Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: dpkg license improvement for non SPDX licenses #3090

Closed
spiffcs opened this issue Aug 1, 2024 · 2 comments · Fixed by #3366
Closed

feat: dpkg license improvement for non SPDX licenses #3090

spiffcs opened this issue Aug 1, 2024 · 2 comments · Fixed by #3366
Assignees
Labels
bug Something isn't working

Comments

@spiffcs
Copy link
Contributor

spiffcs commented Aug 1, 2024

What happened:
Sometimes syft can encounter a dpkg license where the regular expression used to match on contents cannot correctly identify the license.

In the following example we should find things like:

NVIDIA Software License Agreement and CUDA Supplement to Software License Agreement

Reads contents of copyright:
func fetchCopyrightContents(resolver file.Resolver, dbLocation file.Location, m pkg.DpkgDBEntry) (io.ReadCloser, *file.Location) {
if resolver == nil {
return nil, nil
}
// look for /usr/share/docs/NAME/copyright files
copyrightPath := path.Join(docsPath, m.Package, "copyright")
location := resolver.RelativeFileByPath(dbLocation, copyrightPath)
// we may not have a copyright file for each package, ignore missing files
if location == nil {
return nil, nil
}
reader, err := resolver.FileContentsByLocation(*location)
if err != nil {
log.Warnf("failed to fetch deb copyright contents (package=%s): %s", m.Package, err)
}
defer internal.CloseAndLogError(reader, location.RealPath)
l := location.WithAnnotation(pkg.EvidenceAnnotationKey, pkg.SupportingEvidenceAnnotation)
return reader, &l
}

Sends contents for parsing

licenseStrs := parseLicensesFromCopyright(copyrightReader)
for _, licenseStr := range licenseStrs {
p.Licenses.Add(pkg.NewLicenseFromLocations(licenseStr, copyrightLocation.WithoutAnnotations()))
}
// keep a record of the file where this was discovered
p.Locations.Add(*copyrightLocation)

Searches for license clause

func parseLicensesFromCopyright(reader io.Reader) []string {
findings := strset.New()
scanner := bufio.NewScanner(reader)
for scanner.Scan() {
line := scanner.Text()
if value := findLicenseClause(licensePattern, "license", line); value != "" {
findings.Add(value)
}
if value := findLicenseClause(commonLicensePathPattern, "license", line); value != "" {
findings.Add(value)
}
}
results := findings.List()
sort.Strings(results)
return results
}

What you expected to happen:
Given a copyright file is found SOME license information should be created for a given package. No licenses is a bug.

Steps to reproduce the issue:

syft -o json nvidia/cuda:12.5.1-cudnn-runtime-ubuntu20.04 | grant list -o json | jq -r '.results[]
 | [.license.license_id, .license.name] | @csv' | sed 's/"//g'
  • Output of syft version: devel (tip of main)
  • OS (e.g: cat /etc/os-release or similar): OSX
@spiffcs spiffcs added the bug Something isn't working label Aug 1, 2024
@spiffcs spiffcs changed the title feat: dpkg license improvment feat: dpkg license improvement for non SPDX licenses Aug 1, 2024
@spiffcs spiffcs self-assigned this Aug 8, 2024
@spiffcs spiffcs moved this to In Progress in OSS Aug 8, 2024
@spiffcs
Copy link
Contributor Author

spiffcs commented Aug 8, 2024

I've tracked down a couple data sources syft could use to identify non SPDX licenses - currently looking at ways to incorporate these to the licenses identification when generating the SBOM

https://github.com/nexB/scancode-toolkit
https://github.com/nexB/scancode-licensedb

@HeyeOpenSource
Copy link
Contributor

If you'd like a simplified solution to include custom licenses, you might want to take a look here:
https://github.com/HeyeOpenSource/syft/tree/Custom_Licenses 😁

N.B.: I just ran make test on it without any failures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants