This README file contains information on integrating Osselot into your openembedded build process using the meta-osselot layer.
In recent years the topic open-source license compliance in commercial software has seen an increase in interest. As a result, an increasing number of industry partners and customers are asking for an "open source clearance document", as a guarantee that the software in question does not violate any open-source (especially copy-left) license terms.
Creating such a clearing document is a multi-step process:
- You need to identify which packages are shipped within your product.
- You need to identify (curate) the various licenses and copyright notices within those packages. Depending on the requirements, this might have to happen on a file-to-file, rather than a per-package basis.
- You need to verify that all license obligations are adhered to.
- You need to provide this information to the customer in a suitable format.
If clearance is to be done on a file-to-file basis (a requirement that becomes increasingly predominant), this poses a huge challenge, especially in the (open)embedded context, where entire custom-built firmwares need to be considered.
However, even though the firmwares are custom-built, core components used in the underlying (Linux-based) operating systems are largely identical. Incidentally, these are the packages that require the most work during curation, due to their large codebase (it is an entire operating system after all). Therefore, instead of having every vendor to do the same data curation work for these packages for themselves, the idea of sharing and re-using curated license information within the community is only logical and fits spirit of open source.
Cue to Osselot, a relative new project tackling this challenge. In a nutshell, Osselot is an open-source database of curated license information on various open source projects that is made available as git repository. Additionally, Osselot provides documentation and tooling for re-using curation data on divergence in package version and/or individual files.
Osselot therefore helps covering step No2 (and to an extend No4) of the open source clearance process.
Still, the question remains: How can we easily identify packages relevant for license compliance, make Osselot data available wherever possible, as well as identifying divergences between source code cleared in Osselot and source code used in the openembedded build? This is where this layer, meta-osselot comes into play.
Meta-osselot integrates directly in the bitbake build process. It will identify any target-relevant package and attempt to find the package (in the best-matching version) as JSON SPDX file within the Osselot database. If a suitable package is found, meta-osselot will compare the file checksums for all source code within the "S" build directory against the available curation data. It is worth noting that since meta-osselot uses the Osselot git repository as data source, you can easily replace the upstream repository with your own fork of the Osselot curation database, thus allowing using your own curated data as well.
The results of this comparison, together with other meta-information as well as the available Osselot curation data, will be then provided as build artefacts.
Run bitbake-layers add-layer meta-osselot
to add the core layer to your build.
If you are using kas, the configuration looks as follows:
repos:
meta-osselot:
url: "https://github.com/iris-GmbH/meta-osselot.git"
branch: "<YOCTO_RELEASE>"
...
To enable the Osselot integration, simply add INHERIT += "osselot"
to your local.conf
file. Now osselot tasks will automatically run for every package or image you build.
Alternatively, if you are only interested in the osselot output and not in building packages add --runonly=populate_osselot
to your bitbake command, e.g:
bitbake core-image-minimal --runonly=populate_osselot
Meta-osselot can be configured via bitbake environment variables, either on a global or per recipe basis (or both).
Available configuration options are as follows.
Global configuration is done in your local.conf
file or a distro configuration file.
Variable | Description | Default value |
---|---|---|
OSSELOT_HASH_ALGORITHM |
The hash algorithm used when determining equivalence between source code and curation data | "md5" |
OSSELOT_DEPLOYDIR |
The folder in which Osselot artifact data will be deployed | "${DEPLOY_DIR}/osselot" |
OSSELOT_SRC_URI |
The bitbake SRC_URI configuration for fetching curation data |
"git://github.com/Open-Source-Compliance/package-analysis.git;protocol=https;branch=main" |
OSSELOT_SRCREV |
The revision of the curation data to use (default: latest) | "${AUTOREV}" |
OSSELOT_PV |
The package version of the curation data | "1.0+git${SRCPV}" |
OSSELOT_IGNORE_LICENSES |
Ignore packages with the listed licenses (whitespace separated) | "CLOSED" |
OSSELOT_IGNORE_SOURCE_GLOBS |
Globally ignore source code files in S which paths match these globs (whitespace separated) |
".pc/**/* patches/series .git/**/*" |
OSSELOT_IGNORE_PACKAGE_SUFFIXES |
Ignore packages ending with one of the specified suffixes (whitespace separated) | "${SPECIAL_PKGSUFFIX}" |
Per-recipe configuration is done in either the original *.bb
recipe file or by appending to an existing (upstream) recipe using *.bbapped
.
Variable | Description | Default value |
---|---|---|
OSSELOT_NAME |
The name of this package within the Osselot database | "${BPN}" |
OSSELOT_VERSION |
The version of this package within the Osselot database | "${PV}" |
OSSELOT_IGNORE |
Set to "1" to ignore this recipe |
"0" |
OSSELOT_IGNORE_SOURCE_GLOBS |
Within this recipe, ignore source code files in S where the paths match these globs (whitespace separated) |
".pc/**/* patches/series .git/**/*" |
OSSELOT_HASH_EQUIVALENCE |
Set source code hash equivalence of one or more hashes. Equal hashes are colon separated, statements are whitespace separated, e.g. "aaaa:bbbb:cccc dddd:ffff" |
"" |
We recommend overwriting OSSELOT_SRC_URI
with your own fork of the package-analysis
repository, as this allows you to push and re-use your own curated data. However, remember to keep your fork up-to-date with upstream repository for the latest curation data.
When adding curation data to the git repository, be aware that following assumptions are made by meta-osselot, otherwise your curation data will not be identified:
- Available packages are subdirectories within the
analysed-packages
directory (depth is irrelevant). The package name equals the directory name. - Available packages have one or more versions available as direct subdirectory within the package directory.
- These version directories start with the prefix "version-". The package version equals anything after this prefix.
- The version directory contains at least one valid SPDX file in JSON format.
We also recommend to upstream your curation data to the Osselot project for an additional layer of quality control, and for the open-source spirit of improving the Osselot database by making your curation data available to others.
There might be false negatives when matching packages against the Osselot data folder structure due to mismatches in name or version formatting between the recipe and the Osselot database.
For example, within the openembedded-core recipe expat the name of the package is "expat" and the version is "2.5.0". In Osselot however, the same package is named "libexpat" and the version is "R_2_5_0".
In these cases OSSELOT_NAME
and/or OSSELOT_VERSION
need to be overwritten, either within the recipe itself for your own custom layers, or in a matching .bbappend
file if the mismatch occurs within an upstream layer. In the latter case, please open an issue in this projects issue tracker, or contribute a patch to add the .bbappend
file for the appropriate layer within the bbappend folder of this repository, so that we can fix this for everyone.
By default, meta-osselot will already exclude packages where one the following is true:
- The package name contains a non-target suffix, i.e. it will not end up in the target product (see
SPECIAL_PKGSUFFIX
which is the default value forOSSELOT_IGNORE_PACKAGE_SUFFIXES
) - The package does not have any source code (
S
folder not existent) - The recipe
LICENSE
is set to"CLOSED"
(seeOSSELOT_IGNORE_LICENSES
variable) - The recipe has
OSSELOT_IGNORE
set to"1"
(see preconfigured bbappends within meta-osselot)
You may globally append to the OSSELOT_IGNORE_LICENSES
variable, if you want to exclude recipe with based on other licenses (e.g. your custom defined company license)
Also, you may set OSSELOT_IGNORE = "1"
within any recipe *.bb
or *.bbappend
file, if you wish to ignore this package.
Caution
Always be cautious when ignoring source file globs on a global level, you might end up ignoring valid cases!
There are valid use-cases for ignoring source files from a license compliance perspective.
For example, when you are confident that there files do not end up in the final product. Adding these files to the ignore list will help reduce the diff between version mismatches.
Another valid reason are release tarballs. Openembedded recipes sometimes use software release tarballs (e.g. GitHub releases) rather than the corresponding git source code (which is used during curation in Osselot). These release tarballs might contain additional files (e.g. pre-generated configure files) that are not relevant from a license compliance perspective.
Append to the OSSELOT_IGNORE_SOURCE_GLOBS
variable if you want to ignore a file (or glob), e.g.: OSSELOT_IGNORE_SOURCE_GLOBS += "tests/**/*"
.
There might be cases of source code differences between two versions of the same file, e.g. when applying patches via bitbake recipes. In most of these cases, there are no license compliance relevant changes done to these files. In these cases you can use OSSELOT_HASH_EQUIVALENCE
variable to define equivalences between two or more hashes, e.g.:
# first hash equivalence statement
OSSELOT_HASH_EQUIVALENCE += "bcb82bc370eb937e3b310e98d20aa906:d6b29fc355b6ab0f9f4bb9d4e03e9304:cb31a703b96c1ab2f80d164e9676fe7d"
# second hash equivalence statement
OSSELOT_HASH_EQUIVALENCE += "d3b07384d113edec49eaa6238ad5ff00:c157a79031e1c40f85931829bc5fc552"
If an openembedded package source code checksum mismatches the corresponding SPDX checksum entry, meta-osselot will evaluate all available hash equivalence statements. If both the source code checksum as well as the SPDX checksum are available within the same hash equivalence statement, the file will be marked accordingly in the output.
After a successful build, all files relevant to meta-osselot will be stored in the ${OSSELOT_DEPLOYDIR}
.
The file at ${OSSELOT_DEPLOYDIR}/${PN}/${PN}-${PV}-meta.json
contains relevant meta- and checksum-information on the package.
Please submit any patches against the meta-osselot layer via a GitHub Pull Request.