The REUSE project is an effort to make Open Source projects provide copyright and license information (for all files) in a machine readable way.
When a project is fully REUSE compliant, you can easily figure out the copyright and license situation for every single file it holds.
The easiest way to accomplish this is to make sure that all files have the correct header with the appropriate copyright info and SPDX-License-Identifier specified, but it also has ways to provide that meta data in adjacent files – for files where prepending that info isn’t sensible.
What we needed to do
We were already in a fairly good place before this push. We have a script that verifies the presence of copyright header in files (including checking the end year vs the latest git commit), with a list of files that were deliberately skipped.
The biggest things we needed to do were
- Add the SPDX identifier all over
- Make sure that the skipped files also have copyright and licensing info provided
- Add a CI job that verifies that we remain compliant
I also ended up adjusting our own copyright scan script to use the REUSE metadata files instead of its own ignore filters which also made it even easier for us to make sure we are and remain compatible — that every single files in the curl git repository has a known and documented license and copyright situation.
As a bonus, the cleanup work helped us detect an example file that stood out which we got relicensed and we removed two older files that had their own unique licenses (without any good reason).
There are 3518 files in the curl git repository this exact moment.
Compliant!
Starting mid-June 2022, curl is 100% REUSE compliant. curl 7.84.0 will be the first release done in this status.
Motivation
I think it is a good idea to have perfect control over the copyright and license situation for every single file, and to make sure that the situation is documented enough and to a level that allows anyone and everyone to check it out and learn how things lie. No surprises.
Companies have obviously figured out this info before to a degree that they have been satisfied with since curl is widely used even commercial since a long time. But I believe that by providing the information in an even easier and more descriptive way makes things even better. For existing and future users.
I also think that the low threshold for us to reach this compliance was a factor. We were almost there already. We just need to polish up some small details and I think it made it worth it.
This cleanup also makes sure we have perfect control and knowledge of the license situation, now and going forward. I think this can be expected from a project aiming for gold standard.
The curl SPDX license identifier
Keen readers will notice that curl has its own license identifier. It is called the curl license. Not MIT, X or a BSD variation. curl.
The reason for this is good old stupidity. In January 2001 we adopted the MIT license for use in the project because we believed it better matches what we want compared to the previous license situation. We started out with a dual license situation together with the MPL license we used previously, but the MPL part was removed completely in October 2002.
For reasons that have since been forgotten, we thought it was a good idea to edit the license text. To trim it a little. Since August 2002, the license text that started out as an MIT/X license is no longer a perfect copy. It is a derivative . Very similar and almost identical. But it’s not the same.
When the SPDX project created their set of identifiers for well-used licenses out in the FOSS world they decided that the curl license is different enough from the MIT/X license to treat it separately and give it its own identifier. I know of no other project than curl that uses this particular edited version of the MIT license.
In hindsight, I believe the editing of the license text back in 2002 was dumb. I regret it, but I will not change it again. I think we can live with this situation pretty good.
Credits
Most of the heavy lifting necessary to make curl compliant was done by Max Mehl.