http://open-source-security-software.net/project/scancode-toolkit/releases.atom Recent releases for scancode-toolkit 2025-05-18T06:30:54.962471+00:00 python-feedgen scancode-toolkit v1.0.0 scancode-toolkit v1.0.0 2015-07-01T15:21:32+00:00 Initial release. To install, download the scancode-toolkit-1.0.0.zip or scancode-toolkit-1.0.0.tar.bz2 from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-build binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-01T15:21:32+00:00 scancode-toolkit v1.1.0 scancode-toolkit v1.1.0 2015-07-06T10:25:47+00:00 This is a minor bug fix release. Using the `-extract` option in conjuction with `--license` or `--copyright` will return an error. To install, download `scancode-toolkit-1.1.0.zip` or `scancode-toolkit-1.1.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-build binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-06T10:25:47+00:00 scancode-toolkit v1.2.1 scancode-toolkit v1.2.1 2015-07-13T15:12:50+00:00 This is a major bug fix release. The `-extract` option is no longer slow and now displays progress information when archives are extracted. Using --extract with --verbose displays detailed progress messages. To install, download `scancode-toolkit-1.2.1.zip` or `scancode-toolkit-1.2.1.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-build binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-13T15:12:50+00:00 scancode-toolkit v1.2.2 scancode-toolkit v1.2.2 2015-07-14T14:17:12+00:00 This is a minor bug fix release. The `-extract` option now accepts relative paths. To install, download `scancode-toolkit-1.2.2.zip` or `scancode-toolkit-1.2.2.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-build binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-14T14:17:12+00:00 scancode-toolkit v1.2.3 scancode-toolkit v1.2.3 2015-07-16T08:02:21+00:00 This is a major bug fix release for Windows. The `-extract` option was not working on Windows in previous 1.2.x pre-releases To install, download `scancode-toolkit-1.2.3.zip` or `scancode-toolkit-1.2.3.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-build binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-16T08:02:21+00:00 scancode-toolkit v1.2.4 scancode-toolkit v1.2.4 2015-07-22T14:14:50+00:00 This is a minor bug fix release, including the ability to scan a single file and some improved copyright scanning. To install, download `scancode-toolkit-1.2.4.zip` or `scancode-toolkit-1.2.4.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-build binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-22T14:14:50+00:00 scancode-toolkit v1.3.0 scancode-toolkit v1.3.0 2015-07-24T14:35:17+00:00 This is a feature and bug fix release: - scancode now ignores version control directories by default (.svn, .git, etc) - Improved copyright and license detections (new rules, etc.) - other minor improvements and minor bug fixes. - experimental and unsupported inclusion of Linux-32 bits pre-built binaries To install, download `scancode-toolkit-1.3.0.zip` or `scancode-toolkit-1.3.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-built binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-24T14:35:17+00:00 scancode-toolkit v1.3.1 scancode-toolkit v1.3.1 2015-07-27T18:54:04+00:00 This is a feature and bug fix release: - fixed --verbose option - Improved copyright and license detections (new rules, etc.) - other minor improvements and minor bug fixes. - fix for experimental and unsupported inclusion of Linux-32 bits pre-built binaries To install, download `scancode-toolkit-1.3.1.zip` or `scancode-toolkit-1.3.1.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-built binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-07-27T18:54:04+00:00 scancode-toolkit v1.4.0 scancode-toolkit v1.4.0 2015-11-24T18:42:47+00:00 This is a feature and bug fix release: - Separated JSON data into a separate file for the html app. - Added support for scanning package and file information. - New and improved licenses rules and licenses. - Created new extractcode standlone command. Extracting archives is no longer part of the scancode command. - ScanCode can now be called from anywhere. - Various minor improvements and bug fixes. To install, download `scancode-toolkit-1.4.0.zip` or `scancode-toolkit-1.4.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-built binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-11-24T18:42:47+00:00 scancode-toolkit v1.4.2 scancode-toolkit v1.4.2 2015-12-03T11:51:53+00:00 This is a major bug fix release for v1.4.0: - The release archives were missing some code (packagedcode) - Improved --quiet option for command line operations To install, download `scancode-toolkit-1.4.2.zip` or `scancode-toolkit-1.4.2.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can download the source code of pre-built binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-12-03T11:51:53+00:00 scancode-toolkit v1.4.3 scancode-toolkit v1.4.3 2015-12-10T17:14:05+00:00 This is a minor bug fix release for v1.4: - In the HTML app, the scanned path was hardcoded as `scancode-toolkit2/scancode-toolkit/samples` instead of displaying the path that was scanned. To install, download `scancode-toolkit-1.4.3.zip` or `scancode-toolkit-1.4.3.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can also download the source code for pre-built third-party binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-12-10T17:14:05+00:00 scancode-toolkit v1.5.0 scancode-toolkit v1.5.0 2015-12-15T15:06:17+00:00 This is a significant new feature release: - The HTML app now displays a license summary graphic - Copyright holders and Authors are now collected together with copyrights - New email and url scan options: scan for URLs and emails: new useful origin clues - New and improved license and detection rules and other minor bug fixes These new scans are for now only available in the JSON output. To install, download `scancode-toolkit-1.5.0.zip` or `scancode-toolkit-1.5.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can also download the source code for pre-built third-party binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2015-12-15T15:06:17+00:00 scancode-toolkit v1.6.0 scancode-toolkit v1.6.0 2016-01-29T22:59:07+00:00 This is a significant new feature release: - The HTML app now displays a copyright summary graphic - Improved HTML app UI enhancements - New and improved license and detection rules and other minor bug fixes To install, download `scancode-toolkit-1.6.0.zip` or `scancode-toolkit-1.6.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can also download the source code for pre-built third-party binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2016-01-29T22:59:07+00:00 scancode-toolkit v1.6.1 scancode-toolkit v1.6.1 2016-03-01T19:55:53+00:00 This is an unofficial release with experimental python wheel support. **Unless you want to help with testing this release, please use instead the stable version 1.6.0 at https://github.com/nexB/scancode-toolkit/releases/latest** 2016-03-01T19:55:53+00:00 scancode-toolkit v1.6.3 scancode-toolkit v1.6.3 2016-06-24T16:36:01+00:00 This is an unofficial release with experimental improved package detection support __Unless you want to help with testing this release, please use instead the stable version 1.6.0 at https://github.com/nexB/scancode-toolkit/releases/latest __ 2016-06-24T16:36:01+00:00 scancode-toolkit v2.0.0.rc1 scancode-toolkit v2.0.0.rc1 2016-10-07T21:13:46+00:00 Early release candidate used for testing 2016-10-07T21:13:46+00:00 scancode-toolkit v2.0.0.rc2 scancode-toolkit v2.0.0.rc2 2017-01-16T15:07:25+00:00 This is a stable release candidate for v2.0 that can be used for testing and production use. This is a significant new feature release. Changelog is in preparation. To install, download `scancode-toolkit-2.0.0rc2.zip` or `scancode-toolkit-2.0.0rc2.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/develop/README.rst You can also download the source code for pre-built third-party binaries from these locations: https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2017-01-16T15:07:25+00:00 scancode-toolkit v2.0.0.rc3 scancode-toolkit v2.0.0.rc3 2017-06-16T16:28:03+00:00 This is a stable and final release candidate for v2.0 that can be used for testing and production use. This is a significant new feature release. Changelog is in preparation. To install, download `scancode-toolkit-2.0.0rc3.zip` or `scancode-toolkit-2.0.0rc3.tar.bz2` from the Downloads section below and follow installation instructions in the `READMErst` file or at https://github.com/nexB/scancode-toolkit/blob/develop/README.rst Also available is a Python wheel from Pypi: install on Python 2 with `pip install scancode-toolkit` You can also download the source code for pre-built third-party binaries from these locations: - https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz - https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2017-06-16T16:28:03+00:00 scancode-toolkit v2.0.0 scancode-toolkit v2.0.0 2017-06-23T10:01:22+00:00 This is a major release with several new and improved features and bug fixes. To install, download `scancode-toolkit-2.0.0.zip` or `scancode-toolkit-2.0.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst You can also download the source code for pre-built third-party binaries from these locations: * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip Thank you to all contributors to this release and the 200+ stars and 60+ forks on GitHub! Some of the key highlights include: * License: Brand new, faster and more accurate detection engine. New and improved licenses and over 2500+ new detection rules * Package and dependencies: new and improved detection of multiple package formats: NPM, Maven, NuGet, PHP Composer, Python Pypi and RPM. In most cases direct, declared dependencies are also reported. * Scan outputs: New SPDX tag/values and RDF outputs. Improved compact JSON format. * Copyright: several false positive are no longer returned and copyrights are more accurate * Archive extraction: support for shallow extraction and new archive types * Performance: everything is generally faster and less memory hungry. Scans can run on multiple processes in parallel with the new `--processes` option speeding up things even further. * You can now install ScanCode as a library from Pypi with `pip install scancode-toolkit` 2017-06-23T10:01:22+00:00 scancode-toolkit v2.0.1 scancode-toolkit v2.0.1 2017-07-03T16:27:15+00:00 This is a minor release with some minor improved features and bug fixes. To install, download `scancode-toolkit-2.0.1.zip` or `scancode-toolkit-2.0.1.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst This is also available as a Python library from Pypi with `pip install scancode-toolkit` You can also download the source code for pre-built third-party binaries from these locations: * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip Thank you to all contributors to this release and the 200+ stars and 60+ forks on GitHub! Key changes: * New and improved license detection, including refined match scoring for #534 * Bug fixed in License detection leading to a very long scan time for some rare JavaScript files. Reported by @jarnugirdhar * New "base_name" attribute returned with file information. Reported by @chinyeungli * Bug fixed in Maven POM package detection. Reported by @kalagp 2017-07-03T16:27:15+00:00 scancode-toolkit v2.1.0 scancode-toolkit v2.1.0 2017-09-22T20:07:09+00:00 This is a minor release with several new and improved features and bug fixes but no significant API changes. To install, download `scancode-toolkit-2.1.0.zip` or `scancode-toolkit-2.1.0.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst This is also available as a Python library from Pypi with `pip install scancode-toolkit` You can also download the source code for pre-built third-party binaries from these locations: * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip Key changes: * New plugin architecture by @yashdsaraf * Several new and improved licenses and license detection rules * Multiple bug fixes Thank you to all contributors to this release and the 240+ stars and 70+ forks on GitHub! Some of the contributors to this release with either code and bug reports include (and this list is likely missing some): * @abuhman * @chinyeungli * @jimjag * @JonoYang * @jpopelka * @majurg * @mjherzog * @pgier * @pkajaba * @pombredanne * @scottctr * @sschuberth * @yahalom5776 * @yashdsaraf 2017-09-22T20:07:09+00:00 scancode-toolkit v2.2.1 scancode-toolkit v2.2.1 2017-10-05T23:18:07+00:00 This is a minor release with several bug fixes, one new feature and one (minor) API change. To install, download `scancode-toolkit-2.2.1.zip` or `scancode-toolkit-2.2.1.tar.bz2` from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst This is also available as a Python library from Pypi with `pip install scancode-toolkit` You can also download the source code for pre-built third-party binaries from these locations: * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz * https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip ### API change: * Licenses data now contains a new reference_url attribute instead of a dejacode_url attribute. This defaults to the public DejaCode URL and can be configured with the new --license-url-template command line option. ### New feature: * There is a new "--format jsonlines" output format option. In this format, each line in the output is a valid JSON document. The first line contains a "header" object with header-level data such as notice, version, etc. Each line after the first contains the scan results for a single file formatted with the same structure as a whole scan results JSON documents but without any header-level attributes. See also http://jsonlines.org/ ### Other changes: * Several new and improved license detection rules have been added. The logic of detection has been refined to handle some rare corner cases. The underscore character "_" is treated as part of a license word and the handling of negative and false_positive license rules has been simplified. * Several issues with dealing with codebase with non-ASCII, non-UTF-decodable file paths and other filesystem encodings-related bug have been fixed. * Several copyright detection bugs have been fixed. * PHP Composer and RPM packages are now detected with --package * Several other package types are now detected with --package even though only a few attributes may be returned for now until full parsers are added. * Several parsing NPM packages bugs have been fixed. * There are some minor performance improvements when scanning some large file for licenses. Thank you to all contributors to this release and the 250+ stars and 80+ forks on GitHub! 2017-10-05T23:18:07+00:00 scancode-toolkit v2.9.0b1 scancode-toolkit v2.9.0b1 2018-03-02T21:35:42+00:00 This is a pre-release of what will come up for 3.0 This has a lot of new changes including improved plugins, speed and detection that are not yet fully documented but it can be used for testing. 2018-03-02T21:35:42+00:00 scancode-toolkit v2.9.1 scancode-toolkit v2.9.1 2018-03-22T16:23:59+00:00 This is a stable pre-release of what will come up for 3.0 This has a lot of new changes including improved license detection, plugins, speed and detection that are not yet fully documented but it can be used for testing. 2018-03-22T16:23:59+00:00 scancode-toolkit v2.9.2 scancode-toolkit v2.9.2 2018-05-08T15:02:05+00:00 This is a stable pre-release of what will come up for 3.0 This has a lot of new changes and bug fixes including improved SPDX license detection, package reporting and additional plugins and more: these are not yet fully documented but this release can be used for testing and is stable. Some major changes include: - **A security fix** The support for Rar archives extraction in extractcode has been changed and downgraded to use libarchive instead of 7zip as a mitigation for a 7Zip vulnerability referenced as CVE-2018-10115 https://nvd.nist.gov/vuln/detail/CVE-2018-10115 . As a result, you may expect some extraction failures when extracting some Rar archives as fewer Rar archive formats are supported by libarchive. When the bug is properly fixed on all OS in 7Zip this may be reverted. - The package models have been updated significantly and streamlined. Then now also use the Package URL (purl) semantics. If you rely on the previous v2.x models and data structures, with a `--package` scans things are rather improved now. Documentation will come up next. - The license detection has been updated in several ways: - a new --license-expression option allow to return license expressions (using ScanCode keys) - several licenses have been added, updated or retired after a sync with the latest SPDX license list v3.1 and AboutCode - SPDX license identifiers are now detected by the license scan 2018-05-08T15:02:05+00:00 scancode-toolkit v3.0.0 scancode-toolkit v3.0.0 2019-02-14T19:54:54+00:00 This is the first 3.0 release with the best, fastest and most efficient ScanCode ever released. This releases contains many improvements, fixes and new features including breaking API changes (when compared to 2.2.x). See the CHANGELOG for details at https://github.com/nexB/scancode-toolkit/blob/master/CHANGELOG.rst To install, download scancode-toolkit-3.0.0.zip or scancode-toolkit-3.0.0.tar.bz2 from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst This is also available as a Python library from Pypi with `pip install scancode-toolkit` You can also download the corresponding source code for bundled pre-built third-party binaries from these locations: - https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz - https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2019-02-14T19:54:54+00:00 scancode-toolkit v3.0.2 scancode-toolkit v3.0.2 2019-02-15T15:32:44+00:00 This is a minor bug fix version for 3.0.0. See https://github.com/nexB/scancode-toolkit/releases/tag/v3.0.0 for major changes. - A tracing flag was turned on in the summary module by mistake. Reported by @tdruez #1374 - Correct a Maven parsing error. Reported and fixed by @linexb #1373 - Set proper links in the README. Reported and fixed by @sschuberth #1371 - No changes from v3.0.1 See the CHANGELOG for details at https://github.com/nexB/scancode-toolkit/blob/master/CHANGELOG.rst To install, download scancode-toolkit-3.0.2.zip or scancode-toolkit-3.0.2.tar.bz2 from the Downloads section below and follow installation instructions in the README at https://github.com/nexB/scancode-toolkit/blob/master/README.rst This is also available as a Python library from Pypi with `pip install scancode-toolkit` You can also download the corresponding source code for bundled pre-built third-party binaries from these locations: - https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.tar.gz - https://github.com/nexB/scancode-thirdparty-src/archive/v1.0.0.zip 2019-02-15T15:32:44+00:00 scancode-toolkit v3.1.1 scancode-toolkit v3.1.1 2019-09-04T21:07:09+00:00 2019-09-04T21:07:09+00:00 scancode-toolkit v3.2.0rc1 scancode-toolkit v3.2.0rc1 2020-09-08T18:44:56+00:00 This the first release candidate of 3.2 2020-09-08T18:44:56+00:00 scancode-toolkit v3.2.1rc2 scancode-toolkit v3.2.1rc2 2020-09-11T16:01:22+00:00 This the first release candidate of 3.2 2020-09-11T16:01:22+00:00 scancode-toolkit v3.2.2rc3 scancode-toolkit v3.2.2rc3 2020-10-22T09:18:23+00:00 This the third release candidate of 3.2 Notable changes: - Ensure commoncode can become a standalone package #2233 - Add Dockerfile to build docker image from ScanCode sources #2265 2020-10-22T09:18:23+00:00 scancode-toolkit v3.2.3 scancode-toolkit v3.2.3 2020-10-27T18:51:44+00:00 This is the final 3.2 release. Notable changes from previous release candidate - Collect Windows executable metadata #652 - Fix minor bugs - Add Dockerfile to build docker image from ScanCode sources #2265 2020-10-27T18:51:44+00:00 scancode-toolkit v21.2.9 scancode-toolkit v21.2.9 2021-02-09T18:11:34+00:00 This is a major new release. Some of the highlights include: Security: - Update vulnerable LXML to version 4.6.2 to fix https://nvd.nist.gov/vuln/detail/CVE-2020-27783 This was detected thanks to https://github.com/nexb/vulnerablecode Operating system support: - Drop support for Python 2 #295 - Drop support for 32 bits on Windows #335 - Add support for Python 64 bits on Windows 64 bits #335 - Add support for Python 3.6, 37, 3.8 and 3.9 on Linux, Windows and macOS. These are now tested on Azure. - Add deprecation message for native Windows support #2366 License scanning: - Improve license detection accuracy with over 8400 new license detection fules added or updated - Remove the previously deprecated --license-diag option - Include pre-built license index in release archives to speed up start #988 - Use SPDX LicenseRef-scancode namespace for all licenses keys not in SPDX - Replace DEJACODE_LICENSE_URL with SCANCODE_LICENSEDB_URL at https://scancode-licensedb.aboutcode.org #2165 Package scanning: - Add detection of package-installed files - Add analysis of system package installed databases for Debian, OpenWRT and Alpine Linux packages - Add support for Alpine Linux, Debian, OpenWRT. Copyright scanning: - Improve detection with minor grammar fixes Misc.: - Adopt a new calendar date-based versioning for scancode-toolkit version numbers - Update thirdparty dependencies and built-in plugins - Allow installation without extractcode and typecode native plugins. Instead one can elect to install these or not to have a lighter footprint if needed. - Update configuration and bootstrap scripts to support a new PyPI-like repository at https://thirdparty.aboutcode.org/pypi/ - Create new release scripts to populate released archives with just the required wheels of a given OS and Python version. - Updated scancode.bat to handle % signs in the arguments #1876 Big thank you to all contributors and in particular: - Abhishek Kumar - Ayan Sinha Mahapatra - Ayush Bhardwaj - Chin Yeung Li - Dennis Clark - Duncan Howe - John Horan - Jono Yang - Maximilian Huber - Michael Herzog - Philippe Ombredanne - Sankha Das - Scott Pakin - Steven Esser - Tushar Upadhyay 2021-02-09T18:11:34+00:00 scancode-toolkit v21.2.25 scancode-toolkit v21.2.25 2021-02-25T22:19:15+00:00 This is a minor new release. Some of the highlights include: Installation: - Resolve reported installation issues on macOS, Windows and Linux - Stop using extras for a default wheel installation - Build new scancode-toolkit-mini package with limited dependencies for use when packaging in distros and similar - The new Dockerfile will be create smaller images and containers License scanning: - Over 150 and and updated licenses - Support the latest SPDX license list v3.11 - Improve license detection accuracy with over 740 new and improved license detection rules - Fix license cache handling issues Misc.: - Update extractcode, typecode and their native dependencies for better support of latest versions of macOS. Big thank you to all contributors! 2021-02-25T22:19:15+00:00 scancode-toolkit v21.3.31 scancode-toolkit v21.3.31 2021-04-01T15:21:49+00:00 This is a major version with no breaking API changes. Attention: the next version will bring up some significant API changes summarized in the [CHANGELOG](https://github.com/nexB/scancode-toolkit/blob/fa3e3662868c331232f75a93d4b8a1797bc8ddb5/CHANGELOG.rst#breaking-api-changes). Security: - Update dependency versions for security. License scanning: - Add 22 new and update 71 existing reference licenses - Update licenses to include the SPDX license list 3.12 - Improve license detection accuracy with over 2300 new and improved license detection rules - Undeprecate the regexp license and deprecate the hs-regexp-orig license - Improve license db initial load time with caching for faster scancode start time - Ensure that license short names are no more than 50 characters long - Thank you to: - Dennis Clark @DennisClark - Chin-Yeung Li @chinyeungli - Armijn Hemmel @armijnhemel - Sarita Singh @itssingh - Akanksha Garg @akugarg Copyright scanning: - Detect SPDX-FileCopyrightText as defined by the FSFE Reuse project - Fix bug when using the --filter-clues command line option Thank you to Van Lindberg @VanL - Allow calling copyright detection from text lines to ease integration Thank you to Jelmer Vernooij @jelmer Package scanning: - Add support for installed RPMs detection internally (not wired to scans) Thank you to Chin-Yeung Li @chinyeungli - Improve handling of Debian copyright files with faster and more accurate license detection Thank you to Thomas Druez @tdruez - Add new built-in support for installed_files report. Only available when used as a library. - Improve support for RPM, npm, Debian, build scripts (Bazel) and Go packages Thank you to: - Divyansh Sharma @Divyansh2512 - Jonothan Yang @JonoYang - Steven Esser @majurg - Add new support to collect information from semi-structured Readme files and related metadata files. Thank you to: - Jonothan Yang @JonoYang - Steven Esser @majurg Ouputs: - Add new Debian copyright-formatted output. Thank you to Jelmer Vernooij @jelmer - Fix bug in --include where directories where not skipped correctly Thank you to Pierre Tardy @tardyp Misc. and documentation improvements: - Update the way tests assertions are made Thank you to Aditya Viki @adityaviki - Thank you to Aryan Kenchappagol @aryanxk02 The sources of third-party dependencies are available for download here in https://github.com/nexB/thirdparty-packages/ and in https://github.com/nexB/scancode-plugins. 2021-04-01T15:21:49+00:00 scancode-toolkit v21.6.7 scancode-toolkit v21.6.7 2021-06-08T08:34:53+00:00 This is a major new release with important security and bug fixes, as well as significant improvement in license detection. Many thanks to every contributors that made this possible and in particular: - Akanksha Garg @akugarg - Ayan Sinha Mahapatra @AyanSinhaMahapatra - Dennis Clark @DennisClark - François Granade @farialima - Hanna Modica @hanna-modica - Jelmer Vernooij @jelmer - Jono Yang @JonoYang - Konrad Weihmann @priv-kweihmann - Philippe Ombredanne @pombredanne - Pierre Tardy @tardyp - Sarita Singh @itssingh - Sebastian Thomas @sebathomas - Steven Esser @majurg - Till Jaeger @LeChasseur - Thomas Druez @tdruez ### Breaking API changes: - The configure scripts for Linux, macOS and Windows have been entirely refactored and should be considered as new. These are now only native scripts (.bat on Windows and .sh on POSIX) and the Python script etc/configure.py has been removed. Use the PYTHON_EXECUTABLE environment variable to point to alternative non-default Python executable and this on all OSes. ### Security updates: - Update minimum versions and pinned version of thirdparty dependencies to benefit from latest improvements and security fixes. This includes in particular this issues: - pkg:pypi/pygments: (low severity, limited impact) CVE-2021-20270, CVE-2021-27291 - pkg:pypi/lxml: (low severity, likely no impact) CVE-2021-28957 - pkg:pypi/nltk: (low severity, likely no impact) CVE-2019-14751 - pkg:pypi/jinja2: (low severity, likely no impact) CVE-2020-28493, CVE-2019-10906 - pkg:pypi/pycryptodome: (high severity) CVE-2018-15560 (dropped since no longer used by pdfminer) ### Outputs: - The JSON output packages section has a new "extra_data" attributes which is a JSON object that can contain arbitrary data that are specific to a package type. ### License detection: - The SPDX license list has been update to 3.13 - Add 42 new and update 45 existing licenses. - Over 14,300 new and improved license detection rules have been added. A large number of these (~13,400) are to avoid false positive detection. ### Copyright detection: - Improved speed and fixed some timeout issues. Fixed minor misc. bugs. - Allow calling copyright detection from text lines to ease integration. ### Package detection: - A new "extra_data" dictionary is now part of the "packages" data in the returned JSON. This is used to store arbitrary type-specific data that do cannot be fit in the Package data structure. - The Debian copyright files license detection has been reworked and significantly improved. - The PyPI package detection and manifest parsing has been reworked and significantly improved. - The detection of Windows executables and DLLs metadata has been enabled. These metadata are returned as packages. ### Other: - Most third-party libraries have been updated to their newer versions. Some dependency constraints have been relaxed to help some usage as a library. - The on-commit CI tests now validate that we can install from PyPI without problem. - Fix several installation issues. - Add new function to detect copyrights from lines. 2021-06-08T08:34:53+00:00 scancode-toolkit v21.7.30 scancode-toolkit v21.7.30 2021-07-30T22:32:55+00:00 This is a minor release with several bug fixes, major performance improvements and support for new and improved package formats Many thanks to every contributors that made this possible and in particular: - Abhigya Verma @abhi27-web - Ayan Sinha Mahapatra @AyanSinhaMahapatra - Dennis Clark @DennisClark - Jono Yang @JonoYang - Mayur Agarwal @mrmayurgithub - Philippe Ombredanne @pombredanne - Pierre Tardy @tardyp ## Key changes: ### Outputs: - Add new YAML-formatted output. This is exactly the same data structure as for the JSON output - Add new Debian machine readable copyright output. - The CSV output "Resource" column has been renamed to "path". - The SPDX output now has the mandatory DocumentNamespace attribute per SPDX specs #2344 ### Copyright detection: - The copyright detection speed has been significantly improved with the tests taking roughly 1/2 of the time to run. This is achieved mostly by replacing NLTK with a the minimal and simplified subset we need in a new library named pygmars. ### License detection: - Add new licenses: now tracking 1763 licenses - Add new license detection rules: now tracking 29475 license detection rules - We have also improved license expression parsing and processing ### Package detection: - The Debian packages declared license detection has been significantly improved. - The Alpine packages declared license detection has been significantly improved. - There is new support for shell parsing and Alpine packages APKBUILD data collection. - There is new support for various Windows packages detection using multiple techniques including MSI, Windows registry and several more. - There is new support for Distroless Debian-like installed packages. - There is new support for Dart Pub package manifests. 2021-07-30T22:32:55+00:00 scancode-toolkit v21.8.4 scancode-toolkit v21.8.4 2021-08-05T17:50:40+00:00 This is a minor bug fix release primarily for Windows installation. There is no feature change. ## Installation: - Application installation on Windows works again. This fixes #2610 - We now build and test app bundles on all supported Python versions: 3.6 to 3.9 Thank you to @gunaztar for reporting the #2610 bug ## Documentation: - Documentation is updated to reference supported Python versions 3.6 to 3.9 2021-08-05T17:50:40+00:00 scancode-toolkit v30.1.0 scancode-toolkit v30.1.0 2021-09-26T20:20:52+00:00 This is a bug fix release for these bugs: - https://github.com/nexB/scancode-toolkit/issues/2717 We now return the package in the summaries as before. There is also a minor API change: we no longer return a count of "null" empty values in the summaries for license, copyrights, etc. Thank you to: - Thomas Druez @tdruez See also https://github.com/nexB/scancode-toolkit/tree/v30.0.0 for details on the main changes in v30.0.x ## What's Changed * Prepare bugfix release 30.0.1 #2713 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2715 * Return package details in summary #2717 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2718 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v30.0.1...v30.1.0 2021-09-26T20:20:52+00:00 scancode-toolkit v31.0.0b3 scancode-toolkit v31.0.0b3 2022-04-30T17:50:30+00:00 This is a beta release for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0b3/CHANGELOG.rst for an overview of the changes. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed * Report `packages` at top level with file level `package_manifests` by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2710 * Updated install.rst by @beastrun12j in https://github.com/nexB/scancode-toolkit/pull/2722 * Omnibus fall license improvements by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2706 * Improve license detection by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2737 * api.get_licenses: clarify and improve docstring for "min_score" argument by @zacchiro in https://github.com/nexB/scancode-toolkit/pull/2763 * rules with "unqualified" license names are references, not notices by @petergardfjall in https://github.com/nexB/scancode-toolkit/pull/2759 * Fix invalid license yaml files by resolving duplicated keys by @fangxlmr in https://github.com/nexB/scancode-toolkit/pull/2776 * Fix azure pipeline vmimage deprecations by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2775 * Allow license rules to require the presence of certain defining keywords by @mrombout in https://github.com/nexB/scancode-toolkit/pull/2773 * Add first draft ROADMAP by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2736 * Add CycloneDx output option by @agschrei in https://github.com/nexB/scancode-toolkit/pull/2698 * Remove regular expression futurewarning by @soimkim in https://github.com/nexB/scancode-toolkit/pull/2788 * fix docstring in debian_copyright.py by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2786 * fixes missing whitespace in prerequisites list by @altsalt in https://github.com/nexB/scancode-toolkit/pull/2778 * Add PackageManifest Class by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2748 * Add new licenses and new detection rules by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2765 * Rename first column of csv output to "path" by @JRavi2 in https://github.com/nexB/scancode-toolkit/pull/2016 * Detect unknown licenses #1675 by @akugarg in https://github.com/nexB/scancode-toolkit/pull/2592 * Improve copyright handling #2350 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2791 * Fixing OSI identifier for BSD-3-Clause; see also SPDX license metadata by @karsten-klein in https://github.com/nexB/scancode-toolkit/pull/2797 * Fix GPL license detection false positive #2793 by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2799 * 2789 inconsistent doc html app by @kunalchhabra37 in https://github.com/nexB/scancode-toolkit/pull/2795 * Fixed inconsistency in --html-app FILE in cli-reference by @maynaS in https://github.com/nexB/scancode-toolkit/pull/2790 * Replace freenode references with libera chat by @purna135 in https://github.com/nexB/scancode-toolkit/pull/2816 * Adopt nexB/skeleton and bump dependencies by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2818 * Fix bug recognizing license as license_notice instead of license_text by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2817 * Fix incorrect license detection #2777 by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2811 * Remove skeleton from docs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2830 * Detect SPDX-FileContributor tags as authors by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2838 * New license and copyright rule by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2837 * Add key phrase tags to GPL detection rule by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2821 * Make --version output valid YAML for parsing #2856 by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2858 * Add Direct Note for Windows Users (New Comers) by @OsmiumOP in https://github.com/nexB/scancode-toolkit/pull/2857 * Fixed Typo in Documentation by @OsmiumOP in https://github.com/nexB/scancode-toolkit/pull/2862 * Remove version check locally by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2860 * License improvement winter 2022 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2828 * Update link to documentation by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2867 * Improve license detection by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2871 * Detect dependencies from build.gradle files by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2822 * Fix small typo inside notes snippet by @Harshil-Jani in https://github.com/nexB/scancode-toolkit/pull/2829 * Add Package Instances #2691 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2825 * Improve license clarity scoring by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2875 * Do not raise exception on package data mismatch #2886 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2887 * Release 31 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2888 * Add primary license in summary by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2884 * Remove usage of get_terminal_size in click by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2916 * Fix doc builds by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2896 * Update summary plugin by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2914 * Shorten long file names by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2918 * Added new copyright test cases by @abhishak3 in https://github.com/nexB/scancode-toolkit/pull/2891 * Add system packages support in the new packages model by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2909 * Fix typo in summary: ambigous->ambiguous by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2922 * Add system environment to scan headers by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2923 * Update METADATA.bzl parser by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2924 * Spring 2022 license updates by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2921 * Process single package data file correctly by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2933 * Fix package/dependency creation bugs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2932 ## New Contributors * @beastrun12j made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2722 * @zacchiro made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2763 * @fangxlmr made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2776 * @mrombout made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2773 * @agschrei made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2698 * @soimkim made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2788 * @adii21-Ux made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2786 * @altsalt made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2778 * @karsten-klein made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2797 * @KevinJi22 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2799 * @kunalchhabra37 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2795 * @maynaS made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2790 * @purna135 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2816 * @OsmiumOP made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2857 * @Harshil-Jani made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2829 * @abhishak3 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2891 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v30.1.0...v31.0.0b3 2022-04-30T17:50:30+00:00 scancode-toolkit v31.0.0b4 scancode-toolkit v31.0.0b4 2022-05-10T18:46:37+00:00 This is a beta release for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. Several bugs have been fixed when compared by b3. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0b4/CHANGELOG.rst for an overview of the changes. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed * Populate for packages field correctly #2929 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2939 * Prepare Release 31b4 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2941 * Duplicated dependencies package results by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2944 * Prepare Release 31b4 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2947 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.0b3...v31.0.0b4 2022-05-10T18:46:37+00:00 scancode-toolkit v31.0.0b5 scancode-toolkit v31.0.0b5 2022-05-17T23:45:04+00:00 This is a beta release for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. Several bugs have been fixed when compared by b4. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0b5/CHANGELOG.rst for an overview of the changes in v31 compared to v30. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed * Add link to scancode-toolkit-reference-scans by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2952 * Modify pypi PKG-INFO parse by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2953 * Prepare Release 31.b5 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2962 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.0b4...v31.0.0b5 2022-05-17T23:45:04+00:00 scancode-toolkit v31.0.0rc1 scancode-toolkit v31.0.0rc1 2022-06-13T22:56:20+00:00 This is a release candidate for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. Several bugs have been fixed when compared with 31.0.0b5. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0rc1/CHANGELOG.rst for an overview of the changes in v31 compared to v30. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed * Add black and isort as testing dependencies #2969 by @johnmhoran in https://github.com/nexB/scancode-toolkit/pull/2970 * Rename precise_license_detection field #2967 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2968 * Convert package data dict to PackageData #2971 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2973 * Update extractcode --shallow option description by @lf32 in https://github.com/nexB/scancode-toolkit/pull/2959 * Support shortcut flags for cli by @lf32 in https://github.com/nexB/scancode-toolkit/pull/2951 * Consider only copyrights in summry #2972 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2974 * Reimplement get installed packages by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2988 * Report extracted_requirement correctly by @TG1999 in https://github.com/nexB/scancode-toolkit/pull/2984 * Improve packagecode and other release prep by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2992 ## New Contributors * @lf32 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2959 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.0b5...v31.0.0rc1 2022-06-13T22:56:20+00:00 scancode-toolkit v31.0.0rc2 scancode-toolkit v31.0.0rc2 2022-06-16T19:40:58+00:00 This is a release candidate for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. Several bugs have been fixed when compared with 31.0.0rc1. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0rc2/CHANGELOG.rst for an overview of the changes in v31 compared to v30. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed * Improve npm package processing by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2997 * Update license detection by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2998 * Add new license rules and license - Early summer 2022 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2999 * Bump version to 31.0.0rc2 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3000 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.0rc1...v31.0.0rc2 2022-06-16T19:40:58+00:00 scancode-toolkit v31.0.0rc3 scancode-toolkit v31.0.0rc3 2022-07-28T15:31:42+00:00 This is a penultimate release candidate for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. Several bugs have been fixed when compared with 31.0.0rc2. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0rc3/CHANGELOG.rst for an overview of the changes in v31 compared to v30. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed * Do not fail without packages in cyclonedx #2987 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3005 * Fix relaunching scancode on Apple silicon using Rosetta 2 emulation #2835 by @MarcelBochtler in https://github.com/nexB/scancode-toolkit/pull/3018 * Clarify `unknown` license keys #2827 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3023 * Yield Packages before other yieldables #3028 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3031 * Prepare Release 31.0.0rc3 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3029 ## New Contributors * @MarcelBochtler made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3018 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.0rc2...v31.0.0rc3 2022-07-28T15:31:42+00:00 scancode-toolkit v31.0.0rc5 scancode-toolkit v31.0.0rc5 2022-08-02T17:21:55+00:00 This is one of the last release candidate for the upcoming 31 release. v31 is a major release with many new features, and several bug fixes and improvements including major updates to the package and dependency collection and to the license detection. Several bugs have been fixed when compared with 31.0.0rc3 in particular the ability to properly report licenses in system package scans. See https://github.com/nexB/scancode-toolkit/blob/v31.0.0rc5/CHANGELOG.rst for an overview of the changes in v31 compared to v30. Please try this release and report any installation issues so we can work towards a stable 31. Thank you! ## What's Changed since 31 rc3 * Release 31 rc4 prep by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3036 * Add package_adder argument to assemble() #3034 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3035 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.0rc3...v31.0.0rc5 2022-08-02T17:21:55+00:00 scancode-toolkit v31.0.1 scancode-toolkit v31.0.1 2022-08-18T06:36:09+00:00 This is a major release with important bug and security fixes, new and improved features and API changes. Note that we no longer support Python 3.6. Use Python 3.7+ instead. Important API changes: ======================== - The data structure of the JSON output has changed for copyrights, authors and holders. We now use a proper name for attributes and not a generic "value". - The data structure of the JSON output has changed for packages. We now return "package_data" package information at the manifest file-level rather than "packages". This has all the data attributes of a "package_data" field plus others: "package_uuid", "package_data_files" and "files". - There is a a new top-level "packages" attribute that contains package instances that can be aggregating data from multiple manifests. - There is a a new top-level "dependencies" attribute that contains each dependency instance, these can be standalone or releated to a package. These contain a new "extra_data" object. - There is a new resource-level attribute "for_packages" which refers to packages through package_uuids (pURL + uuid string). - The data structure for HTML output has been changed to include emails and urls under the "infos" object. The HTML template displays output for holders, authors, emails, and urls into separate tables like "licenses" and "copyrights". - The data structure for CSV output has been changed to rename the Resource column to "path". "copyright_holder" has been renamed to "holder". The CSV output is deprecated and will be replaced in the future by an improved tabular format. - The license clarity scoring plugin has been overhauled to show new license clarity criteria. More details of the new scoring criteria are provided below. - The functionality of the summary plugin has been imprived to provide declared origin and license information for the codebase being scanned. The previous summary plugin functionality has been preserved in the new ``tallies`` plugin. More details are provided below. - ScanCode has adopted the new code skeleton from https://github.com/nexB/skeleton The key change is the location of the virtual environment. It used to be created at the root of the scancode-toolkit directory. It is now created under the ``venv`` subdirectory. You mus be aware of this if you use ScanCode from a git clone - ``DatafileHandler.assemble()``, ``DatafileHandler.assemble_from_many()``, and the other ``.assemble()`` methods from the other Package handlers from packagedcode, have been updated to yield Package items before Dependency or Resource items. This is particulary important in the case where we are calling the ``assemble()`` method outside of the scancode-toolkit context, where we need to ensure that a Package exists before we assocate a Resource or Dependency to it. Copyright detection: ==================== - The data structure in the JSON is now using consistently named attributes as opposed to plain values. - Several copyright detection bugs have been fixed. - French and German copyright detection is improved. - Some spurious trailing dots in holders are not stripped. License detection: =================== - There have been significant license detection rules and licenses updates: - 107 new licenses have been added (total is now 1954) - 6780 new license detection rules have been added (total is now 32259) - 6753 existing false positive license rules have been removed (see below). - The SPDX license list has been updated to the latest v3.17 - The rule attribute "only_known_words" has been renamed to "is_continuous" and its meaning has been updated and expanded. A rule tagged as "is_continuous" can only be matched if there are no gaps between matched words, be they stopwords, extra unknown or known words. This improves several false positive license detections. The processing for "is_continous" has been merged in "key phrases" processing below. - Key phrases can now be defined in a RULE text by surrounding one or more words with double curly braces `{{` and `}}`. When defined a RULE will only match when the key phrases match exactly. When all the text of rule is a "key phrase", this is the same as being "is_continuous". - The "--unknown-licenses" option now also detects unknown licenses using a simple and effective ngrams-based matching in area that are not matched or weakly matched. This helps detects things that look like a license but are not yet known as licenses. - False positive detection of "license lists" like the lists seen in license and package management tools has been entirely reworked. Rather than using thousands of small false positive rules, there is a new filter to detect a long run of license references and tags that is typical of license lists. As a results, thousands of rules have been replaced by a simpler filter, and the license detection is more accurate, faster and has fewer false positives. - The new license flag "is_generic" tags licenses that are "generic" licenses such as "other-permissive" or "other-copyleft". This is not yet returned in the JSON API. - When scanning binary files, the detection of single word rules is filtered when surrounded by gibberish or mixed case. For instance `$#%$GpL$` is a false positive and is no longer reported. - Several rules we tagged as is_license_notice incorrectly but were references and have been requalified as is_license_reference. All rules made of a single ord have been requalified as is_license_reference if they were not qualified this way. - Matches to small license rules (with small defined as under 15 words) that are scattered over too many lines are now filtered as false matches. - Small, two-words matches that overlap the previous or next match by by the word "license" and assimilated are now filtered as false matches. - The new --licenses-reference option adds a new "licenses_reference" top level attribute to a scan when using the JSON and YAML outputs. This contains all the details and the full text of every license seen in a file or package license expression of a scan. This can be added added after the fact using the --from-json option. - New experimental support for non-English licenses. Use the command ./scancode --reindex-licenses-for-all-languages to index all known non-English licenses and rules. From that point on, they will be detected. Because of this some licenses that were not tagged with their languages are now correctly tagged and they may not be detected unless you activate this new indexing feature. Package detection: ================== - Major changes in package detection and reporting, codebase-level attribute `packages` with one or more `package_data` and files for the packages are reported. The specific changes made are: - The resource level attribute `packages` has been renamed to `package_data`, as these are really package data that are being detected, such as manifests, lockfiles or other package data. This has the data attributes of a `package_data` field plus others: `package_uuid`, `package_data_files` and `files`. - A new top-level attribute `packages` has been added which contains package instances created from `package_data` detected in the codebase. - A new codebase level attribute `dependencies` has been added which contains dependency instances created from lockfiles detected in the codebase. - The package attribute `root_path` has been deleted from `package_data` in favour of the new format where there is no root conceptually, just a list of files for each package. - There is a new resource-level attribute `for_packages` which refers to packages through package_uids (pURL + uuid string). A `package_adder` function is now used to associate a Package to a Resource that is part of it. This gives us the flexibility to use the packagedcode Package handlers in other contexts where `for_packages` on Resource is not implemented in the same way as scancode-toolkit. - The package_data attribute `dependencies` (which is a list of DependentPackages), now has a new attribute `resolved_package` with a package data mapping. Also the `requirement` attribute is renamed to `extracted_requirement`. There is a new `extra_data` to collect extra data as needed. - For Pypi packages, python_requires is treated as a package dependency. License Clarity Scoring Update: =============================== - We are moving away from the original license clarity scoring designed for ClearlyDefined in the license clarity score plugin. The previous license clarity scoring logic produced a score that was misleading when it would return a low score due to the stringent scoring criteria. We are now using more general criteria to get a sense of what provenance information has been provided and whether or not there is a conflict in licensing between what licenses were declared at the top-level key files and what licenses have been detected in the files under the top-level. - The license clarity score is a value from 0-100 calculated by combining the weighted values determined for each of the scoring elements: - Declared license: - When true, indicates that the software package licensing is documented at top-level or well-known locations in the software project, typically in a package manifest, NOTICE, LICENSE, COPYING or README file. - Scoring Weight = 40 - Identification precision: - Indicates how well the license statement(s) of the software identify known licenses that can be designated by precise keys (identifiers) as provided in a publicly available license list, such as the ScanCode LicenseDB, the SPDX license list, the OSI license list, or a URL pointing to a specific license text in a project or organization website. - Scoring Weight = 40 - License texts: - License texts are provided to support the declared license expression in files such as a package manifest, NOTICE, LICENSE, COPYING or README. - Scoring Weight = 10 - Declared copyright: - When true, indicates that the software package copyright is documented at top-level or well-known locations in the software project, typically in a package manifest, NOTICE, LICENSE, COPYING or README file. - Scoring Weight = 10 - Ambiguous compound licensing: - When true, indicates that the software has a license declaration that makes it difficult to construct a reliable license expression, such as in the case of multiple licenses where the conjunctive versus disjunctive relationship is not well defined. - Scoring Weight = -10 - Conflicting license categories: - When true, indicates that the declared license expression of the software is in the permissive category, but that other potentially conflicting categories, such as copyleft and proprietary, have been detected in lower level code. - Scoring Weight = -20 Summary Plugin Update: ====================== - The summary plugin's behavior has been changed. Previously, it provided a count of the detected license expressions, copyrights, holders, authors, and programming languages from a scan. We have preserved this functionality by creating a new plugin called ``tallies``. All functionality of the previous summary plugin have been preserved in the tallies plugin. - The new summary plugin now attempts to determine a declared license expression, declared holder, and the primary programming language from a scan. And the updated license clarity score provides context on the quality of the license information provided in the codebase key files. - The new summary plugin also returns lists of tallies for the other "secondary" detected license expressions, copyright holders, and programming languages. All summary information is provided at the codebase-level attribute named ``summary``. Outputs: ======== - Added new outputs for the CycloneDx format. The CLI now exposes options to produce CycloneDx BOMs in either JSON or XML format - A new field ``warnings`` has been added to the headers of ScanCode toolkit output that contains any warning messages that occur during a scan. - The CSV output format --csv option is now deprecated. It will be replaced by new CSV and tabular output formats in the next ScanCode release. Visit https://github.com/nexB/scancode-toolkit/issues/3043 to provide inputs and feedback. Output version -------------- Scancode Data Output Version is now 2.0.0. Changes: - Rename resource level attribute `packages` to `package_data`. - Add top-level attribute `packages`. - Add top-level attribute `dependencies`. - Add resource-level attribute `for_packages`. - Remove `package-data` attribute `root_path`. - The fields of the license clarity scoring plugin have been replaced with the following fields. An overview of the new fields can be found in the "License Clarity Scoring Update" section above. - `score` - `declared_license` - `identification_precision` - `has_license_text` - `declared_copyrights` - `conflicting_license_categories` - `ambigious_compound_licensing` - The fields of the summary plugin have been replaced with the following fields. An overview of the new fields can be found in the "Summary Plugin Update" section above. - `declared_license_expression` - `license_clarity_score` - `declared_holder` - `primary_language` - `other_license_expressions` - `other_holders` - `other_languages` Documentation Update ======================== - Various documentation files have been updated to reflects API changes and correct minor documentation issues. Development environment and Code API changes: ============================================== - The main package API function `get_package_infos` is deprecated, and replaced by `get_package_data`. - The Resources path are always the same regardless of the strip-root or full-root arguments. - The license cache consistency is not checked anymore when you are using a git checkout. The SCANCODE_DEV_MODE tag file has been removed entirely. Use instead the --reindex-licenses option to rebuild the license index. - We can now regenerate test fixtures using the new SCANCODE_REGEN_TEST_FIXTURES environment variable. There is no need to replace the regen=False with regen=True in the code. Miscellaneous ======================== - Added support for usage of shortcut flags - `-A` or `--about` - `-q` or `--quiet` - `-v` or `--verbose` - `-V` or `--version` can be used. ## What's Changed * Report `packages` at top level with file level `package_manifests` by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2710 * Updated install.rst by @beastrun12j in https://github.com/nexB/scancode-toolkit/pull/2722 * Omnibus fall license improvements by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2706 * Improve license detection by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2737 * api.get_licenses: clarify and improve docstring for "min_score" argument by @zacchiro in https://github.com/nexB/scancode-toolkit/pull/2763 * rules with "unqualified" license names are references, not notices by @petergardfjall in https://github.com/nexB/scancode-toolkit/pull/2759 * Fix invalid license yaml files by resolving duplicated keys by @fangxlmr in https://github.com/nexB/scancode-toolkit/pull/2776 * Fix azure pipeline vmimage deprecations by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2775 * Allow license rules to require the presence of certain defining keywords by @mrombout in https://github.com/nexB/scancode-toolkit/pull/2773 * Add first draft ROADMAP by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2736 * Add CycloneDx output option by @agschrei in https://github.com/nexB/scancode-toolkit/pull/2698 * Remove regular expression futurewarning by @soimkim in https://github.com/nexB/scancode-toolkit/pull/2788 * fix docstring in debian_copyright.py by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2786 * fixes missing whitespace in prerequisites list by @altsalt in https://github.com/nexB/scancode-toolkit/pull/2778 * Add PackageManifest Class by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2748 * Add new licenses and new detection rules by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2765 * Rename first column of csv output to "path" by @JRavi2 in https://github.com/nexB/scancode-toolkit/pull/2016 * Detect unknown licenses #1675 by @akugarg in https://github.com/nexB/scancode-toolkit/pull/2592 * Improve copyright handling #2350 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2791 * Fixing OSI identifier for BSD-3-Clause; see also SPDX license metadata by @karsten-klein in https://github.com/nexB/scancode-toolkit/pull/2797 * Fix GPL license detection false positive #2793 by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2799 * 2789 inconsistent doc html app by @kunalchhabra37 in https://github.com/nexB/scancode-toolkit/pull/2795 * Fixed inconsistency in --html-app FILE in cli-reference by @maynaS in https://github.com/nexB/scancode-toolkit/pull/2790 * Replace freenode references with libera chat by @purna135 in https://github.com/nexB/scancode-toolkit/pull/2816 * Adopt nexB/skeleton and bump dependencies by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2818 * Fix bug recognizing license as license_notice instead of license_text by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2817 * Fix incorrect license detection #2777 by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2811 * Remove skeleton from docs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2830 * Detect SPDX-FileContributor tags as authors by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2838 * New license and copyright rule by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2837 * Add key phrase tags to GPL detection rule by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2821 * Make --version output valid YAML for parsing #2856 by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2858 * Add Direct Note for Windows Users (New Comers) by @OsmiumOP in https://github.com/nexB/scancode-toolkit/pull/2857 * Fixed Typo in Documentation by @OsmiumOP in https://github.com/nexB/scancode-toolkit/pull/2862 * Remove version check locally by @adii21-Ux in https://github.com/nexB/scancode-toolkit/pull/2860 * License improvement winter 2022 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2828 * Update link to documentation by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2867 * Improve license detection by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2871 * Detect dependencies from build.gradle files by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2822 * Fix small typo inside notes snippet by @Harshil-Jani in https://github.com/nexB/scancode-toolkit/pull/2829 * Add Package Instances #2691 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2825 * Improve license clarity scoring by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2875 * Do not raise exception on package data mismatch #2886 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2887 * Release 31 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2888 * Add primary license in summary by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2884 * Remove usage of get_terminal_size in click by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2916 * Fix doc builds by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2896 * Update summary plugin by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2914 * Shorten long file names by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2918 * Added new copyright test cases by @abhishak3 in https://github.com/nexB/scancode-toolkit/pull/2891 * Add system packages support in the new packages model by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2909 * Fix typo in summary: ambigous->ambiguous by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2922 * Add system environment to scan headers by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2923 * Update METADATA.bzl parser by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2924 * Spring 2022 license updates by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2921 * Process single package data file correctly by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2933 * Fix package/dependency creation bugs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2932 * Populate for packages field correctly #2929 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2939 * Prepare Release 31b4 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2941 * Duplicated dependencies package results by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2944 * Prepare Release 31b4 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2947 * Add link to scancode-toolkit-reference-scans by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2952 * Modify pypi PKG-INFO parse by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2953 * Prepare Release 31.b5 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2962 * Add black and isort as testing dependencies #2969 by @johnmhoran in https://github.com/nexB/scancode-toolkit/pull/2970 * Rename precise_license_detection field #2967 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2968 * Convert package data dict to PackageData #2971 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2973 * Update extractcode --shallow option description by @lf32 in https://github.com/nexB/scancode-toolkit/pull/2959 * Support shortcut flags for cli by @lf32 in https://github.com/nexB/scancode-toolkit/pull/2951 * Consider only copyrights in summry #2972 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2974 * Reimplement get installed packages by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/2988 * Report extracted_requirement correctly by @TG1999 in https://github.com/nexB/scancode-toolkit/pull/2984 * Improve packagecode and other release prep by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2992 * Improve npm package processing by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2997 * Update license detection by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2998 * Add new license rules and license - Early summer 2022 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/2999 * Bump version to 31.0.0rc2 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3000 * Do not fail without packages in cyclonedx #2987 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3005 * Fix relaunching scancode on Apple silicon using Rosetta 2 emulation #2835 by @MarcelBochtler in https://github.com/nexB/scancode-toolkit/pull/3018 * Clarify `unknown` license keys #2827 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3023 * Yield Packages before other yieldables #3028 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3031 * Prepare Release 31.0.0rc3 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3029 * Release 31 rc4 prep by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3036 * Add package_adder argument to assemble() #3034 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3035 * Report proprietary license if key phrase #3039 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3041 * Improve release scripts #3040 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3046 * Update DatafileHandler default methods by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3042 * Prepare release 31 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3053 ## New Contributors * @beastrun12j made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2722 * @zacchiro made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2763 * @fangxlmr made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2776 * @mrombout made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2773 * @agschrei made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2698 * @soimkim made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2788 * @adii21-Ux made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2786 * @altsalt made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2778 * @karsten-klein made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2797 * @KevinJi22 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2799 * @kunalchhabra37 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2795 * @maynaS made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2790 * @purna135 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2816 * @OsmiumOP made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2857 * @Harshil-Jani made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2829 * @abhishak3 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2891 * @lf32 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/2959 * @MarcelBochtler made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3018 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v30.1.0...v31.0.1 2022-08-18T06:36:09+00:00 scancode-toolkit v31.0.2 scancode-toolkit v31.0.2 2022-08-25T20:40:59+00:00 This is minor release with minor bug fixes and feature improvements. ## What's Changed * Improve license detection with rules and licenses by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3030 * Fix issues in `PythonInstalledWheelMetadataFile.assign_package_to_resources()` by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3062 * Add new and improved licenses and rules - summer 2022 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3064 * Prepare release 31.0.2 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3065 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.1...v31.0.2 2022-08-25T20:40:59+00:00 scancode-toolkit v31.1.0 scancode-toolkit v31.1.0 2022-08-29T13:20:18+00:00 v31.1.0 - 2022-08-29 ---------------------------------- This is a minor release with critical bug fixes and minor updates. - Fix a critical bug in license detection ## What's Changed * Hot fix for license scan failure #3067 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3070 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.0.2...v31.1.0 2022-08-29T13:20:18+00:00 scancode-toolkit v31.1.1 scancode-toolkit v31.1.1 2022-09-02T13:15:53+00:00 This is a minor release with a bug fix. - Do not display tracing/debug outputs at runtime reported by @soimkim **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.1.0...v31.1.1 2022-09-02T13:15:53+00:00 scancode-toolkit v31.2.1 scancode-toolkit v31.2.1 2022-10-05T13:18:04+00:00 This is a minor release with small bug fixes and minor feature updates. - Update SPDX license list to 3.18 - Improve how we discard license matches that are "gibberish" - And new and improve existing license and license detection rules ## What's Changed * Prepare Release 31.1.1 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3085 * Process Gemfile.lock processing #3072 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3090 * Prefer using PKG-INFO from .egg-info in assemble #3083 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3091 * Correct purl type for cocoapods #3081 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3096 * Fixed restructuredtext bulleted list to use * by @bwjohnson-ss in https://github.com/nexB/scancode-toolkit/pull/3116 * Restore license texts of deprecated licenses by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3101 * GitHub Workflows security hardening by @sashashura in https://github.com/nexB/scancode-toolkit/pull/3117 * Update plugins docs and fix links by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3110 * Replace gemfileparser with gemfileparser2 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3098 * Yield package before assigning to resource by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3115 * Fix summary holder bug by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3114 * Improve Author Detection by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3119 * Prepare release 31.2 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3104 ## New Contributors * @bwjohnson-ss made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3116 * @sashashura made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3117 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.1.1...v31.2.1 2022-10-05T13:18:04+00:00 scancode-toolkit v31.2.3 scancode-toolkit v31.2.3 2022-12-24T08:13:44+00:00 This is a bugfix release. There is a fix for an installation issue with the new "packaging" version 22.0. This is replaced by a fork named "packvers" to work around https://github.com/pypa/packaging/issues/530 and provide an emergency fix for #3171 and #3177 We updated these dependencies: - https://github.com/nexB/pip-requirements-parser - https://github.com/nexB/dparse2/ With the new: - https://github.com/nexB/packvers/ We also improved the compatibility for pre-built wheels and now build one wheel for each Python version to work around some Python pickle bug. We pinned SPDX tools for cope with the upcoming API breaking changes. **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.2.1...v31.2.3 2022-12-24T08:13:44+00:00 scancode-toolkit v31.2.4 scancode-toolkit v31.2.4 2023-01-11T10:15:30+00:00 This is a bugfix release. There is a fix for an license index issue because of the new "attrs" version 22.2.0 and how things pickled with the previous version of attrs (the pickled index) cannot unpickle with newer versions. We have vendored attrs using [vendorize](https://github.com/mwilliamson/python-vendorize) for use in the license index such that it isn't impacted by new package versions. See more details at https://github.com/nexB/scancode-toolkit/issues/3179 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.2.3...v31.2.4 2023-01-11T10:15:30+00:00 scancode-toolkit v32.0.0rc1 scancode-toolkit v32.0.0rc1 2023-01-22T17:52:10+00:00 This is a major new release with API breaking changes. v32.0.0rc1 is the first release candidate and we expect to have a few more. ## Important API changes: This is a major release with major API and output format changes and significant feature updates. In particular changed to the output format for the licenses and packages, and we changed some of the command line options. The output format version is now 3.0.0. ## Package detection: - Update ``GemfileLockParser`` to track the gem which the Gemfile.lock is for, which we assign to the new ``GemfileLockParser.primary_gem`` field. Update ``GemfileLockHandler.parse()`` to handle the case where there is a primary gem detected from a gemfile.lock. If there is a primary gem, a single ``Package`` is created and the detected gem data within the gemfile.lock are assigned as dependencies. If there is no primary gem, then all of the dependencies are collected into Package with no name and yielded. https://github.com/nexB/scancode-toolkit/issues/3072 - Fix issue where dependencies were not reported when scanning an extracted Python project by modifying ``BaseExtractedPythonLayout.assemble()`` to favor using package data from a PKG-INFO file from an egg-info directory. Package data from a PKG-INFO file from an egg-info directory contains the dependency information collected from the requirements.txt file along side PKG-INFO. https://github.com/nexB/scancode-toolkit/issues/3083 - Fix issue where we were returning incorrect purl package ``type`` for cocoapods. ``pods`` was being returned as a purl type for cocoapods, it should be ``cocoapods`` instead. https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#cocoapods https://github.com/nexB/scancode-toolkit/issues/3081 - Code for parsing a Maven POM, npm package.json, freebsd manifest and haxelib JSON have been separated into two functions: one that creates a PackageData object from the parsed Resource, and another that calls the previous function and yields the PackageData. This was done such that we can use the package manifest data parsing code outside of the scancode-toolkit context in other libraries. ## License detection: - The SPDX license list has been updated to the latest v3.19 - This is a major update to license detection where we now combine one or more license matches in a larger license detection. This approach improves the accuracy of license detection and removes a larger number of false positive or ambiguous license detections. See for details https://github.com/nexB/scancode-toolkit/issues/2878 - There is a new ``license_detections`` codebase level attribute with all the unique license detections in the whole scan, both in resources and packages. This has the 3 attributes also present in package/resource level license detections: ``license_expression``, ``matches`` and ``detection_log`` and has two additional attributes: - ``identifier``: which is the ``license_expression`` with an UUID created out of the detection contents and is the same for same detections. - ``count``: Number of times in the codebase this unique license detection was encountered. - The data structure of the JSON output has changed for licenses at file level: - The ``licenses`` attribute is deleted. - A new ``for_license_detections`` attribute is aded which references the codebase level unique license detections, and this is a list of ``identifer`` strings from the codebase level license detections it references. - A new ``license_detections`` attribute contains license detections in that file. This object has three attributes: ``license_expression``, ``detection_log`` and ``matches``. ``matches`` is a list of license matches and is roughly the same as ``licenses`` in the previous version with additional structure changes detailed below. - A new attribute ``license_clues`` contains license matches with the same data structure as the ``matches`` attribute in ``license_detections``. This contains license matches that are mere clues and where not considered to be a proper conclusive license detection. - The ``license_expressions`` list of license expressions is deleted and replaced by a ``detected_license_expression`` single expression. Similarly ``spdx_license_expressions`` was removed and replaced by ``detected_license_expression_spdx``. - See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-resource>`_ for examples and details. - The data structure of license attributes in ``package_data`` and the codebase level ``packages`` has been updated accordingly: - There is a new ``license_detections`` attribute for the primary, top-level declared licenses of a package and an ``other_license_detections`` attribute for the other secondary detections. - The ``license_expression`` is replaced by the ``declared_license_expression`` and ``other_license_expression`` attributes with their SPDX counterparts ``declared_license_expression_spdx`` and ``other_license_expression_spdx``. These expressions are parallel to detections. - The ``declared_license`` attribute is renamed ``extracted_license_statement`` and is now a YAML-encoded string. See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-package>`_ for examples and details. - The license matches structure has changed: we used to report one match for each license ``key`` of a matched license expression. We now report instead one single match for each matched license expression, and list the license keys as a ``licenses`` attribute. This avoids data duplication. Inside each match, we list each match and matched rule attributred directly avoiding nesting. See `license updates doc <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#licensematch-result-data>`_ for examples and details. - There are new and codebase level attributes default with `--licenses` to report reference license metadata and texts once for each license matched across the scan; we now have two codebase level attributes: ``license_references`` and ``license_rule_references`` that list unique detected license and license rules. for examples and details. This reference data is also removed from license matches in all levels i.e. from codebase, package and resource level license detections and resource level license clues. See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#comparision-before-after-license-references>`_ - We replaced the ``scancode --reindex-licenses`` command line option with a new separate command named ``scancode-reindex-licenses``. - The ``--reindex-licenses-for-all-languages`` CLI option is also moved to the ``scancode-reindex-licenses`` command as an option ``--all-languages``. - We can now detect licenses using custom license texts and license rules stored in a directory or packaged as a plugin for consistent reuse and deployment. - There is an ``--additional-directory`` option with the ``scancode-reindex-licenses`` command to add the licenses from a directory. - There is also a ``--only-builtin`` option to use ony builtin licenses ignoring any additional license plugins. - See https://github.com/nexB/scancode-toolkit/issues/480 for more details. - We combined the licensedata file and text file of each license in a single file with a .LICENSE extension. The .yml data file is now included at the top of each .LICENSE file as "YAML frontmatter". The same applies to license rules and their .RULE and .yml files. This halves the number of data files from about 60,000 to 30,000. Git line history is preserved for the combined text + yml files. - See https://github.com/nexB/scancode-toolkit/issues/3049 - There is a new console script ``scancode-license-data`` to export license data in JSON, YAML and HTML, with indexes and a static website for use in the licensedb web site. This becomes the API way to getr scancode license data. See https://github.com/nexB/scancode-toolkit/issues/2738 - The deprecated "--is-license-text" option has been removed. This is now built-in with the --license-text option and --info and exposed with the "percentage_of_license_text" attribute. ## All Changes * Add support for external licenses in scans by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2979 * Separate Package parsing functions by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3135 * Update docs for deprecated and other options #3126 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3127 * Add license dump option by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3100 * Combine license matches in new LicenseDetection by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2961 * Fix issue 3155 by running `scancode-reindex-licenses` subcommand instead of using `--reindex-licenses` flag by @abhi-kr-2100 in https://github.com/nexB/scancode-toolkit/pull/3159 * Detect wurfl commercial license by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3163 * Do not use packaging.LegacyVersion #3171 #3177 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3180 * More License Detection changes by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3154 * docs(fix): how to install Py. 3.8 on recent Ubuntu by @camillem in https://github.com/nexB/scancode-toolkit/pull/3146 * Add links to basic options in docs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3142 * install.rst: spelling by @vargenau in https://github.com/nexB/scancode-toolkit/pull/3184 * Release 32.0.0rc1 prep by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3150 * Remove deprecated images from CI and release-script by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3099 * Fix unhashable type error in cyclonedx #3016 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3189 * Update license db generation by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3197 * Remove license text from index.json of licenseDB by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3201 * Support python 3.11 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3199 * Properly assign boolean to is_resolved #3152 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3153 * Vendor attrs to avoid unpickle issues #3179 #3192 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3193 * Remove trailing T in date by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3203 * Restore help.html from nexB/scancode-licensedb#23 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3202 * adapt code to new spdx-tools release by @meretp in https://github.com/nexB/scancode-toolkit/pull/3173 * Add nuget nuspec dependencies by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3206 * Fix release scripts by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3208 * Fix attrs version in requirements by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3209 ## New Contributors * @abhi-kr-2100 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3159 * @camillem made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3146 * @vargenau made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3184 * @meretp made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3173 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.2.4...v32.0.0rc1 2023-01-22T17:52:10+00:00 scancode-toolkit v32.0.0rc2 scancode-toolkit v32.0.0rc2 2023-02-17T20:08:08+00:00 This is the second release candidate for v32.0.0 with a few bug fixes, license rule additions, and updates in the release script now generating app archives for more python versions across Linux/Windows/MacOS. ## What's Changed * Work around heisen-failures in CI by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3207 * Add HERE Proprietary rule for pom.xml files by @bennati in https://github.com/nexB/scancode-toolkit/pull/3212 * Add required phrase to JSR rule by @bennati in https://github.com/nexB/scancode-toolkit/pull/3218 * Fix choking license detection post-processing #3245 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3247 * Build app archives for all python versions by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3232 * Bump version to v32.0.0rc2 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3262 ## New Contributors * @bennati made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3212 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.0rc1...v32.0.0rc2 2023-02-17T20:08:08+00:00 scancode-toolkit v32.0.0rc3 scancode-toolkit v32.0.0rc3 2023-03-20T16:14:53+00:00 This is the third release candidate for v32.0.0 with two major updates: - we have changed the way we report license detections. See https://github.com/nexB/scancode-toolkit/pull/3286#issue-1615411541 for more details on this. - added support for SPDX license list 3.20, adding several new licenses and detection rules. ## What's Changed * Add new and improve existing licenses by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3271 * Improve License Detection reporting by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3286 * Release v32.0.0rc3 prep by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3291 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.0rc2...v32.0.0rc3 2023-03-20T16:14:53+00:00 scancode-toolkit v31.2.5 scancode-toolkit v31.2.5 2023-04-13T09:10:59+00:00 This is a minor bug fix release. * Backport changes from https://github.com/nexB/scancode-toolkit/pull/3218 * Drop python 3.7 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.2.4...v31.2.5 2023-04-13T09:10:59+00:00 scancode-toolkit v32.0.0rc4 scancode-toolkit v32.0.0rc4 2023-04-20T23:25:27+00:00 ## What's Changed * Fix #3250: Invalid SPDX with empty file: no SHA1 by @vargenau in https://github.com/nexB/scancode-toolkit/pull/3279 * Add docs, changelog and authors in CONTRIBUTION and fix typos and errors by @OctoPie23 in https://github.com/nexB/scancode-toolkit/pull/3204 * Silence pyicu warning by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3280 * Fix licenses in HTML output by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3275 * Fix misc license detection related bugs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3299 * Add copyright holder field to PackageData model by @keshav-space in https://github.com/nexB/scancode-toolkit/pull/3302 * Merge latest skeleton into scancode by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3305 * New licenses and license rules by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3309 * Update documentation for v32 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3292 * Get valid yaml output by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3220 * Fix-up the category of the 'ms-cla' license by @fviernau in https://github.com/nexB/scancode-toolkit/pull/3318 * Release prep V32.0.0rc4 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3336 * Update release script to remove ubuntu18 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3337 ## New Contributors * @OctoPie23 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3204 * @keshav-space made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3302 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.0rc3...v32.0.0rc4 2023-04-20T23:25:27+00:00 scancode-toolkit v31.2.6 scancode-toolkit v31.2.6 2023-04-25T13:00:04+00:00 This is a minor hotfix release. - This fix a crash when parsing a .deb Debian package filename reported in https://github.com/nexB/scancode-toolkit/issues/3259 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.2.5...v31.2.6 2023-04-25T13:00:04+00:00 scancode-toolkit v32.0.0 scancode-toolkit v32.0.0 2023-05-22T22:04:23+00:00 v32 of ScanCode is all about improved license detections! We have more licenses and rules, and major updates on post-processing matches to license detections. We also have major improvements in package license detections and unknown references, along with top level detection summaries for licenses, and reference data for the licenses detected too. There are also a couple of API changes due to model changes in license data. See also https://github.com/nexB/scancode.io/ for a complete, customizable SCA solution using ScanCode and https://github.com/nexB/scancode-workbench/releases for visualizing data generated by ScanCode Toolkit. # Important API changes: This is a major release with major API and output format changes and significant feature updates. In particular the output format has changed for the licenses and packages, and also for some of the command line options. The output format version is now 3.0.0. See https://github.com/nexB/scancode-toolkit/milestone/15 for more details on this release. Visit https://github.com/nexB/scancode-toolkit/discussions/3406 to discuss about this release. ## Package detection: - Update ``GemfileLockParser`` to track the gem which the Gemfile.lock is for, which we assign to the new ``GemfileLockParser.primary_gem`` field. Update ``GemfileLockHandler.parse()`` to handle the case where there is a primary gem detected from a gemfile.lock. If there is a primary gem, a single ``Package`` is created and the detected gem data within the gemfile.lock are assigned as dependencies. If there is no primary gem, then all of the dependencies are collected into Package with no name and yielded. https://github.com/nexB/scancode-toolkit/issues/3072 - Fix issue where dependencies were not reported when scanning an extracted Python project by modifying ``BaseExtractedPythonLayout.assemble()`` to favor using package data from a PKG-INFO file from an egg-info directory. Package data from a PKG-INFO file from an egg-info directory contains the dependency information collected from the requirements.txt file along side PKG-INFO. https://github.com/nexB/scancode-toolkit/issues/3083 - Fix issue where we were returning incorrect purl package ``type`` for cocoapods. ``pods`` was being returned as a purl type for cocoapods, it should be ``cocoapods`` instead. https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#cocoapods https://github.com/nexB/scancode-toolkit/issues/3081 - Code for parsing a Maven POM, npm package.json, freebsd manifest and haxelib JSON have been separated into two functions: one that creates a PackageData object from the parsed Resource, and another that calls the previous function and yields the PackageData. This was done such that we can use the package manifest data parsing code outside of the scancode-toolkit context in other libraries. - The PackageData model now includes a ``holder`` field, which is populated with holder data extracted from the copyright field if copyright data is present, otherwise it remains empty. https://github.com/nexB/scancode-toolkit/issues/3290 - DatafileHandlers now have a classmethod named ``get_top_level_resources()``, which is supposed to yield the top-level Resources of a Package codebase, relative to a Package manifest file. ``maven.MavenPomXmlHandler`` is the first DatafileHandler that has this method implemented. ## License detection: - The SPDX license list has been updated to the latest v3.20 - This is a major update to license detection where we now combine one or more license matches in a larger license detection. This approach improves the accuracy of license detection and removes a larger number of false positive or ambiguous license detections. See for details https://github.com/nexB/scancode-toolkit/issues/2878 - There is a new ``license_detections`` codebase level attribute with all the unique license detections in the whole scan, both in resources and packages. This has the 3 attributes also present in package/resource level license detections: ``license_expression``, ``identifier`` and ``detection_log`` (present optionally if the ``--license-diagnostics`` option is enabled) with an additional attribute: - ``count``: Number of times in the codebase this unique license detection was encountered. - The data structure of the JSON output has changed for licenses at file level: - The ``licenses`` attribute is deleted. - A new ``license_detections`` attribute contains license detections in that file. This object has three attributes: ``license_expression``, ``identifier`` and ``matches``. ``matches`` is a list of license matches and is roughly the same as ``licenses`` in the previous version with additional structure changes detailed below. Identifier is the detected license-expression with an UUID generated from the content of ``matches`` such that this is unique for unique detections. We also have another attribute ``detection_log`` with diagnostics information if the ``--license-diagnostics`` option is enabled. - A new attribute ``license_clues`` contains license matches with the same data structure as the ``matches`` attribute in ``license_detections``. This contains license matches that are mere clues and where not considered to be a proper conclusive license detection. - The ``license_expressions`` list of license expressions is deleted and replaced by a ``detected_license_expression`` single expression. Similarly ``spdx_license_expressions`` was removed and replaced by ``detected_license_expression_spdx``. - See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-resource>`_ for examples and details. - The data structure of license attributes in ``package_data`` and the codebase level ``packages`` has been updated accordingly: - There is a new ``license_detections`` attribute for the primary, top-level declared licenses of a package and an ``other_license_detections`` attribute for the other secondary detections. - The ``license_expression`` is replaced by the ``declared_license_expression`` and ``other_license_expression`` attributes with their SPDX counterparts ``declared_license_expression_spdx`` and ``other_license_expression_spdx``. These expressions are parallel to detections. - The ``declared_license`` attribute is renamed ``extracted_license_statement`` and is now a YAML-encoded string, which can be parsed to recreate the original extracted license statement. Previously this used to be nested python objects lists/dicts/string, but now this is always a YAML string. See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#change-in-license-data-format-package>`_ for examples and details. - The license matches structure has changed: we used to report one match for each license ``key`` of a matched license expression. We now report instead one single match for each matched license expression, and list the license keys as a ``licenses`` attribute. This avoids data duplication. Inside each match, we list each match and matched rule attributred directly avoiding nesting. See `license updates doc <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#licensematch-result-data>`_ for examples and details. - There are new and codebase level attributes with ``--license-references`` to report reference license metadata and texts once for each license matched across the scan; we now have two codebase level attributes: ``license_references`` and ``license_rule_references`` that list unique detected license and license rules. for examples and details. This reference data is also removed from license matches in all levels i.e. from codebase, package and resource level license detections and resource level license clues, irrespective of this CLI option being used, i.e. default with ``--licenses``. See `license updates documentation <https://scancode-toolkit.readthedocs.io/en/latest/explanations/license-detection-reference.html#comparision-before-after-license-references>`_ - We replaced the ``scancode --reindex-licenses`` command line option with a new separate command named ``scancode-reindex-licenses``. - The ``--reindex-licenses-for-all-languages`` CLI option is also moved to the ``scancode-reindex-licenses`` command as an option ``--all-languages``. - We can now detect licenses using custom license texts and license rules stored in a directory or packaged as a plugin for consistent reuse and deployment. - There is an ``--additional-directory`` option with the ``scancode-reindex-licenses`` command to add the licenses from a directory. - There is also a ``--only-builtin`` option to use ony builtin licenses ignoring any additional license plugins. - See https://github.com/nexB/scancode-toolkit/issues/480 for more details. - We combined the license data file and text file of each license in a single file with a .LICENSE extension. The .yml data file is now included at the top of each .LICENSE file as "YAML frontmatter". The same applies to license rules and their .RULE and .yml files. This halves the number of data files from about 60,000 to 30,000. Git line history is preserved for the combined text + yml files. - See https://github.com/nexB/scancode-toolkit/issues/3049 - There is a new console script ``scancode-license-data`` to export license data in JSON, YAML and HTML, with indexes and a static website for use in the licensedb web site. This becomes the API way to getr scancode license data. See https://github.com/nexB/scancode-toolkit/issues/2738 - The deprecated "--is-license-text" option has been removed. This is now built-in with the --license-text option and --info and exposed with the "percentage_of_license_text" attribute. - The license dump() has been modified to add an extra space at empty newlines for license files which also have multiple indentation levels as this was generating invalid YAML output files when ``--license-text`` or ``--license-references`` was enabled. See https://github.com/nexB/scancode-toolkit/issues/3219 - A bugfix has been added to the ``--unknown-licenses`` option where we would crash when using this option without using ``--matched-text`` option. This is now working correctly and also better tested. See https://github.com/nexB/scancode-toolkit/issues/3343 ## What's Changed * Add support for external licenses in scans by @KevinJi22 in https://github.com/nexB/scancode-toolkit/pull/2979 * Separate Package parsing functions by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3135 * Update docs for deprecated and other options #3126 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3127 * Add license dump option by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3100 * Combine license matches in new LicenseDetection by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/2961 * Fix issue 3155 by running `scancode-reindex-licenses` subcommand instead of using `--reindex-licenses` flag by @abhi-kr-2100 in https://github.com/nexB/scancode-toolkit/pull/3159 * Detect wurfl commercial license by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3163 * Do not use packaging.LegacyVersion #3171 #3177 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3180 * More License Detection changes by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3154 * docs(fix): how to install Py. 3.8 on recent Ubuntu by @camillem in https://github.com/nexB/scancode-toolkit/pull/3146 * Add links to basic options in docs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3142 * install.rst: spelling by @vargenau in https://github.com/nexB/scancode-toolkit/pull/3184 * Release 32.0.0rc1 prep by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3150 * Remove deprecated images from CI and release-script by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3099 * Fix unhashable type error in cyclonedx #3016 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3189 * Update license db generation by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3197 * Remove license text from index.json of licenseDB by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3201 * Support python 3.11 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3199 * Properly assign boolean to is_resolved #3152 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3153 * Vendor attrs to avoid unpickle issues #3179 #3192 by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3193 * Remove trailing T in date by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3203 * Restore help.html from nexB/scancode-licensedb#23 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3202 * adapt code to new spdx-tools release by @meretp in https://github.com/nexB/scancode-toolkit/pull/3173 * Add nuget nuspec dependencies by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3206 * Fix release scripts by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3208 * Fix attrs version in requirements by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3209 * Work around heisen-failures in CI by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3207 * Add HERE Proprietary rule for pom.xml files by @bennati in https://github.com/nexB/scancode-toolkit/pull/3212 * Add required phrase to JSR rule by @bennati in https://github.com/nexB/scancode-toolkit/pull/3218 * Fix choking license detection post-processing #3245 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3247 * Build app archives for all python versions by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3232 * Bump version to v32.0.0rc2 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3262 * Add new and improve existing licenses by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3271 * Improve License Detection reporting by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3286 * Release v32.0.0rc3 prep by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3291 * Fix #3250: Invalid SPDX with empty file: no SHA1 by @vargenau in https://github.com/nexB/scancode-toolkit/pull/3279 * Add docs, changelog and authors in CONTRIBUTION and fix typos and errors by @shricodev in https://github.com/nexB/scancode-toolkit/pull/3204 * Silence pyicu warning by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3280 * Fix licenses in HTML output by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3275 * Fix misc license detection related bugs by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3299 * Add copyright holder field to PackageData model by @keshav-space in https://github.com/nexB/scancode-toolkit/pull/3302 * Merge latest skeleton into scancode by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3305 * New licenses and license rules by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3309 * Update documentation for v32 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3292 * Get valid yaml output by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3220 * Fix-up the category of the 'ms-cla' license by @fviernau in https://github.com/nexB/scancode-toolkit/pull/3318 * Release prep V32.0.0rc4 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3336 * Update release script to remove ubuntu18 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3337 * Update doc to reference attrib in AbcTK by @chinyeungli in https://github.com/nexB/scancode-toolkit/pull/3252 * Add new proprietary license detection rule by @ninad365 in https://github.com/nexB/scancode-toolkit/pull/3234 * Only trigger license rule with Freetype by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3227 * Fix unknown license detection by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3345 * Fix typo #3363 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3364 * Port v31.2.5 hotfix by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3351 * Add `get_top_level_resources()` to `DatafileHandler` class by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3315 * 3396 update get license detections and expression by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3397 * Bump commoncode version to 31.0.2 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3399 * Do not set version to empty string in npm_api_url #3393 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3398 * Format extracted_license_statement as YAML by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3402 * Release prep v32 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3405 ## New Contributors * @abhi-kr-2100 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3159 * @camillem made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3146 * @vargenau made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3184 * @meretp made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3173 * @bennati made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3212 * @shricodev made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3204 * @keshav-space made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3302 * @ninad365 made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3234 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v31.2.4...v32.0.0 2023-05-22T22:04:23+00:00 scancode-toolkit v32.0.1 scancode-toolkit v32.0.1 2023-05-23T20:59:18+00:00 This is a minor bugfix release. There are fixes for two issues in this release: - https://github.com/nexB/scancode-toolkit/issues/3407 here in typecode we had an improper import of ctypes.utils and this is fixed in a new release v30.0.1 of typecode - https://github.com/nexB/scancode-toolkit/issues/3408 the setup.cfg and setup-mini.cfg was not aligned for plugin entrypoints. ## What's Changed * Release prep v32.0.1 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3410 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.0...v32.0.1 2023-05-23T20:59:18+00:00 scancode-toolkit v32.0.2 scancode-toolkit v32.0.2 2023-05-29T13:58:16+00:00 This is a minor license update release with: * new and updated licenses in LicenseDB * license-expression V30.1.1 with support for the new licenses ## What's Changed * Add new licenses to licenseDB by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3414 * Release Prep v32.0.2 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3415 * Add doc redirects by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3413 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.1...v32.0.2 2023-05-29T13:58:16+00:00 scancode-toolkit v32.0.3 scancode-toolkit v32.0.3 2023-06-06T19:46:58+00:00 This is a minor bugfix release with the following updates: - We were missing scancode-toolkit-mini releases from v32.0.0rc2 and also the scancode-toolkit release wheels including and after v32.0.0rc2 were actually scancode-toolkit-mini releases. Reference: https://github.com/nexB/scancode-toolkit/issues/3421 - Updated github actions, for more details see https://github.com/nexB/skeleton/issues/75 ## What's Changed * Fix scancode-toolkit-mini and release prep v32.0.3 #3421 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3422 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.2...v32.0.3 2023-06-06T19:46:58+00:00 scancode-toolkit v32.0.4 scancode-toolkit v32.0.4 2023-06-07T20:29:56+00:00 This is a minor bugfix release with the following updates: - Fixes a performance issue issue arising out of license detection on files happening in a single-threaded process_codebase step when the license CLI option is disabled for a package scan. Reference: https://github.com/nexB/scancode-toolkit/pull/3423 ## What's Changed * Fix package scan only performance by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3423 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.3...v32.0.4 2023-06-07T20:29:56+00:00 scancode-toolkit v32.0.5rc3 scancode-toolkit v32.0.5rc3 2023-06-24T14:22:05+00:00 2023-06-24T14:22:05+00:00 scancode-toolkit v32.0.6 scancode-toolkit v32.0.6 2023-07-19T14:54:05+00:00 This is a minor release with a lot of license and package detection improvements, specially for maven packages. We also support the SPDX license list 3.21 now. The main updates over the previous stable release are: * New and updated licenses, including support for newly released SPDX license list version 3.21. For more details see https://github.com/nexB/scancode-toolkit/pull/3437 * Fixes in summary plugin for licenses, and top-level license detections. https://github.com/nexB/scancode-toolkit/pull/3430 * Updated maven license and package detections, with fixes for various maven package manifest parsing, improved top-level package assembly, ecosystem specific package license detection, fixes in --todo plugin, updated license detection rules/heuristics and other misc changes. For more details see: https://github.com/nexB/scancode-toolkit/pull/3447 * Improved Gemfile.lock parsing. For more details see https://github.com/nexB/scancode-toolkit/pull/3444 * Auto-review plugin to get todo items for scan review, with the new --todo CLI option. For more details see: https://github.com/nexB/scancode-toolkit/pull/3353 * Misc. license and copyright detection improvements at https://github.com/nexB/scancode-toolkit/pull/3346 * Other misc. minor bugfixes detailed in all the previous release-candidates. ## What's Changed * Ambiguous Detections ToDo items by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3353 * License detection improvements and review by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3346 * Fix maven pom resource assignment by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3427 * Bump version to v32.0.5rc1 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3428 * Bump version to v32.0.5rc2 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3433 * Release prep v32.0.5rc3 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3436 * Update licenses and rules by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3437 * Fix licenses data in summary plugin by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3430 * Update proprietary-license_553.RULE by @pombredanne in https://github.com/nexB/scancode-toolkit/pull/3441 * support parsing BUNDLED WITH by @akostadinov in https://github.com/nexB/scancode-toolkit/pull/3444 * Update maven detections by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3447 * Release prep v32.0.6 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3454 ## New Contributors * @akostadinov made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3444 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.4...v32.0.6 2023-07-19T14:54:05+00:00 scancode-toolkit v32.0.7 scancode-toolkit v32.0.7 2023-09-28T12:41:06+00:00 [v32.0.6 - 2023-07-13](https://github.com/nexB/scancode-toolkit/blob/develop/CHANGELOG.rst#v3206---2023-07-13) This is a minor release with a lot of license detection improvements, with new and updated license detection rules and new licenses. - 33 new licenses, 30 licenses updated - 70 new and updated license rules The main updates over the previous stable release are: - To the license Rule class adds `is_license_clue` attribute and `is_deprecated` attribute to support license clues detection, and always maintain consistency on unique rule names. Adds fixes for other license detection bug related to license clues, bug in setup.cfg license detection and makes license detection identifiers python-safe. See https://github.com/nexB/scancode-toolkit/pull/3462 - Update/Add new licenses and license rules. See https://github.com/nexB/scancode-toolkit/pull/3470 https://github.com/nexB/scancode-toolkit/pull/3513 - Bump commoncode to v31.0.3 fixing a VirtualCodebase creation issue when there is a directory under the root with the same name as the root directory itself. https://github.com/nexB/commoncode/issues/57 https://github.com/nexB/scancode-toolkit/pull/3495 ## What's Changed * Edit `check_rdf_scan` so that SPDX rdf tests don't automatically pass #3448 by @armintaenzertng in https://github.com/nexB/scancode-toolkit/pull/3451 * Update misc detections by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3462 * Bump commoncode to v31.0.3 by @JonoYang in https://github.com/nexB/scancode-toolkit/pull/3495 * Update and add licenses by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3470 * Update licenses and rules by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3513 * Release prep 32.0.7 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3527 ## New Contributors * @armintaenzertng made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3451 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.6...v32.0.7 2023-09-28T12:41:06+00:00 scancode-toolkit v32.0.8 scancode-toolkit v32.0.8 2023-10-16T19:38:05+00:00 ## What's Changed * Fixed epoch parser failing for numeric values by @OmkarPh in https://github.com/nexB/scancode-toolkit/pull/3520 * Update license rules and detections by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3519 * License rules update by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3545 * Bump version to v32.0.8 by @AyanSinhaMahapatra in https://github.com/nexB/scancode-toolkit/pull/3548 ## New Contributors * @OmkarPh made their first contribution in https://github.com/nexB/scancode-toolkit/pull/3520 **Full Changelog**: https://github.com/nexB/scancode-toolkit/compare/v32.0.7...v32.0.8 2023-10-16T19:38:05+00:00