CI Gate Security are a series of gate verify checks which are centred on preventing sensitive information or possible malicious objects from getting merged into OPNFV repos.
It also performs cursory checks to verify code files and documentation contain a Licence header.
Anteater works on whitelists / blacklists that are generated using standard regular expressions, so it is possible to easily extend anteater to verify any content for any nominated string or file type, beyond even security.
- User commits patch to repo
- Jenkins calls 'anteater' tool with list of files
- Each file is processed through the following checks
- Binary check
- If file is binary, it will be logged under Jenkins 'code review' section
- file name / content checks
- A check will be made to search the file contents, file and file extension naming for sensitive data or dangerous strings. If found, it will be logged under Jenkins 'code review' section
- License check
- A cursory search will be made to verify that some sort of license is included (full license header will later be checked by Legal). If a license is not present, it will be logged under Jenkins 'code review' section
Anteater tool overview
Anteater is a standard POSIX compliant CLI application.
Perform a scan against a patchset. A patchset is a file supplied by Jenkins and is a list of files commited in a patch, with each file being on a newline.
Perform a scan against an entire project. This scan will walk through all files within the repo.
Configuration of Anteater 'master_list.yaml'
A single file called 'master_list.yaml' is used to declare blacklists within anteater. A blacklist is a nothing more then a standard regular expression. If anteater matches regular expression set within master_list.yaml, it will fail the jenkins job (the commited patch).
A global ignore list is configured under binaries_ignore: directive.
See the following example default list, that allows common git artefacts to pass gate with no challenge. The complete list will be developed by consensus of PTLs prior to implementation.
file_names & file_contents
The file_names directive will report at gate any files which have a file name that matches any configured regular expressions set with 'master_list.yaml' `file_name` directive.
The file_contents directive will report any files which contain any of the nominated regex patterns under `file_contents`.
If these patterns are discovered by anteater, jenkins will votes with a -1.
Should the pattern be a false positive, then a patch needs to be supplied to the project exception file (see next section 'Project exceptions for file_names, file_contents and binaries).
Project exceptions for file names, file contents and binaries
Project specific exceptions can be added for file_names and file_contents, by using the name of the repository within the anteater/exceptions/ directory.
A cursory check is made to verify that the strings of either 'copyright' or 'SPDX' are set within the file. A simple check was agreed on, as the correct format is already checked by the legal team.
The checks will only occur against nominated file extensions `license_exts` and files may be ignored for licence checks using `license_ignore`
License Check in Root Directory
If a run of anteater is made with the --path argument,
All checks which are logged as FAIL, are logged to three log files which are then linked to from gerrit for the particular patch that failed.
Anteater will be a non voting check for 'E' release, and a voting check for 'F' release.
To allow projects to catch up on previous fails from already merged patches, a daily job will run that will scan all files.
If patch object is a binary, perform ClamAV scan.
File checksums All whitelisted binaries will have a check sum (sha256) generated placed into waivers to prevent exploit of same naming to get past gate.
Security Lint Scanning
(re)Introduce Bandit, RATS etc.
Improved gatechecks.yaml Improve formatting (easier to read / more friendly)
PyPi hosted version.
Regexp testing tools.
The cursory license check is OK for code contributed to OPNFV, but just as important is any reference to code that the submitted code interfaces with. So we need to be able to scan the references to ensure that the contribution and its references are compatible under OPNFV’s policy. For example:
- It is acceptable for an OPFNV-hosted module to be Eclipse licensed, and import a GPL-licensed module’s interfaces
- Example: the VES collectd plugin in the Barometer project: https://git.opnfv.org/barometer/tree/3rd_party/collectd-ves-plugin/ves_plugin/ves_plugin.py). It would *not* be acceptable for the VES collectd plugin to be APL 2.0 licensed and to import a GPL-licensed module’s interfaces.
- If possible, license metadata inside binaries (e.g. an image, document, slide deck, …) needs to be explicit (our current practice is to have an umbrella license in the root of the repo).
- Scanning of code and referenced code (which becomes part of the OPNFV platform when built) for licenses and known vulnerabilities
- … other examples need to be developed to establish some policies that the tool can validate
We may need to incorporate additional tools e.g. Fossology or proprietary toolchains (e.g. Blackduck – we should see if we can get an Open Source project use license from them).