Skip to end of metadata
Go to start of metadata

Introduction

CI Gate Security are a series of gate verify checks which are centred on preventing sensitive information or possible malicious objects from getting merged into OPNFV repos.

It also performs cursory checks to verify code files and documentation contain a Licence header.

Anteater works on whitelists / blacklists that are generated using standard regular expressions, so it is possible to easily extend anteater to verify any content for any nominated string or file type, beyond even security. 

Work Flow

  1. User commits patch to repo
  2. Jenkins calls 'anteater' tool with list of files
  3. Each file is processed through the following checks

  4. Binary check
    1. If file is binary, it will be logged under Jenkins 'code review' section
  5. file name / content checks
    1. A check will be made to search the file contents, file and file extension naming for sensitive data or dangerous strings. If found, it will be logged under Jenkins 'code review' section
  6. License check
    1. A cursory search will be made to verify that some sort of license is included (full license header will later be checked by Legal). If a license is not present, it will be logged under Jenkins 'code review' section

 

anteater

Anteater tool overview

Anteater is a standard POSIX compliant CLI application.

 

root commandargumentsubcommand
anteater--project--patchset
anteater--project--path

Usage

Perform a scan against a patchset. A patchset is a file supplied by Jenkins and is a list of files commited in a patch, with each file being on a newline.

$ anteater --project releng --patchset /home/opfnv/jjb/patchset

Perform a scan against an entire project. This scan will walk through all files within the repo. 

$ anteater --project releng --path /repo/releng


Configuration of Anteater 'master_list.yaml'

A single file called 'master_list.yaml' is used to declare blacklists within anteater. A blacklist is a nothing more then a standard regular expression. If anteater matches regular expression set within master_list.yaml, it will fail the jenkins job (the commited patch).

binaries

A global ignore list is configured under binaries_ignore: directive.

See the following example default list, that allows common git artefacts to pass gate with no challenge. The complete list will be developed by consensus of PTLs prior to implementation.

binaries:
  binary_ignore:
    - \.git/(index|objects)

file_audits

file_names & file_contents

The file_names directive will report at gate any files which have a file name that matches any configured regular expressions set with 'master_list.yaml' `file_name` directive.

The file_contents directive will report any files which contain any of the nominated regex patterns under `file_contents`.

If these patterns are discovered by anteater, jenkins will votes with a -1. 

Should the pattern be a false positive, then a patch needs to be supplied to the project exception file (see next section 'Project exceptions for file_names, file_contents and binaries).

file_audits:
  file_names:
    - \.asc$
    - \.gpg$
    - \.key$
    - \.md5
    - \.sig$
    - aws_access_key_id
    - aws_secret_access_key
    - id_rsa
  file_contents:
    - -----BEGIN\sRSA\sPRIVATE\sKEY----
    - "curl(.*?)bash"
    - "git(.*?)clone"
    - "sh(.*?)curl"
    - dual_ec_drbg
    - eval
    - gost
    - md[245]
    - panama
    - private_key
    - rc4
    - ripemd
    - secret
    - sha0
    - snefru
    - ssh_key
    - sslv[12]
    - streebog
    - tlsv1
    - wget

Project exceptions for file names, file contents and binaries

Project specific exceptions can be added for file_names and file_contents, by using the name of the repository within the anteater/exceptions/ directory.

For example:

anteater/exceptions/releng_exceptions.yaml
binaries:
  binary_ignore:
    network_architecture.pptx: 
      - d0d7dfc73e0fac09d920ebbdf8cd4e0ef623f15d6246ff20d7a6d12c9a48bf41
    network_architecture.docx:
      - f81d21ae8d9ebd01c3b63dafe84046a9acb3f65b6I81d21ae8d9ebd01c3b63da
file_audits:
  file_names:
    - somefile_name
  file_contents:
    - ^#
    - -s  set secret key
    - "PKG_MAP\\[wget\\]"
    - "\\[wget\\]=wget"
    - "git clone(.*)\\.openstack\\.org"
    - "git clone(.*)gerrit\\.opnfv\\.org"


License Checks

A cursory check is made to verify that the strings of either 'copyright' or 'SPDX' are set within the file. A simple check was agreed on, as the correct format is already checked by the legal team.

The checks will only occur against nominated file extensions `license_exts` and files may be ignored for licence checks using `license_ignore`

 

licence:
  licence_exts: ['.rst','.md','.py','.sh','.java','.rb']
  license_ignore: ['__init__.py']


License Check in Root Directory

If a run of anteater is made with the --path argument, 

 

Logging Framework

All checks which are logged as FAIL, are logged to three log files which are then linked to from gerrit for the particular patch that failed.

For example:

anteater-gate-binaries-<project>.log
anteater-gate-file-name-<project>.log
anteater-gate-file-content-<project>.log
anteater-gate-licence-<project>.log


Implementation Approach

Anteater will be a non voting check for 'E' release, and a voting check for 'F' release.

To allow projects to catch up on previous fails from already merged patches, a daily job will run that will scan all files.

 

Wish List

ClamAV integration. 

If patch object is a binary, perform ClamAV scan.

File checksums

All whitelisted binaries will have a check sum (sha256) generated placed into waivers to prevent exploit of same naming to get past gate. 

Security Lint Scanning

(re)Introduce Bandit, RATS etc.

Improved gatechecks.yaml

Improve formatting (easier to read / more friendly)

Developer Tools

PyPi hosted version.

Regexp testing tools.

Deep Scanning

The cursory license check is OK for code contributed to OPNFV, but just as important is any reference to code that the submitted code interfaces with. So we need to be able to scan the references to ensure that the contribution and its references are compatible under OPNFV’s policy. For example:

  • If possible, license metadata inside binaries (e.g. an image, document, slide deck, …) needs to be explicit (our current practice is to have an umbrella license in the root of the repo).
  • Scanning of code and referenced code (which becomes part of the OPNFV platform when built) for licenses and known vulnerabilities
  • … other examples need to be developed to establish some policies that the tool can validate

We may need to incorporate additional tools e.g. Fossology or proprietary toolchains (e.g. Blackduck – we should see if we can get an Open Source project use license from them).

5 Comments

  1. Aric Gardner,

    shall we use git-lfs and store binary files into git-lfs storage from gerrit?

    1. I think that's only needed when you have really large binary objects that need version control. The checks in anteater are more to insure someone does not sneak in a binary file, that happens to be a trojan. 

  2. It will be nice to guide the project or contributors how to store these info into Jenkins instead of wavers.

     

    secret content
    functest:
      file_names: [\.gpg,test_id_rsa]
      file_contents: [not_a_secret]
    1. Julien I don't think having these on Jenkins is a good idea. It is not just because Jenkins is not where these type of things should be configured but also because it would make the manual scanning by users nearly impossible by moving an important configuration piece to somewhere else from where it belongs.

      In general, I see Jenkins as cron on steroids and using it more than this is not a good idea.

      1. To add to this, a key point for the security strings being in a gerrit repo, means that if someone requires a 'waiver' it goes through the standard code review procedure, where others can check the regex they have used to insure it does not open up a wide hole.