Attesting Image Scans With Kyverno

Description of image

The subject of vulnerabilities in container images is a serious business. As an image author yourself, one of the things you should be doing is ensuring you know what those vulnerabilities are and that you aren't relying on what a scan told you three months ago to make decisions about running it today. Bring Kubernetes into the mix, and you're probably running lots of them at scale which makes this more difficult. In this article, I want to show something which can help with both ends of that process: producing vulnerability scans and using them to make decisions on whether to run an image under Kubernetes. Using a combination of GitHub actions, Trivy, Cosign, and Kyverno, you will have an automated, end-to-end system which allows you to not only produce vulnerability information but be able to make those critical decisions and in an on-going, automated fashion.

The first step to gaining visibility into your image vulnerabilities is to produce such a list. There are a few tools which can assist here, the most popular seem to be Grype and Trivy. The second step is to plumb this scan into your CI pipeline. Scanning at build time is good (and essential) but because new vulnerabilities are found all the time, a one-time scan is no good if you're running that same image version weeks or months down the line. You need a way for that image to get re-scanned periodically. The third step is to make that list of discovered vulnerabilities available somewhere else so you can make decisions based upon it. Good for you I have all three things covered (if you're using GitHub, that is).

The follow is a GitHub Action I have been tinkering on for some time now and I think I finally have it in a good enough position where it checks all of these boxes and works quite well.

 1name: vulnerability-scan
 2on:
 3  schedule:
 4    - cron: '23 1 * * *' # Every day at 01:23
 5env:
 6  REGISTRY: ghcr.io
 7  IMAGE_NAME: ${{ github.repository }}
 8jobs:
 9  scan:
10    runs-on: ubuntu-20.04
11    permissions:
12      contents: read
13    outputs:
14      scan-digest: ${{ steps.calculate-scan-hash.outputs.scan_digest }}
15    steps:
16    - name: Scan for vulnerabilities
17      uses: aquasecurity/trivy-action@503d3abc15463af68b817d685982721f134256a5 # v0.6.0
18      with: 
19        image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
20        format: 'json'
21        ignore-unfixed: true
22        output: trivy-scan.json
23
24    - name: Add scan metadata
25      uses: sergeysova/jq-action@9ac92a6da6d616b4cebdddc0059e36a1ad43fab1 # v2.1.0
26      with:
27        cmd: cat trivy-scan.json | jq '. + {timestamp:(now|todateiso8601)}' | jq '. + {scanner:"trivy"}' > scan.json
28
29    - name: Calculate scan file hash
30      id: calculate-scan-hash
31      run: |
32        SCAN_DIGEST=$(sha256sum scan.json | awk '{print $1}')
33        echo "::set-output name=scan_digest::$SCAN_DIGEST"
34        echo "Hash of scan.json is: $SCAN_DIGEST"        
35
36    - name: Upload vulnerability scan report
37      uses: actions/upload-artifact@6673cd052c4cd6fcf4b4e6e60ea986c889389535 # v3.0.0
38      with:
39        name: scan.json
40        path: scan.json
41        if-no-files-found: error
42  attest:
43    runs-on: ubuntu-20.04
44    permissions:
45      contents: write
46      actions: read
47      packages: write
48      id-token: write
49    env:
50      SCAN_DIGEST: "${{ needs.scan.outputs.scan-digest }}"
51    needs: scan
52    steps:
53    - name: Download scan
54      uses: actions/download-artifact@fb598a63ae348fa914e94cd0ff38f362e927b741 # v3.0.0
55      with:
56        name: scan.json
57
58    - name: Verify scan
59      run: |
60        set -euo pipefail
61        echo "Hash of scan.json should be: $SCAN_DIGEST"
62        COMPUTED_HASH=$(sha256sum scan.json | awk '{print $1}')
63        echo "The current computed hash for scan.json is: $COMPUTED_HASH"
64        echo "If the two above hashes don't match, scan.json has been tampered with."
65        echo "$SCAN_DIGEST scan.json" | sha256sum --strict --check --status || exit -2        
66
67    - name: Install Cosign
68      uses: sigstore/cosign-installer@7e0881f8fe90b25e305bbf0309761e9314607e25 # v2.4.0
69      with:
70        cosign-release: 'v1.10.0'
71
72    # - name: List files
73    #   run: ls -lahF && cat provenance.json
74
75    - name: Log in to GHCR
76      uses: docker/login-action@49ed152c8eca782a232dede0303416e8f356c37b # v2.0.0
77      with:
78        registry: ${{ env.REGISTRY }}
79        username: ${{ github.actor }}
80        password: ${{ secrets.GITHUB_TOKEN }}
81
82    - name: Attest Scan
83      run: cosign attest --replace --predicate scan.json --type https://trivy.aquasec.com/scan/v2 ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
84      env:
85        COSIGN_EXPERIMENTAL: "true"

This action has two jobs: scan and attest.

The scan job will:

  1. Use Trivy to scan your image (right now it's just set to the latest tag which you can configure).
  2. Adds some additional fields to the scan results, the most important bing the time when the scan was produced.
  3. Hash the scan file as a tamper-detection mechanism.
  4. Upload the scan as an artifact on the workflow.

The attest job will immediately follow and will:

  1. Download the scan report from the previous job.
  2. Verify the hash to ensure it hasn't been altered.
  3. Install Cosign which will be used to attest the scan.
  4. Log in to GitHub Container Registry as this is where we will push the signed attestation.
  5. Uses Cosign's keyless signing ability to attest the scan and replace one if it exists. Note that the predicate type used here is custom and I'm calling it https://trivy.aquasec.com/scan/v2. This is because, as of right now, things like scans don't have an official predicate type so we need to define it. You can use anything here like maybe something referencing your organization.

The last step here is important and why I decided it was time to write this article. Prior to Cosign 1.10, the replacement wasn't working properly. Now that it is, this means you can schedule this workflow (as I have done already) and know that each time the image has its scan checked, you can be assured it's always the latest one. I'm coming to that latter point just now.

Now you have scheduled scans of your images taking place. You're also attesting them in an automated and secured way. The final step in the process is to verify these scans and make decisions based upon them prior to allowing your image to actually run.

For this last piece, we can use Kyverno's image verification abilities to do all sorts of useful things. In this example, the most basic things I want to check prior to allowing this image to run in my Kubernetes environment are 1) is it signed the way I expect? and 2) is the vulnerability scan current? The first check is designed to establish that not only is it signed but that, in the case here, the signer was my specific GitHub Action. The second check is designed to ensure we're always looking at fresh data and an attacker hasn't prevented access to it in order to hide a potentially harmful vulnerability they were able to inject somehow.

Below is an example of that Kyverno policy which performs these checks.

 1apiVersion: kyverno.io/v1
 2kind: ClusterPolicy
 3metadata:
 4  name: check-vulnerabilities
 5spec:
 6  validationFailureAction: enforce
 7  webhookTimeoutSeconds: 10
 8  failurePolicy: Fail
 9  rules:
10    - name: not-older-than-one-week
11      match:
12        any:
13        - resources:
14            kinds:
15              - Pod
16      verifyImages:
17      - imageReferences:
18        - "ghcr.io/chipzoller/zulu:*"
19        attestors:
20        - entries:
21          - keyless:
22              subject: "https://github.com/chipzoller/zulu/.github/workflows/*"
23              issuer: "https://token.actions.githubusercontent.com"
24        attestations:
25        - predicateType: https://trivy.aquasec.com/scan/v2
26          conditions:
27          - all:
28            - key: "{{ time_since('','{{timestamp}}','') }}"
29              operator: LessThanOrEquals
30              value: "168h"

Although, even without Kyverno knowledge, I'm willing to bet you can figure out most of what this policy is designed to do. That's because Kyverno does not require a programming language and uses common YAML paradigms and idioms with which you're already familiar making it simple to read and write. If you aren't familiar with Kyverno, I recommend my Exploring Kyverno blog series here.

Let's walk through this policy and explain its functions.

  1. spec.validationFailureAction tells Kyverno what action to take if the validation defined in the rule fails. In this case enforce means, "block this thing from running."
  2. spec.webhookTimeoutSeconds tells Kyverno the maximum length of time it should wait before timing this check out. It's at 10 seconds which is plenty long.
  3. spec.failurePolicy tells the Kubernetes API server what should happen if it doesn't get a response from Kyverno. Fail here means to deny the request. Another option is Ignore.
  4. spec.rules[0].match shows we're matching on Pods, no matter where they come from.
  5. spec.rules[0].verifyImages[0].imageReferences tells Kyverno on which container images it should perform this validation check. I'm naming my test image (which you can also use) named ghcr.io/chipzoller/zulu and I'm checking against all the tags. You can list multiple images here, name specific tags, and other things.
  6. spec.rules[0].verifyImages[0].attestors[0] tells Kyverno that for the specific image reference, it needs to make sure it was signed in keyless mode and that the subject is a workflow from my specific GitHub repository (note the wildcard at the end as I'm not naming a specific workflow file) and that the issuer of that signature is GitHub Actions.
  7. spec.rules[0].verifyImages[0].attestations[0] tells Kyverno to search for a predicate specifically named https://trivy.aquasec.com/scan/v2 and to deny the Pod if the scan is any older than seven (7) days (168 hours).

Let's now put all these pieces together and test this policy with a real image which was scanned, signed, and attested with this process.

Create the Kyverno policy in your cluster and try to submit a Pod which names a container image which meets these criteria. I'll be using the same demo image named in the policy, and hopefully if you, the reader, try this out at some point in the future I haven't broken anything--but it's very possible!

1apiVersion: v1
2kind: Pod
3metadata:
4  name: mypod
5spec:
6  containers:
7  - name: zulu
8    image: ghcr.io/chipzoller/zulu:latest
1$ k apply -f pod.yaml 
2pod/mypod created

I'm using a release candidate of Kyverno 1.7.2 here and the Pod is successfully created. Let's test a failure to ensure the policy is truly working. It just so happens that, as of this writing, the last attested scan took place about twelve hours ago based upon my GitHub Action run history, so let's crank down on that time field in our Kyverno policy so it's even lower than that which should produce a failure. How about changing that value field from 168h to something like 10h. Replace the Kyverno policy and let's try again.

1$ k apply -f pod.yaml 
2Error from server: error when creating "pod.yaml": admission webhook "mutate.kyverno.svc-fail" denied the request: 
3
4resource Pod/default/mypod was blocked due to the following policies
5
6check-vulnerabilities:
7  not-older-than-one-week: 'failed to verify signature for ghcr.io/chipzoller/zulu:latest:
8    .attestors[0].entries[0].keyless: attestation checks failed for ghcr.io/chipzoller/zulu:latest
9    and predicate https://trivy.aquasec.com/scan/v2'

As you can see Kyverno was able to catch this and see the attested scan wasn't within the time we wanted and so that Pod was prevented from running.

This is a really good start to ensuring you are continually attesting to the latest vulnerabilities in your images, but the next thing you'll probably want to do is build some sort of an allow/deny list of the CVEs that might be reported so you can state which do/don't matter in your case. We'll have to save that subject for another blog, however. So hopefully this was helpful and please ping me with any feedback or criticism you may have on this post.


Special thanks to Jim Bugwadia, the bringer of these abilities to Kyverno and also the original author of the GitHub Action workflow I presented here. Thanks, Jim!