Taking Out The Trash: Automatic Cleanup of Bad Resources with Kyverno
Are you tired of your developers not fixing their apps even when they know they're violating policies? Tired of harassing them via email or Slack? In this blog, I'll show you how you can use Kyverno to find and automatically remove those "bad" resources allowing you to take out your cluster's trash.
One of the lesser-used, yet insanely useful, capabilities in Kyverno is its cleanup policies, which allow you to apply a declarative, policy-based approach to removal of resources in a Kubernetes cluster, in a very fine-grained manner, based on a recurring schedule. By contrast, one of the heavily-used capabilities is Kyverno's policy reports, which give you a decoupled, in-cluster window to how your developers'/users' resources are doing compared to, particularly, validation policies.
The typical Kyverno adoption journey goes something like this: Once upon a time, there was a Kubernetes cluster with a bunch of resources in it. One day, a battle-scared platform engineer was tasked with implementing policy in the cluster to improve security and prevent "bad stuff" from happening. Not wanting to disrupt operations, he adds Kyverno policies to the cluster in Audit
mode, which produce policy reports. Those reports record the success or failure status of the existing resources in that cluster when compared to the policies which were added. "Good" resources are said to succeed; "bad" resources are said to fail. The engineer now has a naughty list of the things that need to be fixed, so he sets off into the corporate wilderness to find the owners of these resources and inform them of their wicked ways, delivering an ultimatum from on high that they should be fixed within a certain number of days. Except those days come and go and the resources still aren't fixed. He once again pesters them to remediate yet the cacophony of silence reigns.
If this sounds like you, or maybe you fear this could become you, I have some good news. Kyverno can find all the naughty resources in the cluster and automatically remove them for you after your grace period has expired.
Before going further, I should add a strong word of caution here. Deleting resources is a serious business and before you go and try anything below you should do the responsible thing and establish agreements with the necessary stakeholders. You should absolutely test this in a non-production environment, scoping down as tightly as possible so as not to produce adverse effects.
The flow here is pretty simple:
- Scan policy reports for failed results.
- Annotate the resource with the time the first failure was observed.
- Periodically run cleanup, deleting only the resources which have failed for whatever grace period you want.
Doing this, while not super complex, does involve a few parts to ensure the cleanup process only removes the resources it should.
For example, in my strategy covered herein, I'm not looking for a specific policy to have failed; I want any failures. This can, of course, be configured. I also want to be careful so that in the instance where developers remediate their issues and policy reports have all successes, the cleanup should not occur. To build this out, I'm demonstrating the concept with Deployments and only in a specific Namespace. Also note that I built this assuming Kyverno 1.11 and its per-resource reports, not the earlier style of reporting. This could still work for earlier versions, but modifications would have to be made.
To start, we need some permissions for both the background controller (which will handle mutations of existing resources) and the cleanup controller (which will do the actual deletions).
Permission below use labels to aggregate to the proper "parent" ClusterRoles.
1apiVersion: rbac.authorization.k8s.io/v1
2kind: ClusterRole
3metadata:
4 labels:
5 app.kubernetes.io/component: background-controller
6 app.kubernetes.io/instance: kyverno
7 app.kubernetes.io/part-of: kyverno
8 name: kyverno:update-deployments
9rules:
10- apiGroups:
11 - apps
12 resources:
13 - deployments
14 verbs:
15 - update
16---
17apiVersion: rbac.authorization.k8s.io/v1
18kind: ClusterRole
19metadata:
20 labels:
21 app.kubernetes.io/component: cleanup-controller
22 app.kubernetes.io/instance: kyverno
23 app.kubernetes.io/part-of: kyverno
24 name: kyverno:clean-violating-resources
25rules:
26- apiGroups:
27 - apps
28 resources:
29 - deployments
30 verbs:
31 - get
32 - watch
33 - list
34 - delete
Now, let's look at the meat of the system. We need a couple of "mutate existing" policies here to add and remove annotations. Although it would have been nice if we could just simply look at policy reports entirely, that's not possible for a couple of reasons. First, the policy report's creation date is unreliable because you cannot assume it was created due to a failure; it could have been a success or skip. Second, the rule result timestamp (given in Unix time) gets refreshed at every background scan. In order to begin the calculation properly, you need to know the first time a failure was observed and then, conversely, when ALL the failures are gone to remove/change that annotation.
With that said, here's the policy below with both rules.
1apiVersion: kyverno.io/v2beta1
2kind: ClusterPolicy
3metadata:
4 name: manage-cleanup-annotations
5spec:
6 background: true
7 mutateExistingOnPolicyUpdate: true
8 rules:
9 - name: failed-annotation
10 match:
11 any:
12 - resources:
13 kinds:
14 - Deployment
15 namespaces:
16 - engineering
17 mutate:
18 targets:
19 - apiVersion: apps/v1
20 kind: Deployment
21 name: "{{ request.object.metadata.name }}"
22 namespace: "{{ request.object.metadata.namespace }}"
23 context:
24 - name: failedpolicyresults
25 apiCall:
26 urlPath: "/apis/wgpolicyk8s.io/v1alpha2/namespaces/{{ request.namespace }}/policyreports?fieldSelector=metadata.name={{ request.object.metadata.uid }}"
27 jmesPath: items[0].summary.fail || `0`
28 preconditions:
29 all:
30 - key: "{{ failedpolicyresults }}"
31 operator: GreaterThan
32 value: 0
33 patchStrategicMerge:
34 metadata:
35 annotations:
36 corp.io/hasfailedresults: "true"
37 +(corp.io/lastfailedtime): "{{ time_now_utc() }}"
38 - name: success-annotation
39 match:
40 any:
41 - resources:
42 kinds:
43 - Deployment
44 namespaces:
45 - engineering
46 mutate:
47 targets:
48 - apiVersion: apps/v1
49 kind: Deployment
50 name: "{{ request.object.metadata.name }}"
51 namespace: "{{ request.object.metadata.namespace }}"
52 context:
53 - name: failedpolicyresults
54 apiCall:
55 urlPath: "/apis/wgpolicyk8s.io/v1alpha2/namespaces/{{ request.namespace }}/policyreports?fieldSelector=metadata.name={{ request.object.metadata.uid }}"
56 jmesPath: items[0].summary.fail || `0`
57 preconditions:
58 all:
59 - key: "{{ failedpolicyresults }}"
60 operator: Equals
61 value: 0
62 - key: '{{ request.object.metadata.annotations."corp.io/hasfailedresults" || "" }}'
63 operator: Equals
64 value: "true"
65 - key: '{{ request.object.metadata.annotations."corp.io/lastfailedtime" || "" }}'
66 operator: Equals
67 value: "?*"
68 patchesJson6902: |-
69 - path: /metadata/annotations/corp.io~1hasfailedresults
70 op: remove
71 - path: /metadata/annotations/corp.io~1lastfailedtime
72 op: remove
The policy operates on Deployments in the engineering
Namespace. The first rule adds the annotations corp.io/hasfailedresults
with a value of true
when greater than zero failures are detected, and corp.io/lastfailedtime
with the current time when a failure was detected. The +()
anchor is used so this value does not get overwritten on subsequent background scans because that would wipe our record of when a failure was first observed.
You wouldn't want to delete just Pods because, more than likely, they're owned by some higher-level controller. Deleting them will just involve those controllers spitting the same bad Pods out again resulting in an eternal battle of whack-a-mole.
The second rule looks for successes (no failures) and, if found, will remove those two annotations if they exist.
These both run in Kyverno's background mode, which is key, because a policy report is delayed from when a Deployment is created or updated. By default, background scanning occurs every hour, so the next time the scan occurs, if there is even one failure, the annotations will be applied. They will not be removed until no failures remain.
Although, as I mentioned, this looks for any failures, it could be modified if you have a specific policy you're interested in. The following JMESPath query could be used to consider only failures in the "require-run-as-nonroot" policy:
items[0].results[?policy=='require-run-as-nonroot'].result | [0] || 'empty'
Now that annotations are created and removed reliably, it's time for the cleanup.
The below ClusterCleanupPolicy again targets Deployments in the engineering
Namespace and deletes the ones which have failed results but only if the time at which the failure was observed is greater than, in this case, one hour. The cleanup runs on the tenth minute of every hour, so you have the effect where their grace period is one hour, possibly up to two. Almost certainly you'll want to increase this grace period if you were to adopt this process "for real", but in testing you probably don't want to wait, for example, a week or two to see the effects.
1apiVersion: kyverno.io/v2beta1
2kind: ClusterCleanupPolicy
3metadata:
4 name: clean-violating-resources
5spec:
6 match:
7 any:
8 - resources:
9 kinds:
10 - Deployment
11 namespaces:
12 - engineering
13 conditions:
14 all:
15 - key: '{{ target.metadata.annotations."corp.io/hasfailedresults" || "" }}'
16 operator: Equals
17 value: "true"
18 - key: "{{ time_since('','{{ target.metadata.annotations.\"corp.io/lastfailedtime\" }}','') || '' }}"
19 operator: GreaterThan
20 value: 1h
21 schedule: "10 * * * *"
Lastly, because these two annotations are considered sensitive, it'd be best to add in a validate rule to ensure that only Kyverno's ServiceAccounts have permission to manipulate them. This policy below does just that.
1apiVersion: kyverno.io/v1
2kind: ClusterPolicy
3metadata:
4 name: protect-annotation
5spec:
6 validationFailureAction: Enforce
7 background: false
8 rules:
9 - name: protect-hasfailedresults
10 match:
11 any:
12 - resources:
13 kinds:
14 - Deployment
15 operations:
16 - UPDATE
17 namespaces:
18 - engineering
19 exclude:
20 any:
21 - subjects:
22 - kind: ServiceAccount
23 name: kyverno-?*
24 namespace: kyverno
25 validate:
26 message: "Removing or modifying this annotation is prohibited."
27 deny:
28 conditions:
29 any:
30 - key: "{{request.object.metadata.annotations.\"corp.io/hasfailedresults\" || 'empty' }}"
31 operator: NotEquals
32 value: "{{request.oldObject.metadata.annotations.\"corp.io/hasfailedresults\" || 'empty' }}"
33 - key: "{{request.object.metadata.annotations.\"corp.io/lastfailedtime\" || 'empty' }}"
34 operator: NotEquals
35 value: "{{request.oldObject.metadata.annotations.\"corp.io/lastfailedtime\" || 'empty' }}"
With this combination of policies, you now have a neat automated cleanup system for resources which are failing policies. Plenty of other things you could do here, but I hope you find this useful.