One-Time Pass Codes for Kyverno
In real life, imposed rules often have cases where exceptions may be required but on a case-by-case basis. Policy is really no different here. While prevention of objectively "bad" behavior should be commonplace and enforced as widely as possible, there are valid situations where the rule may need to be bent slightly. I've covered how some of these exceptions work in Kyverno in the past, but I also wanted to explore the possibility of creating some sort of "self-driving" exception system even if just conceptual in nature. In this blog, I'll share a fun little concept project I concocted on how to use Kyverno to implement a one-time pass code system for allowing these exceptions. It's probably not highly practical, but it does give you a sense of what's possible and just how powerful and flexible Kyverno can be to deliver even semi-crazy use cases like this one.
Chances are high you're using some sort of validation policies in your cluster if you're reading this article. And chances are also pretty high that at least one of those policies is in Enforce
mode which, as you probably know, will prevent a "bad" resource from being created should it violate one or more rules in the policy. There are a couple ways to provide exceptions in Kyverno. One of those is to define an exclude
block in a rule and list them there. Another is to define them centrally in another Kubernetes resource like a ConfigMap. And yet another still is to use the formal PolicyException resource introduced in Kyverno 1.9. These are all really useful mechanisms that you should try and employ. But what if in some situations you just wanted to be mostly hands off and provide a bit more loose control? What if you could just let developers and other users know how they can get around policy but still with some form of an access system? I thought I'd play around with that idea a bit and wanted to see if I could do something like a one-time pass code system for Kyverno. It turns out that because of the amazing flexibility and power of Kyverno, not only can this be done but it really wasn't that difficult!
At the end of the day, the idea is this: provide a unique one-time pass code (OTP) back to a user if their resource is blocked by a validate rule but ensure that code and use of it is documented so it can be audited. And, obviously, to prevent reuse of any code more than once.
With a combination of a couple different Kyverno policies which use both validation and mutation for existing resources, this is all possible. The full sequence of how I wanted this to work is shown below.
And here's how to put this together.
First, we'll need a Namespace I'm calling platform
in which to put our ConfigMap used as the OTP journal. Obviously, in a case where, for some reason, you wanted to implement this in a "real" environment, you'd absolutely want to protect this with RBAC so users can't read it. This ConfigMap has a key called codes
with just some starter codes to give you an idea of the formatting and sample contents.
1apiVersion: v1
2kind: ConfigMap
3metadata:
4 name: otp
5 namespace: platform
6data:
7 codes: |-
8 - ua8v92pg
9 - 9akvm2o7
Next, we need to create the validation rules. There are two rules going on in this policy.
- The
invalid-otp
rule is universal and not tied to any specific rule or other policy. It simply checks for creation of Deployments which have theotp
label set that the code hasn't been consumed. This will come into play later. - The
host-namespaces-otp
rule is just an existing rule from the Pod Security Standards of Kyverno policies which has been slightly modified to look-up codes from the ConfigMap mentioned earlier. You'll see that the OTP code is actually created in themessage
field of this rule. This is important because in the next phase, we'll harvest this information to be the input driver for the ConfigMap.
Also, notice how I've used spec.applyRules: One
in this policy and ordered the rules such that invalid-otp
is first. This is to prevent creation of yet another OTP if a user either specifies an invalid one or a code which has already been consumed. Although OTP codes will be generated automatically any time there is a Deployment which fails the host-namespaces-otp
rule, we only want a code to be generated when they aren't trying to specify one in the first place.
Below is the full validation policy.
1apiVersion: kyverno.io/v2beta1
2kind: ClusterPolicy
3metadata:
4 name: disallow-host-namespaces-otp
5spec:
6 validationFailureAction: Enforce
7 background: false
8 applyRules: One
9 rules:
10 - name: invalid-otp
11 match:
12 any:
13 - resources:
14 kinds:
15 - Deployment
16 operations:
17 - CREATE
18 selector:
19 matchLabels:
20 otp: "?*"
21 context:
22 - name: otp
23 configMap:
24 name: otp
25 namespace: platform
26 preconditions:
27 all:
28 - key: "{{ request.object.metadata.labels.otp }}"
29 operator: AnyNotIn
30 value: "{{ parse_yaml(otp.data.codes) }}"
31 validate:
32 message: The code {{ request.object.metadata.labels.otp }} is invalid or has already been used.
33 deny: {}
34 - name: host-namespaces-otp
35 match:
36 any:
37 - resources:
38 kinds:
39 - Deployment
40 operations:
41 - CREATE
42 context:
43 - name: otp
44 configMap:
45 name: otp
46 namespace: platform
47 preconditions:
48 all:
49 - key: "{{ request.object.metadata.labels.otp || '' }}"
50 operator: AnyNotIn
51 value: "{{ parse_yaml(otp.data.codes) }}"
52 validate:
53 message: >-
54 Sharing the host namespaces is disallowed. The fields spec.hostNetwork,
55 spec.hostIPC, and spec.hostPID must be unset or set to `false`. To get around this,
56 you may use a one-time pass code "{{ random('[0-9a-z]{8}') }}" assigned as the value of
57 a label with key "otp". Use of this code will be recorded along with your username.
58 pattern:
59 spec:
60 template:
61 spec:
62 =(hostPID): false
63 =(hostIPC): false
64 =(hostNetwork): false
The net effect here is if a user tries to create a "bad" Deployment which violates the host-namespaces-otp
rule, it'll block them but return a message containing the OTP code and how to use it. Notice also how I'm warning in the message that, if you use this code, it'll be recorded for audit purposes.
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: busybox
5 namespace: default
6 labels:
7 app: busybox
8spec:
9 replicas: 1
10 selector:
11 matchLabels:
12 app: busybox
13 template:
14 metadata:
15 labels:
16 app: busybox
17 spec:
18 hostIPC: true
19 containers:
20 - image: busybox:1.28
21 name: busybox
22 command: ["sleep", "9999"]
1$ kubectl apply -f baddeploy.yaml
2Error from server: error when creating "baddeploy.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
3
4resource Deployment/default/busybox was blocked due to the following policies
5
6disallow-host-namespaces-otp:
7 host-namespaces-otp: 'validation error: Sharing the host namespaces is disallowed.
8 The fields spec.hostNetwork, spec.hostIPC, and spec.hostPID must be unset or set
9 to `false`. To get around this, you may use a one-time pass code "ee4co4k8" assigned
10 as the value of a label with key "otp". Use of this code will be recorded along
11 with your username. rule host-namespaces-otp failed at path /spec/template/spec/hostIPC/'
Next, we need to implement the ConfigMap management system so that OTP codes are added when they need to be and removed upon first use. This was the fun part. Let me explain how this works.
First, in the add-otp
rule, in order to dynamically add the OTP codes to the ConfigMap, we're parsing them out of the Event Kyverno generates whenever there's a blocked resource. This Event–just a standard Kubernetes v1 Event–contains the message which contains the OTP we saw earlier. Since Kyverno can match on these Events (you will need to update your resource filter to allow this), we can use that specific Event as the trigger for a mutate-existing rule on our ConfigMap.
Note: if you remove the Event resource filter you will increase the processing load on Kyverno which will, in turn, require more resources.
With this OTP code extracted from the message, we can append it to the ConfigMap.
Second, in the manage-otp
rule, we're watching for the creation of Deployments that set the otp
label and, if that value is valid, we're modifying its entry in the ConfigMap to record the timestamp and also username of the actor who consumed it. This serves a dual purpose in that because this information has been appended, the code itself is invalidated. Much better than simply deleting the code from the list.
Below is the second policy with both rules.
1apiVersion: kyverno.io/v2beta1
2kind: ClusterPolicy
3metadata:
4 name: manage-otp-list
5spec:
6 rules:
7 - name: add-otp
8 match:
9 any:
10 - resources:
11 kinds:
12 - v1/Event
13 names:
14 - "disallow-host-namespaces-otp.?*"
15 preconditions:
16 all:
17 - key: "{{ request.object.reason }}"
18 operator: Equals
19 value: PolicyViolation
20 - key: "{{ contains(request.object.message, 'one-time pass code') }}"
21 operator: Equals
22 value: true
23 context:
24 - name: otp
25 variable:
26 jmesPath: split(request.object.message,'"') | [1]
27 mutate:
28 targets:
29 - apiVersion: v1
30 kind: ConfigMap
31 name: otp
32 namespace: platform
33 patchStrategicMerge:
34 data:
35 codes: |-
36 {{ @ }}
37 - {{ otp }}
38 - name: manage-otp
39 match:
40 any:
41 - resources:
42 kinds:
43 - Deployment
44 operations:
45 - CREATE
46 selector:
47 matchLabels:
48 otp: "?*"
49 context:
50 - name: otp
51 configMap:
52 name: otp
53 namespace: platform
54 preconditions:
55 all:
56 - key: "{{ request.object.metadata.labels.otp }}"
57 operator: AnyIn
58 value: "{{ parse_yaml(otp.data.codes) }}"
59 mutate:
60 targets:
61 - apiVersion: v1
62 kind: ConfigMap
63 name: otp
64 namespace: platform
65 context:
66 - name: used
67 variable:
68 jmesPath: replace_all(target.data.codes,'{{request.object.metadata.labels.otp}}','{{request.object.metadata.labels.otp}}-{{time_now_utc()}}-{{request.userInfo.username}}')
69 patchStrategicMerge:
70 data:
71 codes: |-
72 {{ used }}
Try it out with a Deployment which uses the code provided earlier.
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: busybox
5 namespace: default
6 labels:
7 app: busybox
8 otp: 1t1h360g
9spec:
10 replicas: 1
11 selector:
12 matchLabels:
13 app: busybox
14 template:
15 metadata:
16 labels:
17 app: busybox
18 spec:
19 hostIPC: true
20 containers:
21 - image: busybox:1.28
22 name: busybox
23 command: ["sleep", "9999"]
When a valid code is consumed, Kyverno will update the ConfigMap to transform this
1apiVersion: v1
2kind: ConfigMap
3metadata:
4 name: otp
5 namespace: platform
6data:
7 codes: |-
8 - ua8v92pg
9 - 9akvm2o7
10 - 1t1h360g
into this
1apiVersion: v1
2kind: ConfigMap
3metadata:
4 name: otp
5 namespace: platform
6data:
7 codes: |-
8 - ua8v92pg
9 - 9akvm2o7
10 - 1t1h360g-2023-06-21T15:04:59Z-czoller
Alright, let's try it out end-to-end and see this whole thing work!
Create a "bad" Deployment.
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: busybox
5 namespace: default
6 labels:
7 app: busybox
8spec:
9 replicas: 1
10 selector:
11 matchLabels:
12 app: busybox
13 template:
14 metadata:
15 labels:
16 app: busybox
17 spec:
18 hostIPC: true
19 containers:
20 - image: busybox:1.28
21 name: busybox
22 command: ["sleep", "9999"]
1$ kubectl apply -f baddeploy.yaml
2Error from server: error when creating "baddeploy.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
3
4resource Deployment/default/busybox was blocked due to the following policies
5
6disallow-host-namespaces-otp:
7 host-namespaces-otp: 'validation error: Sharing the host namespaces is disallowed.
8 The fields spec.hostNetwork, spec.hostIPC, and spec.hostPID must be unset or set
9 to `false`. To get around this, you may use a one-time pass code "uq1s17g8" assigned
10 as the value of a label with key "otp". Use of this code will be recorded along
11 with your username. rule host-namespaces-otp failed at path /spec/template/spec/hostIPC/'
Let's use the code uq1s17g8
just provided.
I'll take the same "bad" Deployment and add that as the value of a label called otp
.
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: busybox
5 namespace: default
6 labels:
7 app: busybox
8 otp: uq1s17g8
9spec:
10 replicas: 1
11 selector:
12 matchLabels:
13 app: busybox
14 template:
15 metadata:
16 labels:
17 app: busybox
18 spec:
19 hostIPC: true
20 containers:
21 - image: busybox:1.28
22 name: busybox
23 command: ["sleep", "9999"]
1$ kubectl apply -f baddeploy.yaml
2deployment.apps/busybox created
Let's ensure someone cannot use this same code a second time, so we'll delete the Deployment we just created.
1$ kubectl delete deploy busybox
2deployment.apps "busybox" deleted
And try to create the same exact Deployment once again.
1$ kubectl apply -f baddeploy.yaml
2Error from server: error when creating "baddeploy.yaml": admission webhook "validate.kyverno.svc-fail" denied the request:
3
4resource Deployment/default/busybox was blocked due to the following policies
5
6disallow-host-namespaces-otp:
7 invalid-otp: The code uq1s17g8 is invalid or has already been used.
There you can see that the same code uq1s17g8
is now flagged as invalid since it was used once before.
As a privileged cluster admin, we can also check our otp
ConfigMap and see who and when a code was used.
1$ kubectl -n platform get cm otp -o yaml
2apiVersion: v1
3data:
4 codes: |-
5 - ua8v92pg
6 - 9akvm2o7
7 - 1t1h360g-2023-06-21T15:04:59Z-czoller
8 - uq1s17g8-2023-06-21T15:10:18Z-jdoe
9kind: ConfigMap
10metadata:
11 annotations:
12 policies.kyverno.io/last-applied-patches: |
13 manage-otp.manage-otp-list.kyverno.io: replaced /data/codes
14 creationTimestamp: "2023-06-20T13:01:27Z"
15 name: otp
16 namespace: platform
17 resourceVersion: "5147565"
18 uid: ed2cce4e-6cf4-4309-b2cc-a2c45493ef4e
And there you have it, your very own OTP system for Kyverno which is self-managed and allows for auditing.
Even though this concept probably isn't very practical to use in the real world, I had fun just experimenting with the idea to see if it was possible. Who knows, maybe some of you out there can even use this!