Kubernetes Audit Logging
Overview
Kubernetes audit logging provides a chronological record of all API server requests, capturing who did what, when, and to which resources. It is one of the most critical security tools in a Kubernetes cluster and a high-priority topic on the CKS exam.
CKS Exam Relevance
Audit logging is one of the most commonly tested topics in Domain 5. You should be able to:
- Write an audit policy from scratch
- Configure the API server to use audit logging
- Read and interpret audit log entries
- Identify suspicious activity from audit logs
How Audit Logging Works
Every request to the Kubernetes API server passes through an audit pipeline. The audit system records each request at one or more stages and captures details based on the configured audit level.
Audit Levels
The audit level determines how much detail is recorded for each event. Levels are evaluated per-rule in the audit policy.
| Level | What is Recorded | Use Case |
|---|---|---|
None | Nothing -- the event is skipped entirely | Reduce noise for known-safe requests |
Metadata | Request metadata only (user, timestamp, resource, verb) -- no request or response body | General auditing without excessive data |
Request | Metadata + request body (but not response body) | Track what was submitted to the API |
RequestResponse | Metadata + request body + response body | Full forensic detail for sensitive operations |
Performance Impact
RequestResponse generates significantly more data than Metadata. Use it selectively for sensitive resources like Secrets, ConfigMaps with sensitive data, and RBAC objects. Using it globally can overwhelm storage and degrade API server performance.
Level Comparison Example
For a kubectl create secret command:
None: (nothing recorded)
Metadata: user=admin, verb=create, resource=secrets, namespace=default, timestamp=...
Request: Above + the full Secret spec submitted (including data)
RequestResponse: Above + the API server's response (created Secret object)Audit Stages
Each API request passes through up to four stages. The audit policy can specify which stages to capture, though the default captures all applicable stages.
| Stage | When it Fires | Description |
|---|---|---|
RequestReceived | As soon as the request arrives | Before any processing. Always fires. |
ResponseStarted | After response headers are sent but before body | Only for long-running requests (watch, exec, port-forward) |
ResponseComplete | After the full response is sent | The most common stage for completed requests |
Panic | When the API server panics | Error handling -- indicates a server bug |
Exam Tip
For the CKS exam, you will most commonly work with RequestReceived and ResponseComplete stages. The ResponseStarted stage is only relevant for long-running requests like kubectl exec or kubectl logs -f.
Writing Audit Policy Rules
The audit policy is a YAML file that defines what to audit and at what level. Rules are evaluated top-down -- the first matching rule wins.
Audit Policy Structure
apiVersion: audit.k8s.io/v1
kind: Policy
# Omit stages to capture all stages
omitStages:
- "RequestReceived"
rules:
# Each rule specifies:
# - level: None, Metadata, Request, or RequestResponse
# - resources: what API resources to match
# - verbs: what operations to match
# - users/groups: who to match
# - namespaces: which namespaces to match
- level: <level>
resources:
- group: "" # core API group
resources: ["pods"]
verbs: ["create", "update", "delete"]
users: ["system:admin"]
namespaces: ["production"]Rule Matching Logic
Rules are evaluated in order. The first matching rule determines the audit level. If no rule matches, the catch-all rule at the end applies.
Critical Rule
Always include a catch-all rule at the end of your policy. Without one, events that do not match any rule will not be logged at all.
Comprehensive Audit Policy Example
apiVersion: audit.k8s.io/v1
kind: Policy
# Do not log RequestReceived stage to reduce volume
omitStages:
- "RequestReceived"
rules:
# Rule 1: Do not log requests to the following API endpoints
# These generate high volume and low security value
- level: None
resources:
- group: ""
resources: ["endpoints", "services", "services/status"]
verbs: ["get", "watch", "list"]
# Rule 2: Do not log watch requests by the system
- level: None
users: ["system:kube-proxy"]
verbs: ["watch"]
# Rule 3: Do not log authenticated requests to certain non-resource URLs
- level: None
nonResourceURLs:
- "/api*"
- "/version"
- "/healthz*"
- "/readyz*"
- "/livez*"
# Rule 4: Log Secret access at Metadata level
# Do NOT use RequestResponse for Secrets -- it would log secret values
- level: Metadata
resources:
- group: ""
resources: ["secrets"]
# Rule 5: Log ConfigMap and PersistentVolume changes at Request level
- level: Request
resources:
- group: ""
resources: ["configmaps", "persistentvolumes"]
verbs: ["create", "update", "patch", "delete"]
# Rule 6: Log all changes to RBAC resources at RequestResponse level
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
verbs: ["create", "update", "patch", "delete"]
# Rule 7: Log pod exec, attach, and port-forward at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["pods/exec", "pods/attach", "pods/portforward"]
# Rule 8: Log node and namespace changes at RequestResponse level
- level: RequestResponse
resources:
- group: ""
resources: ["nodes", "namespaces"]
verbs: ["create", "update", "patch", "delete"]
# Rule 9: Log all ServiceAccount token requests at Metadata level
- level: Metadata
resources:
- group: ""
resources: ["serviceaccounts/token"]
# Rule 10: Catch-all -- log everything else at Metadata level
- level: Metadata
omitStages:
- "RequestReceived"Exam Tip
In the exam, you might be asked to create a policy that logs specific resources at specific levels. Pay close attention to:
- The API group (
""for core,"rbac.authorization.k8s.io"for RBAC, etc.) - Sub-resources like
pods/execandpods/log - The verb list --
get,list,watch,create,update,patch,delete
Configuring the API Server for Audit Logging
To enable audit logging, you must modify the kube-apiserver static pod manifest and add the appropriate flags, volume mounts, and host path volumes.
Step 1: Create the Audit Policy File
Save your audit policy to a file on the control plane node:
# Create the directory for audit files
sudo mkdir -p /etc/kubernetes/audit
# Write the audit policy
sudo vi /etc/kubernetes/audit/policy.yamlStep 2: Create the Audit Log Directory
sudo mkdir -p /var/log/kubernetes/auditStep 3: Modify the API Server Manifest
Edit /etc/kubernetes/manifests/kube-apiserver.yaml:
apiVersion: v1
kind: Pod
metadata:
name: kube-apiserver
namespace: kube-system
spec:
containers:
- name: kube-apiserver
command:
- kube-apiserver
# ... existing flags ...
# Add these audit flags:
- --audit-policy-file=/etc/kubernetes/audit/policy.yaml
- --audit-log-path=/var/log/kubernetes/audit/audit.log
- --audit-log-maxage=30
- --audit-log-maxbackup=10
- --audit-log-maxsize=100
volumeMounts:
# ... existing volume mounts ...
# Add these mounts:
- name: audit-policy
mountPath: /etc/kubernetes/audit/policy.yaml
readOnly: true
- name: audit-log
mountPath: /var/log/kubernetes/audit
volumes:
# ... existing volumes ...
# Add these volumes:
- name: audit-policy
hostPath:
path: /etc/kubernetes/audit/policy.yaml
type: File
- name: audit-log
hostPath:
path: /var/log/kubernetes/audit
type: DirectoryOrCreateAPI Server Audit Flags Reference
| Flag | Description | Example |
|---|---|---|
--audit-policy-file | Path to the audit policy file | /etc/kubernetes/audit/policy.yaml |
--audit-log-path | Path to the audit log file | /var/log/kubernetes/audit/audit.log |
--audit-log-maxage | Max days to retain old log files | 30 |
--audit-log-maxbackup | Max number of old log files to keep | 10 |
--audit-log-maxsize | Max size in MB before log rotation | 100 |
--audit-webhook-config-file | Path to webhook backend config | /etc/kubernetes/audit/webhook.yaml |
--audit-webhook-batch-max-wait | Max time to wait before sending webhook batch | 5s |
Critical Steps -- Do Not Skip
When configuring audit logging on the API server:
- Both the policy file and the log directory must be mounted as volumes
- The policy file mount must be
readOnly: true - The log directory mount should NOT be readOnly
- After saving the manifest, wait for the API server to restart (kubelet detects the change)
- Verify with
kubectl get pods -n kube-system-- if the API server does not come back, check your YAML syntax
Step 4: Verify Audit Logging
# Wait for API server to restart
kubectl get nodes
# Check that the audit log file is being written
ls -la /var/log/kubernetes/audit/audit.log
# View recent audit events
tail -5 /var/log/kubernetes/audit/audit.log | jq .Audit Backends
Log Backend
The log backend writes audit events to a JSON log file on the node's filesystem. This is the most common backend and the one you will use in the CKS exam.
# API server flags for log backend
--audit-policy-file=/etc/kubernetes/audit/policy.yaml
--audit-log-path=/var/log/kubernetes/audit/audit.log
--audit-log-maxage=30
--audit-log-maxbackup=10
--audit-log-maxsize=100Webhook Backend
The webhook backend sends audit events to an external HTTP endpoint (e.g., a SIEM system, Elasticsearch, or custom collector).
# Webhook configuration file (/etc/kubernetes/audit/webhook.yaml)
apiVersion: v1
kind: Config
clusters:
- name: audit-webhook
cluster:
server: https://audit-collector.example.com/audit
certificate-authority: /etc/kubernetes/audit/ca.crt
contexts:
- name: audit-webhook
context:
cluster: audit-webhook
current-context: audit-webhook# API server flags for webhook backend
--audit-webhook-config-file=/etc/kubernetes/audit/webhook.yaml
--audit-webhook-batch-max-wait=5sINFO
You can use both backends simultaneously. The log backend provides local storage for immediate investigation, while the webhook backend forwards events to a centralized logging system.
Reading and Interpreting Audit Logs
Audit Event Structure
Each audit event is a JSON object with the following key fields:
{
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "RequestResponse",
"auditID": "a]b1c2d3-e4f5-6789-abcd-ef0123456789",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/default/secrets",
"verb": "create",
"user": {
"username": "system:admin",
"groups": ["system:masters", "system:authenticated"]
},
"sourceIPs": ["192.168.1.100"],
"userAgent": "kubectl/v1.28.0",
"objectRef": {
"resource": "secrets",
"namespace": "default",
"name": "my-secret",
"apiVersion": "v1"
},
"responseStatus": {
"metadata": {},
"code": 201
},
"requestObject": { "...": "..." },
"responseObject": { "...": "..." },
"requestReceivedTimestamp": "2024-01-15T10:30:00.000000Z",
"stageTimestamp": "2024-01-15T10:30:00.050000Z"
}Key Fields for Investigation
| Field | Purpose |
|---|---|
verb | The API operation: get, list, watch, create, update, patch, delete |
user.username | Who made the request |
sourceIPs | Where the request came from |
objectRef.resource | What resource was targeted |
objectRef.namespace | In which namespace |
objectRef.name | Specific resource name |
responseStatus.code | HTTP response code (201=created, 403=forbidden, etc.) |
requestObject | The full request body (only at Request/RequestResponse level) |
responseObject | The full response body (only at RequestResponse level) |
Common jq Queries for Audit Logs
# Find all Secret access
cat audit.log | jq 'select(.objectRef.resource=="secrets")'
# Find all failed requests (403 Forbidden)
cat audit.log | jq 'select(.responseStatus.code==403)'
# Find all delete operations
cat audit.log | jq 'select(.verb=="delete")'
# Find requests from a specific user
cat audit.log | jq 'select(.user.username=="badactor")'
# Find pod exec events
cat audit.log | jq 'select(.objectRef.subresource=="exec")'
# Find all requests from a specific source IP
cat audit.log | jq 'select(.sourceIPs[] == "10.0.0.5")'
# Find ServiceAccount token creation
cat audit.log | jq 'select(.objectRef.resource=="serviceaccounts" and .objectRef.subresource=="token")'
# Find RBAC changes
cat audit.log | jq 'select(.objectRef.resource | test("roles|rolebindings|clusterroles|clusterrolebindings"))'
# Count events by verb
cat audit.log | jq -s 'group_by(.verb) | map({verb: .[0].verb, count: length})'Exam Tip
In the exam, you will likely need to use jq to filter audit logs. Practice these queries beforehand. The most common exam scenarios are:
- "Find which user created a specific resource"
- "Identify all operations on Secrets in a namespace"
- "Determine what changes were made to RBAC"
Common Exam Scenarios
Scenario 1: Enable Audit Logging
Task: Enable audit logging on the API server with a policy that logs:
- All Secret operations at Metadata level
- All namespace create/delete at RequestResponse level
- Everything else at Metadata level
Approach:
- Write the audit policy file
- Create the log directory
- Modify the API server manifest with flags and volume mounts
- Wait for API server restart and verify
Scenario 2: Investigate Suspicious Activity
Task: Someone deleted a Deployment in the production namespace. Find who did it and when using audit logs.
Approach:
cat /var/log/kubernetes/audit/audit.log | \
jq 'select(.objectRef.resource=="deployments" and
.objectRef.namespace=="production" and
.verb=="delete")'Scenario 3: Create a Targeted Audit Policy
Task: Create a policy that captures full request/response for RBAC changes but only metadata for everything else.
Approach:
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
- level: RequestResponse
resources:
- group: "rbac.authorization.k8s.io"
resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
- level: MetadataWatch Out
The catch-all rule - level: Metadata with no other fields matches everything. This is the standard way to create a default rule. Do not forget it -- without a catch-all, unmatched events are silently dropped.
Audit Logging Checklist
Use this checklist when configuring audit logging in the exam:
- [ ] Audit policy file created at the specified path
- [ ] Audit policy has correct
apiVersion: audit.k8s.io/v1andkind: Policy - [ ] Rules are in the correct order (most specific first, catch-all last)
- [ ] API server manifest includes
--audit-policy-fileflag - [ ] API server manifest includes
--audit-log-pathflag - [ ] Volume mount for the audit policy file (readOnly: true)
- [ ] Volume mount for the audit log directory
- [ ] Host path volume for the audit policy file (type: File)
- [ ] Host path volume for the audit log directory (type: DirectoryOrCreate)
- [ ] API server has restarted successfully
- [ ] Audit log file is being written to
Quick Reference
# Minimum API server flags needed:
--audit-policy-file=/etc/kubernetes/audit/policy.yaml
--audit-log-path=/var/log/kubernetes/audit/audit.log
# Minimum volume mounts needed:
# 1. Policy file mount (readOnly)
# 2. Log directory mount
# Minimum volumes needed:
# 1. Policy file hostPath (type: File)
# 2. Log directory hostPath (type: DirectoryOrCreate)