Skip to content

Kubernetes Audit Logging

Overview

Kubernetes audit logging provides a chronological record of all API server requests, capturing who did what, when, and to which resources. It is one of the most critical security tools in a Kubernetes cluster and a high-priority topic on the CKS exam.

CKS Exam Relevance

Audit logging is one of the most commonly tested topics in Domain 5. You should be able to:

  • Write an audit policy from scratch
  • Configure the API server to use audit logging
  • Read and interpret audit log entries
  • Identify suspicious activity from audit logs

How Audit Logging Works

Every request to the Kubernetes API server passes through an audit pipeline. The audit system records each request at one or more stages and captures details based on the configured audit level.

Audit Levels

The audit level determines how much detail is recorded for each event. Levels are evaluated per-rule in the audit policy.

LevelWhat is RecordedUse Case
NoneNothing -- the event is skipped entirelyReduce noise for known-safe requests
MetadataRequest metadata only (user, timestamp, resource, verb) -- no request or response bodyGeneral auditing without excessive data
RequestMetadata + request body (but not response body)Track what was submitted to the API
RequestResponseMetadata + request body + response bodyFull forensic detail for sensitive operations

Performance Impact

RequestResponse generates significantly more data than Metadata. Use it selectively for sensitive resources like Secrets, ConfigMaps with sensitive data, and RBAC objects. Using it globally can overwhelm storage and degrade API server performance.

Level Comparison Example

For a kubectl create secret command:

None:              (nothing recorded)
Metadata:          user=admin, verb=create, resource=secrets, namespace=default, timestamp=...
Request:           Above + the full Secret spec submitted (including data)
RequestResponse:   Above + the API server's response (created Secret object)

Audit Stages

Each API request passes through up to four stages. The audit policy can specify which stages to capture, though the default captures all applicable stages.

StageWhen it FiresDescription
RequestReceivedAs soon as the request arrivesBefore any processing. Always fires.
ResponseStartedAfter response headers are sent but before bodyOnly for long-running requests (watch, exec, port-forward)
ResponseCompleteAfter the full response is sentThe most common stage for completed requests
PanicWhen the API server panicsError handling -- indicates a server bug

Exam Tip

For the CKS exam, you will most commonly work with RequestReceived and ResponseComplete stages. The ResponseStarted stage is only relevant for long-running requests like kubectl exec or kubectl logs -f.

Writing Audit Policy Rules

The audit policy is a YAML file that defines what to audit and at what level. Rules are evaluated top-down -- the first matching rule wins.

Audit Policy Structure

yaml
apiVersion: audit.k8s.io/v1
kind: Policy
# Omit stages to capture all stages
omitStages:
  - "RequestReceived"
rules:
  # Each rule specifies:
  # - level: None, Metadata, Request, or RequestResponse
  # - resources: what API resources to match
  # - verbs: what operations to match
  # - users/groups: who to match
  # - namespaces: which namespaces to match
  - level: <level>
    resources:
      - group: ""          # core API group
        resources: ["pods"]
    verbs: ["create", "update", "delete"]
    users: ["system:admin"]
    namespaces: ["production"]

Rule Matching Logic

Rules are evaluated in order. The first matching rule determines the audit level. If no rule matches, the catch-all rule at the end applies.

Critical Rule

Always include a catch-all rule at the end of your policy. Without one, events that do not match any rule will not be logged at all.

Comprehensive Audit Policy Example

yaml
apiVersion: audit.k8s.io/v1
kind: Policy

# Do not log RequestReceived stage to reduce volume
omitStages:
  - "RequestReceived"

rules:
  # Rule 1: Do not log requests to the following API endpoints
  # These generate high volume and low security value
  - level: None
    resources:
      - group: ""
        resources: ["endpoints", "services", "services/status"]
    verbs: ["get", "watch", "list"]

  # Rule 2: Do not log watch requests by the system
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]

  # Rule 3: Do not log authenticated requests to certain non-resource URLs
  - level: None
    nonResourceURLs:
      - "/api*"
      - "/version"
      - "/healthz*"
      - "/readyz*"
      - "/livez*"

  # Rule 4: Log Secret access at Metadata level
  # Do NOT use RequestResponse for Secrets -- it would log secret values
  - level: Metadata
    resources:
      - group: ""
        resources: ["secrets"]

  # Rule 5: Log ConfigMap and PersistentVolume changes at Request level
  - level: Request
    resources:
      - group: ""
        resources: ["configmaps", "persistentvolumes"]
    verbs: ["create", "update", "patch", "delete"]

  # Rule 6: Log all changes to RBAC resources at RequestResponse level
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
    verbs: ["create", "update", "patch", "delete"]

  # Rule 7: Log pod exec, attach, and port-forward at Metadata level
  - level: Metadata
    resources:
      - group: ""
        resources: ["pods/exec", "pods/attach", "pods/portforward"]

  # Rule 8: Log node and namespace changes at RequestResponse level
  - level: RequestResponse
    resources:
      - group: ""
        resources: ["nodes", "namespaces"]
    verbs: ["create", "update", "patch", "delete"]

  # Rule 9: Log all ServiceAccount token requests at Metadata level
  - level: Metadata
    resources:
      - group: ""
        resources: ["serviceaccounts/token"]

  # Rule 10: Catch-all -- log everything else at Metadata level
  - level: Metadata
    omitStages:
      - "RequestReceived"

Exam Tip

In the exam, you might be asked to create a policy that logs specific resources at specific levels. Pay close attention to:

  • The API group ("" for core, "rbac.authorization.k8s.io" for RBAC, etc.)
  • Sub-resources like pods/exec and pods/log
  • The verb list -- get, list, watch, create, update, patch, delete

Configuring the API Server for Audit Logging

To enable audit logging, you must modify the kube-apiserver static pod manifest and add the appropriate flags, volume mounts, and host path volumes.

Step 1: Create the Audit Policy File

Save your audit policy to a file on the control plane node:

bash
# Create the directory for audit files
sudo mkdir -p /etc/kubernetes/audit

# Write the audit policy
sudo vi /etc/kubernetes/audit/policy.yaml

Step 2: Create the Audit Log Directory

bash
sudo mkdir -p /var/log/kubernetes/audit

Step 3: Modify the API Server Manifest

Edit /etc/kubernetes/manifests/kube-apiserver.yaml:

yaml
apiVersion: v1
kind: Pod
metadata:
  name: kube-apiserver
  namespace: kube-system
spec:
  containers:
  - name: kube-apiserver
    command:
    - kube-apiserver
    # ... existing flags ...
    # Add these audit flags:
    - --audit-policy-file=/etc/kubernetes/audit/policy.yaml
    - --audit-log-path=/var/log/kubernetes/audit/audit.log
    - --audit-log-maxage=30
    - --audit-log-maxbackup=10
    - --audit-log-maxsize=100
    volumeMounts:
    # ... existing volume mounts ...
    # Add these mounts:
    - name: audit-policy
      mountPath: /etc/kubernetes/audit/policy.yaml
      readOnly: true
    - name: audit-log
      mountPath: /var/log/kubernetes/audit
  volumes:
  # ... existing volumes ...
  # Add these volumes:
  - name: audit-policy
    hostPath:
      path: /etc/kubernetes/audit/policy.yaml
      type: File
  - name: audit-log
    hostPath:
      path: /var/log/kubernetes/audit
      type: DirectoryOrCreate

API Server Audit Flags Reference

FlagDescriptionExample
--audit-policy-filePath to the audit policy file/etc/kubernetes/audit/policy.yaml
--audit-log-pathPath to the audit log file/var/log/kubernetes/audit/audit.log
--audit-log-maxageMax days to retain old log files30
--audit-log-maxbackupMax number of old log files to keep10
--audit-log-maxsizeMax size in MB before log rotation100
--audit-webhook-config-filePath to webhook backend config/etc/kubernetes/audit/webhook.yaml
--audit-webhook-batch-max-waitMax time to wait before sending webhook batch5s

Critical Steps -- Do Not Skip

When configuring audit logging on the API server:

  1. Both the policy file and the log directory must be mounted as volumes
  2. The policy file mount must be readOnly: true
  3. The log directory mount should NOT be readOnly
  4. After saving the manifest, wait for the API server to restart (kubelet detects the change)
  5. Verify with kubectl get pods -n kube-system -- if the API server does not come back, check your YAML syntax

Step 4: Verify Audit Logging

bash
# Wait for API server to restart
kubectl get nodes

# Check that the audit log file is being written
ls -la /var/log/kubernetes/audit/audit.log

# View recent audit events
tail -5 /var/log/kubernetes/audit/audit.log | jq .

Audit Backends

Log Backend

The log backend writes audit events to a JSON log file on the node's filesystem. This is the most common backend and the one you will use in the CKS exam.

bash
# API server flags for log backend
--audit-policy-file=/etc/kubernetes/audit/policy.yaml
--audit-log-path=/var/log/kubernetes/audit/audit.log
--audit-log-maxage=30
--audit-log-maxbackup=10
--audit-log-maxsize=100

Webhook Backend

The webhook backend sends audit events to an external HTTP endpoint (e.g., a SIEM system, Elasticsearch, or custom collector).

yaml
# Webhook configuration file (/etc/kubernetes/audit/webhook.yaml)
apiVersion: v1
kind: Config
clusters:
- name: audit-webhook
  cluster:
    server: https://audit-collector.example.com/audit
    certificate-authority: /etc/kubernetes/audit/ca.crt
contexts:
- name: audit-webhook
  context:
    cluster: audit-webhook
current-context: audit-webhook
bash
# API server flags for webhook backend
--audit-webhook-config-file=/etc/kubernetes/audit/webhook.yaml
--audit-webhook-batch-max-wait=5s

INFO

You can use both backends simultaneously. The log backend provides local storage for immediate investigation, while the webhook backend forwards events to a centralized logging system.

Reading and Interpreting Audit Logs

Audit Event Structure

Each audit event is a JSON object with the following key fields:

json
{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "RequestResponse",
  "auditID": "a]b1c2d3-e4f5-6789-abcd-ef0123456789",
  "stage": "ResponseComplete",
  "requestURI": "/api/v1/namespaces/default/secrets",
  "verb": "create",
  "user": {
    "username": "system:admin",
    "groups": ["system:masters", "system:authenticated"]
  },
  "sourceIPs": ["192.168.1.100"],
  "userAgent": "kubectl/v1.28.0",
  "objectRef": {
    "resource": "secrets",
    "namespace": "default",
    "name": "my-secret",
    "apiVersion": "v1"
  },
  "responseStatus": {
    "metadata": {},
    "code": 201
  },
  "requestObject": { "...": "..." },
  "responseObject": { "...": "..." },
  "requestReceivedTimestamp": "2024-01-15T10:30:00.000000Z",
  "stageTimestamp": "2024-01-15T10:30:00.050000Z"
}

Key Fields for Investigation

FieldPurpose
verbThe API operation: get, list, watch, create, update, patch, delete
user.usernameWho made the request
sourceIPsWhere the request came from
objectRef.resourceWhat resource was targeted
objectRef.namespaceIn which namespace
objectRef.nameSpecific resource name
responseStatus.codeHTTP response code (201=created, 403=forbidden, etc.)
requestObjectThe full request body (only at Request/RequestResponse level)
responseObjectThe full response body (only at RequestResponse level)

Common jq Queries for Audit Logs

bash
# Find all Secret access
cat audit.log | jq 'select(.objectRef.resource=="secrets")'

# Find all failed requests (403 Forbidden)
cat audit.log | jq 'select(.responseStatus.code==403)'

# Find all delete operations
cat audit.log | jq 'select(.verb=="delete")'

# Find requests from a specific user
cat audit.log | jq 'select(.user.username=="badactor")'

# Find pod exec events
cat audit.log | jq 'select(.objectRef.subresource=="exec")'

# Find all requests from a specific source IP
cat audit.log | jq 'select(.sourceIPs[] == "10.0.0.5")'

# Find ServiceAccount token creation
cat audit.log | jq 'select(.objectRef.resource=="serviceaccounts" and .objectRef.subresource=="token")'

# Find RBAC changes
cat audit.log | jq 'select(.objectRef.resource | test("roles|rolebindings|clusterroles|clusterrolebindings"))'

# Count events by verb
cat audit.log | jq -s 'group_by(.verb) | map({verb: .[0].verb, count: length})'

Exam Tip

In the exam, you will likely need to use jq to filter audit logs. Practice these queries beforehand. The most common exam scenarios are:

  • "Find which user created a specific resource"
  • "Identify all operations on Secrets in a namespace"
  • "Determine what changes were made to RBAC"

Common Exam Scenarios

Scenario 1: Enable Audit Logging

Task: Enable audit logging on the API server with a policy that logs:

  • All Secret operations at Metadata level
  • All namespace create/delete at RequestResponse level
  • Everything else at Metadata level

Approach:

  1. Write the audit policy file
  2. Create the log directory
  3. Modify the API server manifest with flags and volume mounts
  4. Wait for API server restart and verify

Scenario 2: Investigate Suspicious Activity

Task: Someone deleted a Deployment in the production namespace. Find who did it and when using audit logs.

Approach:

bash
cat /var/log/kubernetes/audit/audit.log | \
  jq 'select(.objectRef.resource=="deployments" and 
      .objectRef.namespace=="production" and 
      .verb=="delete")'

Scenario 3: Create a Targeted Audit Policy

Task: Create a policy that captures full request/response for RBAC changes but only metadata for everything else.

Approach:

yaml
apiVersion: audit.k8s.io/v1
kind: Policy
rules:
  - level: RequestResponse
    resources:
      - group: "rbac.authorization.k8s.io"
        resources: ["roles", "rolebindings", "clusterroles", "clusterrolebindings"]
  - level: Metadata

Watch Out

The catch-all rule - level: Metadata with no other fields matches everything. This is the standard way to create a default rule. Do not forget it -- without a catch-all, unmatched events are silently dropped.

Audit Logging Checklist

Use this checklist when configuring audit logging in the exam:

  • [ ] Audit policy file created at the specified path
  • [ ] Audit policy has correct apiVersion: audit.k8s.io/v1 and kind: Policy
  • [ ] Rules are in the correct order (most specific first, catch-all last)
  • [ ] API server manifest includes --audit-policy-file flag
  • [ ] API server manifest includes --audit-log-path flag
  • [ ] Volume mount for the audit policy file (readOnly: true)
  • [ ] Volume mount for the audit log directory
  • [ ] Host path volume for the audit policy file (type: File)
  • [ ] Host path volume for the audit log directory (type: DirectoryOrCreate)
  • [ ] API server has restarted successfully
  • [ ] Audit log file is being written to

Quick Reference

bash
# Minimum API server flags needed:
--audit-policy-file=/etc/kubernetes/audit/policy.yaml
--audit-log-path=/var/log/kubernetes/audit/audit.log

# Minimum volume mounts needed:
# 1. Policy file mount (readOnly)
# 2. Log directory mount

# Minimum volumes needed:
# 1. Policy file hostPath (type: File)
# 2. Log directory hostPath (type: DirectoryOrCreate)

Released under the MIT License.