Skip to content

Security Contexts Deep Dive

What Are Security Contexts?

A SecurityContext defines privilege and access control settings for a pod or container. They are the primary mechanism for controlling the Linux security properties of your workloads in Kubernetes.

CKS Exam Relevance

SecurityContext questions appear on almost every CKS exam. You must be able to configure pod-level and container-level settings from memory, understand the precedence rules, and know which settings prevent privilege escalation.

Pod-Level vs Container-Level Security Contexts

Kubernetes allows security settings at two levels, with container-level settings taking precedence over pod-level settings when both are defined.

What Goes Where

SettingPod-LevelContainer-LevelNotes
runAsUserYesYesContainer overrides pod
runAsGroupYesYesContainer overrides pod
runAsNonRootYesYesContainer overrides pod
fsGroupYesNoPod-level only
supplementalGroupsYesNoPod-level only
sysctlsYesNoPod-level only
readOnlyRootFilesystemNoYesContainer-level only
allowPrivilegeEscalationNoYesContainer-level only
capabilitiesNoYesContainer-level only
privilegedNoYesContainer-level only
seccompProfileYesYesContainer overrides pod
seLinuxOptionsYesYesContainer overrides pod

Key Exam Concept

When both pod-level and container-level settings are defined, the container-level always wins. A pod with runAsUser: 1000 and a container with runAsUser: 2000 will run that container as UID 2000.


Core Security Context Settings

runAsUser, runAsGroup, runAsNonRoot

These settings control which Linux user and group the container process runs as.

yaml
apiVersion: v1
kind: Pod
metadata:
  name: security-user-demo
spec:
  securityContext:
    runAsUser: 1000          # All containers run as UID 1000
    runAsGroup: 3000         # Primary group GID 3000
    runAsNonRoot: true       # Reject if image tries to run as root
  containers:
  - name: app
    image: busybox:1.36
    command: ["sh", "-c", "id && sleep 3600"]
    # Inherits pod-level: runs as user 1000, group 3000
  - name: sidecar
    image: busybox:1.36
    command: ["sh", "-c", "id && sleep 3600"]
    securityContext:
      runAsUser: 2000        # Overrides pod-level: runs as UID 2000
      # Still inherits runAsGroup: 3000 and runAsNonRoot: true

Verification

After creating the pod, verify the user:

bash
kubectl exec security-user-demo -c app -- id
# uid=1000 gid=3000 groups=3000

kubectl exec security-user-demo -c sidecar -- id
# uid=2000 gid=3000 groups=3000

What Happens with runAsNonRoot?

When runAsNonRoot: true is set, the kubelet validates that the container is not running as UID 0. If the container image's USER directive is root (or absent, which defaults to root) and no runAsUser is specified, the pod will fail to start:

Error: container has runAsNonRoot and image will run as root
yaml
# This will FAIL - image defaults to root, no runAsUser override
apiVersion: v1
kind: Pod
metadata:
  name: will-fail
spec:
  securityContext:
    runAsNonRoot: true
  containers:
  - name: app
    image: nginx:1.25       # nginx image runs as root by default
yaml
# This will SUCCEED - runAsUser overrides the image default
apiVersion: v1
kind: Pod
metadata:
  name: will-succeed
spec:
  securityContext:
    runAsNonRoot: true
    runAsUser: 1000
  containers:
  - name: app
    image: nginx:1.25

fsGroup and supplementalGroups

These are pod-level only settings that control file ownership and group membership.

yaml
apiVersion: v1
kind: Pod
metadata:
  name: fsgroup-demo
spec:
  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000               # Volumes owned by GID 2000
    supplementalGroups:
    - 4000                      # Added to supplemental groups
    - 5000
  containers:
  - name: app
    image: busybox:1.36
    command: ["sh", "-c", "id && ls -la /data && sleep 3600"]
    volumeMounts:
    - name: data-vol
      mountPath: /data
  volumes:
  - name: data-vol
    emptyDir: {}

What fsGroup does:

  1. All files created in mounted volumes are owned by GID 2000
  2. The GID 2000 is added to each container's supplemental groups
  3. Kubernetes recursively chowns volume contents to the fsGroup on mount

Verification

bash
kubectl exec fsgroup-demo -- id
# uid=1000 gid=3000 groups=2000,3000,4000,5000

kubectl exec fsgroup-demo -- ls -la /data
# drwxrwsrwx 2 root 2000 ... .

readOnlyRootFilesystem

Forces the container's root filesystem to be read-only. This is a powerful security measure that prevents attackers from writing malicious files to the container.

yaml
apiVersion: v1
kind: Pod
metadata:
  name: readonly-demo
spec:
  containers:
  - name: app
    image: nginx:1.25
    securityContext:
      readOnlyRootFilesystem: true    # Cannot write to /
    volumeMounts:
    - name: tmp
      mountPath: /tmp                  # Writable temp directory
    - name: cache
      mountPath: /var/cache/nginx      # nginx needs to write cache
    - name: run
      mountPath: /var/run              # nginx needs PID file
  volumes:
  - name: tmp
    emptyDir: {}
  - name: cache
    emptyDir: {}
  - name: run
    emptyDir: {}

Common Exam Pattern

Many exam questions ask you to set readOnlyRootFilesystem: true on a pod that requires certain writable paths. You must know to use emptyDir volumes for directories the application needs to write to (like /tmp, /var/run, /var/cache).


allowPrivilegeEscalation

Controls whether a process can gain more privileges than its parent process. This is critical for security.

yaml
apiVersion: v1
kind: Pod
metadata:
  name: no-escalation
spec:
  containers:
  - name: app
    image: busybox:1.36
    command: ["sh", "-c", "sleep 3600"]
    securityContext:
      allowPrivilegeEscalation: false   # No SUID, no privilege gain
      runAsNonRoot: true
      runAsUser: 1000

What it does:

  • Sets the no_new_privs flag on the container process
  • Prevents SUID binaries from being effective
  • Prevents a process from gaining capabilities its parent doesn't have
  • Default is true -- you must explicitly set it to false

Security Best Practice

Always set allowPrivilegeEscalation: false unless your application explicitly requires SUID binaries or special capability escalation. This single setting blocks a large class of container escape attacks.


Linux Capabilities

Capabilities break the monolithic root privilege into discrete units. You can drop unnecessary capabilities and add only what is needed.

yaml
apiVersion: v1
kind: Pod
metadata:
  name: capabilities-demo
spec:
  containers:
  - name: app
    image: busybox:1.36
    command: ["sh", "-c", "sleep 3600"]
    securityContext:
      capabilities:
        drop:
        - ALL                    # Drop ALL capabilities first
        add:
        - NET_BIND_SERVICE       # Only add back what's needed

Common capabilities to know:

CapabilityWhat It AllowsShould You Allow?
NET_BIND_SERVICEBind to ports < 1024Only if needed
NET_RAWUse RAW/PACKET socketsRarely -- used for ping
SYS_PTRACETrace processesDebugging only
SYS_ADMINBroad admin operationsAlmost never
NET_ADMINNetwork configurationCNI plugins only
SYS_TIMESet system clockAlmost never

Best Practice Pattern

The most secure capability configuration is "drop ALL, add back only what you need":

yaml
capabilities:
  drop: ["ALL"]
  add: ["NET_BIND_SERVICE"]   # Only if required

Privileged Containers

A privileged container has all Linux capabilities and can access all devices on the host. This is the opposite of security hardening.

yaml
# DANGEROUS - Never do this unless absolutely required
apiVersion: v1
kind: Pod
metadata:
  name: privileged-pod
spec:
  containers:
  - name: app
    image: busybox:1.36
    securityContext:
      privileged: true          # Full host access

Never Use in Production

A privileged container can:

  • Access all host devices (/dev/*)
  • Modify the host kernel via /proc and /sys
  • Load kernel modules
  • Escape the container entirely

On the CKS exam, you will likely be asked to identify and remove privileged: true from pod specs.


Complete Hardened Pod Example

This example combines all security context best practices into a single pod specification:

yaml
apiVersion: v1
kind: Pod
metadata:
  name: hardened-pod
  namespace: production
  labels:
    app: secure-app
spec:
  securityContext:
    runAsUser: 10000
    runAsGroup: 10000
    runAsNonRoot: true
    fsGroup: 10000
    seccompProfile:
      type: RuntimeDefault
  automountServiceAccountToken: false   # Don't mount SA token
  containers:
  - name: app
    image: gcr.io/distroless/static:nonroot
    securityContext:
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
        - ALL
    resources:
      limits:
        memory: "128Mi"
        cpu: "250m"
      requests:
        memory: "64Mi"
        cpu: "125m"
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: tmp
    emptyDir:
      sizeLimit: 64Mi

Exam Checklist

When hardening a pod on the exam, apply this checklist:

  1. runAsNonRoot: true -- prevent root execution
  2. runAsUser: <non-zero> -- specify a non-root UID
  3. readOnlyRootFilesystem: true -- immutable root FS
  4. allowPrivilegeEscalation: false -- no privilege gain
  5. capabilities: drop: ["ALL"] -- minimal capabilities
  6. automountServiceAccountToken: false -- no API access
  7. seccompProfile: type: RuntimeDefault -- syscall filtering

Security Context Precedence Rules

Default Behavior

If no securityContext is configured at any level and the image has no USER directive, the container runs as root (UID 0). This is why explicit security contexts are essential.


Common CKS Exam Patterns

Pattern 1: Fix a Pod Running as Root

You are given a running pod and asked to ensure it runs as a non-root user:

bash
# Check current user
kubectl exec <pod-name> -- id
# uid=0(root) gid=0(root) groups=0(root)

# Edit the pod (you'll need to delete and recreate)
kubectl get pod <pod-name> -o yaml > pod.yaml
# Add securityContext, then:
kubectl delete pod <pod-name>
kubectl apply -f pod.yaml

Pattern 2: Enable Read-Only Root Filesystem

Add readOnlyRootFilesystem: true and provide writable volumes where needed:

bash
# Find which paths the application writes to
kubectl exec <pod-name> -- find / -writable -type d 2>/dev/null
# Then add emptyDir volumes for those paths

Pattern 3: Remove Privileged Mode

Identify and fix overly permissive pods:

bash
# Find privileged pods across all namespaces
kubectl get pods -A -o json | jq '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.name'

Pattern 4: Configure Specific Capabilities

Drop all and add only required capabilities:

bash
# Check current capabilities
kubectl exec <pod-name> -- cat /proc/1/status | grep -i cap

Quick Reference

bash
# Verify security context of a running pod
kubectl get pod <name> -o jsonpath='{.spec.securityContext}'
kubectl get pod <name> -o jsonpath='{.spec.containers[0].securityContext}'

# Check running user inside container
kubectl exec <pod-name> -- id
kubectl exec <pod-name> -- whoami

# Check capabilities
kubectl exec <pod-name> -- cat /proc/1/status | grep Cap

# Decode capabilities bitmask
capsh --decode=00000000a80425fb

# List processes with their UIDs
kubectl exec <pod-name> -- ps aux

Released under the MIT License.