Ensuring Container Immutability
Overview
Container immutability is a security principle that states containers should not be modified at runtime. Once a container image is built and deployed, its filesystem, binaries, and configuration should remain unchanged. Any modification at runtime -- installing packages, downloading scripts, modifying configuration files -- indicates either a misconfigured application or a security compromise.
CKS Exam Relevance
Container immutability is a frequently tested topic on the CKS exam. You should be able to:
- Enforce immutability using
readOnlyRootFilesystem - Configure writable directories using
emptyDirvolumes for legitimate needs - Detect mutable containers that violate immutability
- Enforce immutability at scale using Pod Security Standards or OPA/Gatekeeper
What Immutability Means in a Container Context
An immutable container:
- Cannot modify its root filesystem at runtime
- Cannot install new packages or download new binaries
- Cannot modify configuration files after startup
- Can only write to explicitly designated writable directories (e.g.,
/tmp,/var/log) - Receives all configuration through environment variables, ConfigMaps, or Secrets mounted at startup
Why Immutability Matters for Security
| Threat | How Immutability Helps |
|---|---|
| Malware installation | Attacker cannot write malicious binaries to the filesystem |
| Configuration tampering | Application config cannot be modified at runtime |
| Backdoor persistence | Attacker cannot modify existing binaries or add new ones |
| Container drift | Running container always matches the built image |
| Forensic integrity | Filesystem changes are isolated to designated volumes |
| Compliance | Auditors can verify that deployed containers match approved images |
Immutable Container Architecture
readOnlyRootFilesystem Enforcement
The most direct way to enforce container immutability is setting readOnlyRootFilesystem: true in the container's security context.
Basic Example
apiVersion: v1
kind: Pod
metadata:
name: immutable-pod
namespace: default
spec:
containers:
- name: app
image: nginx:1.25
securityContext:
readOnlyRootFilesystem: true
ports:
- containerPort: 80WARNING
Setting readOnlyRootFilesystem: true without providing writable directories will cause many applications to fail. Most applications need to write to /tmp, /var/run, /var/cache, or similar directories.
What Happens Without Writable Directories
# If nginx tries to write its PID file:
# nginx: [emerg] open() "/var/run/nginx.pid" failed (30: Read-only file system)
# If an app tries to write to /tmp:
# Error: EROFS: read-only file system, open '/tmp/data.json'Using emptyDir for Writable Directories
emptyDir volumes provide ephemeral writable storage that exists only for the lifetime of the pod. They are the standard solution for providing writable directories to immutable containers.
Nginx with Immutable Root Filesystem
apiVersion: v1
kind: Pod
metadata:
name: immutable-nginx
namespace: default
spec:
containers:
- name: nginx
image: nginx:1.25
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: false
allowPrivilegeEscalation: false
ports:
- containerPort: 80
volumeMounts:
# Nginx needs to write to these directories
- name: tmp
mountPath: /tmp
- name: var-run
mountPath: /var/run
- name: var-cache-nginx
mountPath: /var/cache/nginx
- name: var-log-nginx
mountPath: /var/log/nginx
volumes:
- name: tmp
emptyDir: {}
- name: var-run
emptyDir: {}
- name: var-cache-nginx
emptyDir: {}
- name: var-log-nginx
emptyDir: {}Python Application with Immutable Root Filesystem
apiVersion: v1
kind: Pod
metadata:
name: immutable-python-app
namespace: default
spec:
containers:
- name: app
image: python:3.11-slim
command: ["python", "-m", "http.server", "8080"]
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false
ports:
- containerPort: 8080
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}Application with Configuration from ConfigMap
apiVersion: v1
kind: Pod
metadata:
name: immutable-configured-app
namespace: default
spec:
containers:
- name: app
image: myapp:1.0
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
allowPrivilegeEscalation: false
env:
- name: DB_HOST
valueFrom:
configMapKeyRef:
name: app-config
key: db-host
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: app-secrets
key: db-password
volumeMounts:
- name: tmp
mountPath: /tmp
- name: config
mountPath: /etc/app/config.yaml
subPath: config.yaml
readOnly: true
volumes:
- name: tmp
emptyDir: {}
- name: config
configMap:
name: app-configStartupProbe and Exec Considerations
Some containers need to perform initialization tasks that require filesystem writes. There are several patterns to handle this:
Pattern 1: Init Container for Setup
apiVersion: v1
kind: Pod
metadata:
name: immutable-with-init
spec:
initContainers:
- name: setup
image: busybox:1.36
command: ["sh", "-c", "cp /defaults/* /config/"]
volumeMounts:
- name: config-dir
mountPath: /config
- name: defaults
mountPath: /defaults
containers:
- name: app
image: myapp:1.0
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: config-dir
mountPath: /app/config
readOnly: true
- name: tmp
mountPath: /tmp
volumes:
- name: config-dir
emptyDir: {}
- name: defaults
configMap:
name: default-config
- name: tmp
emptyDir: {}Pattern 2: Memory-Backed emptyDir for Sensitive Temp Files
apiVersion: v1
kind: Pod
metadata:
name: immutable-memory-tmp
spec:
containers:
- name: app
image: myapp:1.0
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir:
medium: Memory # Uses tmpfs -- data stays in RAM
sizeLimit: 64Mi # Limits memory usageExam Tip
Using emptyDir with medium: Memory ensures temporary files never touch disk, which is more secure for sensitive data. However, it counts against the container's memory limit.
Detecting Mutable Containers
Using kubectl to Find Non-Immutable Pods
# Find pods without readOnlyRootFilesystem
kubectl get pods -A -o json | jq -r '
.items[] |
select(
.spec.containers[] |
(.securityContext.readOnlyRootFilesystem // false) == false
) |
"\(.metadata.namespace)/\(.metadata.name)"
'
# Check a specific pod's security context
kubectl get pod <name> -o jsonpath='{.spec.containers[*].securityContext.readOnlyRootFilesystem}'
# Detailed check of all containers in a pod
kubectl get pod <name> -o json | jq '.spec.containers[] | {
name: .name,
readOnlyRootFilesystem: (.securityContext.readOnlyRootFilesystem // false)
}'Using Falco to Detect Runtime Modifications
# Falco rule to detect writes to binary directories
- rule: Write below binary dir in container
desc: >
Detect writes to /bin, /sbin, /usr/bin, /usr/sbin in containers.
These directories should never be modified at runtime.
condition: >
open_write and container and
(fd.name startswith /bin or
fd.name startswith /sbin or
fd.name startswith /usr/bin or
fd.name startswith /usr/sbin)
output: >
Binary directory modified in container
(user=%user.name file=%fd.name process=%proc.name
container=%container.name pod=%k8s.pod.name
ns=%k8s.ns.name image=%container.image.repository)
priority: CRITICALEnforcement via Pod Security Standards (PSS)
Kubernetes Pod Security Standards provide built-in enforcement of security baselines. While PSS does not have a dedicated readOnlyRootFilesystem check, the Restricted profile enforces many immutability-related settings.
Pod Security Admission (PSA) Labels
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
# Enforce restricted standard
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest
# Warn on baseline violations
pod-security.kubernetes.io/warn: restricted
pod-security.kubernetes.io/warn-version: latest
# Audit restricted violations
pod-security.kubernetes.io/audit: restricted
pod-security.kubernetes.io/audit-version: latestINFO
The Restricted PSS profile enforces allowPrivilegeEscalation: false, runAsNonRoot: true, and capability restrictions, but does not enforce readOnlyRootFilesystem. For that, you need OPA/Gatekeeper or a custom admission webhook.
Enforcement via OPA/Gatekeeper
OPA/Gatekeeper can enforce readOnlyRootFilesystem across the cluster.
ConstraintTemplate
apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
name: k8srequirereadonlyrootfilesystem
spec:
crd:
spec:
names:
kind: K8sRequireReadOnlyRootFilesystem
targets:
- target: admission.k8s.gatekeeper.sh
rego: |
package k8srequirereadonlyrootfilesystem
violation[{"msg": msg}] {
container := input.review.object.spec.containers[_]
not container.securityContext.readOnlyRootFilesystem
msg := sprintf(
"Container '%v' must set securityContext.readOnlyRootFilesystem to true",
[container.name]
)
}
violation[{"msg": msg}] {
container := input.review.object.spec.initContainers[_]
not container.securityContext.readOnlyRootFilesystem
msg := sprintf(
"Init container '%v' must set securityContext.readOnlyRootFilesystem to true",
[container.name]
)
}Constraint
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequireReadOnlyRootFilesystem
metadata:
name: require-readonly-rootfs
spec:
match:
kinds:
- apiGroups: [""]
kinds: ["Pod"]
namespaces:
- production
- staging
parameters: {}Testing the Constraint
# This pod should be rejected
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: mutable-pod
namespace: production
spec:
containers:
- name: app
image: nginx:1.25
# No readOnlyRootFilesystem set
EOF
# Error: Container 'app' must set securityContext.readOnlyRootFilesystem to true
# This pod should be accepted
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: immutable-pod
namespace: production
spec:
containers:
- name: app
image: nginx:1.25
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir: {}
EOF
# pod/immutable-pod createdComplete Immutable Pod Specification
Here is a comprehensive example that combines all immutability best practices:
apiVersion: v1
kind: Pod
metadata:
name: fully-immutable-app
namespace: production
labels:
app: secure-app
spec:
# Use a non-root service account with minimal permissions
serviceAccountName: app-sa
automountServiceAccountToken: false
# Security settings at the pod level
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
seccompProfile:
type: RuntimeDefault
containers:
- name: app
image: myapp:1.0@sha256:abc123... # Pin by digest
ports:
- containerPort: 8080
protocol: TCP
# Container-level security
securityContext:
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
# No capabilities added
# Resource limits (prevents resource abuse)
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
# Configuration via environment and mounted configs
env:
- name: APP_PORT
value: "8080"
- name: LOG_LEVEL
value: "info"
envFrom:
- configMapRef:
name: app-env-config
# Volume mounts -- only necessary writable directories
volumeMounts:
- name: tmp
mountPath: /tmp
- name: app-config
mountPath: /etc/app
readOnly: true
- name: tls-certs
mountPath: /etc/tls
readOnly: true
# Health checks
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 10
periodSeconds: 15
readinessProbe:
httpGet:
path: /readyz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
volumes:
- name: tmp
emptyDir:
medium: Memory
sizeLimit: 64Mi
- name: app-config
configMap:
name: app-config
- name: tls-certs
secret:
secretName: app-tlsImmutability Checklist
Use this checklist to verify container immutability:
- [ ]
readOnlyRootFilesystem: trueset for all containers - [ ]
emptyDirvolumes provided for necessary writable directories (/tmp,/var/run, etc.) - [ ]
runAsNonRoot: trueto prevent root access - [ ]
allowPrivilegeEscalation: falseto prevent privilege escalation - [ ] All capabilities dropped (
drop: [ALL]) - [ ] Configuration provided via ConfigMaps/Secrets, not baked into the image
- [ ] Image pinned by digest, not just tag
- [ ]
automountServiceAccountToken: falseif API access is not needed - [ ] Resource limits set to prevent resource abuse
- [ ] No writable
hostPathvolumes mounted
Exam Tip
In the CKS exam, a common task is to fix a pod that is not immutable. The typical fix involves:
- Add
readOnlyRootFilesystem: trueto the security context - Add
emptyDirvolumes for directories the application needs to write to - Verify the pod starts and runs correctly after the change