Host Namespaces and Privileges
Overview
Linux namespaces are the fundamental isolation mechanism that makes containers possible. Each namespace type provides a separate view of a specific system resource. Kubernetes allows pods to opt out of this isolation and share namespaces with the host -- a powerful but extremely dangerous configuration.
This section covers the risks of sharing host namespaces, running privileged containers, and the security controls you must enforce to prevent these attack vectors.
Linux Namespaces in Containers
hostNetwork
When hostNetwork: true is set, the pod uses the host's network namespace instead of getting its own.
What It Enables
- Pod sees all network interfaces on the host (eth0, docker0, cni0, etc.)
- Pod can bind to any host port directly
- Pod can see all network traffic on the host
- Pod has the same IP address as the host node
- Pod can access node-level services listening on localhost
The Risk
# DANGEROUS: Pod with host network access
apiVersion: v1
kind: Pod
metadata:
name: host-network-pod
spec:
hostNetwork: true # Shares the host's network namespace
containers:
- name: attacker
image: nicolaka/netshoot
command: ["sleep", "infinity"]Why hostNetwork Is Dangerous
A container with hostNetwork: true can:
- Sniff all network traffic on the node (including other pods)
- Access the kubelet API on localhost:10250
- Access the metadata service (cloud provider instance metadata)
- Bind to any port on the host, potentially impersonating services
- Bypass NetworkPolicies (which operate on pod IPs, not host IPs)
- Access etcd if running on a control plane node (localhost:2379)
Legitimate Use Cases
Only a few workloads genuinely need hostNetwork:
- CNI plugins (Calico, Cilium, Flannel) -- they configure the host network
- kube-proxy -- manages iptables rules on the host
- Ingress controllers -- sometimes need direct host port access
- Monitoring agents -- node-level network metrics
hostPID
When hostPID: true is set, the pod shares the host's PID namespace.
What It Enables
- Pod can see all processes running on the host
- Pod can see processes in other containers
- Pod can send signals to host processes (with appropriate capabilities)
- Pod can read
/proc/<pid>/of host processes
The Risk
# DANGEROUS: Pod with host PID namespace
apiVersion: v1
kind: Pod
metadata:
name: host-pid-pod
spec:
hostPID: true # Shares the host's PID namespace
containers:
- name: attacker
image: busybox
command: ["sleep", "infinity"]Why hostPID Is Dangerous
A container with hostPID: true can:
- List all host processes:
ps auxshows everything on the node - Read environment variables of other processes:
cat /proc/<pid>/environ(may contain secrets) - Read process memory (with
SYS_PTRACEcapability) - Send signals to host processes:
kill -9 <host-pid> - Access /proc filesystem entries for kubelet, dockerd, etcd
Demonstrating the Risk
# Inside a pod with hostPID: true
# See ALL host processes
ps aux
# Read environment variables of process 1 (systemd/init)
cat /proc/1/environ | tr '\0' '\n'
# See kubelet's command line arguments (may expose tokens)
cat /proc/$(pgrep kubelet)/cmdline | tr '\0' ' 'hostIPC
When hostIPC: true is set, the pod shares the host's IPC namespace.
What It Enables
- Pod can access host shared memory segments
- Pod can access host semaphores and message queues
- Pod can communicate with host processes via System V IPC
The Risk
# DANGEROUS: Pod with host IPC namespace
apiVersion: v1
kind: Pod
metadata:
name: host-ipc-pod
spec:
hostIPC: true # Shares the host's IPC namespace
containers:
- name: attacker
image: busybox
command: ["sleep", "infinity"]Why hostIPC Is Dangerous
A container with hostIPC: true can:
- Read shared memory of host processes (may contain sensitive data)
- Interfere with host IPC mechanisms
- Access databases that use shared memory (PostgreSQL, Oracle)
- Enable side-channel attacks through shared memory inspection
Privileged Containers
Setting privileged: true is the most dangerous configuration possible. It effectively removes all container isolation.
What Privileged Mode Grants
| Feature | Normal Container | Privileged Container |
|---|---|---|
| Capabilities | ~14 default | ALL capabilities |
| Device Access | None | Access to ALL host devices |
| AppArmor | Enforced | Disabled |
| Seccomp | RuntimeDefault | Disabled |
| /proc | Masked paths | Full access |
| /sys | Read-only | Read-write |
| SELinux | Enforced | Unconfined |
| Cgroups | Enforced | Can modify |
# EXTREMELY DANGEROUS: Privileged container
apiVersion: v1
kind: Pod
metadata:
name: privileged-pod
spec:
containers:
- name: root-access
image: ubuntu:22.04
securityContext:
privileged: true # Full host access
command: ["sleep", "infinity"]Why Privileged Containers Are Effectively Root on the Host
A privileged container can:
- Mount the host filesystem:
mount /dev/sda1 /mnt-- read/write everything on the host - Load kernel modules:
insmod malicious.ko-- run code in the kernel - Access all devices:
/dev/mem,/dev/sda-- raw disk and memory access - Modify iptables: change firewall rules, redirect traffic
- Escape the container: trivially break out to the host
- Compromise the entire cluster: pivot to other nodes via the kubelet
There is almost never a legitimate reason to run a privileged container in production.
Container Escape from Privileged Pod
This demonstrates why privileged containers are so dangerous:
# Inside a privileged container -- escape to host filesystem
mkdir -p /mnt/host
mount /dev/sda1 /mnt/host
# Now you can read/write the entire host filesystem
cat /mnt/host/etc/shadow
cat /mnt/host/etc/kubernetes/admin.conf
# Or use nsenter to get a host shell
nsenter --target 1 --mount --uts --ipc --net --pid -- /bin/bash
# You are now running as root on the hostProcMount Settings
The /proc filesystem exposes kernel and process information. By default, Kubernetes masks sensitive paths within /proc.
Default vs Unmasked
# Default (masked) - safe
securityContext:
procMount: Default
# Masked paths: /proc/acpi, /proc/kcore, /proc/keys,
# /proc/latency_stats, /proc/timer_list, /proc/timer_stats,
# /proc/sched_debug, /proc/scsi
# Unmasked - dangerous, exposes all of /proc
securityContext:
procMount: UnmaskedWARNING
procMount: Unmasked should almost never be used. It exposes sensitive kernel information that can aid in container escapes and privilege escalation.
Read-Only Root Filesystem
Setting readOnlyRootFilesystem: true prevents the container from writing to its root filesystem.
Why It Matters
- Prevents attackers from modifying binaries in the container
- Blocks web shell drops and malware installation
- Forces applications to use designated writable volumes (emptyDir, etc.)
- Enforces immutable infrastructure principles
apiVersion: v1
kind: Pod
metadata:
name: readonly-pod
spec:
containers:
- name: app
image: nginx:1.27
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: cache
mountPath: /var/cache/nginx
- name: run
mountPath: /var/run
- name: tmp
mountPath: /tmp
volumes:
- name: cache
emptyDir: {}
- name: run
emptyDir: {}
- name: tmp
emptyDir: {}Handling Read-Only Filesystem
When readOnlyRootFilesystem: true is set, applications that need to write temporary files will fail. The solution is to mount emptyDir volumes at the paths where the application needs to write (typically /tmp, /var/run, /var/cache).
Running as Non-Root
Running containers as non-root is one of the most important security practices.
runAsNonRoot
This tells the kubelet to validate that the container does not run as root (UID 0). If the container image is configured to run as root, the pod will fail to start.
apiVersion: v1
kind: Pod
metadata:
name: nonroot-pod
spec:
securityContext:
runAsNonRoot: true # Reject if UID is 0
containers:
- name: app
image: nginx:1.27 # This will FAIL -- nginx runs as root by defaultrunAsUser and runAsGroup
Explicitly set the UID and GID:
apiVersion: v1
kind: Pod
metadata:
name: specific-user-pod
spec:
securityContext:
runAsUser: 1000 # Run as UID 1000
runAsGroup: 1000 # Run as GID 1000
fsGroup: 1000 # Files created will have this GID
runAsNonRoot: true # Additional validation
containers:
- name: app
image: python:3.12-slim
command: ["python", "-m", "http.server", "8080"]
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALLallowPrivilegeEscalation
This controls whether a process can gain more privileges than its parent:
securityContext:
allowPrivilegeEscalation: falseWhen set to false:
- Setuid binaries cannot escalate privileges
no_new_privsflag is set on the process- The process cannot gain capabilities beyond what it started with
Always Set to false
Unless your application specifically requires setuid binaries (extremely rare), always set allowPrivilegeEscalation: false. This is required by the Restricted Pod Security Standard.
Host Namespace Sharing Risks
Pod Security Standards Summary
Pod Security Standards define three levels that restrict these dangerous configurations. This is covered in detail in the Pod Security Standards section.
| Setting | Privileged | Baseline | Restricted |
|---|---|---|---|
hostNetwork | Allowed | Denied | Denied |
hostPID | Allowed | Denied | Denied |
hostIPC | Allowed | Denied | Denied |
privileged | Allowed | Denied | Denied |
hostPorts | Allowed | Limited | Limited |
runAsNonRoot | Not required | Not required | Required |
allowPrivilegeEscalation | Allowed | Allowed | Must be false |
capabilities | Any | Limited drops | Drop ALL, limited adds |
readOnlyRootFilesystem | Not required | Not required | Not required* |
seccompProfile | Any | Any | RuntimeDefault or Localhost |
Note: readOnlyRootFilesystem is recommended but not strictly required by the Restricted standard.
Complete Hardened Pod Example
This example combines all the host-level restrictions:
apiVersion: v1
kind: Pod
metadata:
name: fully-hardened-pod
labels:
app: secure-app
spec:
# Pod-level security
securityContext:
runAsUser: 10001
runAsGroup: 10001
runAsNonRoot: true
fsGroup: 10001
seccompProfile:
type: RuntimeDefault
# Explicitly deny host namespace sharing
hostNetwork: false
hostPID: false
hostIPC: false
# Prevent service account token automounting
automountServiceAccountToken: false
containers:
- name: app
image: python:3.12-slim
command: ["python", "-m", "http.server", "8080"]
ports:
- containerPort: 8080
# Container-level security
securityContext:
privileged: false
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop:
- ALL
# Resource limits (prevent DoS)
resources:
limits:
memory: "128Mi"
cpu: "250m"
requests:
memory: "64Mi"
cpu: "100m"
# Writable volumes for app needs
volumeMounts:
- name: tmp
mountPath: /tmp
volumes:
- name: tmp
emptyDir:
sizeLimit: "50Mi"Identifying Risky Pods
Finding Pods with Dangerous Settings
# Find pods with hostNetwork
kubectl get pods -A -o json | \
jq -r '.items[] | select(.spec.hostNetwork==true) |
"\(.metadata.namespace)/\(.metadata.name)"'
# Find pods with hostPID
kubectl get pods -A -o json | \
jq -r '.items[] | select(.spec.hostPID==true) |
"\(.metadata.namespace)/\(.metadata.name)"'
# Find privileged containers
kubectl get pods -A -o json | \
jq -r '.items[] | select(.spec.containers[].securityContext.privileged==true) |
"\(.metadata.namespace)/\(.metadata.name)"'
# Find containers running as root
kubectl get pods -A -o json | \
jq -r '.items[] | select(.spec.securityContext.runAsNonRoot!=true) |
"\(.metadata.namespace)/\(.metadata.name)"'
# Combined: find all pods with any dangerous setting
kubectl get pods -A -o json | jq -r '
.items[] |
select(
.spec.hostNetwork==true or
.spec.hostPID==true or
.spec.hostIPC==true or
(.spec.containers[].securityContext.privileged==true)
) |
"\(.metadata.namespace)/\(.metadata.name)"'Quick Reference
Exam Speed Reference
Deny all host access:
spec:
hostNetwork: false
hostPID: false
hostIPC: false
containers:
- name: app
securityContext:
privileged: false
allowPrivilegeEscalation: false
runAsNonRoot: true
readOnlyRootFilesystem: true
capabilities:
drop: ["ALL"]Find risky pods:
# Privileged pods
kubectl get pods -A -o json | jq '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.name'
# Host network pods
kubectl get pods -A -o json | jq '.items[] | select(.spec.hostNetwork==true) | .metadata.name'Key Exam Takeaways
- hostNetwork, hostPID, hostIPC should be
falsefor all workloads unless absolutely necessary - privileged: true is the most dangerous setting -- it removes all container isolation
- readOnlyRootFilesystem: true prevents filesystem modification attacks
- runAsNonRoot: true ensures the container does not run as root
- allowPrivilegeEscalation: false prevents gaining additional privileges
- These settings are enforced by Pod Security Standards at the namespace level
- On the exam, you may need to identify and fix pods with dangerous settings