Host Namespaces and Privileges

Overview

Linux namespaces are the fundamental isolation mechanism that makes containers possible. Each namespace type provides a separate view of a specific system resource. Kubernetes allows pods to opt out of this isolation and share namespaces with the host -- a powerful but extremely dangerous configuration.

This section covers the risks of sharing host namespaces, running privileged containers, and the security controls you must enforce to prevent these attack vectors.

Linux Namespaces in Containers

hostNetwork

When hostNetwork: true is set, the pod uses the host's network namespace instead of getting its own.

What It Enables

Pod sees all network interfaces on the host (eth0, docker0, cni0, etc.)
Pod can bind to any host port directly
Pod can see all network traffic on the host
Pod has the same IP address as the host node
Pod can access node-level services listening on localhost

The Risk

yaml

# DANGEROUS: Pod with host network access
apiVersion: v1
kind: Pod
metadata:
  name: host-network-pod
spec:
  hostNetwork: true   # Shares the host's network namespace
  containers:
  - name: attacker
    image: nicolaka/netshoot
    command: ["sleep", "infinity"]

Why hostNetwork Is Dangerous

A container with hostNetwork: true can:

Sniff all network traffic on the node (including other pods)
Access the kubelet API on localhost:10250
Access the metadata service (cloud provider instance metadata)
Bind to any port on the host, potentially impersonating services
Bypass NetworkPolicies (which operate on pod IPs, not host IPs)
Access etcd if running on a control plane node (localhost:2379)

Legitimate Use Cases

Only a few workloads genuinely need hostNetwork:

CNI plugins (Calico, Cilium, Flannel) -- they configure the host network
kube-proxy -- manages iptables rules on the host
Ingress controllers -- sometimes need direct host port access
Monitoring agents -- node-level network metrics

hostPID

When hostPID: true is set, the pod shares the host's PID namespace.

What It Enables

Pod can see all processes running on the host
Pod can see processes in other containers
Pod can send signals to host processes (with appropriate capabilities)
Pod can read /proc/<pid>/ of host processes

The Risk

yaml

# DANGEROUS: Pod with host PID namespace
apiVersion: v1
kind: Pod
metadata:
  name: host-pid-pod
spec:
  hostPID: true   # Shares the host's PID namespace
  containers:
  - name: attacker
    image: busybox
    command: ["sleep", "infinity"]

Why hostPID Is Dangerous

A container with hostPID: true can:

List all host processes: ps aux shows everything on the node
Read environment variables of other processes: cat /proc/<pid>/environ (may contain secrets)
Read process memory (with SYS_PTRACE capability)
Send signals to host processes: kill -9 <host-pid>
Access /proc filesystem entries for kubelet, dockerd, etcd

Demonstrating the Risk

bash

# Inside a pod with hostPID: true
# See ALL host processes
ps aux

# Read environment variables of process 1 (systemd/init)
cat /proc/1/environ | tr '\0' '\n'

# See kubelet's command line arguments (may expose tokens)
cat /proc/$(pgrep kubelet)/cmdline | tr '\0' ' '

hostIPC

When hostIPC: true is set, the pod shares the host's IPC namespace.

What It Enables

Pod can access host shared memory segments
Pod can access host semaphores and message queues
Pod can communicate with host processes via System V IPC

The Risk

yaml

# DANGEROUS: Pod with host IPC namespace
apiVersion: v1
kind: Pod
metadata:
  name: host-ipc-pod
spec:
  hostIPC: true   # Shares the host's IPC namespace
  containers:
  - name: attacker
    image: busybox
    command: ["sleep", "infinity"]

Why hostIPC Is Dangerous

A container with hostIPC: true can:

Read shared memory of host processes (may contain sensitive data)
Interfere with host IPC mechanisms
Access databases that use shared memory (PostgreSQL, Oracle)
Enable side-channel attacks through shared memory inspection

Privileged Containers

Setting privileged: true is the most dangerous configuration possible. It effectively removes all container isolation.

What Privileged Mode Grants

Feature	Normal Container	Privileged Container
Capabilities	~14 default	ALL capabilities
Device Access	None	Access to ALL host devices
AppArmor	Enforced	Disabled
Seccomp	RuntimeDefault	Disabled
/proc	Masked paths	Full access
/sys	Read-only	Read-write
SELinux	Enforced	Unconfined
Cgroups	Enforced	Can modify

yaml

# EXTREMELY DANGEROUS: Privileged container
apiVersion: v1
kind: Pod
metadata:
  name: privileged-pod
spec:
  containers:
  - name: root-access
    image: ubuntu:22.04
    securityContext:
      privileged: true   # Full host access
    command: ["sleep", "infinity"]

Why Privileged Containers Are Effectively Root on the Host

A privileged container can:

Mount the host filesystem: mount /dev/sda1 /mnt -- read/write everything on the host
Load kernel modules: insmod malicious.ko -- run code in the kernel
Access all devices: /dev/mem, /dev/sda -- raw disk and memory access
Modify iptables: change firewall rules, redirect traffic
Escape the container: trivially break out to the host
Compromise the entire cluster: pivot to other nodes via the kubelet

There is almost never a legitimate reason to run a privileged container in production.

Container Escape from Privileged Pod

This demonstrates why privileged containers are so dangerous:

bash

# Inside a privileged container -- escape to host filesystem
mkdir -p /mnt/host
mount /dev/sda1 /mnt/host

# Now you can read/write the entire host filesystem
cat /mnt/host/etc/shadow
cat /mnt/host/etc/kubernetes/admin.conf

# Or use nsenter to get a host shell
nsenter --target 1 --mount --uts --ipc --net --pid -- /bin/bash
# You are now running as root on the host

ProcMount Settings

The /proc filesystem exposes kernel and process information. By default, Kubernetes masks sensitive paths within /proc.

Default vs Unmasked

yaml

# Default (masked) - safe
securityContext:
  procMount: Default
  # Masked paths: /proc/acpi, /proc/kcore, /proc/keys,
  # /proc/latency_stats, /proc/timer_list, /proc/timer_stats,
  # /proc/sched_debug, /proc/scsi

# Unmasked - dangerous, exposes all of /proc
securityContext:
  procMount: Unmasked

WARNING

procMount: Unmasked should almost never be used. It exposes sensitive kernel information that can aid in container escapes and privilege escalation.

Read-Only Root Filesystem

Setting readOnlyRootFilesystem: true prevents the container from writing to its root filesystem.

Why It Matters

Prevents attackers from modifying binaries in the container
Blocks web shell drops and malware installation
Forces applications to use designated writable volumes (emptyDir, etc.)
Enforces immutable infrastructure principles

yaml

apiVersion: v1
kind: Pod
metadata:
  name: readonly-pod
spec:
  containers:
  - name: app
    image: nginx:1.27
    securityContext:
      readOnlyRootFilesystem: true
    volumeMounts:
    - name: cache
      mountPath: /var/cache/nginx
    - name: run
      mountPath: /var/run
    - name: tmp
      mountPath: /tmp
  volumes:
  - name: cache
    emptyDir: {}
  - name: run
    emptyDir: {}
  - name: tmp
    emptyDir: {}

Handling Read-Only Filesystem

When readOnlyRootFilesystem: true is set, applications that need to write temporary files will fail. The solution is to mount emptyDir volumes at the paths where the application needs to write (typically /tmp, /var/run, /var/cache).

Running as Non-Root

Running containers as non-root is one of the most important security practices.

runAsNonRoot

This tells the kubelet to validate that the container does not run as root (UID 0). If the container image is configured to run as root, the pod will fail to start.

yaml

apiVersion: v1
kind: Pod
metadata:
  name: nonroot-pod
spec:
  securityContext:
    runAsNonRoot: true    # Reject if UID is 0
  containers:
  - name: app
    image: nginx:1.27     # This will FAIL -- nginx runs as root by default

runAsUser and runAsGroup

Explicitly set the UID and GID:

yaml

apiVersion: v1
kind: Pod
metadata:
  name: specific-user-pod
spec:
  securityContext:
    runAsUser: 1000       # Run as UID 1000
    runAsGroup: 1000      # Run as GID 1000
    fsGroup: 1000         # Files created will have this GID
    runAsNonRoot: true    # Additional validation
  containers:
  - name: app
    image: python:3.12-slim
    command: ["python", "-m", "http.server", "8080"]
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
          - ALL

allowPrivilegeEscalation

This controls whether a process can gain more privileges than its parent:

yaml

securityContext:
  allowPrivilegeEscalation: false

When set to false:

Setuid binaries cannot escalate privileges
no_new_privs flag is set on the process
The process cannot gain capabilities beyond what it started with

Always Set to false

Unless your application specifically requires setuid binaries (extremely rare), always set allowPrivilegeEscalation: false. This is required by the Restricted Pod Security Standard.

Pod Security Standards Summary

Pod Security Standards define three levels that restrict these dangerous configurations. This is covered in detail in the Pod Security Standards section.

Setting	Privileged	Baseline	Restricted
`hostNetwork`	Allowed	Denied	Denied
`hostPID`	Allowed	Denied	Denied
`hostIPC`	Allowed	Denied	Denied
`privileged`	Allowed	Denied	Denied
`hostPorts`	Allowed	Limited	Limited
`runAsNonRoot`	Not required	Not required	Required
`allowPrivilegeEscalation`	Allowed	Allowed	Must be false
`capabilities`	Any	Limited drops	Drop ALL, limited adds
`readOnlyRootFilesystem`	Not required	Not required	Not required*
`seccompProfile`	Any	Any	RuntimeDefault or Localhost

Note: readOnlyRootFilesystem is recommended but not strictly required by the Restricted standard.

Complete Hardened Pod Example

This example combines all the host-level restrictions:

yaml

apiVersion: v1
kind: Pod
metadata:
  name: fully-hardened-pod
  labels:
    app: secure-app
spec:
  # Pod-level security
  securityContext:
    runAsUser: 10001
    runAsGroup: 10001
    runAsNonRoot: true
    fsGroup: 10001
    seccompProfile:
      type: RuntimeDefault
  
  # Explicitly deny host namespace sharing
  hostNetwork: false
  hostPID: false
  hostIPC: false
  
  # Prevent service account token automounting
  automountServiceAccountToken: false
  
  containers:
  - name: app
    image: python:3.12-slim
    command: ["python", "-m", "http.server", "8080"]
    ports:
    - containerPort: 8080
    
    # Container-level security
    securityContext:
      privileged: false
      allowPrivilegeEscalation: false
      readOnlyRootFilesystem: true
      capabilities:
        drop:
          - ALL
    
    # Resource limits (prevent DoS)
    resources:
      limits:
        memory: "128Mi"
        cpu: "250m"
      requests:
        memory: "64Mi"
        cpu: "100m"
    
    # Writable volumes for app needs
    volumeMounts:
    - name: tmp
      mountPath: /tmp
  
  volumes:
  - name: tmp
    emptyDir:
      sizeLimit: "50Mi"

Identifying Risky Pods

Finding Pods with Dangerous Settings

bash

# Find pods with hostNetwork
kubectl get pods -A -o json | \
  jq -r '.items[] | select(.spec.hostNetwork==true) | 
  "\(.metadata.namespace)/\(.metadata.name)"'

# Find pods with hostPID
kubectl get pods -A -o json | \
  jq -r '.items[] | select(.spec.hostPID==true) | 
  "\(.metadata.namespace)/\(.metadata.name)"'

# Find privileged containers
kubectl get pods -A -o json | \
  jq -r '.items[] | select(.spec.containers[].securityContext.privileged==true) | 
  "\(.metadata.namespace)/\(.metadata.name)"'

# Find containers running as root
kubectl get pods -A -o json | \
  jq -r '.items[] | select(.spec.securityContext.runAsNonRoot!=true) | 
  "\(.metadata.namespace)/\(.metadata.name)"'

# Combined: find all pods with any dangerous setting
kubectl get pods -A -o json | jq -r '
  .items[] | 
  select(
    .spec.hostNetwork==true or 
    .spec.hostPID==true or 
    .spec.hostIPC==true or 
    (.spec.containers[].securityContext.privileged==true)
  ) | 
  "\(.metadata.namespace)/\(.metadata.name)"'

Quick Reference

Exam Speed Reference

Deny all host access:

yaml

spec:
  hostNetwork: false
  hostPID: false
  hostIPC: false
  containers:
  - name: app
    securityContext:
      privileged: false
      allowPrivilegeEscalation: false
      runAsNonRoot: true
      readOnlyRootFilesystem: true
      capabilities:
        drop: ["ALL"]

Find risky pods:

bash

# Privileged pods
kubectl get pods -A -o json | jq '.items[] | select(.spec.containers[].securityContext.privileged==true) | .metadata.name'

# Host network pods
kubectl get pods -A -o json | jq '.items[] | select(.spec.hostNetwork==true) | .metadata.name'

Key Exam Takeaways

hostNetwork, hostPID, hostIPC should be false for all workloads unless absolutely necessary
privileged: true is the most dangerous setting -- it removes all container isolation
readOnlyRootFilesystem: true prevents filesystem modification attacks
runAsNonRoot: true ensures the container does not run as root
allowPrivilegeEscalation: false prevents gaining additional privileges
These settings are enforced by Pod Security Standards at the namespace level
On the exam, you may need to identify and fix pods with dangerous settings

Host Namespaces and Privileges ​

Overview ​

Linux Namespaces in Containers ​

hostNetwork ​

What It Enables ​

The Risk ​

Legitimate Use Cases ​

hostPID ​

What It Enables ​

The Risk ​

Demonstrating the Risk ​

hostIPC ​

What It Enables ​

The Risk ​

Privileged Containers ​

What Privileged Mode Grants ​

Container Escape from Privileged Pod ​

ProcMount Settings ​

Default vs Unmasked ​

Read-Only Root Filesystem ​

Why It Matters ​

Running as Non-Root ​

runAsNonRoot ​

runAsUser and runAsGroup ​

allowPrivilegeEscalation ​

Host Namespace Sharing Risks ​

Pod Security Standards Summary ​

Complete Hardened Pod Example ​

Identifying Risky Pods ​

Finding Pods with Dangerous Settings ​

Quick Reference ​

Host Namespaces and Privileges

Overview

Linux Namespaces in Containers

hostNetwork

What It Enables

The Risk

Legitimate Use Cases

hostPID

What It Enables

The Risk

Demonstrating the Risk

hostIPC

What It Enables

The Risk

Privileged Containers

What Privileged Mode Grants

Container Escape from Privileged Pod

ProcMount Settings

Default vs Unmasked

Read-Only Root Filesystem

Why It Matters

Running as Non-Root

runAsNonRoot

runAsUser and runAsGroup

allowPrivilegeEscalation

Host Namespace Sharing Risks

Pod Security Standards Summary

Complete Hardened Pod Example

Identifying Risky Pods

Finding Pods with Dangerous Settings

Quick Reference