Container Internals Deep Dive 01: Cgroups
How cgroups enforce resource limits for CPU, memory, and I/O in container workloads.
On this page
Container Internals Deep Dive — this post is part of a series
- Part 1: Container Internals Deep Dive 00
- Part 2: Container Internals Deep Dive 01: Cgroups
- Part 3: Container Internals Deep Dive 02: Namespaces
- Part 4: Container Internals Deep Dive 03: Network Namespaces and CNI
- Part 5: Container Internals Deep Dive 04: containerd Internals
- Part 6: Container Internals Deep Dive 05: OCI Standard
- Part 7: Container Internals Deep Dive 06: runc vs crun
- Part 8: Container Internals Deep Dive 07: Rootless Containers with Podman
- Part 9: Container Internals Deep Dive 08: Kata Containers
- Part 10: Container Internals Deep Dive 09: Firecracker microVM
Series: 2/10. In part 00 we covered chroot. In this part we cover cgroups.
Cgroups (control groups) are the Linux kernel mechanism used to account for and limit resources for a set of processes.
Why cgroups matter for containers #
Without cgroups, one noisy workload can consume all CPU or memory on a host. With cgroups, each container can be bounded and monitored.
Typical controls:
cpu.max(v2): CPU quota and periodmemory.max: hard memory limitmemory.high: soft memory pressure thresholdpids.max: process count capio.max: block I/O limits
cgroups v1 vs v2 (quick view) #
- v1: separate hierarchies per controller; operationally messy
- v2: unified hierarchy; cleaner semantics and pressure reporting
Most modern distros and Kubernetes setups are moving to cgroups v2.
Hands-on: inspect a running container #
docker run --rm -d --name cgroup-demo --memory=256m --cpus=1 nginx:stable
docker inspect cgroup-demo --format '{{.HostConfig.Memory}} {{.HostConfig.NanoCpus}}'
docker exec cgroup-demo sh -c 'cat /proc/self/cgroup'
On cgroups v2 hosts, inspect limits from host namespace:
CID=$(docker inspect cgroup-demo --format '{{.Id}}')
CG=/sys/fs/cgroup/system.slice/docker-${CID}.scope
sudo cat "$CG/memory.max"
sudo cat "$CG/cpu.max"
Common operational mistakes #
- Setting only memory limit but no CPU cap, causing host contention.
- Setting tight memory limits without profiling peak RSS.
- Ignoring
memory.high, which helps graceful reclaim before OOM.
Kubernetes mapping #
Kubernetes requests and limits translate to cgroup controls via the runtime.
- CPU
limit-> quota (cpu.max) - Memory
limit->memory.max - Pod-level and container-level constraints are composed by kubelet and runtime
Takeaway #
Cgroups are the resource contract for containers. If your SLOs matter, cgroup tuning is not optional.