Docker guide
What Docker does in production
Docker standardizes how apps are built, shipped, and run. A single image runs the same on a laptop, CI runner, or cloud VM — reducing “works on my machine” drift. Teams use Docker for microservices, local dev stacks, CI test environments, and as the runtime under Kubernetes (containerd/CRI).
Images vs containers
- Image — read-only template (layers + metadata). Built from a
Dockerfileor pulled from a registry - Container — runnable instance of an image with a writable layer, process, and config (env, ports, mounts)
Images are identified by repository:tag (e.g. nginx:1.25).
The default tag is latest — pin explicit tags in production.
How a container runs
When you docker run, the typical workflow is:
- Image resolve — pull from registry if missing locally
- Container create — Docker (via containerd) sets up namespaces (PID, network, mount, UTS, IPC)
- Cgroups — CPU, memory, and I/O limits applied if configured
- Filesystem — image layers mounted read-only; writable container layer + any volumes attached
- Process start — entrypoint/CMD runs as PID 1 inside the container
- Lifecycle — runs until PID 1 exits or
docker stopsends SIGTERM then SIGKILL
Containers are not VMs — they share the host kernel. Isolation is strong for normal workloads but not equivalent to separate machines for all threat models.
Container internals
A container is not magic — it is a normal Linux process (or tree of processes) with extra kernel features applied. Understanding namespaces, cgroups, the union filesystem, and the runtime stack helps when debugging “it works on the host but not in the container” and when tuning limits or disk use.
Linux namespaces — isolation
Namespaces give each container its own view of system resources. Processes inside cannot see host processes or networks unless explicitly shared.
- PID — process IDs restart at 1 inside the container; host PIDs are hidden
- Network — own interfaces, routes, and ports (published ports are NAT rules on the host)
- Mount — separate filesystem root (
/inside the container) - UTS — hostname and domain name
- IPC — shared memory and semaphores isolated from other containers
- User — UID/GID mapping (rootless Docker/Podman map container root to an unprivileged host user)
cgroups — limits and accounting
Control groups (cgroups) cap and measure resource use. When you set
docker run --memory 512m --cpus 1.5, Docker configures cgroup limits
the kernel enforces — exceeding memory can OOM-kill the container (exit 137).
- Memory — hard limit; swap behavior depends on host and flags
- CPU — shares or hard caps (
--cpus,cpu_quota) - pids — max processes per container (
--pids-limit) - blkio — I/O weight and throttling on supported setups
docker stats reads cgroup metrics. No limits means a container can
consume all host RAM or CPU — always set limits on shared production hosts.
overlayfs — image layers and the writable layer
Docker stores images as stacked read-only layers. At run time the engine mounts them with overlayfs (or another graph driver) and adds a thin writable container layer on top. File changes inside a running container go to that writable layer; the image layers underneath stay unchanged.
- Each
Dockerfileinstruction that changes the filesystem can create a new layer - Deleting a file in a container often creates a “whiteout” — the lower layer file is hidden, not erased from the image
- Data you must keep belongs in volumes or bind mounts — the writable layer is discarded when the container is removed
- On disk: image data under
/var/lib/docker/overlay2/(default driver on modern Linux)
Large images and many layers fill disk — use docker history and
docker system df to audit. Multi-stage builds shrink final layer count.
containerd — the runtime below Docker
Modern Docker does not talk to the kernel directly for every operation.
containerd is a core container runtime: pull images, create containers,
start/stop processes, and manage snapshots. The Docker daemon (dockerd)
exposes the user-friendly API; containerd does the heavy lifting. A low-level CLI,
ctr, can manage containerd objects when debugging without Docker.
Kubernetes uses containerd (or CRI-O) via the Container Runtime
Interface (CRI) — not dockerd. Nodes run pods through containerd; kubectl
never calls docker run on modern clusters. See the
Kubernetes lab for the control-plane view.
On a host with both Docker and K8s, you may see containerd shared or separate
depending on install method.
Stack in practice: docker CLI → dockerd →
containerd → runc (OCI runtime) → namespaces + cgroups +
rootfs mount → your process as PID 1.
Dockerfile essentials
A Dockerfile defines image build steps:
FROM— base imageRUN— build-time commands (install packages, compile)COPY/ADD— add files into the image (COPYpreferred)ENV— environment variables baked in or defaultedEXPOSE— documents ports (does not publish them)CMD/ENTRYPOINT— default command when container starts
Order layers for cache efficiency — put rarely changing steps (base, apt install)
before frequently changing app code. Use .dockerignore to exclude
node_modules, .git, and build artifacts from the context.
Volumes and bind mounts
- Bind mount —
-v /host/path:/container/path— maps a host directory into the container - Named volume — Docker-managed storage in
/var/lib/docker/volumes/— survives container removal - tmpfs — in-memory mount for ephemeral secrets or caches
Permission issues often appear when the container user ID does not match the host
directory owner. Fix with user: in Compose, chown on the host,
or an entrypoint that adjusts permissions.
Networking
Default bridge network: containers get private IPs; publish ports with
-p host:container. Common modes:
- bridge — default isolated network with port mapping
- host — container shares host network stack (no port mapping isolation)
- none — no networking
- user-defined bridge — containers resolve each other by name (Compose creates one per project)
From the host, reach a published port on localhost. Container-to-container
traffic on the same user-defined network uses service names as DNS.
Docker Compose
docker-compose.yml (or Compose v2: docker compose) defines
multi-container stacks — web + database + cache. One file describes images, env vars,
ports, volumes, networks, and dependencies (depends_on). Use for local
dev and small deployments; production often moves to Kubernetes but Compose patterns
transfer directly.
Registry and image lifecycle
Images are pulled from registries (Docker Hub, ECR, GCR, Harbor). Authenticate with
docker login. Scan tags for CVEs; rebuild base images regularly.
Prune unused images and build cache — disk fills quickly on CI and dev machines:
docker system df, docker image prune.
Docker daemon and CLI
The docker CLI talks to the dockerd daemon via a Unix
socket (/var/run/docker.sock). Permission denied on the socket means
the user is not in the docker group (or root). The daemon manages images,
containers, networks, and volumes on that host.
Podman (related)
Podman offers a Docker-compatible CLI without a long-running daemon —
useful for rootless containers on RHEL/Fedora. Most commands map directly
(podman run, podman build). See the
Podman lab and
podman scenarios for daemonless and rootless specifics.
Production practices
- One process per container — or a proper init if you need multiple (tini, dumb-init)
- Logs to stdout/stderr — collected by
docker logsor a log driver - Health checks —
HEALTHCHECKin Dockerfile or Composehealthcheck - Resource limits —
--memory,--cpusto avoid starving the host - Non-root user —
USERin Dockerfile when the app allows it - Read-only root —
--read-onlyplus tmpfs for writable dirs where supported
Learning resources
- Docker documentation — docs.docker.com
- Dockerfile reference — docs.docker.com — Dockerfile
- Compose — docs.docker.com — Compose
- Storage — docs.docker.com — storage
- Networking — docs.docker.com — network
Practice scenarios
Hands-on Docker scenarios on live Linux VMs: docker