Why Docker?
Docker solves the "works on my machine" problem. It packages your application with everything it needs -- runtime, libraries, system tools, configs -- into a portable, reproducible unit called a container.
Before Docker, deploying meant wrestling with dependency versions, OS differences, and configuration drift. Docker eliminates all of that. You build once, run anywhere.
Containers vs Virtual Machines
Containers and VMs both isolate applications, but they work at different levels.
Virtual Machines Containers
+---+ +---+ +---+ +---+ +---+ +---+
|App| |App| |App| |App| |App| |App|
+---+ +---+ +---+ +---+ +---+ +---+
|Bins| |Bins| |Bins| |Bins| |Bins| |Bins|
|Libs| |Libs| |Libs| |Libs| |Libs| |Libs|
+----+ +----+ +----+ +----+-+----+-+----+
|Guest| |Guest| |Guest| | Container Runtime |
| OS | | OS | | OS | | (Docker) |
+-----+ +-----+ +-----+ +----------------------+
+------------------------+ | Host OS |
| Hypervisor | +----------------------+
+------------------------+ | Hardware |
| Host OS | +----------------------+
+------------------------+
| Hardware |
+------------------------+
| Aspect | Virtual Machine | Container |
|---|---|---|
| Isolation Level | Full OS isolation | Process-level isolation |
| Boot Time | Minutes | Seconds |
| Size | Gigabytes (full OS) | Megabytes (app + deps) |
| Resource Usage | Heavy (each VM runs full OS) | Lightweight (shares host kernel) |
| Portability | Portable (VM image) | Highly portable (OCI standard) |
| Performance | Near-native with overhead | Near-native, minimal overhead |
| Use Case | Different OS requirements | Microservices, CI/CD, dev environments |
Key insight: VMs virtualize hardware, containers virtualize the operating system. Containers share the host kernel, which is why they're so much lighter.
Core Concepts
Images vs Containers
An image is a read-only template. A container is a running instance of an image.
Image (template) Container (running instance)
+------------------+ +------------------+
| Node.js 20 | | Node.js 20 |
| App code | ----> | App code |
| Dependencies | run | Dependencies |
| Config files | | Config files |
+------------------+ | + writable layer |
(immutable) +------------------+
(running process)
Think of it like a class vs an object. The image is the class definition, the container is the instantiated object.
# List local images
docker images
# List running containers
docker ps
# List all containers (including stopped)
docker ps -a
Docker Hub and Registries
Docker Hub is the default public registry for Docker images. It hosts official images for Node.js, Python, PostgreSQL, Redis, Nginx, and thousands more.
# Pull an image from Docker Hub
docker pull node:20-alpine
# Pull a specific version
docker pull postgres:16
# Pull from a private registry
docker pull registry.company.com/my-app:latest
Dockerfile Deep Dive
A Dockerfile is a text file with instructions to build an image. Each instruction creates a layer.
Basic Dockerfile for a Node.js App
# Use official Node.js image as base
FROM node:20-alpine
# Set working directory inside the container
WORKDIR /app
# Copy package files first (for layer caching)
COPY package.json package-lock.json ./
# Install dependencies
RUN npm ci --only=production
# Copy application code
COPY . .
# Expose the port the app listens on
EXPOSE 3000
# Define the command to start the app
CMD ["node", "server.js"]
Dockerfile Instructions Reference
| Instruction | Purpose | Example |
|---|---|---|
FROM | Base image | FROM node:20-alpine |
WORKDIR | Set working directory | WORKDIR /app |
COPY | Copy files from host to image | COPY . . |
ADD | Like COPY but supports URLs and tar extraction | ADD archive.tar.gz /app |
RUN | Execute command during build | RUN npm install |
CMD | Default command when container starts | CMD ["node", "app.js"] |
ENTRYPOINT | Fixed command (CMD becomes arguments) | ENTRYPOINT ["node"] |
ENV | Set environment variable | ENV NODE_ENV=production |
EXPOSE | Document which port the app uses | EXPOSE 3000 |
ARG | Build-time variable | ARG VERSION=1.0 |
VOLUME | Create a mount point | VOLUME /data |
USER | Set the user for subsequent instructions | USER node |
HEALTHCHECK | Define a health check command | HEALTHCHECK CMD curl -f http://localhost:3000 |
Multi-Stage Builds
Multi-stage builds dramatically reduce final image size by separating the build environment from the runtime environment.
# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: Production
FROM node:20-alpine AS production
WORKDIR /app
# Copy only what we need from the builder stage
COPY /app/dist ./dist
COPY /app/node_modules ./node_modules
COPY /app/package.json ./
# Run as non-root user
USER node
EXPOSE 3000
CMD ["node", "dist/server.js"]
Build stage image: ~800MB (Node.js + dev deps + source + build tools)
Final image: ~150MB (Node.js + prod deps + compiled output)
.dockerignore
Like .gitignore, this file tells Docker which files to exclude from the build context. Without it, COPY . . sends everything to the Docker daemon -- including node_modules, .git, and other unnecessary files.
node_modules
.git
.gitignore
.env
.env.local
dist
coverage
*.md
.vscode
.idea
Dockerfile
docker-compose.yml
Building and Running
Build an Image
# Build from current directory
docker build -t my-app:1.0 .
# Build with a specific Dockerfile
docker build -t my-app:1.0 -f Dockerfile.production .
# Build with build arguments
docker build --build-arg VERSION=2.0 -t my-app:2.0 .
Run a Container
# Run in foreground
docker run my-app:1.0
# Run in background (detached)
docker run -d my-app:1.0
# Run with port mapping (host:container)
docker run -d -p 8080:3000 my-app:1.0
# Run with environment variables
docker run -d -p 3000:3000 \
-e DATABASE_URL=postgres://localhost/mydb \
-e NODE_ENV=production \
my-app:1.0
# Run with a volume mount (for development)
docker run -d -p 3000:3000 \
-v $(pwd)/src:/app/src \
my-app:1.0
# Run with a name
docker run -d --name api-server -p 3000:3000 my-app:1.0
# Run interactively (useful for debugging)
docker run -it my-app:1.0 /bin/sh
Container Lifecycle
# Stop a running container
docker stop api-server
# Start a stopped container
docker start api-server
# Restart a container
docker restart api-server
# Remove a stopped container
docker rm api-server
# Force remove a running container
docker rm -f api-server
# View container logs
docker logs api-server
# Follow logs in real time
docker logs -f api-server
# Execute a command in a running container
docker exec -it api-server /bin/sh
# Inspect container details
docker inspect api-server
Layer Caching
Docker caches each layer. If a layer hasn't changed, Docker reuses the cached version. This makes builds fast -- but order matters.
Bad: Cache Invalidated on Every Code Change
FROM node:20-alpine
WORKDIR /app
COPY . . # ANY file change invalidates this layer
RUN npm ci # Reinstalls ALL deps every time
CMD ["node", "server.js"]
Good: Dependencies Cached Separately
FROM node:20-alpine
WORKDIR /app
COPY package.json package-lock.json ./ # Only changes when deps change
RUN npm ci # Cached unless package files change
COPY . . # Code changes only invalidate this layer
CMD ["node", "server.js"]
Layer caching flow:
Instruction | First Build | Code Change | Dep Change
---------------------|-------------|-------------|------------
FROM node:20-alpine | Build | CACHED | CACHED
COPY package*.json | Build | CACHED | Build
RUN npm ci | Build | CACHED | Build
COPY . . | Build | Build | Build
Caching Tips
- Put rarely-changing instructions first -- base image, system deps
- Copy dependency files before source code --
package.jsonbeforeCOPY . . - Combine RUN commands to reduce layers
- Use specific base image tags --
node:20-alpinenotnode:latest
# Combine RUN commands to reduce layers
RUN apt-get update && \
apt-get install -y --no-install-recommends \
curl \
ca-certificates && \
rm -rf /var/lib/apt/lists/*
Docker Compose
Docker Compose lets you define and run multi-container applications with a single YAML file. Perfect for local development environments.
Basic docker-compose.yml
version: "3.9"
services:
app:
build: .
ports:
- "3000:3000"
environment:
- DATABASE_URL=postgres://user:pass@db:5432/myapp
- REDIS_URL=redis://cache:6379
depends_on:
- db
- cache
volumes:
- ./src:/app/src # Hot reload in development
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
POSTGRES_DB: myapp
ports:
- "5432:5432"
volumes:
- postgres_data:/var/lib/postgresql/data
cache:
image: redis:7-alpine
ports:
- "6379:6379"
volumes:
postgres_data:
Compose Commands
# Start all services
docker compose up
# Start in background
docker compose up -d
# Build and start
docker compose up --build
# Stop all services
docker compose down
# Stop and remove volumes (reset data)
docker compose down -v
# View logs for all services
docker compose logs
# View logs for a specific service
docker compose logs app
# Scale a service
docker compose up -d --scale app=3
# Run a one-off command
docker compose exec app npm run migrate
Full-Stack Development Example
version: "3.9"
services:
frontend:
build:
context: ./frontend
dockerfile: Dockerfile.dev
ports:
- "3000:3000"
volumes:
- ./frontend/src:/app/src
environment:
- NEXT_PUBLIC_API_URL=http://localhost:4000
api:
build:
context: ./api
dockerfile: Dockerfile.dev
ports:
- "4000:4000"
volumes:
- ./api/src:/app/src
environment:
- DATABASE_URL=postgres://dev:devpass@db:5432/appdb
- JWT_SECRET=local-dev-secret
depends_on:
db:
condition: service_healthy
db:
image: postgres:16-alpine
environment:
POSTGRES_USER: dev
POSTGRES_PASSWORD: devpass
POSTGRES_DB: appdb
ports:
- "5432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
- ./db/init.sql:/docker-entrypoint-initdb.d/init.sql
healthcheck:
test: ["CMD-SHELL", "pg_isready -U dev"]
interval: 5s
timeout: 5s
retries: 5
adminer:
image: adminer
ports:
- "8080:8080"
depends_on:
- db
volumes:
pgdata:
Docker Networking
Containers need to communicate with each other and the outside world. Docker provides several network drivers.
Network Types
+------------------------------------------------------------+
| Docker Host |
| |
| bridge (default) host none |
| +---------------+ +----------+ +----------+ |
| | container A | |container | |container | |
| | 172.17.0.2 | |shares | |no network| |
| +-------+-------+ |host's | +----------+ |
| | |network | |
| +-------+-------+ +----------+ |
| | container B | |
| | 172.17.0.3 | |
| +---------------+ |
+------------------------------------------------------------+
# List networks
docker network ls
# Create a custom network
docker network create my-network
# Run container on a specific network
docker run -d --network my-network --name api my-app
# Connect a running container to a network
docker network connect my-network existing-container
# Inspect network
docker network inspect my-network
Service Discovery
In Docker Compose, services can reach each other by service name. Docker's internal DNS resolves the names.
services:
api:
environment:
# "db" resolves to the database container's IP
- DATABASE_URL=postgres://user:pass@db:5432/myapp
# "cache" resolves to the Redis container's IP
- REDIS_URL=redis://cache:6379
db:
image: postgres:16-alpine
cache:
image: redis:7-alpine
No need for IP addresses. Docker handles it.
Docker Volumes
Volumes persist data beyond the container lifecycle. Without volumes, data is lost when a container is removed.
# Create a named volume
docker volume create app-data
# Mount a volume when running a container
docker run -d -v app-data:/app/data my-app
# Mount a host directory (bind mount)
docker run -d -v $(pwd)/data:/app/data my-app
# List volumes
docker volume ls
# Inspect a volume
docker volume inspect app-data
# Remove unused volumes
docker volume prune
Volume Types
| Type | Syntax | Use Case |
|---|---|---|
| Named Volume | -v mydata:/app/data | Database storage, persistent data |
| Bind Mount | -v ./src:/app/src | Development hot reload |
| tmpfs Mount | --tmpfs /app/temp | Temporary data (RAM only) |
Docker Best Practices
1. Use Specific Base Image Tags
# Bad: unpredictable, changes over time
FROM node:latest
# Good: pinned version, deterministic
FROM node:20.11-alpine
2. Run as Non-Root User
FROM node:20-alpine
WORKDIR /app
COPY . .
RUN npm ci --only=production
USER node
CMD ["node", "server.js"]
3. Use HEALTHCHECK
HEALTHCHECK \
CMD curl -f http://localhost:3000/health || exit 1
4. Minimize Image Size
# Use alpine base images
FROM node:20-alpine # ~130MB vs node:20 ~900MB
# Remove unnecessary files
RUN npm ci --only=production && \
npm cache clean --force
# Use multi-stage builds (see above)
5. One Process Per Container
Each container should run one process. Don't put your API, database, and Redis in the same container.
# Right: separate services
services:
api:
build: ./api
db:
image: postgres:16-alpine
cache:
image: redis:7-alpine
6. Use Environment Variables for Configuration
# Define defaults in Dockerfile
ENV NODE_ENV=production
ENV PORT=3000
# Override at runtime
# docker run -e PORT=8080 my-app
7. Add Metadata with Labels
LABEL maintainer="[email protected]"
LABEL version="1.2.0"
LABEL description="API server for user management"
Cleanup Commands
Docker can consume significant disk space. Regular cleanup helps.
# Remove all stopped containers
docker container prune
# Remove unused images
docker image prune
# Remove unused volumes
docker volume prune
# Remove everything unused (containers, images, networks, volumes)
docker system prune -a --volumes
# Check disk usage
docker system df
Production Dockerfile Template
Putting it all together -- a production-ready Dockerfile for a Node.js application:
# syntax=docker/dockerfile:1
# Stage 1: Dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production && npm cache clean --force
# Stage 2: Build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 3: Production
FROM node:20-alpine AS production
# Add labels
LABEL maintainer="[email protected]"
# Security: run as non-root
RUN addgroup -g 1001 -S appgroup && \
adduser -S appuser -u 1001 -G appgroup
WORKDIR /app
# Copy production dependencies from deps stage
COPY /app/node_modules ./node_modules
# Copy built application from builder stage
COPY /app/dist ./dist
COPY /app/package.json ./
# Set environment
ENV NODE_ENV=production
ENV PORT=3000
# Health check
HEALTHCHECK \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
# Switch to non-root user
USER appuser
EXPOSE 3000
CMD ["node", "dist/server.js"]
Debugging Docker Issues
Common Problems and Solutions
# Container exits immediately -- check logs
docker logs <container-id>
# Port already in use
docker ps # find what's using the port
docker stop <container-using-port>
# Build cache issues -- force rebuild
docker build --no-cache -t my-app .
# Permission denied in container
# Check USER instruction, file ownership (--chown)
# Container can't connect to another container
# Ensure they're on the same network
docker network inspect bridge
# Out of disk space
docker system prune -a
Interactive Debugging
# Shell into a running container
docker exec -it my-container /bin/sh
# Run a one-off container for debugging
docker run -it --rm node:20-alpine /bin/sh
# Override the entrypoint for debugging
docker run -it --entrypoint /bin/sh my-app
Key Takeaways
- Docker packages apps with their dependencies into portable containers
- Containers share the host kernel -- lighter and faster than VMs
- Dockerfiles define how to build images; each instruction creates a cached layer
- Order matters: put rarely-changing instructions first for better caching
- Multi-stage builds keep production images small
- Docker Compose orchestrates multi-container development environments
- Services in Compose can reach each other by name (DNS-based service discovery)
- Use volumes for persistent data, bind mounts for development
- Production images: specific tags, non-root user, health checks, minimal size
- One process per container is the fundamental design principle