Marcell CD

Docker: A Developer's Guide

What is Docker, and why should you, as a software developer, understand how it works? Have you ever used Docker commands and wondered, “Why does this even work?” or “Why doesn’t it work?” You don’t need to be an expert, but knowing the fundamentals will make your life easier as a software developer. This knowledge will help you write efficient Docker images and resolve bugs and performance issues.

What is Docker

One thing we need to. Make clear is that Docker is not a virtual machine (VM). A VM emulates an entire machine, computer - its own kernel, its own operating system(OS) and its own hardware abstraction/ That’s heavy for the computer to run and execute. Docker takes a different approach compered to VM’s.

Docker uses features built into the Linux kernel — specifically namespaces and cgroups — to isolate processes. Your container is just a process running on your host machine, but it thinks it’s alone in the world. It has its own filesystem, its own network, its own process tree.

Virtual MachineDocker Container
BootsA full OSA process
Startup timeMinutesMilliseconds
SizeGigabytesMegabytes
IsolationHardware-levelKernel-level

This is why Docker is so fast and lightweight. You’re not booting a computer — you’re starting a process.


Images vs Contaienrs

Lets make clear what the main difference between images and containers are since a lot of people get confused between them and make it clear.

You can think of it like this:

Image → like a class in OOP Container → like an instance of that class

You can run many containers from the same image, and they don’t interfere with each other. When a container writes data, those changes live only in that container — the original image is never touched.

Pull an image from Docker Hub docker pull nginx

Run a container from that image docker run -d -p 8080:80 nginx

Run a second, completely independent container from the same image docker run -d -p 8081:80 nginx

Both containers share the same image underneath, but live completely separate lives.


🧅 Layers — Why Docker Is Efficient

Images aren’t monolithic blobs. They’re made of layers, stacked on top of each other. Each layer represents a change to the filesystem on your machine.

This is a critical concept because:

  1. Layers are cached. If a layer hasn’t changed, Docker reuses it.
  2. Layers are shared between images. If two images use the same base, they don’t duplicate that data.

Here’s what that looks like in practice:

Layer 4: Copy your app code ← changes often Layer 3: Install npm dependencies ← changes sometimes Layer 2: Install Node.js ← rarely changes Layer 1: Ubuntu base image ← almost never changes

Docker builds from top to bottom, and as soon as one layer changes, every layer below it must be rebuilt. This is why instruction order in your Dockerfile matters enormously for build speed.

For example, here is an illustration of a slow Dockerfile.

** 🚨 Slow Dockerfile (wrong order):**

FROM node:20
COPY . .                  # Copies everything first
RUN npm install           # Runs install AFTER — cache busts every time code changes!
CMD ["node", "index.js"]

✅ Fast Dockerfile (correct order):

FROM node:20
COPY package*.json ./     # Copy only what npm needs first
RUN npm install           # This layer is now cached as long as package.json doesn't change
COPY . .                  # Copy your app code last
CMD ["node", "index.js"]

In the optimized version, npm install is only re-run when your dependencies actually change — not every time you edit a .js file.


📄 The Dockerfile — Thinking in Build Steps

A Dockerfile is a recipe for building an image. Every instruction creates a new layer. Here’s what the most important instructions actually do:

InstructionWhat it does
FROMSets the base image to build on top of
RUNExecutes a shell command during the build
COPYCopies files from your machine into the image
ENVSets environment variables available at build and runtime
EXPOSEDocuments which port the container listens on (informational)
CMDThe default command to run when a container starts (overridable)
ENTRYPOINTThe fixed command that always runs (CMD becomes its arguments)

Understanding Docker Layers

Each instruction in a Dockerfile creates a new layer in the image. Think of layers like transparent sheets stacked on top of each other — each one adds or modifies something from the previous layers. This layered architecture is what makes Docker images efficient:

Real-World Example: Node.js Application

Here’s a production-ready Dockerfile for a Node.js app with explanations:

# 1. Start from official Node image
FROM node:20-alpine

# 2. Set working directory inside the container
WORKDIR /app

# 3. Set an environment variable
ENV NODE_ENV=production

# 4. Copy dependency files and install — cached layer
COPY package*.json ./
RUN npm ci --only=production

# 5. Copy the rest of your source code
COPY . .

# 6. Document the port
EXPOSE 3000

# 7. Start the app
CMD ["node", "server.js"]

Why this order?

Additional Important Instructions

InstructionWhat it doesWhen to use
ARGBuild-time variables (not available at runtime)API keys for private registries, version numbers
USERSets the user/UID to run asSecurity: avoid running as root
VOLUMECreates a mount point for external volumesDatabase files, uploaded content
HEALTHCHECKDefines how Docker checks if container is healthyProduction monitoring
LABELAdds metadata to the imageVersion info, maintainer, description

CMD vs ENTRYPOINT — The Key Difference

Both define what runs when a container starts, but they behave differently:

# CMD — easily overridden at runtime
CMD ["node", "server.js"]
# docker run myimage node other-script.js  ← works fine, overrides CMD

# ENTRYPOINT — the container IS this command
ENTRYPOINT ["node"]
CMD ["server.js"]           # default argument to ENTRYPOINT
# docker run myimage other-script.js  ← runs: node other-script.js

When to use which?

Best Practices for Writing Dockerfiles

  1. Use specific base image tags

    # ❌ Bad - might change unexpectedly
    FROM node:latest
    
    # ✅ Good - predictable
    FROM node:20.12.0-alpine
  2. Combine RUN commands to reduce layers

    # ❌ Creates 3 layers
    RUN apt-get update
    RUN apt-get install -y curl
    RUN apt-get clean
    
    # ✅ Creates 1 layer
    RUN apt-get update && \
        apt-get install -y curl && \
        apt-get clean && \
        rm -rf /var/lib/apt/lists/*
  3. Use .dockerignore to exclude files

    # .dockerignore
    node_modules
    .git
    .env
    *.log
  4. Multi-stage builds for smaller images

    # Build stage
    FROM node:20-alpine AS builder
    WORKDIR /app
    COPY package*.json ./
    RUN npm ci
    COPY . .
    RUN npm run build
    
    # Production stage
    FROM node:20-alpine
    WORKDIR /app
    COPY package*.json ./
    RUN npm ci --only=production
    COPY --from=builder /app/dist ./dist
    CMD ["node", "dist/server.js"]

Common Patterns by Language/Framework

Python/Django:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["gunicorn", "myapp.wsgi:application", "--bind", "0.0.0.0:8000"]

Go:

FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN go build -o main .

FROM alpine:latest
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=builder /app/main .
CMD ["./main"]

Debugging Dockerfile Builds

When things go wrong, these techniques help:

# Build with detailed output
docker build --progress=plain --no-cache -t myapp .

# Debug a specific build stage
docker build --target builder -t myapp-debug .

# Inspect intermediate layers
docker history myapp

# Run commands in a failed build container
docker run -it <image-id-from-failed-step> /bin/sh

Remember: A well-written Dockerfile is the foundation of a reliable containerized application. Take time to understand each instruction and optimize for both build speed and final image size.


💾 Volumes — Solving the Persistence Problem

Containers are ephemeral. When a container dies, any data it wrote to its filesystem is gone. Forever.

This is a feature, not a bug — it keeps containers predictable and stateless. But obviously, for databases and file uploads, you need data to survive restarts. That’s what volumes are for.

A volume is a directory that lives outside the container on the host filesystem, but is mounted into the container so it can read and write there.

# Create a named volume
docker volume create mydata

# Mount it into a container at /app/data
docker run -v mydata:/app/data myimage

# Use a bind mount (maps a specific host folder)
docker run -v /home/user/myproject:/app myimage

Now even if the container is destroyed and recreated, the data in the volume persists.

Host Machine                   Container
───────────────────────────────────────────
/var/lib/docker/volumes/mydata ←→ /app/data

   Data lives here, safe and sound

Rule of thumb: Anything stateful (databases, uploaded files, logs) should live in a volume. Your app code and dependencies should be baked into the image.


🌐 Networking — How Containers Talk to Each Other

By default, Docker creates a private internal network. Containers on the same network can talk to each other by name — Docker has a built-in DNS resolver that maps container names to their IP addresses.

# Create a custom network
docker network create myapp-network

# Start a database on that network
docker run -d \
  --name postgres-db \
  --network myapp-network \
  -e POSTGRES_PASSWORD=secret \
  postgres:16

# Start your app on the same network
docker run -d \
  --name my-api \
  --network myapp-network \
  -p 3000:3000 \
  myimage

Now inside my-api, you can connect to the database using the hostname postgres-db:

// In your Node.js app — no IP addresses needed!
const connectionString = "postgresql://postgres:secret@postgres-db:5432/mydb"

Docker resolves postgres-db to the correct container IP automatically. This is incredibly powerful — your app doesn’t need to know or care about internal IPs.


🎼 Docker Compose — Orchestrating the Whole Thing

Running every container manually with docker run gets unwieldy fast. Docker Compose lets you define your entire multi-container application in a single docker-compose.yml file.

Docker Compose is more commonly used locally and for development purposes, but it is less common in production environments. For production-grade orchestration, teams typically migrate to Kubernetes or Docker Swarm, though Compose can work for simpler production deployments.

Complete Example: Node.js API with PostgreSQL and Redis

Here’s a production-ready example with health checks and proper configurations:

# docker-compose.yml
services:
  api:
    build: . # Build from the Dockerfile in this directory
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=development
      - DATABASE_URL=postgresql://postgres:secret@db:5432/mydb
      - REDIS_URL=redis://cache:6379
    depends_on:
      db:
        condition: service_healthy # Wait for DB to be ready
      cache:
        condition: service_healthy
    volumes:
      - .:/app # Bind mount for live code reloading
      - /app/node_modules # Keep container's node_modules
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  db:
    image: postgres:16
    environment:
      POSTGRES_PASSWORD: secret
      POSTGRES_DB: mydb
    volumes:
      - postgres-data:/var/lib/postgresql/data # Persist database data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

  cache:
    image: redis:7-alpine
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 3

volumes:
  postgres-data: # Named volume, managed by Docker

networks:
  default:
    driver: bridge

Essential Docker Compose Commands

# Start everything in the background
docker compose up -d

# See what's running
docker compose ps

# Stream logs from all services
docker compose logs -f

# Stream logs from specific service
docker compose logs -f api

# Execute commands in running containers
docker compose exec api npm test

# Rebuild images before starting
docker compose up -d --build

# Scale a service to multiple instances
docker compose up -d --scale api=3

# Tear everything down (volumes are preserved)
docker compose down

# Tear down AND delete volumes (fresh start)
docker compose down -v

All services are automatically placed on the same network, so api can reach db and cache by name — exactly as we covered in the networking section.

Advanced Docker Compose Features

1. Health Checks and Dependencies

Health checks ensure services are actually ready, not just started:

services:
  api:
    depends_on:
      db:
        condition: service_healthy # Waits for health check to pass
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s # Grace period for startup

2. Multiple Compose Files for Different Environments

Use override files to customize for different environments:

# docker-compose.yml (base configuration)
services:
  api:
    image: myapp:latest
    environment:
      - LOG_LEVEL=info

# docker-compose.override.yml (auto-loaded for development)
services:
  api:
    build: .
    volumes:
      - .:/app
    environment:
      - LOG_LEVEL=debug

# docker-compose.prod.yml (production overrides)
services:
  api:
    restart: always
    environment:
      - LOG_LEVEL=warning
    deploy:
      replicas: 3
# Development (uses base + override automatically)
docker compose up

# Production (explicitly specify files)
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

3. Profiles for Conditional Services

Run different service combinations based on profiles:

services:
  api:
    image: myapp:latest

  db:
    image: postgres:16

  debug-tools:
    image: busybox
    profiles: ["debug"] # Only starts when debug profile is active

  monitoring:
    image: prometheus
    profiles: ["monitoring", "production"]
# Start only core services
docker compose up

# Include debug tools
docker compose --profile debug up

# Include monitoring stack
docker compose --profile monitoring up

4. Resource Limits and Reservations

Control resource usage for production deployments:

services:
  api:
    image: myapp:latest
    deploy:
      resources:
        limits:
          cpus: "0.5"
          memory: 512M
        reservations:
          cpus: "0.25"
          memory: 256M

Docker Compose vs Other Orchestrators

FeatureDocker ComposeKubernetesDocker Swarm
ComplexitySimple YAML, easy to learnSteep learning curveModerate complexity
Use CaseDevelopment, small productionEnterprise productionSimple production clusters
ScalingSingle host onlyMulti-host, auto-scalingMulti-host, manual scaling
Self-healingBasic restart policiesAdvanced with pod managementBasic service recovery
Load BalancingManual with nginx/HAProxyBuilt-in service meshBuilt-in simple LB
Setup TimeMinutesHours to days30 minutes

Best Practices for Docker Compose

  1. Always specify version and use latest schema

    version: "3.8" # or newer
  2. Use environment files for sensitive data

    services:
      api:
        env_file:
          - .env # Git-ignored file with secrets
          - .env.local # Local overrides
  3. Leverage build arguments for flexible images

    services:
      api:
        build:
          context: .
          args:
            NODE_VERSION: 20
            APP_ENV: ${APP_ENV:-development}
  4. Use explicit container names for easier debugging

    services:
      api:
        container_name: myapp_api_1
  5. Define restart policies for production

    services:
      api:
        restart: unless-stopped # or "always" for critical services

Common Patterns and Examples

Full-Stack Application with Frontend

services:
  frontend:
    build: ./frontend
    ports:
      - "80:80"
    depends_on:
      - api

  api:
    build: ./backend
    environment:
      - DATABASE_URL=postgresql://user:pass@db:5432/app
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:16
    volumes:
      - db-data:/var/lib/postgresql/data

volumes:
  db-data:

Microservices with Service Discovery

services:
  gateway:
    image: nginx
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf

  auth-service:
    build: ./services/auth
    expose:
      - "3001" # Internal port only

  user-service:
    build: ./services/users
    expose:
      - "3002"

  order-service:
    build: ./services/orders
    expose:
      - "3003"

Debugging Docker Compose Applications

# Validate compose file syntax
docker compose config

# See real-time events
docker compose events

# Run one-off commands
docker compose run --rm api npm test

# Start specific services only
docker compose up db cache

# Remove orphan containers
docker compose up --remove-orphans

Remember: Docker Compose excels at defining relationships between containers and managing them as a unit. While it’s primarily a development tool, it can handle simple production deployments. For complex production needs requiring high availability, auto-scaling, or multi-host deployments, consider graduating to Kubernetes or Docker Swarm.


🧠 Putting It All Together

Here’s the mental model to keep in your head:

Dockerfile
    ↓  docker build
  Image  ──────────────────────── (Layers, cached, shared)
    ↓  docker run / docker compose up
Container  ──────────────────────  (Isolated process)

    ├── Network  ────────────────  (Talk to other containers by name)
    └── Volume   ────────────────  (Persist data outside the container)

Once these ideas click, Docker stops feeling like magic (or black magic) and becomes a genuinely elegant tool. The commands stop being something you copy from Stack Overflow and start being something you can reason about.