Docker Tutorial for Beginners: Core Concepts, Dockerfile, and Compose — All in One Place

The first time I ran docker run hello-world I honestly had no idea what just happened. There was a wall of green text, something downloaded, something ran, and I just kind of nodded at my screen like yeah, that makes sense.

It didn’t make sense. I had no idea what an image was versus a container. I didn’t know what a layer was. I just knew Docker was “important” and I needed to learn it, so I fumbled through it for a few weeks until things started clicking.

This post is what I wish existed back then. I’m going to cover the mental models first — because if you try to learn Docker by memorizing commands without the mental models, you’ll keep forgetting everything and re-Googling the same stuff. Then we’ll do a real Dockerfile, a cheatsheet worth bookmarking, and Docker Compose for when you need multiple services talking to each other.

The Mental Models (Do Not Skip This Part)

Seriously. A lot of beginner Docker tutorials go straight to commands and that’s why Docker feels slippery to learn. Spend five minutes here and the rest of this becomes obvious.

Images vs. Containers

An image is a blueprint. A container is what you get when you actually run that blueprint.

Think of it like a class vs. an object in any OOP language. The class is the definition — the object is the live thing in memory. You can spin up ten containers from the same image and they’re completely separate from each other. Stop one, the others keep going. Delete a container, the image is still sitting there. It’s a clean separation that trips people up early because the terminology is a bit overloaded.

Layers

Every instruction in a Dockerfile creates a layer. Docker caches layers individually, which ends up being the most useful thing to understand for day-to-day work. If nothing changed in a layer, Docker just pulls it from cache instead of rebuilding it. Change something in the middle of your Dockerfile, and Docker only reruns from that point down.

This sounds like an implementation detail but it directly affects how you write Dockerfiles. More on this when we get to the actual example.

Registries

Docker Hub is the default registry — it’s basically GitHub but for images. When you run docker pull postgres, that’s pulling from Docker Hub. You can set up a private registry (AWS ECR, GitHub Container Registry, etc.) which is what most teams do in production.

Volumes

Containers are ephemeral. When a container exits and you remove it, whatever was written inside it is gone. That’s by design. Volumes let you persist data outside the container’s lifecycle — so your database isn’t wiped every time you restart. This matters a lot once you’re running anything stateful.

Networks

By default, containers don’t know the others exist. Docker networks let them communicate. The useful part here is that Docker Compose creates an internal network automatically, and containers on it can reach each other using the service name as a hostname. So instead of hardcoding localhost:5432, your Node app just talks to postgres:5432 and Docker handles the routing. That one feature alone makes Compose worth using.

Container Lifecycle

Containers move through a simple set of states:

created → running → stopped → removed
What you want to doCommand
Create a container (but don’t start it)docker create
Start a created containerdocker start
Stop a running containerdocker stop
Remove a stopped containerdocker rm
Do all of the above in one godocker run

docker run is the one you’ll actually use constantly. It’s just create + start combined, with a bunch of useful flags. The two I use most:

bash

docker run -d --name my-app -p 3000:3000 my-node-image

-d runs it detached (background), --name gives it a name you can reference later instead of the random hash, and -p 3000:3000 maps port 3000 on the container to port 3000 on your machine. That last part confuses people at first — it’s host:container, not the other way around.

Writing a Dockerfile That Doesn’t Waste Your Time

Let me show you a Dockerfile for a Node.js app and explain the one trick that makes builds actually fast.

dockerfile

FROM node:20-alpine

WORKDIR /app

COPY package.json package-lock.json ./

RUN npm install

COPY . .

EXPOSE 3000

CMD ["node", "src/index.js"]

The important thing here is the order of those COPY lines.

I copy package.json first, run npm install, then copy everything else. Most people don’t think about this and just do COPY . . right at the top, which means every single time you change any source file — which is constantly — Docker has to redo the npm install step. That’s painful when you’ve got 200 dependencies.

But because npm install sits in its own layer after package.json is copied, Docker caches it. As long as package.json doesn’t change, that layer never reruns. Your code changes, Docker rebuilds only from COPY . . onward, and you skip the slow install entirely. First time I figured this out I felt like I’d unlocked something.

A couple other things in that file worth noting:

node:20-alpine — Alpine Linux is a tiny base image. The full node:20 image is something like 1GB+. Alpine is under 200MB. If you’re pushing images to a registry regularly, this adds up.

EXPOSE 3000 — this is just documentation. It doesn’t actually publish the port. You still need -p when you run the container. A lot of tutorials don’t make this clear and it causes confusion.

Cheatsheet

Bookmark this tab. These are the commands I actually use.

Images

bash

docker images                         # What's on my machine
docker pull nginx                     # Pull from Docker Hub
docker build -t my-app:1.0 .          # Build from current directory
docker rmi my-app:1.0                 # Remove an image
docker image prune                    # Clean up dangling images

Containers

bash

docker ps                             # Running containers
docker ps -a                          # All containers, including stopped
docker run -d -p 3000:3000 my-app     # Run detached, map port
docker run --rm -it ubuntu bash       # Interactive, auto-delete on exit
docker stop <id>                      # Stop gracefully
docker rm <id>                        # Remove stopped container
docker logs <id>                      # See what's happening inside
docker exec -it <id> sh               # Shell in while it's running

Cleanup

bash

docker system prune                   # Remove stopped containers, unused networks, dangling images
docker system prune -a                # Same but also unused images
docker volume prune                   # Remove unused volumes

The docker exec -it <id> sh one is genuinely useful for debugging. Something’s broken, you don’t know why, just shell in and look around. Saved me a lot of time.

Docker Compose

Single-container Docker is fine for a simple service. But once you have a Node API, a Postgres database, and a frontend all needing to run together and communicate — you don’t want to be writing three separate docker run commands with a bunch of flags and manually creating networks. That’s where Compose comes in.

Here’s a docker-compose.yml for something like a resource-booking system — Node.js backend, Postgres, and a frontend:

yaml

version: '3.8'

services:
  postgres:
    image: postgres:15
    environment:
      POSTGRES_DB: booking_db
      POSTGRES_USER: admin
      POSTGRES_PASSWORD: secret
    volumes:
      - postgres_data:/var/lib/postgresql/data
    ports:
      - "5432:5432"

  api:
    build: ./api
    environment:
      DATABASE_URL: postgres://admin:secret@postgres:5432/booking_db
      NODE_ENV: development
    ports:
      - "3000:3000"
    depends_on:
      - postgres
    volumes:
      - ./api:/app
      - /app/node_modules

  frontend:
    build: ./frontend
    ports:
      - "5173:5173"
    depends_on:
      - api

volumes:
  postgres_data:

A few things that are easy to miss here:

The DATABASE_URL uses postgres as the hostname — not localhost, not an IP. That’s the service name, and Compose’s internal DNS resolves it automatically. Took me an embarrassingly long time to internalize this when I first started.

The double volume on api:

  • ./api:/app mounts your local source into the container so live changes reflect without rebuilding
  • /app/node_modules is a trick — without it, your local node_modules folder would overwrite the one installed inside the container, and things would break in confusing ways

depends_on tells Compose to start postgres before api. Worth knowing: it doesn’t wait for Postgres to be ready to accept connections, just started. If your app boots fast and Postgres isn’t done initializing, you’ll get connection errors. The workaround is a health check or a small retry loop in your app startup. I’ve hit this bug more than once.

Compose commands you’ll use:

bash

docker compose up           # Start everything, attached
docker compose up -d        # Start everything, detached
docker compose down         # Stop and remove containers
docker compose down -v      # Stop and remove containers + volumes (careful — this wipes your DB)
docker compose logs api     # Logs for a specific service
docker compose exec api sh  # Shell into a running service

One More Thing: .dockerignore

Before you ship your first image, create a .dockerignore file. It works exactly like .gitignore — anything listed there gets excluded from the build context Docker sends to the daemon.

Without it, you’re sending your entire node_modules (potentially hundreds of MB), your .env file with secrets, your .git history, everything — into the build. Builds get slow and images get bloated.

node_modules
.env
.git
dist
*.log

That’s a reasonable starting point for a Node.js project.

Also — and I can’t stress this enough — don’t put real secrets in your Dockerfile. Environment variables set with ENV in a Dockerfile get baked into the image layer and are visible to anyone who pulls it. Use .env files at runtime or proper secrets management. This is a mistake people make in production and it’s not fun to clean up.

Where to Go From Here

If you got through all of this, you’re in solid shape for day-to-day Docker usage. The next things worth learning are multi-stage builds (big for keeping production image sizes down) and health checks in Compose. After that, if you’re heading toward Kubernetes territory, a lot of what you’ve learned here carries over directly.

Questions or something that didn’t click? Drop a comment below — I check them.