What is Docker? Deep Dive into Containerization
Episode - What is Docker? Deep Dive into Containerization
Hey everyone! Welcome back to the tutorial series. Today we are going to take a deep dive into Docker - one of the most important tools in modern software development!
If you've heard terms like "containers", "images", "Docker Hub" and wondered what they mean - this is the episode for you!
What we will cover:
- What is Docker?
- The Problem Before Docker
- Docker vs Virtual Machines
- Docker Architecture
- Docker Images
- Docker Containers
- Dockerfile - Building Custom Images
- Docker Hub
- Essential Docker Commands
- Docker Volumes
- Docker Networks
- Docker Compose
- Real-World Example
- Interview Questions
What is Docker?
Let's start with the most basic question - What exactly is Docker?
Here's the official definition:
"Docker is an open-source platform that automates the deployment, scaling, and management of applications using containerization."
Wait, what does that mean? Let me break it down for you!
- Containerization - A method of packaging an application with all its dependencies
- Container - A lightweight, standalone, executable package
- Isolation - Each container runs in its own isolated environment
- Portability - "Works on my machine" becomes "Works on EVERY machine"
In simple words:
Docker = Package your application + dependencies + environment into a single unit that runs ANYWHERE!
Think of Docker like a Shipping Container: ========================================== Before shipping containers: - Different goods packed differently - Hard to transport across ships, trucks, trains - Damaged during transfers After shipping containers: - Standard size container - Works on any ship, truck, or train - Protected and isolated Docker does the SAME for software! - Standard container format - Works on any machine with Docker - Isolated and protected
The Problem Before Docker
To understand why Docker was created, let's see what problems existed before it!
Problem 1: "It works on my machine!"
The Classic Developer Problem: ============================== Developer: "The app works perfectly on my laptop!" Tester: "It's crashing on my machine!" DevOps: "It doesn't work on the server!" WHY? Different environments! - Different OS versions - Different library versions - Different configurations - Missing dependencies
Problem 2: Dependency Hell
Dependency Conflicts: ===================== App A needs: Python 2.7, Node 14, MySQL 5.7 App B needs: Python 3.9, Node 18, MySQL 8.0 Running both on same machine? NIGHTMARE! - Version conflicts - Library clashes - Configuration mess
Problem 3: Slow and Heavy Virtual Machines
Virtual Machine Problems: ========================= ┌─────────────────────────────────┐ │ App (50MB) │ ├─────────────────────────────────┤ │ Guest OS (2-10GB!) │ ← Heavy! ├─────────────────────────────────┤ │ Hypervisor │ ├─────────────────────────────────┤ │ Host OS │ ├─────────────────────────────────┤ │ Hardware │ └─────────────────────────────────┘ - Each VM needs its own OS (GBs of space!) - Slow to start (minutes) - Resource hungry (RAM, CPU) - Hard to share and distribute
Docker solves ALL these problems!
Docker vs Virtual Machines
This is a very important comparison to understand!
Virtual Machine (VM): Docker Container:
===================== =================
┌───────────┐ ┌───────────┐ ┌───────────┐ ┌───────────┐
│ App A │ │ App B │ │ App A │ │ App B │
├───────────┤ ├───────────┤ ├───────────┤ ├───────────┤
│ Bins/Libs│ │ Bins/Libs│ │ Bins/Libs│ │ Bins/Libs│
├───────────┤ ├───────────┤ └───────────┴─┴───────────┘
│ Guest OS │ │ Guest OS │ │
│ (2-10GB) │ │ (2-10GB) │ ┌──────────┴──────────┐
└───────────┴─┴───────────┘ │ Docker Engine │
│ └──────────┬──────────┘
┌──────────┴──────────┐ │
│ Hypervisor │ ┌──────────┴──────────┐
└──────────┬──────────┘ │ Host OS │
│ └──────────┬──────────┘
┌──────────┴──────────┐ │
│ Host OS │ ┌──────────┴──────────┐
└──────────┬──────────┘ │ Hardware │
│ └─────────────────────┘
┌──────────┴──────────┐
│ Hardware │ NO Guest OS needed!
└─────────────────────┘ Containers share Host OS kernel!
| Feature | Virtual Machine | Docker Container |
|---|---|---|
| OS | Each VM has its own OS | Shares host OS kernel |
| Size | GBs (includes full OS) | MBs (only app + dependencies) |
| Startup Time | Minutes | Seconds |
| Performance | Slower (hardware virtualization) | Near native (OS-level virtualization) |
| Resource Usage | High (RAM, CPU for each OS) | Low (shared resources) |
| Isolation | Complete (separate OS) | Process-level isolation |
| Portability | Less portable | Highly portable |
Key Insight: Containers virtualize the OS, while VMs virtualize the hardware!
Docker Architecture
Let's understand how Docker works behind the scenes!
Docker Architecture:
====================
┌─────────────────────────────────────────────────────────────┐
│ DOCKER CLIENT │
│ │
│ docker build docker pull docker run │
│ │ │ │ │
└─────────┴───────────────┴──────────────┴────────────────────┘
│
REST API
│
▼
┌─────────────────────────────────────────────────────────────┐
│ DOCKER HOST │
│ ┌───────────────────────────────────────────────────────┐ │
│ │ DOCKER DAEMON (dockerd) │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Container 1 │ │ Container 2 │ │ Container 3 │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ │ │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ │
│ │ │ Image 1 │ │ Image 2 │ │ Image 3 │ │ │
│ │ └─────────────┘ └─────────────┘ └─────────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ DOCKER REGISTRY │
│ (Docker Hub) │
│ │
│ node python nginx mongodb │
│ redis postgres ubuntu alpine │
└─────────────────────────────────────────────────────────────┘
Key Components Explained
1. Docker Client
- The command-line tool you interact with
- Sends commands to Docker Daemon via REST API
- Commands like: docker run, docker build, docker pull
2. Docker Daemon (dockerd)
- The background service running on host
- Manages Docker objects (images, containers, networks, volumes)
- Listens for Docker API requests
3. Docker Registry
- Storage and distribution system for Docker images
- Docker Hub is the default public registry
- You can also create private registries
Docker Images
An image is a read-only template with instructions for creating a container. Think of it like a blueprint or a class in OOP!
Docker Image Concept:
=====================
Image = Blueprint/Template (Read-Only)
Container = Running Instance of Image
Just like:
Class = Blueprint
Object = Instance of Class
One Image → Multiple Containers!
┌─────────────────┐
│ Node Image │
└────────┬────────┘
│
┌────┴────┬────────────┐
│ │ │
▼ ▼ ▼
┌───────┐ ┌───────┐ ┌───────────┐
│ App 1 │ │ App 2 │ │ App 3 │
│Container│Container│ Container │
└───────┘ └───────┘ └───────────┘
Image Layers:
Docker images are built in layers. Each instruction in Dockerfile creates a new layer!
Image Layers Example: ===================== ┌─────────────────────────┐ │ Layer 5: CMD node app │ ← Run command ├─────────────────────────┤ │ Layer 4: COPY . . │ ← Your code ├─────────────────────────┤ │ Layer 3: npm install │ ← Dependencies ├─────────────────────────┤ │ Layer 2: WORKDIR /app │ ← Working directory ├─────────────────────────┤ │ Layer 1: node:18-alpine│ ← Base image └─────────────────────────┘ Benefits of Layers: - Cached! (faster builds) - Shared between images - Only changed layers rebuild
Docker Containers
A container is a runnable instance of an image. It's what actually runs your application!
Container Characteristics: ========================== ┌─────────────────────────────────────┐ │ CONTAINER │ │ │ │ ┌─────────────────────────────┐ │ │ │ Your Application │ │ │ ├─────────────────────────────┤ │ │ │ Dependencies │ │ │ ├─────────────────────────────┤ │ │ │ Runtime Environment │ │ │ └─────────────────────────────┘ │ │ │ │ Features: │ │ ✅ Isolated process │ │ ✅ Own filesystem │ │ ✅ Own network interface │ │ ✅ Resource limits (CPU, RAM) │ │ ✅ Portable │ └─────────────────────────────────────┘
Container Lifecycle:
Container States:
=================
docker create docker start
│ │
▼ ▼
┌─────────┐ ┌─────────┐
│ Created │──────────→│ Running │
└─────────┘ └────┬────┘
│
┌────────────────┼────────────────┐
│ │ │
docker stop docker pause docker kill
│ │ │
▼ ▼ ▼
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Stopped │ │ Paused │ │ Dead │
└─────────┘ └─────────┘ └─────────┘
│
docker rm
│
▼
┌─────────┐
│ Removed │
└─────────┘
Dockerfile - Building Custom Images
A Dockerfile is a text file with instructions to build a Docker image. It's like a recipe!
Dockerfile Syntax: ================== # Comment INSTRUCTION arguments Common Instructions: - FROM → Base image - WORKDIR → Set working directory - COPY → Copy files from host to image - RUN → Execute commands during build - ENV → Set environment variables - EXPOSE → Document which port to expose - CMD → Default command when container starts - ENTRYPOINT → Configure container as executable
Example Dockerfile for Node.js App:
# Use Node.js base image FROM node:18-alpine # Set working directory inside container WORKDIR /app # Copy package files first (for better caching) COPY package*.json ./ # Install dependencies RUN npm install # Copy rest of the application code COPY . . # Expose port 3000 EXPOSE 3000 # Command to run the application CMD ["node", "index.js"]
Building the Image:
# Build image from Dockerfile docker build -t my-node-app . # Breakdown: # docker build → Build command # -t my-node-app → Tag/name for the image # . → Build context (current directory) OUTPUT: [+] Building 45.2s (10/10) FINISHED => [1/5] FROM node:18-alpine => [2/5] WORKDIR /app => [3/5] COPY package*.json ./ => [4/5] RUN npm install => [5/5] COPY . . => exporting to image => naming to docker.io/library/my-node-app
Dockerfile Best Practices
Best Practices:
===============
1. Use specific base image tags (not 'latest')
❌ FROM node
✅ FROM node:18-alpine
2. Use .dockerignore file
node_modules
.git
.env
*.log
3. Order instructions by change frequency
(least changing → most changing)
Better layer caching!
4. Combine RUN commands
❌ RUN apt-get update
RUN apt-get install -y curl
✅ RUN apt-get update && apt-get install -y curl
5. Use multi-stage builds for smaller images
FROM node:18 AS builder
# build steps...
FROM node:18-alpine
COPY --from=builder /app/dist ./dist
6. Don't run as root
RUN adduser -D appuser
USER appuser
Docker Hub
Docker Hub is the world's largest container registry - like GitHub but for Docker images!
Docker Hub: =========== ┌─────────────────────────────────────────┐ │ DOCKER HUB │ │ hub.docker.com │ │ │ │ Official Images: │ │ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ node │ │ python │ │ nginx │ │ │ └────────┘ └────────┘ └────────┘ │ │ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ mongo │ │ redis │ │postgres│ │ │ └────────┘ └────────┘ └────────┘ │ │ │ │ Your Images: │ │ ┌──────────────────────────────┐ │ │ │ username/my-awesome-app │ │ │ └──────────────────────────────┘ │ └─────────────────────────────────────────┘ Commands: - docker pull → Download image - docker push → Upload image - docker search → Search for images
Working with Docker Hub:
# Pull an image from Docker Hub docker pull node:18-alpine # Login to Docker Hub docker login # Tag your image for pushing docker tag my-node-app username/my-node-app:v1.0 # Push to Docker Hub docker push username/my-node-app:v1.0 # Pull your image on another machine docker pull username/my-node-app:v1.0
Essential Docker Commands
Let's learn the most important Docker commands!
Image Commands
# List all images docker images # Pull an image docker pull nginx:latest # Build an image docker build -t my-app:v1 . # Remove an image docker rmi image_name # Remove all unused images docker image prune -a # Inspect an image docker inspect image_name # View image history (layers) docker history image_name
Container Commands
# Run a container docker run nginx # Run in detached mode (background) docker run -d nginx # Run with port mapping docker run -d -p 8080:80 nginx # Host port 8080 → Container port 80 # Run with name docker run -d --name my-nginx -p 8080:80 nginx # Run with environment variables docker run -d -e NODE_ENV=production my-app # Run interactive mode (with terminal) docker run -it ubuntu bash # List running containers docker ps # List all containers (including stopped) docker ps -a # Stop a container docker stop container_name # Start a stopped container docker start container_name # Restart a container docker restart container_name # Remove a container docker rm container_name # Remove running container (force) docker rm -f container_name # View container logs docker logs container_name # Follow logs in real-time docker logs -f container_name # Execute command in running container docker exec -it container_name bash # View container resource usage docker stats
Complete Example - Running Node.js App
Step-by-Step Example:
=====================
1. Create a simple Node.js app
--------------------------------
// index.js
const express = require('express');
const app = express();
app.get('/', (req, res) => {
res.send('Hello from Docker!');
});
app.listen(3000, () => {
console.log('Server running on port 3000');
});
2. Create package.json
----------------------
{
"name": "docker-demo",
"version": "1.0.0",
"main": "index.js",
"dependencies": {
"express": "^4.18.2"
}
}
3. Create Dockerfile
--------------------
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "index.js"]
4. Create .dockerignore
-----------------------
node_modules
.git
.env
5. Build the image
------------------
docker build -t my-express-app .
6. Run the container
--------------------
docker run -d -p 3000:3000 --name express-server my-express-app
7. Test it!
-----------
curl http://localhost:3000
# Output: Hello from Docker!
8. View logs
------------
docker logs express-server
9. Stop and remove
------------------
docker stop express-server
docker rm express-server
Docker Volumes
Volumes are used to persist data generated by containers. By default, container data is lost when container is removed!
The Data Problem:
=================
Container without Volume:
┌─────────────────────┐
│ Container │
│ ┌───────────────┐ │
│ │ App Data │ │ ← Data stored here
│ └───────────────┘ │
└─────────────────────┘
│
docker rm
│
▼
DATA LOST! 😱
Container with Volume:
┌─────────────────────┐
│ Container │
│ ┌───────────────┐ │
│ │ /app/data │──┼────┐
│ └───────────────┘ │ │
└─────────────────────┘ │
│ │
docker rm │
│ ▼
▼ ┌─────────────┐
Container gone │ Volume │
│ (Persists!)│
└─────────────┘
DATA SAFE! ✅
Types of Volumes:
1. Named Volumes (Recommended) ============================== docker volume create my-data docker run -v my-data:/app/data my-app # Docker manages the storage location # Best for production 2. Bind Mounts ============== docker run -v /host/path:/container/path my-app docker run -v $(pwd):/app my-app # Maps host directory to container # Great for development (live reload!) 3. Anonymous Volumes ==================== docker run -v /app/data my-app # Docker creates random name # Not recommended (hard to manage)
Volume Commands:
# Create a volume
docker volume create my-volume
# List volumes
docker volume ls
# Inspect a volume
docker volume inspect my-volume
# Remove a volume
docker volume rm my-volume
# Remove all unused volumes
docker volume prune
# Run container with volume
docker run -d \
-v my-volume:/data \
--name my-container \
my-image
Docker Networks
Networks allow containers to communicate with each other and the outside world!
Docker Network Types: ===================== 1. BRIDGE (Default) - Containers on same bridge can communicate - Isolated from other networks 2. HOST - Container uses host's network directly - No network isolation 3. NONE - No networking - Complete isolation 4. CUSTOM BRIDGE (Recommended) - User-defined bridge network - DNS resolution by container name!
Container Communication: ======================== Default Bridge Network: ┌─────────────────────────────────────┐ │ bridge (docker0) │ │ │ │ ┌───────────┐ ┌───────────┐ │ │ │ Container │ │ Container │ │ │ │ A │ │ B │ │ │ │ 172.17.0.2│ │ 172.17.0.3│ │ │ └───────────┘ └───────────┘ │ │ │ │ Can communicate via IP only! │ └─────────────────────────────────────┘ Custom Bridge Network: ┌─────────────────────────────────────┐ │ my-network │ │ │ │ ┌───────────┐ ┌───────────┐ │ │ │ Container │ │ Container │ │ │ │ "app" │───→│ "db" │ │ │ └───────────┘ └───────────┘ │ │ │ │ Can communicate via NAME! │ │ app can reach db using "db" │ └─────────────────────────────────────┘
Network Commands:
# Create a network docker network create my-network # List networks docker network ls # Inspect a network docker network inspect my-network # Run container on specific network docker run -d --network my-network --name app my-app # Connect running container to network docker network connect my-network container_name # Disconnect from network docker network disconnect my-network container_name # Remove a network docker network rm my-network
Practical Example - App + Database:
# Create network
docker network create app-network
# Run MongoDB container
docker run -d \
--network app-network \
--name mongodb \
-e MONGO_INITDB_ROOT_USERNAME=admin \
-e MONGO_INITDB_ROOT_PASSWORD=secret \
mongo:6
# Run Node.js app container
docker run -d \
--network app-network \
--name node-app \
-p 3000:3000 \
-e MONGO_URL=mongodb://admin:secret@mongodb:27017 \
my-node-app
# node-app can connect to MongoDB using hostname "mongodb"!
Docker Compose
Docker Compose is a tool for defining and running multi-container Docker applications. Instead of running multiple docker commands, you define everything in a YAML file!
Without Docker Compose:
=======================
# Terminal 1
docker network create app-net
# Terminal 2
docker run -d --network app-net --name db \
-e POSTGRES_PASSWORD=secret \
-v db-data:/var/lib/postgresql/data \
postgres:15
# Terminal 3
docker run -d --network app-net --name redis redis:7
# Terminal 4
docker run -d --network app-net --name app \
-p 3000:3000 \
-e DATABASE_URL=postgres://postgres:secret@db:5432 \
-e REDIS_URL=redis://redis:6379 \
my-app
😫 So many commands! Hard to manage!
With Docker Compose:
====================
docker-compose up -d
😎 One command! Everything defined in YAML!
docker-compose.yml Example:
version: '3.8'
services:
# Node.js Application
app:
build: .
ports:
- "3000:3000"
environment:
- NODE_ENV=production
- DATABASE_URL=postgres://postgres:secret@db:5432/myapp
- REDIS_URL=redis://redis:6379
depends_on:
- db
- redis
volumes:
- ./logs:/app/logs
restart: unless-stopped
# PostgreSQL Database
db:
image: postgres:15-alpine
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=secret
- POSTGRES_DB=myapp
volumes:
- db-data:/var/lib/postgresql/data
restart: unless-stopped
# Redis Cache
redis:
image: redis:7-alpine
volumes:
- redis-data:/data
restart: unless-stopped
# Nginx Reverse Proxy
nginx:
image: nginx:alpine
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- app
restart: unless-stopped
volumes:
db-data:
redis-data:
Docker Compose Commands:
# Start all services docker-compose up # Start in detached mode docker-compose up -d # Build and start docker-compose up --build # Stop all services docker-compose down # Stop and remove volumes docker-compose down -v # View logs docker-compose logs # Follow logs docker-compose logs -f # View logs for specific service docker-compose logs app # List running services docker-compose ps # Execute command in service docker-compose exec app bash # Scale a service docker-compose up -d --scale app=3 # Restart a service docker-compose restart app
Real-World Architecture Example
Production Architecture with Docker:
====================================
┌─────────────┐
│ Client │
│ (Browser) │
└──────┬──────┘
│
▼
┌────────────────────────┐
│ Load Balancer │
│ (Nginx) │
└───────────┬────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ App Server │ │ App Server │ │ App Server │
│ Container │ │ Container │ │ Container │
│ (Node.js) │ │ (Node.js) │ │ (Node.js) │
└───────┬───────┘ └───────┬───────┘ └───────┬───────┘
│ │ │
└─────────────────┼─────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌───────────────┐ ┌───────────────┐
│ PostgreSQL │ │ Redis │ │ Elasticsearch│
│ Container │ │ Container │ │ Container │
└───────────────┘ └───────────────┘ └───────────────┘
│
▼
┌───────────────┐
│ Volume │
│ (Persistent) │
└───────────────┘
Quick Recap
| Concept | Description |
|---|---|
| Docker | Platform for containerizing applications |
| Image | Read-only template/blueprint for containers |
| Container | Running instance of an image |
| Dockerfile | Instructions to build an image |
| Docker Hub | Public registry for Docker images |
| Volume | Persistent storage for containers |
| Network | Communication between containers |
| Docker Compose | Tool for multi-container applications |
Interview Questions
Q: What is Docker?
"Docker is an open-source containerization platform that packages applications with all their dependencies into standardized units called containers. This ensures the application runs consistently across different environments."
Q: What is the difference between Docker Image and Container?
"An image is a read-only template containing instructions for creating a container - like a blueprint or class. A container is a running instance of an image - like an object created from a class. One image can create multiple containers."
Q: What is the difference between Docker and Virtual Machines?
"Docker containers share the host OS kernel and virtualize at the OS level, making them lightweight (MBs) and fast to start (seconds). VMs virtualize at the hardware level, each requiring its own OS, making them heavier (GBs) and slower to start (minutes)."
Q: What is a Dockerfile?
"A Dockerfile is a text file containing instructions to build a Docker image. It specifies the base image, dependencies, application code, environment variables, and the command to run when the container starts."
Q: What are Docker Volumes and why are they needed?
"Docker volumes are used to persist data generated by containers. By default, container data is ephemeral and lost when the container is removed. Volumes store data outside the container's filesystem, ensuring data persistence across container restarts and removals."
Q: What is Docker Compose?
"Docker Compose is a tool for defining and running multi-container Docker applications. You use a YAML file to configure your application's services, networks, and volumes, then create and start all services with a single command."
Q: How do containers communicate with each other?
"Containers communicate through Docker networks. On a user-defined bridge network, containers can reach each other using container names as hostnames. Docker provides DNS resolution, so a container named 'db' can be reached by other containers on the same network using 'db' as the hostname."
Q: What is the difference between CMD and ENTRYPOINT?
"Both specify commands to run when a container starts. CMD provides defaults that can be overridden when running the container. ENTRYPOINT sets a fixed command that always runs, with CMD arguments appended to it. Use ENTRYPOINT when the container should always run the same executable."
Q: What is layer caching in Docker?
"Docker images are built in layers, and each layer is cached. When rebuilding an image, Docker reuses cached layers if the instruction and context haven't changed. This speeds up builds significantly. That's why we copy package.json before copying all source code - dependencies are cached unless package.json changes."
Q: How do you reduce Docker image size?
"Use smaller base images (alpine), multi-stage builds, combine RUN commands to reduce layers, use .dockerignore to exclude unnecessary files, remove package manager caches, and avoid installing unnecessary packages."
Key Points to Remember
- Docker = Containerization platform
- Container = Lightweight, isolated environment
- Image = Blueprint for containers (read-only)
- Dockerfile = Recipe to build images
- Images have layers - cached for faster builds
- Docker Hub = Public registry for images
- Volumes = Persist data beyond container lifecycle
- Networks = Container communication
- Custom bridge = DNS resolution by container name
- Docker Compose = Multi-container orchestration
- docker-compose.yml = Define services, networks, volumes
- Containers vs VMs = OS-level vs hardware virtualization
- Use .dockerignore to exclude files
- Use specific tags, not 'latest'
- Multi-stage builds for smaller images
What's Next?
Now you understand Docker from the ground up! In the upcoming topics, you can explore:
- Docker Swarm - Container orchestration
- Kubernetes - Production-grade orchestration
- CI/CD with Docker - Automated deployments
- Docker Security Best Practices
Keep coding, keep learning! See you in the next one!
Post a Comment