Learn the Basics of Docker and Docker Compose as a Software Developer

Like virtual machines (VMs), Docker is also a technology for deploying and running applications in isolated environments. However, Docker has the advantage of being more lightweight, more portable, easier to set up, and faster to start, which makes it particularly well-suited for microservices and modern, cloud-native applications.

Docker is a must-know tool for software developers nowadays, no matter whether you are a frontend, backend, or DevOps developer. Even as a data analyst or data scientist, knowing Docker will make your life much easier, and it will make it much easier for you to work with other developers on the team.

However, many new developers still lack the basics of Docker and we need to spend quite some time onboarding them with the fundamentals. To make it easier for new developers to get started with Docker, we will introduce the fundamentals of Docker and Docker Compose in this post, with some easy-to-follow examples. Hopefully, this post will make life easier for both trainers and trainees.

Docker image vs Docker container

We will first introduce some basic concepts, which can be good to have some overview quickly. It will be very concise as most of them can be found easily online. This knowledge can be helpful when you are asked in a job interview by the way 😃. However, if you are more interested in the code, you can skip the theoretical part and jump to the coding part directly.

A Docker image is like a blueprint or a template for creating Docker containers. It can be thought of as a snapshot of an application together with its dependencies. Docker Images are stored in registries, such as Docker Hub or Google Container Registry.

A Docker container is a running instance of a Docker image and provides an isolated environment for running applications. A Docker container is isolated from other containers and also the host system.

Docker containers vs Virtual machines

Virtual machines (VMs) and Docker containers are both technologies that can be used for deploying and running applications in isolated environments. However, the infrastructure and the technology behind them are quite different which makes them suitable for different use cases.

Virtual machines use hardware-level virtualization and each VM has its own operating system. It has the advantage of better isolation but at the cost of more resource usage and lower performance. Virtual machines are basically like the computers you are using now. You can perform the same actions on it as on your local computer.

You can create virtual machines locally using some tools like VMware or VirtualBox. However, a more common form is the VMs on Cloud platforms, an example is the Compute Engine on the Google Cloud Platform. You can customize the CPU, memory, and disk size when creating a VM to make it meet your requirements. Normally, we create VMs when we need to use them for some heavy calculations like training machine learning models, or when we need to start multiple Docker containers on the same machine.

On the other hand, Docker containers share the same kernel with the host and are more lightweight, most portable, and generally has better performance. It is much easier to run the same container on both the GCP and AWS platforms than to move a virtual machine from GCP to AWS. Docker containers are a natural choice for modern microservices where a big application is divided into smaller, independent, and loosely-coupled services that communicate with each other over APIs. Serverless services like Cloud Function, Cloud Run, and App Engine on GCP all use Docker containers directly or behind the scene.

Docker containers are also perfect for local development because we can install all kinds of applications or services on the same computer with no need to worry about them interfering with each other or with the system. We will later introduce Docker Compose which is a tool for defining and running multi-container Docker applications.

Install Docker

OK, enough abstract talking, let’s see some code that is more fun 😄.

Firstly, we need to install Docker on our computer. It’s recommended to follow the official instructions on how to install Docker on different operating systems, rather than using some “handy” third-party scripts. Following the official instructions can make sure the latest Docker and Docker Compose are installed in a proper way.

If you are a Ubuntu user, you can choose to install Docker Desktop just Docker Engine. You will need to restart your computer after installation to make the authentications of Docker work properly.

Run a Hello World Docker container

Unless we need to build a microservice, normally we can use publically available Docker images directly. Let’s use the “hello-world” image on Docker Hub to run a container and test if our Docker is installed properly:

docker run hello-world

The Docker image will be automatically pulled from DockerHub if it cannot be found locally.

If you see “Hello from Docker!” on your screen, then Docker has been installed successfully.

We can check the Docker image downloaded with this command:

docker image ls

And you can the information for the Docker image:

REPOSITORY    TAG       IMAGE ID       CREATED         SIZE
hello-world   latest    feb5d9fea6a5   19 months ago   13.3kB

The repository is the name of the Docker image which can be found on DockerHub. And a tag is a label that is assigned to a specific version of the Docker image.

We can find out more information about a Docker image with the inspect command:

docker image inspect hello-world

Importantly, you can find out which command is run in the Docker container in the Cmd section:

"Cmd": [
    "/hello"
]

We can check currently running Docker containers with the docker ps command. However, when this command is run, nothing will be shown. This is because the container was just started, ran the /hello command, and then stopped.

We can find the stopped container with this command:

docker container ls -a

It will show all containers, including those that are stopped:

CONTAINER ID   IMAGE         COMMAND    CREATED          STATUS                      PORTS     NAMES
7d79e14b6cb3   hello-world   "/hello"   12 minutes ago   Exited (0) 12 minutes ago             optimistic_jang

We can remove stopped containers with this command:

docker container rm optimistic_jang

optimistic_jang is a random container name assigned by Docker. Remember to press the Tab key to have command autocompletion without typing everything explicitly.

We can also use the –rm option to specify that a Docker container should be deleted once the container is stopped.

docker run --rm hello-world

More advanced and more practical ways of running Docker containers

Above you just run the hello-world container to have some feelings for Docker. In practice, the usage would be much more complex. Let’s check the example that’s been used in many previous posts regarding MySQL.

We will start a MySQL server locally using Docker:

# Create a volume to persist the data.
$ docker volume create mysql8-data

# Create the container for MySQL.
$ docker run --name mysql8 -d -e MYSQL_ROOT_PASSWORD=root -p 13306:3306 -v mysql8-data:/var/lib/mysql mysql:8

# Connect to the local MySQL server in Docker.
$ docker exec -it mysql8 mysql -u root -proot

mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.33    |
+-----------+
1 row in set (0.00 sec)

The following actions are performed with the above commands:

Create a named volume to persist the data even when the container is deleted. If we don’t create and use a named volume, Docker will create an anonymous volume automatically which will be deleted when the container is removed. We can also mount a local folder onto a Docker container, which is more commonly used when using Docker Compose as we will see soon.
Many more options are specified in the docker run command:

–name (there is no short version for it) specifies a custom name for the Docker container. If we don’t specify one, a random name will be created for us automatically as shown in the hello-world example above.
-d or –detach runs the container in the background and prints the container ID. It’s important when using a container as a service that should continue running even when the console is closed.
-e or –env sets environment variables for the Docker container. In this example, we specify the password for the MySQL root user using the MYSQL_ROOT_PASSWORD environment variable.
-p or –publish publishes a container’s port(s) to the host. The format is -p HOST_PORT:CONTAINER_PORT. In this example, container port 3306 is published to port 13306 on the host computer. Therefore we can access the MySQL server on port 13306 on the host computer. However, when inside the container, or inside a network as we will see in the Docker Compose example, the server should still be accessed on port 3306.
-v or–volume binds mount a named volume to the container. We should create the volume beforehand so it can be bound here.

3. We can get into a running container and run commands there using the docker exec command. This is extremely helpful when you don’t want to install some DB clients locally like redis-cli, MySQL-client, or mongosh. In this example, we run the MySQL client command (mysql) directly inside the container to check the version of the server.

If you have a MySQL client installed locally, you can also connect to the MySQL server running the Docker container on port 13306:

mysql -h 127.0.0.1 -P 13306 -u root -proot

Note that -P specifies the port and -p the password. You can and should normally omit the password for security issues. You will be prompted to enter the password when it’s omitted.

In case anything goes wrong with the Docker container, we can inspect the container and also check the logs. These are the basic debugging commands when using Docker:

# Check the status of the Docker container.
docker container inspect mysql8

# Check the logs as a snapshot.
docker logs mysql8

# Check the logs continuously.
docker logs -f mysql8

Build custom Docker images with Dockerfile

Instead of just using pre-built public Docker images, we may also need to build Docker images by ourselves, especially when developing microservices.

To build a custom Docker image, we need to create a special file called Dockerfile, which contains instructions for building a Docker image.

In order to have some simple microservice code to play with, let’s clone this repo for the FastAPI Essentials post, which has a Dockerfile with the content as follows:

FROM python:3.11

WORKDIR /app

COPY ./requirements.txt /app/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /app/requirements.txt

COPY ./app /app

CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]

We will explain each command here:

FROM— Specifies the base image to use for building the new image.
WORDIR— Sets the working directory for any subsequent RUN, CMD, ENTRYPOINT, COPY, or ADD commands. If the directory does not exist, it will be created automatically.
COPY — Copies files or directories from the host to the container’s filesystem. Note that for a Python microservice, we should always copy the requirements.txt file before the application code.
This is because requirements.txt is changed much less frequently than the application code. If the application code is also copied before the RUN command to install the package dependencies, we would need to reinstall everything whenever any application code is changed, which is very inefficient.
Under the hood, it’s because Docker creates a separate layer for each command, and if the application code is changed, new layers would be recreated for all subsequent commands.
RUN — Executes a command inside the image, which is often used for installing packages or running scripts. If we need to run multiple commands on the same RUN section, we can use && and \ to concatenate multiple commands, for example:

RUN pip install -U pip && \
    pip install --no-cache-dir -U -r /app/requirements.txt

CMD — Provides the default command and arguments for the container when it is run. If a command is provided when starting the container, the CMD will be overridden.
There is another related command ENTRYPOINT that should be mentioned here, which also specifies the executable that should run when the container is started. However, it will not be overwritten by the command provided when starting the container. Besides, when both CMD and ENTRYPOINT are provided, the ENTRYPOINT would specify the executable that will be run and CMD would provide the default arguments for the ENTRYPOINT command.

We can create a .dockerignore file and specify which folders or files should be ignored when a Docker image is built. By excluding virtual environment files or Python compilation files, we can reduce the size of the Docker image, which can make it easier to share and deploy.

We should go into the folder where Dockerfile is located and run the following command to build the Docker image for our microservice:

docker build -t fastapi-demo:latest .

The -t or –tag option specifies the name and tag for the Docker image to be built. The tag is optional but we should normally specify one or more tags for our Docker images to identify different versions of it.

The dot (.) specifies the current folder where Dockerfile is located. When it is located somewhere else, we should specify the path to the folder explicitly.

When the Docker image is built, we can run the container:

docker run --name fastapi-demo -d --rm -p 8000:80 -v ./app:/app fastapi-demo:latest

We can test our microservice running in the Docker container by sending a request to it:

curl -X 'POST' \
  'http://127.0.0.1:8000/products' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "name": "string",
  "price": 0,
  "pid": 0
}'

Since we are using FastAPI, you can also open the interactive API docs that can be accessed at http://127.0.0.1:8000/docs, and try out the APIs there directly.

Docker Compose V1 and V2

Docker Compose is a tool for defining and running multi-container Docker applications. It allows you to manage multiple containers, including their configurations and dependencies in a simple and declarative way using a single docker-compose.yml file.

When using Docker Compose, it should be noted that there are two versions of it, namely V1 and V2.

Docker Compose V1 is written in Python. It’s seen as deprecated now and the support will be stopped from the end of June 2023. The docker-compose command is used for Compose V1. If possible, we should always use Docker Compose V2.

In mid-2020 Docker Compose V2 was released. It is rewritten in Go with major performance improvement and supports most of the features of V1. To use Compose V2, we should use the docker compose command, without the hyphen in between. The compose command has been integrated directly into the Docker CL now, rather than being used as a standalone command like docker-compose.

docker compose is expected to be a drop-in replacement for docker-compose. However, for newer versions of docker-compose.yaml file with some new features, you may only be able to run it using docker compose.

Use Docker Compose to manage multiple containers

Let’s write a docker-compose.yaml file to manage both the FastAPI and MySQL Docker containers at the same time.

Before getting started, let’s stop and remove the FastAPI and MySQL containers to avoid port conflicts because the same host ports are used here.

docker container stop fastapi-demo
docker container rm fastapi-demo

docker container stop mysql8
docker container rm mysql8

The docker-compose.yaml file will have the content as follows:

version: "3.9"

services:
  fastapi-demo:
    build:
      context: .
    ports:
      - target: 80
        published: 8000
    volumes:
      - type: bind
        source: ./app
        target: /app
    networks:
      - microservice-net

  mysql:
    image: mysql:8.0.33
    restart: always
    environment:
      MYSQL_DATABASE: data
    env_file:
      - .env
    ports:
      - 13306:3306
    volumes:
      - type: volume
        source: mysql8-data
        target: /var/lib/mysql
    networks:
      - microservice-net

volumes:
  mysql8-data:
    external: true

networks:
  microservice-net:
    driver: bridge

Before we explain each key of the docker-compose.yaml file, we should know what is a service in the context of Docker Compose.

A service in the docker-compose.yml file can be thought of as a template for creating and running one or more Docker containers (more than one container when using Docker Swarm which is not covered in this post).

When we run docker compose up, the containers of each service defined in the file will be created and started according to the configurations. A service in Docker Compose basically works like a service of the system.

The second-level keys directly under the services key are the services that will be started, which are the FastAPI and MySQL services, respectively.

Now let’s check some of the most commonly used keys for the services:

build — Defines the build context for a service which is normally the directory containing Dockerfile. It’s only needed when we need to build a custom Docker image. Note that when build is specified, you cannot specify the image field anymore. Otherwise, you may see the error saying the Docker image cannot be found in the Docker registries.
image — Specifies the Docker image to use for a service. It is normally some official image from Docker Hub, but can also be custom images in your private Docker registry like Google Container Registry.
ports — Maps container ports to host ports. target is the container port and published the host port. You can also use the traditional HOST_PORT:CONTAINER_PORT format as used with the docker run command.
volumes — Mounts volumes to a service. We can bind a host folder to a container folder to share files between the host and the container, or use a named volume (defined under the top-level volumes key) to persist data for the container. The type is bind and volume for these two types of volumes, respectively.
In this example, we use bind for the FastAPI service because we want to share the code with the container so the latest code is always used in the container. On the other hand, named volume is used for the MySQL service so the data can be persisted even when the container is deleted.
networks — Specifies custom networks (defined under the top-level networks key) for the services. Those with the same network can communicate with each other by service names. Therefore, in the FastAPI service of this example, we can access the MySQL server by host mysql and port 3306.
restart — Specifies the restart policy for a service. For a data server service like MySQL, we always want to restart when anything goes wrong. However, for the FastAPI service, we should not set the restart policy so we can check the logs and also debug more conveniently.
environment — Specifies some environment variables that are not sensitive and can be exposed in the repository.
env_file — Specifies the sensitive environment variable that should not be exposed in the repository.

Some comments on the top-level volumes and networks keys:

For volumes that should be created by Docker Compose automatically, we should skip the external: true option, which is used to specify that a pre-created volume should be used instead.
For networks, the driver is bridge if it’s used on a single node, and overlay on multiple ones. The latter is only used in Docker Swarm.

Now can run the containers, and check the logs with the following commands:

# Build the images.
docker compose build

# Pull the images that are not built.
docker compose pull

# Start the containers. -d is to run them in the background (detached).
docker compose up -d

# Check the logs
docker compose -f fastapi-demo

Clean up Docker

After you have been using Docker for some time, you may have a lot of dangling images, volumes, or stopped containers, which can take a lot of resources and impact the performance of your computer.

A dangling image is an image that is not used by any containers or has no associated tags. We can safely remove dangling images with the prune command:

docker image prune

A dangling volume is a volume that is not used by any containers and can also be deleted with the prune command:

docker volume prune

Similarly, stopped containers can be removed by:

docker container prune

Finally, there is a more powerful command:

docker system prune

This command will remove all unused data.

WARNING! This will remove:
  - all stopped containers
  - all networks not used by at least one container
  - all dangling images
  - all dangling build cache

Are you sure you want to continue? [y/N]

Note that with Compose V2, the volumes will not be pruned with docker system prune. You need to specify it with the –volumes option:

docker system prune --volumes

In this post, we have introduced the basic concepts of Docker and Docker Compose, as well as useful commands and configurations for them. The details of Dockerfile and docker-compose.yaml file are explained with simple examples. With the knowledge of this post, you will be very comfortable getting started using Docker and Docker Compose in your work.

SuperDataMiner

Learn the Basics of Docker and Docker Compose as a Software Developer

Docker image vs Docker container

Docker containers vs Virtual machines

Install Docker

Run a Hello World Docker container

More advanced and more practical ways of running Docker containers

Build custom Docker images with Dockerfile

Docker Compose V1 and V2

Use Docker Compose to manage multiple containers

Clean up Docker

Related articles:

Leave a comment Cancel reply

Learn the Basics of Docker and Docker Compose as a Software Developer

Docker image vs Docker container

Docker containers vs Virtual machines

Install Docker

Run a Hello World Docker container

More advanced and more practical ways of running Docker containers

Build custom Docker images with Dockerfile

Docker Compose V1 and V2

Use Docker Compose to manage multiple containers

Clean up Docker

Related articles:

Share this:

Leave a comment Cancel reply