Dockerfile Implementations.

Krupakar Reddy
5 min readMar 5, 2022

Dockerfile, which is a text file consists of commands and instructions to build an image. This image which is a packed application that can be shifted to various platforms to run an application, which is called Containerization.

To build this image, we will follow good practices such as, the image should be less in size and speed in the performance. For that we will follow some good practices which mentioned in this article.

In Dockerfile, every command will creates the layer, and each layer will increase in size of an image.

Terms to Understand:

  1. Build Context: It is a set of files present in specified location or path, those files are sent to Docker daemon when docker build command, So it can use them in the of the image.
  2. Docker Daemon: Daemon is a part of docker engine, which manages all the objects such as docker images, volumes, networks and containers.
  3. To improve build speed in docker images, we can eliminate the build-context, that which is not require to build an image.
  4. .dockerignore: Some of the file which are unnecessary in container, or to exclude files in container, this .dockerignore plays a major role. it is similar to .gitignore. The unwanted file or directory names can be mention in this .dockerignore to ignore them in container. This helps in reduce of context during image build.
  5. .dockerignore helps to speed up builds and this file shall be placed at root directory of context.

Reducing the layers:

→ Normally in Dockerfile, commands such as RUN, COPY, ADD created a layers. Other instructions creates ephemeral intermediate images and it does not effect in increase of size of an image.

→Here by reducing the commands in dockerfile, which will reduce in steps and layer formations. This can be avoided by implementing single RUN command for multiple instructions by using arguments.
→ MAINTAINER and CMD instruction don’t generate layers.

Include Multi-line arguments in Dockerfile:

→ If we have a long length lines, it is hard to read the whole line till end, by using multiline arguements we can break long line into multiple lines, this will increase readability. Whereas docker considers broken lines as a single line.

→Instead of using same command for multiple times in repeated lines, use single command we can use backslash ( \ ) at the end of line. This ‘ \’ indicates new line to perform.

This ignores duplication of packages and makes easy to update and this also makes PRs a lot easier to read and review.

Example:

→Try to avoid multiple RUN commands where ever is possible, this will improve in size and build speed. Lets say, if we need to create a directory and create a user, we shall follow as : RUN mkdir <directory_name> && useradd <user_name > instead of using two time RUN command.

Commands in Dockerfile:

FROM: Is used to pull as a base image and is better to use current offical alpine docker images which reduces the size of an image.

RUN: Is used to install packages or to execute instructions, we use RUN.

When to use “ \ ”and “ && \ ” :

Whenever we are performing similar activity for multiple instructions ‘ \ ’ can be used, whereas ‘ && \’ can be used for same command but for different activities.

Example:

Use of ‘ \ ‘

In the above example single RUN command is used to install multiple packages, Here install is the similar for all packages to install, Hence we shall use ‘ \ ’.

Example:

Use of ‘ && \ ‘

In the above example, We have RUN command used to create a directory and used of adding user. With same RUN command, we are trying to different activities such as mkdir and useradd. Hence for two different activities, we use ‘ && \ ’ .

Update Machines :

When we use to update our machines before installing any packages, we shall follow update with respective to install packages. This is because if it is RHEL OS, to update we use yum update.

→Using yum update alone in a RUN statement, causes caching issues and following yum install instructions will fail.

→This caching issues will raise due to docker checks the initial instructions as identical and reuses its cache from previous steps. Finally yum update will run from cache, Hence build will get an outdated version of packages.

→Hence to avoid those caching issues and build failures, we follow update and install instructions consecutively such as, RUN yum update && yum install -y which makes dockerfile installs latest package versions. This is known as “cache bursting”.

→We can also use versions with packages to install, this makes to install particular packages forcefully regardless of what is in the cache, which is called as “Version Pinning”.

Example:

The python package specifies a version 2.7. If the image previously used an older version, specifying the new one causes a cache bust of yum update and gets the new version installation.

ADD: This command can be use for multiple activites such as, copying files from local to container and if those files are tar, then ADD will copy it and untar the files automatically.

Also we can download file remote repository by using url to ADD command.

COPY: As we know this command will copy files from local directory to container. COPY is preferred than ADD which is more transparent to copy file. For extracting files we can prefer ADD.

In Dockerfile, some of the commands such as ‘EXPOSE’ ‘ENV’ etc, which are not changing frequently (static values), can be used in the top of Dockerfile. This leads to speedup the builds and prevent mistakes in package duplication.

Example:

In the above example, EXPOSE and ENV are used in the top of Dockerfile.

--

--