Published on

Optimizing Docker Images for Production

Table of Contents


Why do we need to optimize docker images...🤓?

When it comes to running containerized applications in a production environment, optimizing your Docker images is a crucial step. Here's why: 1 2 3 4 5

  • Reduced Image Size:

    • Smaller Docker images take less time to build, push, and pull, leading to faster deployment times.
    • Smaller images require less storage space, both on the build server and in the production environment, resulting in cost savings and improved efficiency.
  • Faster Startup Times:

    • Optimized Docker images start up faster because they contain only the necessary dependencies and components.
    • Quick container startup is crucial for scalability and responsiveness in production environments.
  • Improved Security:

    • Smaller Docker images have a reduced attack surface, as they contain fewer installed packages and libraries.
    • This makes it easier to maintain and secure the production environment, with fewer potential vulnerabilities to address.
  • Efficient Resource Utilization:

    • Optimized Docker images consume fewer system resources (CPU, memory, storage) compared to larger, bloated images.
    • This allows for more efficient use of infrastructure resources, leading to cost savings and better overall performance.
  • Streamlined Deployment:

    • Smaller, optimized Docker images are easier to manage and distribute, especially in a CI/CD pipeline.
    • This simplifies the deployment process and reduces the risk of issues during the rollout of new versions.
  • Compliance and Regulations:

    • In some industries, there are regulatory requirements or guidelines that mandate the use of optimized, minimal Docker images to reduce the attack surface and improve security.
    • By optimizing your Docker images, you can ensure compliance with these standards.

By optimizing your Docker images for production, you can improve the overall efficiency, performance, and security of your containerized applications, leading to better user experiences and lower operational costs.

Requirements & Setup:

  1. A nextJS application

  2. Ensure that the app spins up successfully:

terminal
cd talk-with-data # cd <directory_name>

npm run dev

Code

  1. Create a Dockerfile for the NextJS application
Dockerfile
FROM node:18
WORKDIR /app

COPY package*.json ./

RUN npm install

COPY . .

RUN npm run build

EXPOSE 3000

CMD ["sh", "-c", "npm run start"]
  1. Build the image and spin a container

Build the docker image with

terminal
docker build -t nextjs-app .

Spin a container with

terminal
docker run -it --rm -p 3000:3000 nextjs-app
  1. Docker image before optimization

    The time taken to build the docker image is around 134 seconds.

    output

    Figure 1: Time taken to build the image after optimization

Execute this in your terminal to display the size of the image in bytes.

terminal
docker image inspect --format='{{.Size}}' nextjs-app

The size of the image is around 1.92GB.

output
Figure 2: Size of the image before optimization
  1. Optimizing the docker image 6
Dockerfile
FROM node:18-alpine AS builder
WORKDIR /app

COPY package*.json ./

RUN npm install

COPY . .

RUN npm run build

FROM node:18-alpine
WORKDIR /app

COPY --from=builder /app/package*.json ./
COPY --from=builder /app/public ./public
COPY --from=builder /app/next.config.js ./
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static

EXPOSE 3000

ENTRYPOINT ["node", "server.js"]

What changes did i perform?

The changes made to optimize the Docker image are as follows:

  1. Used the node:18-alpine base image instead of node:18:

    • The node:18-alpine image is based on the Alpine Linux distribution, which is a much smaller and more lightweight Linux distribution compared to the default Debian-based node:18 image.
    • This reduces the overall size of the Docker image, leading to faster build times and smaller image size.
  2. Introduced a multi-stage build:

    • The Dockerfile now uses a multi-stage build, where the first stage (builder) is responsible for building the application, and the second stage (node:18-alpine) is used for the final image.
    • This allows you to separate the build dependencies from the runtime dependencies, resulting in a smaller final image.
  3. Copied only the necessary files:

    • In the second stage, the Dockerfile only copies the required files (package*.json, public, next.config.js, and the standalone and static directories) from the first stage, instead of copying the entire application directory.
    • In the first Dockerfile, the application is built using the full-fledged node:18 environment, which includes all the necessary build dependencies. However, these build dependencies are not needed in the final runtime environment. By using a multi-stage build, you can separate the build environment from the runtime environment, and only copy the necessary artifacts to the final image.
  4. Used the ENTRYPOINT instruction instead of CMD:

    • The ENTRYPOINT instruction is used to set the default command to be executed when the container starts, whereas CMD sets the default command that can be overridden.
    • Using ENTRYPOINT ensures that the node server.js command is always executed when the container starts, without the risk of it being overridden.

Docker image after optimization

The new image took 120 seconds to be built.

output

Figure 3: Time taken to build the image after optimization

output
Figure 4: Size of the Docker image after optimization

The size of the new image is 185mb, which's barely 1% of the original size of the image. 🚀

In the realm of computing, the significance of time and space cannot be overstated. Therefore, it is crucial to streamline our resources effectively to minimize both computational time and storage space utilization. - Benny Daniel

Till next time!

References:

Footnotes

  1. N. Zhao, V. Tarasov, H. Albahar, A. Anwar, L. Rupprecht, D. Skourtis, A. K. Paul, K. Chen, and A. R. Butt, "Large-Scale Analysis of Docker Images and Performance Implications for Container Storage Systems," IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 8, pp. 1-8, Aug. 2015. DOI: 10.1109/TPDS.2020.3034517

  2. "Large-Scale Analysis of Docker Images and Performance Implications for Container Storage Systems," IEEE Xplore, Oct. 28, 2020. [Online]. Available: https://ieeexplore.ieee.org/document/9242268

  3. "Improvement of container scheduling for Docker using Ant Colony Optimization," IEEE Xplore. [Online]. Available: http://ieeexplore.ieee.org/document/7886112/

  4. "Optimizing the docker container usage based on load scheduling," IEEE Xplore. [Online]. Available: https://ieeexplore.ieee.org/document/7972269

  5. "A Docker Container Anomaly Monitoring System Based on Optimized Isolation Forest," Aug. 20, 2019. [Online]. Available: https://ieeexplore.ieee.org/document/8807263

  6. A. H. Arul, "talkwithdata," GitHub, 2023. [Online]. Available: https://github.com/arvindharul/talkwithdata. [Accessed: 06-Mar-2024].