Multiple RUN vs. single chained RUN in Dockerfile, which is better?

Dockerfile.1 executes multiple RUN:

FROM busybox
RUN echo This is the A > a
RUN echo This is the B > b
RUN echo This is the C > c

Dockerfile.2 joins them:

FROM busybox
RUN echo This is the A > a &&\
    echo This is the B > b &&\
    echo This is the C > c

Each RUN creates a layer, so I always assumed that fewer layers is better and thus Dockerfile.2 is better.

This is obviously true when a RUN removes something added by a previous RUN (i.e. yum install nano && yum clean all), but in cases where every RUN adds something, there are a few points we need to consider:

  1. Layers are supposed to just add a diff above the previous one, so if the later layer does not remove something added in a previous one, there should not be much disk space saving advantage between both methods.

  2. Layers are pulled in parallel from Docker Hub, so Dockerfile.1, although probably slightly bigger, would theoretically get downloaded faster.

  3. If adding a 4th sentence (i.e. echo This is the D > d) and locally rebuilding, Dockerfile.1 would build faster thanks to cache, but Dockerfile.2 would have to run all 4 commands again.

So, the question: Which is a better way to do a Dockerfile?

4 Answers
4

Leave a Comment