Create a PyTorch Docker image ready for production

Given a PyTorch model, how should we put it in a Docker image, with all the related dependencies, ready to be deployed?

Author's pictureNov 3, 2020 | Riccardo Padovani | [email protected]

cover

Photo by Michael Dziedzic on Unsplash

You know the drill: your Data Science team has created an amazing PyTorch model, and now they want you to put it in production. They give you a .pt file and some preprocessing script. What now?

Luckily, AWS and Facebook have created a project, called Torch Serve, to put PyTorch images in production, similarly to Tensorflow Serving. It is a well crafted Docker image, where you can upload your models. In this tutorial we will see how to customize the Docker image to include your model, how to install other dependencies inside it, and which configuration options are available.

We include the PyTorch model directly inside the Docker image, instead of loading it at runtime; while loading it at runtime as some advantages and makes sense in some scenario (as in testing labs where you want to try a lot of different models), I don’t think it is suitable for production. Including the model directly in the Docker image has different advantages:

Let’s now get our hands dirty and dive in what is necessary to have the Docker image running!

Building the model archive

The Torch Serve Docker image needs a model archive to work: it’s a file with inside a model, and some configurations file. To create it, first install Torch Serve, and have a PyTorch model available somewhere on the PC.

To create this model archive, we need only one command:

torch-model-archiver --model-name <MODEL_NAME> --version <MODEL_VERSION>  --serialized-file <MODEL> --export-path <WHERE_TO_SAVE_THE_MODEL_ARCHIVE>

There are four options we need to specify in this command:

Putting all together, the command should be something similar to:

torch-model-archiver --model-name predict_the_future --version 1.0 --serialized-file ~/models/predict_the_future --export-path model-store/

After having run it, we now have a file with .mar extension, the first step to put in production our PyTorch model! .mar files are actually just .zip files with a different extension, so feel free to open it and analyze it to see how it works behind the scenes.

Probably some pre-processing before invoking the model is necessary. If this is the case, we can create a file where we can put all the necessary instructions. This file can have external dependencies, so we can code an entire application in front of our model.

To include the handler file in the model archive, we need only to add the --handler flag to the command above, like this:

torch-model-archiver --model-name predict_the_future --version 1.0 --serialized-file ~/models/predict_the_future --export-path model-store/ --handler handler.py

Create the Docker image

Now we have the model archive, and we include it in the PyTorch Docker Image. Other than the model archive, we need to create a configuration file as well, to say to PyTorch which model to automatically load at the startup.

We need a config.properties file similar to this: Later in this tutorial we will see what these lines mean, and what other options are available.

inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
number_of_netty_threads=32
job_queue_size=1000
model_store=/home/model-server/model-store

Docker image with just the model

If we need to include just the model archive, and the config file, the Dockerfile is quite straightforward, since we just need to copy the files, all the other things will be managed by TorchServe itself. Our Dockerfile will thus be:

FROM pytorch/torchserve as production

COPY config.properties /home/model-server/config.properties
COPY predict_the_future.mar /home/model-server/model-store

TorchServe already includes torch, torchvision, torchtext, and torchaudio, so there is no need to add them. To see the current version of these libraries, please go see the requirements file of TorchServe over GitHub.

Docker image with the model and external dependencies

What if we need different Python Dependencies for our Python handler?

In this case, we want to use a two steps Docker image: in the first step we build our dependencies, and then we copy them over the final image. We list our dependencies in a file called requirements.txt, and we use pip to install them. Pip is the package installer for Python. Their documentation about the format of the requirements file is very complete.

The Dockerfile is now something like this:

ARG BASE_IMAGE=ubuntu:18.04

# Compile image loosely based on pytorch compile image
FROM ${BASE_IMAGE} AS compile-image
ENV PYTHONUNBUFFERED TRUE

# Install Python and pip, and build-essentials if some requirements need to be compiled
RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    python3-dev \
    python3-distutils \
    python3-venv \
    curl \
    build-essential \
    && rm -rf /var/lib/apt/lists/* \
    && cd /tmp \
    && curl -O https://bootstrap.pypa.io/get-pip.py \
    && python3 get-pip.py

RUN python3 -m venv /home/venv

ENV PATH="/home/venv/bin:$PATH"

RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 1
RUN update-alternatives --install /usr/local/bin/pip pip /usr/local/bin/pip3 1

# The part above is cached by Docker for future builds
# We can now copy the requirements file from the local system
# and install the dependencies
COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

FROM pytorch/torchserve as production

# Copy dependencies after having built them
COPY --from=compile-image /home/venv /home/venv

# We use curl for health checks on AWS Fargate
USER root
RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \
    curl \
    && rm -rf /var/lib/apt/lists/*

USER model-server

COPY config.properties /home/model-server/config.properties
COPY predict_the_future.mar /home/model-server/model-store

If PyTorch is among the dependencies, we should change the line to install the requirements from

RUN pip install --no-cache-dir -r requirements.txt

to

RUN pip install --no-cache-dir -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html

In this way we will use the pre-build Python packages for PyTorch instead of installing them from scratch: it will be faster, and it requires less resources, making it suitable also for small CI/CD systems.

Configuring the Docker image

We created a configuration file above, but what does it? Of course, going through all the possible configurations would be impossible, so I leave here the link to the documentation. Among the other things explained there, there is a way to configure Cross-Origin Resource Sharing (necessary to use the model as APIs over the web), a guide on how to enable SSL, and much more.

There is a set of configurations parameters in particular I’d like to focus on: the ones related to logging. First, for production environment, I suggest setting async_logging to true: it could delay a bit the output, but allows a higher throughput. Then, it’s important to notice that by default Torch Serve captures every message, including the ones with severity DEBUG. In production, we probably don’t want this, especially because it can become quite verbose.

To override the default behavior, we need to create a new file, called log4j.properties. For more information on every possible options I suggest familiarizing with the official guide. To start, copy the default Torch Serve configuration, and increase the severity of the printed messages. In particular, change

log4j.logger.org.pytorch.serve = DEBUG, ts_log
log4j.logger.ACCESS_LOG = INFO, access_log

to

log4j.logger.org.pytorch.serve = WARNING, ts_log
log4j.logger.ACCESS_LOG = WARNING, access_log

We need also to copy this new file to the Docker Image, so copy the logging config just after the config file:

COPY config.properties /home/model-server/config.properties
COPY config.properties /home/model-server/log4j.properties

We need to inform Torch Serve about this new config file, and we do so adding a line to config.properties:

vmargs=-Dlog4j.configuration=file:///home/model-server/log4j.properties

We have now a full functional Torch Serve Docker image, with our custom model, ready to be deployed!

For any question, comment, feedback, critic, suggestion on how to improve my English, reach me on Twitter (@rpadovani93) or drop an email at [email protected].

Ciao,
R.