You know the drill: your Data Science team has created an amazing PyTorch model, and now they want you to put it in production. They give you a .pt
file and some preprocessing script. What now?
Luckily, AWS and Facebook have created a project, called Torch Serve, to put PyTorch images in production, similarly to Tensorflow Serving. It is a well-crafted Docker image, where you can upload your models. In this tutorial, we will see how to customize the Docker image to include your model, how to install other dependencies inside it, and which configuration options are available.
We include the PyTorch model directly inside the Docker image, instead of loading it at runtime; while loading it at runtime as some advantages and makes sense in some scenario (as in testing labs where you want to try a lot of different models), I don’t think it is suitable for production. Including the model directly in the Docker image has different advantages:
if you use CI/CD you can achieve reproducible builds;
to spawn a new instance serving your model, you need to have available only your Docker registry, and not also a storage solutions to store the model;
you need to authenticate only to your Docker registry, and not to the storage solution;
it makes easier keeping track of what has been deployed, ‘cause you have to check only the Docker image version, and not the model version. This is especially important if you have a cluster of instances serving your model;
Let’s now get our hands dirty and dive in what is necessary to have the Docker image running!
Building the model archive
The Torch Serve Docker image needs a model archive to work: it’s a file with inside a model, and some configurations file. To create it, first install Torch Serve, and have a PyTorch model available somewhere on the PC.
To create this model archive, we need only one command:
1torch-model-archiver --model-name <MODEL_NAME> --version <MODEL_VERSION> --serialized-file <MODEL> --export-path <WHERE_TO_SAVE_THE_MODEL_ARCHIVE>
There are four options we need to specify in this command:
MODEL_NAME
is an identifier to recognize the model, we can use whatever we want here: it’s useful when we include multiple models inside the same Docker image, a nice feature of Torch Serve that we won’t cover for now;MODEL_VERSION
is used to identify, as the name implies, the version of the model;MODEL
is the path, on the local PC, with the.pt
file acting as model;WHERE_TO_SAVE_THE_MODEL_ARCHIVE
is a local directory where Torch Serve will put the model archive it generates;
Putting all together, the command should be something similar to:
1torch-model-archiver --model-name predict_the_future --version 1.0 --serialized-file ~/models/predict_the_future --export-path model-store/
After having run it, we now have a file with .mar
extension, the first step to put in production our PyTorch model!
Probably some pre-processing before invoking the model is necessary. If this is the case, we can create a file where we can put all the necessary instructions. This file can have external dependencies, so we can code an entire application in front of our model.
To include the handler file in the model archive, we need only to add the --handler
flag to the command above, like this:
1torch-model-archiver --model-name predict_the_future --version 1.0 --serialized-file ~/models/predict_the_future --export-path model-store/ --handler handler.py
Create the Docker image
Now we have the model archive, and we include it in the PyTorch Docker Image. Other than the model archive, we need to create a configuration file as well, to say to PyTorch which model to automatically load at the startup.
We need a config.properties
file similar to the following. Later in this tutorial we will see what these lines mean, and what other options are available.
1inference_address=http://0.0.0.0:80802management_address=http://0.0.0.0:80813number_of_netty_threads=324job_queue_size=10005model_store=/home/model-server/model-store
Docker image with just the model
If we need to include just the model archive, and the config file, the Dockerfile is quite straightforward, since we just have to copy the files, all the other things will be managed by TorchServe itself. Our Dockerfile will thus be:
1FROM pytorch/torchserve as production2 3COPY config.properties /home/model-server/config.properties4COPY predict_the_future.mar /home/model-server/model-store
TorchServe already includes torch
, torchvision
, torchtext
, and torchaudio
, so there is no need to add them. To see the current version of these libraries, please go see the requirements file of TorchServe on GitHub.
Docker image with the model and external dependencies
What if we need different Python Dependencies for our Python handler?
In this case, we want to use a two-step Docker image: in the first step we build our dependencies, and then we copy them over to the final image. We list our dependencies in a file called requirements.txt
, and we use pip to install them. Pip is the package installer for Python. Their documentation about the format of the requirements file is very complete.
The Dockerfile is now something like this:
1ARG BASE_IMAGE=ubuntu:18.04 2 3# Compile image loosely based on pytorch compile image 4FROM ${BASE_IMAGE} AS compile-image 5ENV PYTHONUNBUFFERED TRUE 6 7# Install Python and pip, and build-essentials if some requirements need to be compiled 8RUN apt-get update && \ 9 DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \10 python3-dev \11 python3-distutils \12 python3-venv \13 curl \14 build-essential \15 && rm -rf /var/lib/apt/lists/* \16 && cd /tmp \17 && curl -O https://bootstrap.pypa.io/get-pip.py \18 && python3 get-pip.py19 20RUN python3 -m venv /home/venv21 22ENV PATH="/home/venv/bin:$PATH"23 24RUN update-alternatives --install /usr/bin/python python /usr/bin/python3 125RUN update-alternatives --install /usr/local/bin/pip pip /usr/local/bin/pip3 126 27# The part above is cached by Docker for future builds28# We can now copy the requirements file from the local system29# and install the dependencies30COPY requirements.txt .31 32RUN pip install --no-cache-dir -r requirements.txt33 34FROM pytorch/torchserve as production35 36# Copy dependencies after having built them37COPY --from=compile-image /home/venv /home/venv38 39# We use curl for health checks on AWS Fargate40USER root41RUN apt-get update && \42 DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y \43 curl \44 && rm -rf /var/lib/apt/lists/*45 46USER model-server47 48COPY config.properties /home/model-server/config.properties49COPY predict_the_future.mar /home/model-server/model-store
If PyTorch is among the dependencies, we should change the line to install the requirements from
1RUN pip install --no-cache-dir -r requirements.txt
to
1RUN pip install --no-cache-dir -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
In this way, we will use the pre-build Python packages for PyTorch instead of installing them from scratch: it will be faster, and it requires less resources, making it suitable also for small CI/CD systems.
Configuring the Docker image
We created a configuration file above, but what does it? Of course, going through all the possible configurations would be impossible, so I leave here the link to the documentation. Among the other things explained there, there is a way to configure Cross-Origin Resource Sharing (necessary to use the model as APIs over the web), a guide on how to enable SSL, and much more.
There is a set of configurations parameters in particular I’d like to focus on: the ones related to logging. First, for production environment, I suggest setting async_logging
to true
: it could delay a bit the output, but allows a higher throughput. Then, it’s important to notice that by default Torch Serve captures every message, including the ones with severity DEBUG
. In production, we probably don’t want this, especially because it can become quite verbose.
To override the default behavior, we need to create a new file, called log4j.properties
. For more information on every possible options I suggest familiarizing with the official guide. To start, copy the default Torch Serve configuration, and increase the severity of the printed messages. In particular, change
1log4j.logger.org.pytorch.serve = DEBUG, ts_log2log4j.logger.ACCESS_LOG = INFO, access_log
to
1log4j.logger.org.pytorch.serve = WARNING, ts_log2log4j.logger.ACCESS_LOG = WARNING, access_log
We need also to copy this new file to the Docker Image, so copy the logging config just after the config file:
1COPY config.properties /home/model-server/config.properties2COPY config.properties /home/model-server/log4j.properties
We need to inform Torch Serve about this new config file, and we do so adding a line to config.properties
:
1vmargs=-Dlog4j.configuration=file:///home/model-server/log4j.properties
We now have a full functional Torch Serve Docker image, with our custom model, ready to be deployed!
For any question, comment, feedback, critic, suggestion on how to improve my English, leave a comment below, or drop an email at [email protected].
Ciao,
R.
Comments