Radix/docs/howtos/dockerize_models.md

377 lines
11 KiB
Markdown
Raw Normal View History

# Containerize a Model
A convenient way of [deploying a model](../howtos/deploy_on_server.md) is packaging
it up in a Docker container. Thanks to bazel, this is really easy to do. You
just have to append a few lines to your model's `BUILD.bazel`. Here is how it's
done.
**Note:** This walkthrough will work with your installed container runtime, no
matter if it's **Docker or e.g. Podman.** Also, we'll create images in the
[OCI](https://github.com/opencontainers/image-spec) open image format.
Let's try containerizing our [first model](../tutorials/write_first_model.md), as it
doesn't need any additional weights files. We'll see [down below](#adding-weights-and-data)
how to add those. We'll also see how to add GPU/TPU support for our container
there.
Bazel creates images from `.TAR` archives.
The steps required for containerization are:
1. Let bazel create a MANIFEST for the tar file to come.
2. Let bazel create a TAR archive of everything needed for the model to run.
- see also: [Deploying Models on a Server](../howtos/deploy_on_server.md), where
we prepare a TAR file, and copy it to and run it on a remote GPU server.
3. Let bazel create a container image for Linux X86_64.
4. Let bazel load the image _(OPTIONAL)_.
5. Let bazel push the image straight to the Docker registry.
6. Let bazel [add weights and data](#adding-weights-and-data), GPU/TPU support
_(OPTIONAL)_.
**Note:** every TAR archive we create (one in this example) becomes its own
layer in the container image.
## Dockerizing our first model
We need to add a few "imports" at the beginning of our `BUILD.bazel` so we can
use their rules to define our 5 additional targets:
```python
load("@aspect_bazel_lib//lib:tar.bzl", "mtree_spec", "tar")
load("@aspect_bazel_lib//lib:transitions.bzl", "platform_transition_filegroup")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_load", "oci_push")
load("@zml//bazel:zig.bzl", "zig_cc_binary")
zig_cc_binary(
name = "simple_layer",
main = "main.zig",
deps = [
"@zml//async",
"@zml//zml",
],
)
```
### 1. The Manifest
To get started, let's make bazel generate a manifest that will be used when
creating the TAR archive.
```python
# Manifest created from the simple_layer binary and friends
mtree_spec(
name = "mtree",
srcs = [":simple_layer"],
)
```
It is as easy as that: we define that we want everything needed for our binary
to be included in the manifest.
### 2. The TAR
Creating the TAR archive is equally easy; it's just a few more lines of bazel:
```python
# Create a tar archive from the above manifest
tar(
name = "archive",
srcs = [":simple_layer"],
args = [
"--options",
"zstd:compression-level=9",
],
compress = "zstd",
mtree = ":mtree",
)
```
Note that we specify high **zstd** compression, which serves two purposes:
avoiding large TAR files, and also: creating TAR files that are quick to
extract.
### 3. The Image
Creating the actual image is a two-step process:
- First, we use a rule that creates an
[OCI](https://github.com/opencontainers/image-spec) image (open image
format). But we're not done yet.
- Second, we force the actual OCI image to be built for `Linux X86_64` always,
regardless of the host we're building the image **on**.
```python
# The actual docker image, with entrypoint, created from tar archive
oci_image(
name = "image_",
base = "@distroless_cc_debian12",
entrypoint = ["./{}/simple_layer".format(package_name())],
tars = [":archive"],
)
```
See how we use string interpolation to fill in the folder name for the
container's entrypoint?
Next, we use a transition rule to force the container to be built for
Linux X86_64:
```python
# We always want to create the image for Linux
platform_transition_filegroup(
name = "image",
srcs = [":image_"],
target_platform = "@zml//platforms:linux_amd64",
)
```
And that's almost it! You can already build the image:
```
bazel build --config=release //examples/simple_layer:image
INFO: Analyzed target //simple_layer:image (1 packages loaded, 8 targets configured).
INFO: Found 1 target...
Target //simple_layer:image up-to-date:
bazel-out/k8-dbg-ST-f832ad0148ae/bin/simple_layer/image_
INFO: Elapsed time: 0.279s, Critical Path: 0.00s
INFO: 1 process: 1 internal.
INFO: Build completed successfully, 1 total action
```
... and inspect `./bazel-out`. Bazel tells you the exact path to the `image_`.
### 4. The Load
While inspecting the image is surely interesting, we usually want to load the
image so we can run it.
There is a bazel rule for that: `oci_load`. When we append the following lines
to `BUILD.bazel`:
```python
# Load will immediately load the image (eg: docker load)
oci_load(
name = "load",
image = ":image",
repo_tags = [
"distroless/simple_layer:latest",
],
)
```
... then we can load the image and run it with the following commands:
```
bazel run --config=release //simple_layer:load
docker run --rm distroless/simple_layer:latest
```
### 5. The Push
We just need to add one more target to the build file before we can push the
image to a container registry:
```python
# Bazel target for pushing the Linux image to the docker registry
oci_push(
name = "push",
image = ":image",
remote_tags = ["latest"],
# override with -- --repository foo.bar/org/image
repository = "index.docker.io/renerocksai/simple_layer",
)
```
This will push the `simple_layer` image with the tag `latest` (you can add more)
to the docker registry:
```
bazel run --config=release //simple_layer:push
```
When dealing with maybe a public and a private container registry - or if you
just want to try it out **right now**, you can always override the repository on
the command line:
```
bazel run --config=release //simple_layer:push -- --repository my.server.com/org/image
```
## Adding weights and data
Dockerizing a model that doesn't need any weights was easy. But what if you want
to create a complete care-free package of a model plus all required weights and
supporting files?
We'll use the [MNIST
example](https://github.com/zml/zml/tree/master/examples/mnist) to illustrate
how to build Docker images that also contain data files.
You can `bazel run --config=release //mnist:push -- --repository
index.docker.io/my_org/zml_mnist` in the `./examples` folder if you want to try
it out.
**Note: Please add one more of the following parameters to specify all the
platforms your containerized model should support.**
- NVIDIA CUDA: `--@zml//runtimes:cuda=true`
- AMD RoCM: `--@zml//runtimes:rocm=true`
- Google TPU: `--@zml//runtimes:tpu=true`
- AWS Trainium/Inferentia 2: `--@zml//runtimes:neuron=true`
- **AVOID CPU:** `--@zml//runtimes:cpu=false`
**Example:**
```
bazel run //mnist:push --config=release --@zml//runtimes:cuda=true -- --repository index.docker.io/my_org/zml_mnist
```
### Manifest and Archive
We only add one more target to the `BUILD.bazel` to construct the commandline
for the `entrypoint` of the container. All other steps basically remain the
same.
Let's start with creating the manifest and archive:
```python
load("@aspect_bazel_lib//lib:expand_template.bzl", "expand_template")
load("@aspect_bazel_lib//lib:tar.bzl", "mtree_spec", "tar")
load("@aspect_bazel_lib//lib:transitions.bzl", "platform_transition_filegroup")
load("@rules_oci//oci:defs.bzl", "oci_image", "oci_load", "oci_push")
load("@zml//bazel:zig.bzl", "zig_cc_binary")
# The executable
zig_cc_binary(
name = "mnist",
args = [
"$(location @com_github_ggerganov_ggml_mnist//file)",
"$(location @com_github_ggerganov_ggml_mnist_data//file)",
],
data = [
"@com_github_ggerganov_ggml_mnist//file",
"@com_github_ggerganov_ggml_mnist_data//file",
],
main = "mnist.zig",
deps = [
"@zml//async",
"@zml//zml",
],
)
# Manifest created from the executable (incl. its data: weights and dataset)
mtree_spec(
name = "mtree",
srcs = [":mnist"],
)
# Create a tar archive from the above manifest
tar(
name = "archive",
srcs = [":mnist"],
args = [
"--options",
"zstd:compression-level=9",
],
compress = "zstd",
mtree = ":mtree",
)
```
### Entrypoint
Our container entrypoint commandline is not just the name of the executable
anymore, as we need to pass the weights file and the test dataset to MNIST. A
simple string interpolation will not be enough.
For this reason, we use the `expand_template` rule, like this:
```python
# A convenience template for creating the "command line" for the entrypoint
expand_template(
name = "entrypoint",
data = [
":mnist",
"@com_github_ggerganov_ggml_mnist//file",
"@com_github_ggerganov_ggml_mnist_data//file",
],
substitutions = {
":model": "$(rlocationpath @com_github_ggerganov_ggml_mnist//file)",
":data": "$(rlocationpath @com_github_ggerganov_ggml_mnist_data//file)",
},
template = [
"./{}/mnist".format(package_name()),
"./{}/mnist.runfiles/:model".format(package_name()),
"./{}/mnist.runfiles/:data".format(package_name()),
],
)
```
- `data`, which is identical to `data` in the `mnist` target used for running
the model, tells bazel which files are needed.
- in `substitutions` we define what `:model` and `:data` need to be replaced
with
- in `template`, we construct the actual entrypoint conmandline
### Image, Push
From here on, everything is analog to the `simple_layer` example, with one
exception: in the `image_` target, we don't fill in the `entrypoint` directly,
but use the expanded template, which we conveniently named `entrypoint` above.
```python
# The actual docker image, with entrypoint, created from tar archive
oci_image(
name = "image_",
base = "@distroless_cc_debian12",
# the entrypoint comes from the expand_template rule `entrypoint` above
entrypoint = ":entrypoint",
tars = [":archive"],
)
# We always want to create the image for Linux
platform_transition_filegroup(
name = "image",
srcs = [":image_"],
target_platform = "@zml//platforms:linux_amd64",
)
# Load will immediately load the image (eg: docker load)
oci_load(
name = "load",
image = ":image",
repo_tags = [
"distroless/mnist:latest",
],
)
# Bazel target for pushing the Linux image to our docker registry
oci_push(
name = "push",
image = ":image",
remote_tags = ["latest"],
# override with -- --repository foo.bar/org/image
repository = "index.docker.io/steeve/mnist",
)
```
And that's it! With one simple bazel command, you can push a neatly packaged
MNIST model, including weights and dataset, to the docker registry:
```
bazel run //mnist:push --@zml//runtimes:cuda=true -- --repository index.docker.io/my_org/zml_mnist
```