Docker for Machine Learning¶
Docker standardizes ML workflows by eliminating "works on my machine" problems across data collection, experimentation, training, evaluation, deployment, and monitoring stages.
Key Benefits for ML¶
- Reproducible experiments with pinned dependency versions
- Consistent environments across dev/staging/production
- Team onboarding - new engineers get identical setup instantly
- Model serving in production with isolated dependencies
- GPU access via NVIDIA Container Toolkit
ML Development Environment¶
JupyterLab with Persistent Notebooks¶
Volume mount ensures notebooks persist beyond container lifecycle.
MLflow Experiment Tracking¶
Connecting JupyterLab to MLflow¶
Both run as separate containers. Connect via host networking or Docker network. JupyterLab notebooks train models and log metrics/parameters/artifacts to MLflow.
Containerizing ML Applications¶
Dockerfile for ML App¶
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 7860
CMD ["python", "app.py"]
ML Stack with Docker Compose¶
services:
mlflow:
image: ghcr.io/mlflow/mlflow:latest
ports: ["5555:5000"]
command: mlflow server --host 0.0.0.0
api:
build: ./src/api
ports: ["8000:8000"]
depends_on: [mlflow]
streamlit:
build: ./src/streamlit
ports: ["8501:8501"]
environment:
- API_URL=http://api:8000
depends_on: [api]
ML Pipeline Workflow¶
- Raw data -> preprocessing -> clean data
- Feature engineering -> selected features
- Model training -> model.pkl + encoders
- Package model in FastAPI wrapper -> Docker image
- Package Streamlit frontend -> Docker image
- Compose all services together
Docker Model Runner (Local LLM)¶
Apple Silicon only (M1-M4), Docker Desktop 4.40+.
docker model list # list downloaded models
docker model pull <model> # download from Docker Hub AI models
docker model run <model> # run inference
Integration endpoints: - From containers: http://modelrunner.docker.internal/v1 - From host: http://localhost:<port>/v1 - OpenAI-compatible API - same connection string works for both
Docker MCP Toolkit (AI Agents)¶
MCP (Model Context Protocol) connects LLMs to external tools (filesystem, GitHub, databases) through standardized interfaces.
{
"mcpServers": {
"github": {
"image": "docker.io/mcp/github",
"env": { "GITHUB_PERSONAL_ACCESS_TOKEN": "<token>" }
}
}
}
Deployment Targets¶
- Docker Hub - public registry, free for public images
- Hugging Face Spaces - push Dockerfile + code, auto-builds and deploys
- AWS App Runner - ECR image -> managed deployment with auto-scaling
Gotchas¶
- Alpine base images are smaller but may lack compilation tools needed for ML libraries (numpy, scipy, torch)
- Volume mounts are critical - without them, trained models and notebooks are lost on container removal
- GPU access requires NVIDIA Container Toolkit and
--gpusflag - ML images are often large (2-10GB) - multi-stage builds and
.dockerignorehelp reduce size --no-cache-dirflag on pip prevents caching packages inside image layers
See Also¶
- docker compose - multi-container orchestration for ML stacks
- dockerfile and image building - image optimization techniques
- container registries - ECR, ACR, Docker Hub for ML images
- monitoring and observability - monitoring ML services in production