A model in a notebook helps no one. MLOps is the engineering discipline that gets models into production and keeps them working — the difference between a data scientist who produces interesting charts and one who ships systems the business depends on. This module walks the whole last mile: persist and version a trained model, track your experiments so results are reproducible, wrap the model in a web API, containerise it with Docker so it runs anywhere, and monitor it in production because — unlike ordinary software — models silently rot as the world changes. This is the skill set that separates senior practitioners from beginners.
1Why MLOps? From notebook to production
Training is maybe 10% of a real ML project. The other 90% is everything around it: data pipelines, versioning, serving, monitoring and retraining. MLOps brings software-engineering rigour (and DevOps ideas) to machine learning.
| Notebook ML | Production ML (MLOps) |
|---|---|
| runs once, by you | runs continuously, automatically |
| data sits in a CSV | data flows from live pipelines |
| “it works on my machine” | reproducible anywhere (Docker) |
| accuracy in a cell | monitored metrics + alerts |
| forgotten after the demo | versioned, retrained, maintained |
- Training is ~10% of a project; data, serving, monitoring and retraining are the rest.
- MLOps applies software/DevOps rigour: versioning, reproducibility, automation, monitoring.
- Unlike normal software, models silently decay as data drifts — maintenance is built-in.
2Saving & versioning models
Step one of deployment: persist the trained model so you can load it elsewhere without retraining. For scikit-learn, joblib is the standard.
Save and load
import joblib
# Save the entire fitted pipeline (preprocessing + model together)
joblib.dump(pipeline, 'model.joblib')
# Later, in a totally different process:
loaded = joblib.load('model.joblib')
print(loaded.predict(X_new[:3]))[1 0 1]
Pipeline from Module 5 guarantees train-time and serve-time transforms match exactly.Version everything that makes a model
- Code — Git (you already do this).
- Data — DVC or dataset hashes, so you know what it trained on.
- Model artefacts — a model registry with versions and stages (Staging → Production).
- Environment — pinned
requirements.txtso dependencies match.
- Use
joblib.dump/loadto persist and restore scikit-learn models. - Always save the entire Pipeline so serving uses the exact same preprocessing as training.
- Reproducibility needs four things versioned: code, data, model artefact, and environment.
3Experiment tracking with MLflow
Real projects train dozens of models with different features and settings. MLflow logs each run's parameters, metrics and artefacts so you can compare them and reproduce the winner — no more “which notebook had the 0.94 model?”
Track a run
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
with mlflow.start_run(run_name='rf-200-trees'):
params = {'n_estimators': 200, 'max_depth': 10}
model = RandomForestClassifier(**params, random_state=42).fit(X_tr, y_tr)
acc = model.score(X_te, y_te)
mlflow.log_params(params)
mlflow.log_metric('accuracy', acc)
mlflow.sklearn.log_model(model, 'model')
print('Logged accuracy:', round(acc, 3))Logged accuracy: 0.958
Run mlflow ui and open http://localhost:5000 to browse every run side by side — sort by metric, inspect parameters, and download any model.
- MLflow logs parameters, metrics and model artefacts for every training run.
mlflow uilets you compare runs and reproduce the best one.- The model registry promotes a run through stages (Staging → Production).
4Serving a model with FastAPI
To make predictions available to apps, wrap the model in a web API. FastAPI is the modern Python choice: fast, with automatic validation and interactive docs.
# app.py
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
app = FastAPI(title='Churn Predictor')
model = joblib.load('model.joblib') # loaded once at startup
class Customer(BaseModel): # auto-validated request schema
features: list[float]
@app.post('/predict')
def predict(item: Customer):
proba = model.predict_proba([item.features])[0, 1]
return {'churn_probability': round(float(proba), 4)}# Run it, then call it
uvicorn app:app --reload
curl -X POST http://localhost:8000/predict \
-H 'Content-Type: application/json' \
-d '{"features": [0.2, 45.0, 1.0, 3.0]}'{"churn_probability": 0.8123}/docs, so others can try your endpoint in the browser.- FastAPI wraps a model in a web endpoint with automatic request validation (Pydantic).
- Load the model once at startup, not per request, for speed.
- FastAPI auto-generates interactive docs at
/docsfor easy testing.
5Packaging with Docker
“It works on my machine” is not deployment. Docker packages your code, dependencies and runtime into a single image that runs identically on any machine — your laptop, a colleague's, or a cloud server.
A Dockerfile for the API
FROM python:3.11-slim
WORKDIR /app
# Install dependencies first (better layer caching)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy the app and the trained model
COPY app.py model.joblib ./
EXPOSE 8000
CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]# Build the image, then run a container
docker build -t churn-api .
docker run -p 8000:8000 churn-apiINFO: Uvicorn running on http://0.0.0.0:8000 INFO: Application startup complete.
docker run gets the identical environment — eliminating the dependency mismatches that break deployments. From here, the container deploys to Kubernetes, AWS, GCP or any cloud the same way.-slim base, copy requirements.txt before your code (so dependency layers cache), and use a .dockerignore. Smaller images build faster, deploy faster and have a smaller attack surface.- Docker packages code + dependencies + runtime into one portable image.
- A Dockerfile defines the build;
docker buildthendocker runlaunches the container. - Containers run identically everywhere and deploy cleanly to any cloud or Kubernetes.
6Monitoring, drift & retraining
A deployed model is not done — it is on probation. The world changes, and the data it sees in production drifts away from its training data, quietly eroding accuracy. Monitoring catches this before users do.
Two kinds of drift
- Data drift: the input distribution shifts (e.g. new customer demographics).
- Concept drift: the relationship between inputs and target changes (e.g. behaviour after a price change or a pandemic).
Detect drift in code
from scipy import stats
# Compare a feature's training vs recent-production distribution
stat, p = stats.ks_2samp(train_feature, live_feature)
print(f'KS statistic: {stat:.3f}, p-value: {p:.4f}')
if p < 0.05:
print('Drift detected -- flag for review / retraining')KS statistic: 0.214, p-value: 0.0008 Drift detected -- flag for review / retraining
- Data drift = inputs shift; concept drift = the input-output relationship changes.
- Detect drift statistically (e.g. KS test) and monitor live accuracy on a dashboard.
- Mature MLOps closes the loop: log, monitor, alert, and retrain automatically on fresh data.
★ Hands-on Project — Deploy a Model as a Container
Take a trained model all the way to a running, containerised API — the deliverable that proves you can ship.
- Train a model from an earlier module inside a scikit-learn Pipeline and log the run with MLflow (params + metrics).
- Persist the full pipeline with
joblib.dumpand pin your dependencies inrequirements.txt. - Write a FastAPI
app.pythat loads the model at startup and exposes aPOST /predictendpoint with a Pydantic request schema. - Run it with uvicorn and test it with
curland the interactive/docspage. - Write a Dockerfile (slim base, cached deps), then
docker buildanddocker runthe API. - Confirm the containerised endpoint returns the same predictions as your local run.
- Add a simple drift check: a script that compares a feature's training vs new distribution with a KS test and prints a warning.
- Write a short README (how to build, run, call) and a note on how you'd monitor and retrain it, then commit to your portfolio.
Ready to test yourself?
Take the module quiz. Score 70% or more to mark this module complete.
Start the quiz →💡 Log in to save your progress and earn the certificate.