MLOps Workflow Overview¶
This guide illustrates the complete MLOps workflow from data processing to model deployment, showing how different stages connect together.
Complete MLOps Pipeline¶
flowchart TB
Start[📥 Raw Data]
subgraph Phase1["🔧 DataOps Phase"]
direction TB
V1[🔍 Validate Raw Data]
P1[🔄 Process & Clean]
V2[🔍 Validate Processed]
F1[🎯 Engineer Features]
V3[🔍 Validate Features]
DVC1[💾 Version with DVC]
V1 --> P1 --> V2 --> F1 --> V3 --> DVC1
end
subgraph Phase2["🤖 ModelOps Phase"]
direction TB
E1[📊 Experiment & Train]
E2[📈 Track with MLflow]
E3[🔬 Evaluate Models]
R1[📝 Register Best Model]
E1 --> E2 --> E3 --> R1
end
subgraph Phase3["🚀 Deployment Phase"]
direction TB
D1[🐳 Build Containers]
D2[🌐 Deploy API]
D3[📊 Deploy Dashboard]
D1 --> D2
D1 --> D3
end
subgraph Phase4["📈 Monitoring Phase"]
direction TB
M1[📉 Track Performance]
M2[🔍 Detect Drift]
M3[🔔 Alert on Issues]
M1 --> M2 --> M3
end
Start --> Phase1
Phase1 --> Phase2
Phase2 --> Phase3
Phase3 --> Phase4
Phase4 -.retrain trigger.-> Phase1
style Phase1 fill:#e8f5e9
style Phase2 fill:#e3f2fd
style Phase3 fill:#fff3e0
style Phase4 fill:#fce4ec
Phase 1: DataOps¶
Goal: Transform raw data into validated, versioned features ready for modeling.
Commands¶
Or run individual steps:
# Step 1: Validate raw data
python src/data/validate_data.py --stage raw --fail-on-error
# Step 2: Process and clean data
python -m src.data.make_dataset
# Step 3: Validate processed data
python src/data/validate_data.py --stage processed --fail-on-error
# Step 4: Engineer features
python -m src.features.build_features
# Step 5: Validate features
python src/data/validate_data.py --stage features --fail-on-error
# Step 6: Version with DVC
dvc add data/processed/train_clean.parquet
dvc add data/processed/train_features.parquet
git add data/processed/*.dvc .gitignore
git commit -m "data: version processed data and features"
Outputs¶
data/processed/train_clean.parquet- Cleaned and merged datadata/processed/train_features.parquet- Feature-engineered datasetdata/processed/*.dvc- DVC metadata files
Learn more: DataOps Workflow Guide
Phase 2: ModelOps¶
Goal: Train, evaluate, and register production-ready models.
Commands¶
# Option A: Jupyter Notebooks (exploration)
jupyter lab
# Navigate to: notebooks/03-baseline-models.ipynb
# notebooks/04-advanced-models-and-ensembles.ipynb
# Option B: Python Scripts (production)
python -m src.models.train_baselines
python -m src.models.train_advanced
python -m src.models.ensembles
# View experiments in MLflow
mlflow ui
# Open: http://localhost:5000
Workflow Details¶
flowchart LR
A[Load Features] --> B[Time-Series CV Split]
B --> C[Train Models]
C --> D[Log to MLflow]
D --> E[Evaluate RMSPE]
E --> F{Best Model?}
F -->|Yes| G[Register Model]
F -->|No| H[Continue Experiments]
H --> C
G --> I[Production Ready]
style G fill:#c8e6c9
style I fill:#81c784
Outputs¶
- MLflow experiments with metrics, parameters, and artifacts
- Trained model files in
models/ - Registered production model in MLflow Model Registry
- Performance metrics in
outputs/metrics/
Learn more: Model Training Guide
Phase 3: Deployment¶
Goal: Deploy models via API and interactive dashboard.
Commands¶
# Deploy all services with Docker Compose
docker-compose up --build
# Or deploy individually:
# FastAPI prediction service
docker build -t rossmann-api -f Dockerfile .
docker run -p 8000:8000 rossmann-api
# Streamlit dashboard
docker build -t rossmann-dashboard -f Dockerfile.streamlit .
docker run -p 8501:8501 rossmann-dashboard
# MLflow tracking server
mlflow server --host 0.0.0.0 --port 5000
Access Services¶
- API: http://localhost:8000
- API Docs: http://localhost:8000/docs
- Dashboard: http://localhost:8501
- MLflow: http://localhost:5000
Deployment Architecture¶
flowchart TB
User[👤 User] --> LB[Load Balancer]
LB --> API1[FastAPI Instance 1]
LB --> API2[FastAPI Instance 2]
API1 --> ML[MLflow Registry]
API2 --> ML
ML --> Model[Production Model]
API1 --> Logger[Prediction Logger]
API2 --> Logger
Logger --> Monitor[Monitoring Service]
Dashboard[📊 Streamlit Dashboard] --> ML
style Model fill:#81c784
style Monitor fill:#ffd54f
Learn more: Deployment guide coming soon
Phase 4: Monitoring¶
Goal: Track model performance and detect data drift in production.
Commands¶
# Generate drift report
python -m src.monitoring.drift_detection \
--reference data/processed/train_features.parquet \
--current data/production/current_batch.parquet \
--output monitoring/drift_reports/
# Track model performance
python -m src.monitoring.performance \
--predictions outputs/predictions/production.csv \
--actuals data/production/actuals.csv \
--output monitoring/performance_reports/
# View reports
open monitoring/drift_reports/latest_drift_report.html
Monitoring Dashboard¶
flowchart LR
subgraph Metrics["📊 Key Metrics"]
M1[RMSPE]
M2[Prediction Latency]
M3[Request Volume]
end
subgraph Drift["🔍 Drift Detection"]
D1[Feature Drift]
D2[Target Drift]
D3[Concept Drift]
end
subgraph Alerts["🔔 Alerting"]
A1[Performance Degradation]
A2[Data Quality Issues]
A3[System Errors]
end
Metrics --> Alerts
Drift --> Alerts
Alerts --> Action[🔄 Trigger Retraining]
style Action fill:#ffcdd2
Outputs¶
- Drift detection reports (HTML/JSON)
- Performance metrics over time
- Alerts for issues requiring attention
Learn more: Monitoring guide coming soon
End-to-End Example¶
Here's a complete workflow from scratch:
1. Initial Setup (5 minutes)¶
# Clone and install
git clone https://github.com/bradleyboehmke/rossmann-forecasting.git
cd rossmann-forecasting
pip install uv
uv venv && source .venv/bin/activate
uv pip install -e .
2. DataOps (10 minutes)¶
# Run complete data pipeline
bash scripts/dataops_workflow.sh
# Result: Validated features ready for modeling
3. ModelOps (30 minutes)¶
# Train models with MLflow tracking
python -m src.models.train_baselines
python -m src.models.train_advanced
# View experiments
mlflow ui
# Register best model (via MLflow UI or programmatically)
4. Deployment (5 minutes)¶
# Deploy all services
docker-compose up --build
# Test API
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"store": 1, "date": "2015-08-01", ...}'
5. Monitoring (Ongoing)¶
# Generate monitoring reports
python -m src.monitoring.drift_detection
python -m src.monitoring.performance
# Review dashboards and alerts
Continuous Improvement Loop¶
flowchart LR
P[📊 Monitor Production] --> D{Drift or<br/>Degradation?}
D -->|Yes| R[🔄 Retrain]
D -->|No| P
R --> DataOps[🔧 DataOps:<br/>New Data]
DataOps --> ModelOps[🤖 ModelOps:<br/>Retrain]
ModelOps --> E[📈 Evaluate]
E --> Better{Better than<br/>Current?}
Better -->|Yes| Deploy[🚀 Deploy New Model]
Better -->|No| Keep[Keep Current Model]
Deploy --> P
Keep --> P
style R fill:#ffcdd2
style Deploy fill:#c8e6c9
The workflow is continuous:
- Monitor production performance and data quality
- Detect when model performance degrades or data drifts
- Retrain using fresh data through the DataOps pipeline
- Evaluate if the new model outperforms the current one
- Deploy only if the new model is better
- Repeat continuously to maintain model performance
Quick Reference¶
Common Commands¶
| Task | Command |
|---|---|
| Run DataOps pipeline | bash scripts/dataops_workflow.sh |
| Train baseline models | python -m src.models.train_baselines |
| Train advanced models | python -m src.models.train_advanced |
| View MLflow experiments | mlflow ui (http://localhost:5000) |
| Deploy all services | docker-compose up --build |
| Generate drift report | python -m src.monitoring.drift_detection |
| Run tests | pytest tests/ -v |
| View documentation | mkdocs serve (http://localhost:8000) |
Key Directories¶
data/raw/- Original immutable datadata/processed/- Cleaned data and featuresmodels/- Trained model artifactsmlruns/- MLflow experiment trackingmonitoring/- Drift and performance reportsoutputs/- Predictions and metrics
Next Steps¶
Now that you understand the complete workflow:
- Quick Start - Get up and running in 5 minutes
- DataOps Workflow - Deep dive into data processing
- Model Training - Learn experiment tracking
- Deployment - Deploy to production (Coming Soon)
- Monitoring - Track performance (Coming Soon)