MLOps Training Documentation
Complete study guide for MLOps infrastructure and deployment
Study Path Overview
This documentation follows a hybrid structure:
docs/(you are here) - Conceptual learning and theorymodule-X/- Hands-on labs and practice code
Quick Start
- Choose your module below
- Read the conceptual documentation in
docs/module-X/ - Practice with labs in
module-X/folder
Module 1: Infrastructure & Prerequisites
Goal: Master Git, AWS, Kubernetes, and Terraform fundamentals for MLOps
Study Path
| Order | Topic | Description | Lab Location |
|---|---|---|---|
| 1 | Git for Teams | Version control, branching strategies, and team collaboration | module-01/git/ |
| 2 | AWS Cloud Services | Cloud services, security, networking, and AI/ML | module-01/aws/ |
| 3 | Kubernetes | Container orchestration for production workloads | module-01/k8s/ |
| 4 | Terraform | Infrastructure as Code fundamentals | module-01/terraform/ |
Module 1 Documentation
Git for Teams:
- Git Overview - Complete Git collaboration guide
- Git Basics & Configuration - Essential commands and setup
- Understanding Git Areas - How Git manages files
- Branching Strategies - Compare workflows
- Remote Operations - Working with remotes
- Pull Requests & Code Review - Collaboration process
- Merge Conflicts - Resolving conflicts
- Repository Governance - Team contribution models
- Team Conventions - Standards and best practices
- Workflow Examples - Real-world scenarios
AWS Cloud Services:
- AWS Overview Guide - Complete AWS CLF-C02 reference
- Cloud Concepts & Security
- Core Services (Compute, Storage, Database, Networking, Analytics)
- AI/ML Services
- Deployment Methods
- Billing & Pricing
- LocalStack Practice Guides
Kubernetes:
- K8s Overview - Complete K8s guide
- Why Kubernetes? - Production orchestration
- Core Objects - Object model, namespaces, pods, labels
- Workloads - Deployments, StatefulSets, Jobs
- Storage - PVs, PVCs, StorageClasses
- Configuration - ConfigMaps and Secrets
- Network - Services and Ingress
- Architecture - Control plane and nodes
- Helm - Package management
- Monitoring - Observability
Terraform:
Lab Locations
| Lab | Description | Location |
|---|---|---|
| Git for Teams | Git practice exercises and examples | module-01/git/ |
| LocalStack | AWS services practice locally | module-01/aws/localstack/ |
| Kubernetes | K8s hands-on practice | module-01/k8s/ |
| Terraform Basics | Infrastructure as Code fundamentals | module-01/terraform/basics/ |
| Terraform Examples | Example configurations | module-01/terraform/examples/ |
| Terraform Exercises | Practice exercises | module-01/terraform/exercises/ |
Module 2: Model Deployment
Coming soon - Batch API deployment with FastAPI
Lab Location: module-02/batch-api/
Module 3: Deployment and Operation
Goal: Implement automated testing, CI/CD pipelines, and monitoring
Study Path
| Order | Topic | Description | Lab Location |
|---|---|---|---|
| 1 | Testing | Unit, integration, and contract testing | module-03/testing/ |
| 2 | CI/CD | GitHub Actions workflows and pipelines | module-03/cicd/ |
| 3 | Monitoring & Observability | Grafana LGTM+P stack | module-03/monitoring/ |
Module 3 Documentation
Testing:
CI/CD:
Monitoring & Observability:
- Quick Start with intro-to-mltp
- Grafana Overview - Visualization platform
- Grafana Mimir - Scalable metrics storage
- Grafana Loki - Centralized log aggregation
- Grafana Tempo - Distributed tracing
- Grafana Pyroscope - Continuous profiling
- Quickstart Guide
Lab Locations
| Lab | Description | Location |
|---|---|---|
| Testing | Unit, integration, contract testing | module-03/testing/ |
| CI/CD | GitHub Actions workflows | module-03/cicd/github-actions/ |
| Monitoring | Grafana LGTM+P stack demo | module-03/monitoring/ |
Study Tips
For Each Module
- Read first - Start with the conceptual guide in
docs/ - Practice second - Run the lab exercises in
module-X/ - Experiment - Modify configurations and observe changes
- Review - Re-read documentation with practical context
For Hands-on Skills
- Complete all lab exercises - Don't skip!
- Break things intentionally - Learn to troubleshoot
- Build variations - Modify exercises to solve new problems
- Document your learnings - Keep notes
Example Study Workflow
bash
# 1. Read the conceptual guide (Git for Teams)
cat docs/module-01/git/README.md
# 2. Navigate to the lab
cd module-01/git
# 3. Practice Git workflows
# Create a practice repository, branches, merges, etc.
# 4. Read AWS guide
cat docs/module-01/aws/README.md
# 5. Navigate to the lab
cd ../aws/localstack
# 6. Start the lab environment
docker compose up -d
# 7. Practice the exercises
aws --endpoint-url=http://localhost:4566 s3 mb s3://my-bucket
# 8. Clean up
docker compose down -vAdditional Resources
External References
General:
- Git Documentation
- GitHub Flow Guide
- Kubernetes Documentation
- Docker Documentation
- Terraform Documentation
AWS (Reference for CLF-C02 Exam):
Testing & CI/CD:
Monitoring & Observability:
Internal Tools
module-01/git/- Git practice exercisesmodule-01/aws/localstack/- LocalStack lab environmentmodule-01/k8s/- Kubernetes hands-on practicemodule-01/terraform/- Terraform practicemodule-03/testing/- Testing labsmodule-03/cicd/github-actions/- CI/CD workflowsmodule-03/monitoring/- Grafana LGTM+P stack
Progress Tracking
Track your progress by checking off completed modules:
Module 1: Infrastructure & Prerequisites
- [ ] Git for Teams (Basics, Branching Strategies, Collaboration)
- [ ] AWS Cloud Services (Core Services, Security, AI/ML)
- [ ] Kubernetes (Core Objects, Workloads, Storage, Networking)
- [ ] Terraform Basics
- [ ] LocalStack Practice Labs
Module 2: Model Deployment
- [ ] Batch API with FastAPI
- [ ] Model Deployment Patterns
Module 3: Deployment and Operation
- [ ] Testing (Unit, Integration, Contract)
- [ ] CI/CD Pipelines (GitHub Actions)
- [ ] Monitoring & Observability (Grafana LGTM+P)
Last Updated: January 2026