SRE for Machine Learning and Data Pipelines Virtual Internship
In this virtual internship, students will learn how to ensure the reliability and availability of machine learning models and data pipelines. They will gain hands-on experience with model versioning, data quality monitoring, and incident response. By the end of the internship, students will be equipped with the skills to become Site Reliability Engineers (SREs) for machine learning and data-intensive applications.
Track Overview
Tasks & Milestones
Task 1: Implement Model Versioning
IntermediateIn this task, students will set up a model versioning system using Git and MLflow to track model changes and artifacts.
Task 2: Develop Automated Model Deployment Pipeline
IntermediateIn this task, students will create an automated pipeline to deploy machine learning models to production.
Task 1: Implement Data Quality Monitoring
IntermediateIn this task, students will set up data quality checks and monitoring for their machine learning pipelines.
Task 2: Set up Observability for Machine Learning Pipelines
IntermediateIn this task, students will implement observability tools to monitor the health and performance of their machine learning pipelines.
Task 1: Develop Incident Response and Mitigation Plan
IntermediateIn this task, students will create an incident response and mitigation plan for a machine learning pipeline.
Task 2: Implement SLIs and SLOs for Reliability Engineering
IntermediateIn this task, students will define service-level indicators (SLIs) and objectives (SLOs) to measure and improve the reliability of a machine learning pipeline.
Task 1: Implement Infrastructure as Code
IntermediateIn this task, students will use Terraform to define and manage the infrastructure for a machine learning pipeline.
Task 2: Implement Workflow Automation
IntermediateIn this task, students will use Airflow to automate the workflows and processes for a machine learning pipeline.
Prerequisites
- • Familiarity with cloud computing and containerization
- • Experience with Python or other programming languages
Certificate
Certificate of Completion
Earn a certificate upon successful completion