Kubernetes Observability and Monitoring Virtual Internship
In this comprehensive 12-week virtual internship, participants will learn to monitor and observe Kubernetes clusters using industry-standard tools like Prometheus and Grafana. Through hands-on projects, you will gain practical experience in setting up monitoring and alerting, analyzing SLIs/SLOs, and implementing incident management and automation to ensure high availability and reliability of your Kubernetes environments. By the end of the internship, you will have built a strong foundation in Kubernetes observability and be equipped with the skills to thrive as a Site Reliability Engineer (SRE).
Track Overview
Tasks & Milestones
Kubernetes Cluster Monitoring Setup
IntermediateSet up a Kubernetes cluster and install Prometheus and Grafana for monitoring
Defining SLIs and SLOs for a Kubernetes Service
IntermediateIdentify appropriate SLIs and set SLO targets for a Kubernetes service
Implementing Incident Management Workflows
IntermediateSet up incident management and automation using Alertmanager and other tools
Implementing Distributed Tracing in Kubernetes
AdvancedSet up a distributed tracing solution (e.g., Jaeger) to trace end-to-end application requests
Kubernetes Observability Capstone Project
AdvancedDesign and implement a production-ready Kubernetes observability solution
Prerequisites
- • Intermediate-level experience with Kubernetes
- • Familiarity with Linux and shell scripting
- • Basic understanding of monitoring and observability concepts
Certificate
Certificate of Completion
Earn a certificate upon successful completion