← All services
Site Reliability Engineering
# SLOs, error budgets, and on-call sanity
Embed SRE practices into your engineering org: define SLIs/SLOs, build error budgets, design on-call runbooks, and move your team from reactive firefighting to proactive reliability.
$ cat deliverables
- → SLI/SLO definition workshop
- → Error budget policy
- → Runbook library (Markdown)
- → Alerting rule audit & cleanup
- → Incident post-mortem template
$ cat tech-stack
Prometheus
Grafana
PagerDuty
Opsgenie
Loki
OpenTelemetry