- Home
- Capabilities
- Operate Oncall
On-Call Rotation
Quick Reference
What & Why
Definition
>= 90% of production services have defined on-call rotation with < 15min incident response SLA.
Business Value
Prevents infrastructure drift entirely and reduces environment provisioning time from 2 weeks to 2 hours through versioned infrastructure code Achieving < 15min mean incident response time is a key milestone toward this goal.
Context
This capability is part of the Foundation milestone's focus on establish baseline practices (testable, releasable, monitorable). Essential for teams targeting DF, MTTR improvements.
Success Criteria
< 15min mean incident response time
Measurement
PagerDuty/Opsgenie incident acknowledgment time
Evidence
- On-call schedule
- Incident response metrics
- Escalation policy
In Practice
Real-World Implementation
Teams create PagerDuty schedule, rotate weekly, set escalation policy (primary 5min, secondary 10min), track MTTA.
Concrete Example
Implementation Guide
Implementation Steps
Follow the measurement approach: PagerDuty/Opsgenie incident acknowledgment time
For detailed step-by-step guidance, refer to the Infrastructure & Operations Baseline Implementation Kit.
Resources
Implementation Kit
Infrastructure & Operations Baseline KitTemplates
Browse all templatesRelated Resources
View learning pathsRelated Capabilities
Enables
What this unlocks
Complementary
Often adopted together, from the Infrastructure & Operations Baseline epic
Troubleshooting & FAQs
Common Issues
Issue: Target metric not improving
Solution: Verify measurement is accurate, check if prerequisites are fully implemented, review evidence artifacts for completeness
Issue: Team resistance to adoption
Solution: Start with pilot team, demonstrate value with metrics, provide training and support during transition
Issue: Inconsistent implementation across teams
Solution: Create shared templates and guidelines, establish regular sync meetings, use automation to enforce standards
Frequently Asked Questions
Can we implement this before completing prerequisites?
While possible, it's not recommended. Prerequisites ensure foundational practices are in place, making this capability more effective and easier to adopt.
How long does implementation typically take?
Most capabilities can be implemented within 90 days when tackled as part of the Foundation milestone. Individual timelines vary based on team size and existing practices.