- Home
- Roadmap
- Foundation
- Operate Iac Baseline
Infrastructure & Operations Baseline
Infrastructure as Code for all infrastructure, runbook standards, and operational readiness practices.
Business Value
Prevents infrastructure drift entirely and reduces environment provisioning time from 2 weeks to 2 hours through versioned infrastructure code
DORA Impact
- Deployment Frequency
- Mean Time to Recover
Key Features
- Infrastructure as Code
- Operational Runbooks
- On-Call Rotation
- Autoscaling Configuration
- Backup and Recovery
Who
When
Foundation (0-90 days)
Capabilities in This Epic
Infrastructure as Code
>= 70% of infrastructure managed via IaC (Terraform, Pulumi, CloudFormation) in version control.
Operational Runbooks
>= 80% of critical services have runbooks for deployment, incident response, and disaster recovery.
On-Call Rotation
>= 90% of production services have defined on-call rotation with < 15min incident response SLA.
Autoscaling Configuration
>= 70% of stateless services have horizontal autoscaling based on CPU/memory or custom metrics.
Backup and Recovery
>= 90% of stateful services (databases, volumes) have automated backups with tested recovery procedures.
Implementation Journey
Prerequisites
Complete these before starting:
- Infrastructure provisioning currently manual or semi-automated
- Target cloud provider or platform selected
- Team has access to infrastructure resources
Typical Timeline
4 weeks
Effort Estimate
Breakdown by role:
Team Composition
Cross-functional team including: platform, sre
Applicable Environments
Success Metrics
Entry Criteria
Prerequisites to start implementing this epic:
Exit Criteria
Criteria defined at the Foundation milestone level:
DORA Metrics Impact
Resources
Implementation Kit
Step-by-step guide, templates, and tools for this epic
View Infrastructure & Operations Baseline Implementation KitCommon Pitfalls
Mitigation: Use remote state backend (S3, Azure Storage). Enable state locking. Back up state files regularly.
Mitigation: Detect drift with scheduled scans. Block manual changes via policy. Document exceptions with tickets.
Mitigation: Require PR review for IaC changes. Run terraform plan in CI. Separate plan and apply steps with approval gate.
Next Steps
After Completing This Epic
Once you've met all exit criteria, consider these next steps:
- Review metrics to validate DORA improvements
- Document lessons learned and update team playbooks
- Share success stories with other teams
Alternative Paths
Other epics that can be tackled in parallel: