Skip to main content
    DevOps
    Way of Working
    1. Home
    2. Roadmap
    3. Foundation
    4. Operate Iac Baseline

    Infrastructure & Operations Baseline

    Foundation Milestone
    Phase: operate
    DF
    MTTR

    Infrastructure as Code for all infrastructure, runbook standards, and operational readiness practices.

    Business Value

    Prevents infrastructure drift entirely and reduces environment provisioning time from 2 weeks to 2 hours through versioned infrastructure code

    DORA Impact

    • Deployment Frequency
    • Mean Time to Recover

    Key Features

    • Infrastructure as Code
    • Operational Runbooks
    • On-Call Rotation
    • Autoscaling Configuration
    • Backup and Recovery

    Who

    platform
    sre

    When

    Foundation (0-90 days)

    Capabilities in This Epic

    1.

    Infrastructure as Code

    >= 70% of infrastructure managed via IaC (Terraform, Pulumi, CloudFormation) in version control.

    Target: >= 70% infrastructure in IaC
    2.

    Operational Runbooks

    >= 80% of critical services have runbooks for deployment, incident response, and disaster recovery.

    Target: >= 80% services have runbooks
    3.

    On-Call Rotation

    >= 90% of production services have defined on-call rotation with < 15min incident response SLA.

    Target: < 15min mean incident response time
    4.

    Autoscaling Configuration

    >= 70% of stateless services have horizontal autoscaling based on CPU/memory or custom metrics.

    Target: >= 70% services have autoscaling
    5.

    Backup and Recovery

    >= 90% of stateful services (databases, volumes) have automated backups with tested recovery procedures.

    Target: >= 90% stateful services have backups

    Implementation Journey

    Prerequisites

    Complete these before starting:

    • Infrastructure provisioning currently manual or semi-automated
    • Target cloud provider or platform selected
    • Team has access to infrastructure resources

    Typical Timeline

    4 weeks

    Effort Estimate

    160 hours
    ≈ 20 days

    Breakdown by role:

    Platform:100 hours
    SRE:40 hours
    Security:20 hours

    Team Composition

    Cross-functional team including: platform, sre

    Applicable Environments

    regulated
    non-regulated

    Success Metrics

    Entry Criteria

    Prerequisites to start implementing this epic:

    Infrastructure provisioning currently manual or semi-automated
    Target cloud provider or platform selected
    Team has access to infrastructure resources

    Exit Criteria

    Criteria defined at the Foundation milestone level:

    deployment Frequency: >= weekly (staging)
    lead Time: <= 7 days (commit to staging)
    change Failure Rate: <= 20%
    mttr: <= 4h (staging)
    observability Coverage: >= 80% services instrumented
    ci Success: >= 90%
    flaky Tests: < 5%
    sbom Coverage: >= 90% services
    secrets Policy: Approved secrets manager only
    pr Cycle Time: p50 <= 24h
    build Success: main >= 95%, PR >= 90%
    ownership Coverage: >= 90% services

    DORA Metrics Impact

    DF
    1/month to 1/week
    4x
    MTTR
    24 hours to 4 hours
    83%

    Resources

    Implementation Kit

    Step-by-step guide, templates, and tools for this epic

    View Infrastructure & Operations Baseline Implementation Kit

    Templates

    Ready-to-use templates for implementing capabilities

    Browse All Templates

    Learn More

    Tutorials & Learning PathsCase Studies & Examples

    Common Pitfalls

    IaC state file lost or corrupted, infrastructure orphaned
    Mitigation: Use remote state backend (S3, Azure Storage). Enable state locking. Back up state files regularly.
    Manual infrastructure changes bypass IaC, causing drift
    Mitigation: Detect drift with scheduled scans. Block manual changes via policy. Document exceptions with tickets.
    IaC changes applied without review, causing outages
    Mitigation: Require PR review for IaC changes. Run terraform plan in CI. Separate plan and apply steps with approval gate.

    Next Steps

    After Completing This Epic

    Once you've met all exit criteria, consider these next steps:

    • Review metrics to validate DORA improvements
    • Document lessons learned and update team playbooks
    • Share success stories with other teams

    Continue To

    The natural next epic in the roadmap sequence:

    Observability & Monitoring Foundations

    Alternative Paths

    Other epics that can be tackled in parallel:

    Backlog Quality & Planning EnablementCode Quality & Review StandardsCI/CD & Build AutomationTesting Strategy & Quality Gates
    DevOps
    Way of Working

    DevOps practices for the entire delivery lifecycle

    © 2019-2026 devopswow.com. Created by Burhan Öcüt

    PartnersAboutPrivacyTermsCookies