AWS with Terraform (Day 27)

Automating AWS Infrastructure Using Terraform and GitHub Actions

Infrastructure automation is where DevOps truly becomes real. Writing Terraform code is only half the story—the real value comes when infrastructure changes are version-controlled, reviewed, scanned, approved, and applied automatically.

On Day 27 of my DevOps journey, I implemented a production-grade CI/CD pipeline to automate AWS infrastructure using Terraform and GitHub Actions, following best practices used in real-world teams.

This blog walks through the architecture, workflow design, safety controls, and lessons learned.


Why Automate Terraform with CI/CD?

Running Terraform manually from a laptop works for learning, but it quickly breaks down in team and production environments:

  • Local state files are risky and hard to share

  • Credentials scattered across machines

  • No standardized reviews or approvals

  • No security or policy checks before apply

  • No clear audit trail

By integrating Terraform with GitHub Actions, infrastructure changes become:

  • Repeatable

  • Auditable

  • Review-driven

  • Safe for production


What This Setup Delivers

This pipeline implements a real-world Terraform automation pattern:

  • End-to-end CI/CD for Terraform

  • Multi-environment support (dev, test, prod)

  • Pull-request–based planning

  • Push-based apply with approvals

  • Static linting and security scanning

  • Centralized, secure Terraform state

  • Safe, manual destroy workflow


Reference Infrastructure Architecture

The infrastructure follows a high-availability two-tier AWS design:

  • Application Load Balancer in public subnets

  • Auto Scaling Group running application instances in private subnets

  • NAT Gateways for outbound internet access

  • Terraform state stored securely in S3

This mirrors common production patterns and keeps the demo realistic.


How the Pipeline Works

1. Pull Request → Plan Phase

When a developer opens a pull request:

  1. GitHub Actions runs automatically

  2. Terraform code is checked out

  3. TFLint runs to enforce best practices

  4. Trivy (IaC scanning) checks for security misconfigurations

  5. terraform plan is executed

  6. The plan output is uploaded as an artifact

  7. The plan is posted back to the PR for review

This ensures no change reaches AWS without visibility.


2. Merge → Apply Phase

When the PR is merged or code is pushed:

  1. The apply workflow is triggered

  2. The saved plan artifact is downloaded

  3. Terraform selects the correct workspace

    • dev → dev

    • test → test

    • main → prod

  4. terraform apply runs using the approved plan

For production, the workflow pauses and waits for manual approval.


3. Environment Protection for Production

To protect prod:

  • GitHub Environments require reviewer approval

  • Prod secrets are restricted

  • Only allowed branches can deploy

  • Manual confirmation is required before apply

This prevents accidental or unauthorized production changes.


Terraform State Management

State is stored in an S3 backend with:

  • Versioning enabled

  • Encryption at rest

  • Locking enabled (DynamoDB optional)

Each environment uses a separate Terraform workspace, ensuring isolation between dev, test, and prod.


Linting and Security Scanning

Security and quality checks are enforced early:

TFLint

  • Detects misconfigurations

  • Enforces Terraform best practices

  • Prevents bad patterns from merging

Trivy (IaC Scan)

  • Finds open security groups

  • Detects public-facing resources

  • Flags missing encryption or HTTPS

  • Provides line-level fix suggestions

Issues are caught before infrastructure is created.


Safe Destroy Workflow

Destroying infrastructure is dangerous if automated blindly. To avoid accidents:

  • Destruction runs only via workflow_dispatch

  • User must select the environment

  • User must type a confirmation keyword (destroy)

  • Production destroy requires reviewer approval

This makes destruction intentional and controlled.


Common Pitfalls and Fixes

From hands-on experience:

  • Workflow not detected → YAML must be in .github/workflows/

  • No plan found → Artifact upload/download path mismatch

  • Terraform files in subfolder → Set working-directory explicitly

  • AWS auth failures → Ensure secrets are set per environment

These are small details—but critical in real pipelines.


Best Practices Learned

  • Always run plan on PR, never directly on apply

  • Keep separate workspaces per environment

  • Protect production with approvals

  • Automate linting and scanning early

  • Use saved plan artifacts for consistency

  • Never allow destroy without confirmation


Diagram




Conclusion

This setup turns Terraform into a production-ready Infrastructure as Code workflow.

By combining Terraform with GitHub Actions, we get:

  • Predictable infrastructure changes

  • Clear audit trails

  • Strong security guardrails

  • Confidence in production deployments

This is how DevOps teams move fast without breaking things.

Day 27 completed. On to the next challenge.

Here is the session link: 


Comments

Popular posts from this blog

AWS with Terraform (Day 01)

AWS with Terraform (Day 02)