AWS with Terraform (Day 13)
Terraform Data Sources in AWS: Safely Using Existing VPCs, Subnets & AMIs
Today’s learning in my “30 Days of AWS & Terraform” series focuses on one of the most powerful concepts that enable real-world infrastructure management—Terraform Data Sources.Many teams struggle with repeatedly hardcoding AMI IDs, VPC IDs, and subnet IDs inside Terraform configuration. These values change frequently and are usually managed externally by platform teams. Hardcoding them introduces fragility, makes templates non-portable, and requires manual changes on every update.
Terraform solves this problem through data sources, which allow you to query and reference existing cloud resources safely, instead of defining or recreating them.
Why Terraform Data Sources Matter
Data sources let you look up resources dynamically from AWS and consume their attributes without manually storing IDs.
Key benefits
| Problem | Solution with Data Sources |
|---|---|
| Hardcoded AMI, VPC, or subnet IDs | Dynamically fetch IDs at runtime |
| Manual lookups and updating IDs | Eliminates human dependency |
| Risk of outdated AMIs | Always fetch the latest AMI |
| Resource duplication | Reuse existing shared infrastructure safely |
Outcome: Cleaner, more automated, scalable Terraform configurations that adapt to multi-team shared cloud environments.
Core Example: Provision EC2 Without Hardcoding AMI & Subnet
Use Terraform to fetch:
-
The existing VPC by tag
-
A specific subnet inside that VPC
-
The latest Amazon Linux 2 AMI
✔ Terraform Data Sources Implementation
Use data source in EC2 resource
Why This Approach Is Better
| Traditional | With Data Sources |
|---|---|
ami = "ami-123456789" | ami = data.aws_ami.linux2.id |
subnet_id = "subnet-abc123" | subnet_id = data.aws_subnet.shared.id |
| Manual AMI refresh | Always fetch latest AMI |
| Prone to errors | Self-healing infra lookups |
When To Use Data Sources
-
Referencing shared infrastructure (networking, IAM, KMS, ECR, etc.)
-
Provisioning resources across multiple environments or regions
-
When AMIs are frequently refreshed
-
When multiple teams share infra layers
Best Practices
| Recommendation | Reason |
|---|---|
| Prefer tag-based filtering | Stable across accounts, readable |
Use most_recent = true carefully | Avoid unexpected major upgrades |
Keep data sources in data.tf | Maintain clean file structure |
Always review with terraform plan | Validates correct resource selection |
Quick Checklist Before Applying
-
Validate tag names exist on the target resources
-
Run:
-
Review selected resource IDs in plan output
-
Destroy demo resources to avoid billing
Closing Summary
Terraform data sources bridge the gap between IaC and real cloud environments. They eliminate brittle hardcoded values, improve automation quality, and help teams safely consume shared infrastructure.
For EC2 provisioning:
Combine
data.aws_vpc,data.aws_subnet, anddata.aws_amito deploy instances into existing VPCs and subnets while automatically selecting the appropriate AMI.
A simple framing:
“This AMI has a data source, now I can simply reference
data.aws_ami.linux2.idinstead of hardcoding values.”
Using Terraform data sources correctly results in safe, scalable, production-ready IaC.
Diagram
Here is the session link:
Comments
Post a Comment