Infrastructure as Code: Why It Matters for Your Business
Manual infrastructure doesn’t scale. Infrastructure as Code does.
The Problem with Manual Infrastructure
When infrastructure is created by clicking through consoles:
- No history: Who changed what, when, and why?
- No reproducibility: Can you rebuild this environment from scratch?
- No review process: Changes go live without oversight
- Configuration drift: Production slowly diverges from staging
- Single points of failure: Only one person knows how it’s set up
This works for a weekend project. It doesn’t work for a business.
What Infrastructure as Code Actually Means
Infrastructure as Code (IaC) means defining your infrastructure in version-controlled files:
// This is infrastructure
const database = new rds.DatabaseInstance(this, "Database", {
engine: rds.DatabaseInstanceEngine.postgres({ version: rds.PostgresEngineVersion.VER_15 }),
instanceType: ec2.InstanceType.of(ec2.InstanceClass.T4G, ec2.InstanceSize.SMALL),
vpc,
allocatedStorage: 20,
backupRetention: Duration.days(7),
});
This code:
- Lives in Git with full history
- Gets reviewed before merging
- Deploys consistently every time
- Can recreate the entire environment
The Business Case
1. Faster deployments
| Approach | Time to Deploy New Environment |
|---|---|
| Manual | 4-8 hours (if documented) |
| IaC | 15-30 minutes |
When a new client needs an isolated environment, or you need to spin up a demo, IaC makes it trivial.
2. Fewer errors
Manual deployments have a ~5-10% error rate on complex changes. IaC reduces this to near zero because:
- Changes are tested in staging first
- Code review catches mistakes
- The same code runs every time
3. Disaster recovery
If your AWS account got compromised tomorrow, could you rebuild?
With IaC: Run cdk deploy and you’re back online.
Without IaC: Days of manual reconstruction, assuming someone remembers how it was configured.
4. Knowledge transfer
When someone leaves the team, their infrastructure knowledge shouldn’t leave with them. With IaC, everything is documented in code.
Choosing a Tool
Three main options for AWS:
AWS CDK (Cloud Development Kit)
const bucket = new s3.Bucket(this, "DataBucket", {
encryption: s3.BucketEncryption.S3_MANAGED,
versioned: true,
lifecycleRules: [{
expiration: Duration.days(90),
transitions: [{
storageClass: s3.StorageClass.GLACIER,
transitionAfter: Duration.days(30),
}],
}],
});
Best for: Teams comfortable with TypeScript/Python, complex infrastructure, AWS-only environments.
Strengths:
- Full programming language (loops, conditions, abstractions)
- Strong typing catches errors early
- High-level constructs reduce boilerplate
Terraform
resource "aws_s3_bucket" "data" {
bucket = "my-data-bucket"
}
resource "aws_s3_bucket_lifecycle_configuration" "data" {
bucket = aws_s3_bucket.data.id
rule {
id = "archive"
status = "Enabled"
transition {
days = 30
storage_class = "GLACIER"
}
expiration {
days = 90
}
}
}
Best for: Multi-cloud environments, teams with existing Terraform experience.
Strengths:
- Cloud-agnostic
- Mature ecosystem
- Large community
CloudFormation
Resources:
DataBucket:
Type: AWS::S3::Bucket
Properties:
BucketEncryption:
ServerSideEncryptionConfiguration:
- ServerSideEncryptionByDefault:
SSEAlgorithm: AES256
LifecycleConfiguration:
Rules:
- Status: Enabled
Transitions:
- StorageClass: GLACIER
TransitionInDays: 30
ExpirationInDays: 90
Best for: Simple setups, teams already invested in CloudFormation.
Strengths:
- Native AWS support
- No additional tooling required
- Direct AWS integration
Our recommendation
CDK for most projects. The productivity gains from using a real programming language outweigh the learning curve.
Getting Started
Project structure
infrastructure/
├── bin/
│ └── app.ts # Entry point
├── lib/
│ ├── stacks/
│ │ ├── network.ts # VPC, subnets, security groups
│ │ ├── database.ts # RDS, ElastiCache
│ │ ├── compute.ts # Fargate, Lambda
│ │ └── cdn.ts # CloudFront, S3
│ └── constructs/
│ ├── static-site.ts
│ └── fargate-api.ts
├── cdk.json
├── package.json
└── tsconfig.json
Start simple
Don’t try to codify everything at once. Start with:
- One environment (production or staging)
- Core infrastructure (database, compute, networking)
- New resources only (don’t migrate existing manual resources immediately)
Example: Basic web app infrastructure
// lib/stacks/app-stack.ts
export class AppStack extends Stack {
constructor(scope: Construct, id: string, props: AppStackProps) {
super(scope, id, props);
// Networking
const vpc = new ec2.Vpc(this, "Vpc", {
maxAzs: 2,
natGateways: 1,
});
// Database
const database = new rds.DatabaseInstance(this, "Database", {
engine: rds.DatabaseInstanceEngine.postgres({
version: rds.PostgresEngineVersion.VER_15,
}),
instanceType: ec2.InstanceType.of(ec2.InstanceClass.T4G, ec2.InstanceSize.MICRO),
vpc,
vpcSubnets: { subnetType: ec2.SubnetType.PRIVATE_ISOLATED },
credentials: rds.Credentials.fromGeneratedSecret("postgres"),
});
// Application
const cluster = new ecs.Cluster(this, "Cluster", { vpc });
new ecs_patterns.ApplicationLoadBalancedFargateService(this, "Service", {
cluster,
memoryLimitMiB: 512,
cpu: 256,
taskImageOptions: {
image: ecs.ContainerImage.fromEcrRepository(props.repository),
secrets: {
DATABASE_URL: ecs.Secret.fromSecretsManager(database.secret!),
},
},
});
}
}
Common Patterns
Environment separation
// bin/app.ts
const app = new cdk.App();
new AppStack(app, "AppStaging", {
env: { account: "123456789", region: "us-east-1" },
environment: "staging",
instanceSize: "micro",
});
new AppStack(app, "AppProduction", {
env: { account: "987654321", region: "us-east-1" },
environment: "production",
instanceSize: "small",
});
Same code, different configurations. Staging and production stay in sync.
Secrets management
Never hardcode secrets:
// Bad
const apiKey = "sk-1234567890";
// Good
const apiKey = secretsmanager.Secret.fromSecretNameV2(
this, "ApiKey", "/myapp/api-key"
);
Cross-stack references
// Network stack exports VPC
export class NetworkStack extends Stack {
public readonly vpc: ec2.Vpc;
constructor(scope: Construct, id: string, props?: StackProps) {
super(scope, id, props);
this.vpc = new ec2.Vpc(this, "Vpc", { maxAzs: 2 });
}
}
// App stack imports VPC
const networkStack = new NetworkStack(app, "Network");
new AppStack(app, "App", { vpc: networkStack.vpc });
Deployment Workflow
Local development
# See what will change
cdk diff
# Deploy to your account
cdk deploy
CI/CD integration
# .github/workflows/infrastructure.yml
name: Infrastructure
on:
push:
branches: [main]
paths: ['infrastructure/**']
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
aws-region: us-east-1
- name: Install dependencies
run: cd infrastructure && npm ci
- name: CDK diff
run: cd infrastructure && npx cdk diff
- name: CDK deploy
run: cd infrastructure && npx cdk deploy --require-approval never
Migration Strategy
For existing manual infrastructure:
Phase 1: Import
Use cdk import to bring existing resources under IaC management:
cdk import AppStack
# Follow prompts to import existing resources
Phase 2: Parity
Ensure the IaC definition matches reality:
cdk diff
# Should show no changes if import was successful
Phase 3: Iteration
Now all changes go through code:
# Make change in code
git commit -m "Increase database instance size"
git push
# CI/CD deploys the change
Mistakes to Avoid
1. Giant monolithic stacks
// Bad: Everything in one stack
class EverythingStack extends Stack {
// 2000 lines of resources
}
// Good: Logical separation
class NetworkStack extends Stack { }
class DatabaseStack extends Stack { }
class ComputeStack extends Stack { }
2. Ignoring state management
Both Terraform and CDK track state. Understand where state lives and how to handle state conflicts.
3. No environment separation
// Bad: Hardcoded values
const instanceType = "t4g.large";
// Good: Environment-specific configuration
const instanceType = props.environment === "production"
? "t4g.large"
: "t4g.micro";
4. Skipping code review
Infrastructure changes should get the same review rigor as application code. A bad infrastructure change can take down production.
Measuring Success
After implementing IaC, track:
| Metric | Before IaC | After IaC |
|---|---|---|
| Deployment time | Hours | Minutes |
| Failed deployments | 5-10% | <1% |
| Time to spin up new env | Days | Hours |
| Configuration drift incidents | Monthly | Never |
| Bus factor | 1-2 people | Team-wide |
Lessons Learned
-
Start with new projects. Migrating existing infrastructure is harder than starting fresh.
-
Invest in learning. IaC has a learning curve, but it pays off quickly.
-
Don’t over-engineer. Start simple, add complexity as needed.
-
Test in staging. Never deploy infrastructure changes directly to production.
-
Document decisions. Code shows what, comments explain why.
Manual infrastructure is technical debt that compounds over time. Infrastructure as Code is an investment that pays dividends with every deployment.