Skip to main content

Command Palette

Search for a command to run...

End-to-End Cloud-Native Deployment

Building Scalable & Secure Cloud Infrastructure with Terraform and AWS

Updated
5 min read
End-to-End Cloud-Native Deployment

Introduction

This technical walkthrough demonstrates how to deploy a secure, scalable microservice on AWS using infrastructure-as-code (Terraform), containerization (Docker), and serverless computing (ECS Fargate). We’ll dissect the architecture, code, and DevOps practices that ensure reliability and security in production environments.


Architecture Overview

Key Components:

  • Public Subnets (2): Host Application Load Balancer (ALB)

  • Private Subnets (2): Run ECS Fargate tasks (isolated)

  • NAT Gateway: Allows outbound traffic from private subnets

  • Security Groups: Layer 4 firewall rules


Docker Implementation

Dockerfile

# Stage 1: Build environment
FROM python:3.9-slim as builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt  # Isolate dependencies

# Stage 2: Runtime environment
FROM python:3.9-slim
WORKDIR /app

# Create non-root user
RUN useradd -m appuser && \
    mkdir -p /home/appuser/.local && \
    chown -R appuser:appuser /app /home/appuser

# Copy dependencies and code
COPY --from=builder --chown=appuser:appuser /root/.local /home/appuser/.local
COPY --chown=appuser:appuser src/ .

USER appuser  # Drop privileges
ENV PATH=/home/appuser/.local/bin:$PATH

EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0"]  # Start FastAPI

Best Practices:

  • Multi-Stage Build: Reduces final image size (98MB vs 350MB)

  • Non-Root User: Mitigates container breakout risks

  • Dependency Isolation: Prevents version conflicts


Terraform Infrastructure

1. VPC Module (modules/vpc/main.tf)

resource "aws_vpc" "main" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true  # Required for ECS service discovery
}

resource "aws_subnet" "private" {
  count             = 2
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(aws_vpc.main.cidr_block, 8, count.index + 2)
  availability_zone = data.aws_availability_zones.available.names[count.index]
}

resource "aws_nat_gateway" "main" {
  allocation_id = aws_eip.nat.id
  subnet_id     = aws_subnet.public[0].id  # Single NAT for cost optimization
}

Network Design:

  • CIDR: 10.0.0.0/16 (65k IPs)

  • AZs: Multi-AZ for high availability

  • NAT: Central gateway for outbound traffic

2. ECS Cluster (modules/ecs/main.tf)

resource "aws_ecs_cluster" "main" {
  name = "${var.env}-cluster"
  setting {
    name  = "containerInsights"
    value = "enabled"  # CloudWatch metrics
  }
}

resource "aws_ecs_task_definition" "app" {
  family             = "${var.env}-task"
  cpu                = 256   # Fargate vCPU units
  memory             = 512   # In MiB
  network_mode       = "awsvpc"
  execution_role_arn = aws_iam_role.ecs_exec.arn

  container_definitions = jsonencode([{
    name      = "app",
    image     = var.container_image,
    essential = true,
    portMappings = [{ 
      containerPort = 8000,
      hostPort      = 8000  # Required for awsvpc mode
    }],
    logConfiguration: {
      logDriver = "awslogs",
      options = {
        "awslogs-group"  = "/ecs/${var.env}-task",
        "awslogs-region" = var.region
      }
    }
  }])
}

Fargate Configuration:

  • vCPU/Memory: Matches task requirements (1/2 GB)

  • Networking: awsvpc mode for ENI per task

  • Logging: CloudWatch integration

3. Load Balancer (modules/ecs/alb.tf)

resource "aws_lb" "main" {
  name               = "${var.env}-alb"
  subnets            = var.public_subnet_ids
  security_groups    = [aws_security_group.alb.id]
  internal           = false  # Internet-facing
}

resource "aws_lb_listener" "http" {
  load_balancer_arn = aws_lb.main.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.main.arn
  }
}

Traffic Flow:

  1. ALB receives HTTP traffic on port 80

  2. Routes to target group on port 8000

  3. Target group health checks /health endpoint


Security Implementation

1. IAM Roles

resource "aws_iam_role" "ecs_exec" {
  assume_role_policy = jsonencode({
    Version = "2012-10-17",
    Statement = [{
      Action = "sts:AssumeRole",
      Effect = "Allow",
      Principal = {
        Service = "ecs-tasks.amazonaws.com"  # Least privilege
      }
    }]
  })
}

Policy Restrictions:

  • ECS Tasks: Can’t modify infrastructure

  • Secrets: Pull from AWS Secrets Manager (optional)

2. Security Groups

# ALB Security Group
resource "aws_security_group" "alb" {
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]  # Public access
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# ECS Security Group
resource "aws_security_group" "ecs" {
  ingress {
    from_port       = 8000
    to_port         = 8000
    protocol        = "tcp"
    security_groups = [aws_security_group.alb.id]  # Only ALB access
  }
}

Zero-Trust Model:

  • ALB: Open inbound HTTP (port 80)

  • ECS: Only allows ALB traffic (port 8000)


Deployment Workflow

# 1. Build & Push Docker Image
docker build -t myrepo/simple-time-service:latest .
docker push myrepo/simple-time-service:latest

# 2. Terraform Deployment
terraform init
terraform plan -var="container_image=myrepo/simple-time-service:latest"
terraform apply -var="container_image=myrepo/simple-time-service:latest"

# 3. Verify
ALB_DNS=$(terraform output -raw alb_dns_name)
curl -v http://$ALB_DNS/health  # Expected: {"status": "healthy"}

Troubleshooting 503 Errors

Diagnosis Steps:

  1. Target Group Health
aws elbv2 describe-target-health --target-group-arn $(terraform output -raw target_group_arn)
  1. ECS Task Logs
aws logs tail "/ecs/prod-task" --follow
  1. Network Connectivity
# Test from private subnet
aws ec2-instance-connect ssh --instance-id i-12345 --command "curl localhost:8000"

Common Fixes:

  • Security Groups: Allow ALB → ECS traffic

  • Task Definition: Correct containerPort mapping

  • IAM Roles: Add ecs-tasks.amazonaws.com trust


Best Practices Checklist

Category Practice Implementation Example
Security Non-root containers USER appuser in Dockerfile
Cost Fargate spot instances Add capacity_provider_strategy
Reliability Multi-AZ deployment aws_subnet.private[*].availability_zone
Observability CloudWatch Container Insights setting { name = "containerInsights" }

Conclusion

This implementation showcases critical DevOps principles:

  1. Infrastructure-as-Code: Terraform manages 20+ AWS resources declaratively

  2. Secure by Default: Zero-trust networking, least privilege IAM

  3. Cloud-Native: Serverless Fargate tasks scale automatically


GitHub Repository: Particle41 DevOps Challenge
AWS Documentation: ECS Best Practices


Deploy to AWS

AI-Native Infrastructure & Security Architecture Research | Subhanshu Mohan Gupta

Part 19 of 50

Independent research and deep technical exploration of AI-driven DevSecOps, resilient cloud architecture, cross-chain systems and large-scale distributed architecture.

Up next

Caching Conundrum: Is There Truly Just One Path to API Efficiency?

Exploring Myths, Methods, and Modern Strategies to Supercharge your API Performance

More from this blog

A

AI-Driven DevSecOps, Cloud Security & System Architecture | Subhanshu Mohan Gupta

56 posts

Check out my “Revolutionary AI DevOps” publications, where AI transforms DevOps, enhancing automation, CI/CD, security, and performance for next-gen infrastructures.