Build a Scalable Web Application with Auto Scaling on AWS using Terraform

We will discover how to use Terraform to launch a scalable web application on AWS. To make sure our application can scale dynamically in response to incoming traffic, we will have set up EC2 instances, an Elastic Load Balancer (ELB), and an Auto Scaling Group by the end of this article. This configuration minimizes expenses during times of low demand while guaranteeing the program maintains performance during surges in traffic.

Project Description

EC2 Instances to host the web application.
Elastic Load Balancer (ELB) to distribute traffic across multiple instances.
Auto Scaling Group (ASG) to dynamically adjust the number of EC2 instances based on traffic.
CloudWatch for monitoring and triggering auto scaling actions.

Auto Scaling automatically adjusts the number of EC2 instances based on the incoming traffic, ensuring that your app remains highly available without manual intervention. Imagine it’s Black Friday and your site experiences a sudden surge in visitors. Auto Scaling ensures that your infrastructure can handle the load and can scale back when the traffic drops.

Terraform allows you to manage this scaling infrastructure with code, making it easy to replicate and adjust as needed.

Project Structure

├── env
│   └── dev
│       ├── main.tf
│       ├── outputs.tf
│       ├── terraform.tf
│       ├── terraform.tfvars
│       └── vars.tf
├── modules
│   ├── alb
│   ├── asg
│   ├── cloud_watch
│   ├── launch_template
│   ├── shared
│   │   └── security_groups
│   └── vpc
└── templates
    └── user_data.sh

Step 1: VPC

In this section, we configure the Virtual Private Cloud (VPC) and its associated resources. The code can seem complex, but let’s break it down to understand how it fits together.

First, we create a VPC using the aws_vpc resource.

Next, we dynamically calculate the effective AZs based on the available ones and the desired_azs variable. We limit the number of AZs to match the region's availability and slice the AZs into a list.

For subnet creation, we define both public and private subnets in locals. The subnet_definitions map combines public and private subnet definitions. Using a loop, we create subnets with the aws_subnet resource, applying a cidr_block based on the subnet type and AZ. Public subnets get an Internet Gateway (IGW) for external access, while private subnets route traffic through a NAT Gateway.

The route tables are created dynamically with aws_route_table using the combined local variable. Public subnets are associated with the IGW, while private subnets are linked to the NAT Gateway. The code is designed this way to ensure a modular, flexible network setup where subnets and routing are dynamically generated based on input variables, ensuring scalability and adaptability.

# Create a Virtual Private Cloud (VPC)
resource "aws_vpc" "vpc" {
  cidr_block           = var.cidr_block
  instance_tenancy     = "default"
  enable_dns_hostnames = var.dns_hostnames
  tags = {
    Name        = "${var.environment}-vpc"
    Environemnt = var.environment
  }
}

# Fetch available AZs in the chosen region
data "aws_availability_zones" "available_azs" {
  state = "available"
}

# Calculate effective AZs and limit based on the region's available AZs
locals {
  effective_azs = min(var.desired_azs, length(data.aws_availability_zones.available_azs.names))
  az_list       = slice(data.aws_availability_zones.available_azs.names, 0, local.effective_azs)
}

# **************** Subnet Creation (Public and Private) ********************* #
# Subnet data for public and private subnets
locals {
  public_subnets = [
    for idx in range(var.public_subnets_no) : {
      idx      = idx
      name     = "public"
      cidr_add = 100 + idx
    }
  ]
  private_subnets = [
    for idx in range(var.private_subnets_no) : {
      idx      = idx
      name     = "private"
      cidr_add = idx
    }
  ]
  # Combines both public and private subnet definitions into one map with unique keys
  subnet_definitions = merge(
    { for idx, subnet in local.public_subnets : "public-${idx}" => subnet },
    { for idx, subnet in local.private_subnets : "private-${idx}" => subnet }
  )
}

# Create subnets dynamically
resource "aws_subnet" "subnets" {
  for_each          = local.subnet_definitions
  vpc_id            = aws_vpc.vpc.id
  cidr_block        = cidrsubnet(var.cidr_block, 8, each.value.cidr_add)
  availability_zone = local.az_list[each.value.idx % length(local.az_list)]
  tags = {
    Name        = "${var.environment}-${each.value.name}-subnet-${each.key}"
    Environment = var.environment
  }
}

# Create an Internet Gateway for external connectivity
resource "aws_internet_gateway" "internet_gateway" {
  vpc_id = aws_vpc.vpc.id
  tags = {
    Name        = "${var.environment}-igw"
    Environemnt = var.environment
  }
}

resource "aws_eip" "nat_eip" {
  tags = {
    Name        = "${var.environment}-nat-eip"
    Environment = var.environment
  }
}

# Create NAT Gateway
resource "aws_nat_gateway" "nat_gateway" {
  subnet_id     = aws_subnet.subnets["public-0"].id
  allocation_id = aws_eip.nat_eip.id
  tags = {
    Name        = "${var.environment}-nat-gw"
    Environment = var.environment
  }
}

# Create Route Tables
locals {
  combined = [
    { "type" = "public", "id" = aws_internet_gateway.internet_gateway.id },
    { "type" = "private", "id" = aws_nat_gateway.nat_gateway.id }
  ]
}

resource "aws_route_table" "route_tables" {
  for_each = { for idx, gateway in local.combined : idx => gateway }
  vpc_id   = aws_vpc.vpc.id
  route {
    cidr_block = "0.0.0.0/0"
    # Use `nat_gateway_id` for NAT Gateways and `gateway_id` for others
    nat_gateway_id = each.value.type == "private" ? each.value.id : null
    gateway_id     = each.value.type == "public" ? each.value.id : null
  }
  tags = {
    Name        = "${var.environment}-${each.value.type}-route-table"
    Environment = var.environment
  }
}

# Associate public subnets with the route table
resource "aws_route_table_association" "public_route_table_association" {
  # Loop through public subnets to associate each with the public route table
  for_each       = { for key, subnet in local.subnet_definitions : key => aws_subnet.subnets[key] if subnet.name == "public" }
  route_table_id = aws_route_table.route_tables["0"].id
  subnet_id      = each.value.id
}

# Associate private subnets with the route table
resource "aws_route_table_association" "private_route_table_association" {
  for_each       = { for key, subnet in local.subnet_definitions : key => aws_subnet.subnets[key] if subnet.name == "private" }
  route_table_id = aws_route_table.route_tables["1"].id
  subnet_id      = each.value.id
}

Outputs

Outputs are provided for the VPC ID, public subnet IDs, and private subnet IDs, enabling easy reference to these resources for further configuration.

output "vpc_id" {
  description = "The ID of the VPC."
  value       = aws_vpc.vpc.id
}

output "public_subnet_ids" {
  description = "A list of IDs for the public subnets."
  value = [
    for key, subnet in aws_subnet.subnets :
    subnet.id if local.subnet_definitions[key].name == "public"
  ]
}

output "private_subnet_ids" {
  description = "A list of IDs for the private subnets."
  value = [
    for key, subnet in aws_subnet.subnets :
    subnet.id if local.subnet_definitions[key].name == "private"
  ]
}

Step 2: Security Groups

The security group configuration utilizes a shared module, which is also used by the launch template and Application Load Balancer (ALB). It implements validation logic for allowing HTTP traffic based on the allow_internet_access variable. If public access is allowed, HTTP traffic is permitted from any IP (0.0.0.0/0). If not, HTTP traffic is restricted to come only from the ALB’s security group, ensuring secure and controlled access to the instances.

resource "aws_security_group" "sg" {
  description = "Allow TLS traffic inbound and all outbound"
  vpc_id      = var.vpc_id
}

resource "aws_vpc_security_group_ingress_rule" "allow_http" {
  for_each          = { for idx, port in var.inbound_ports : idx => port } 
  security_group_id = aws_security_group.sg.id
  cidr_ipv4       = var.allow_internet_access ? "0.0.0.0/0" : null
  referenced_security_group_id = var.allow_internet_access ? null : var.security_group_ref_id
  from_port         = each.value
  to_port           = each.value
  ip_protocol       = "tcp"
}

resource "aws_vpc_security_group_egress_rule" "app_allow_all_outbound_alb" {
  security_group_id = aws_security_group.sg.id
  cidr_ipv4         = "0.0.0.0/0"  
  ip_protocol       = "-1"         
}

Outputs

The output block exports the ID of the security group associated with the ALB or Launch Template, making it available for further use in other parts of the infrastructure.

output "security_group_id" {
  description = "The ID of the Application Load Balancer/Launch Template security group"
  value       = aws_security_group.sg.id
}

Step 3: Creating a Launch Template

We start by creating a Launch Template in Terraform, which defines how EC2 instances in the Auto Scaling Group (ASG) will be configured. This template references the shared security group module for network security and uses the most recent Ubuntu AMI for the instance configuration. The instance type and user data are defined to install and configure Nginx on the EC2 instances, ensuring they serve a simple web page.

The user data contains the EC2 instance's hostname in the web page ($(hostname -f)), demonstrating that load balancer traffic is shared across instances.

module "security_group" {
  source                = "../shared/security_groups"
  vpc_id                = var.vpc_id
  inbound_ports         = [80]
  allow_internet_access = false
  security_group_ref_id = var.alb_security_group_id
}

data "aws_ami" "ubuntu" {
  most_recent = true

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd*/ubuntu-*-${var.distro_version}-amd64-server-*"]
  }

  owners = ["099720109477"]
}


resource "aws_launch_template" "launch_template" {
  name = "${var.environment}-launch-template"

  disable_api_stop        = true
  disable_api_termination = false

  image_id      = data.aws_ami.ubuntu.id
  instance_type = var.instance_type

  vpc_security_group_ids = [module.security_group.security_group_id]

  user_data = base64encode(<<-EOT
    #!/bin/bash
    sudo apt update -y
    sudo apt upgrade -y
    sudo apt install -y nginx
    sudo systemctl start nginx
    sudo systemctl enable nginx
    echo "<h1>Hello from Terraform at $(hostname -f)</h1>" | sudo tee /var/www/html/index.html
  EOT
  )
}

Outputs

This provides the details of the created launch template, which can be referenced later for scaling or managing EC2 instances in the Auto Scaling Group.

output "launch_template" {
  value = aws_launch_template.launch_template
}

Step 4: Creating a Load Balancer

In this step, we set up the Application Load Balancer (ALB) to distribute incoming HTTP traffic across all EC2 instances. Similarly to our previous step with the launch template, we use the same shared security group module for the ALB. This highlights the Terraform code's modularity and reusability. The shared security group module allows us to declare the security group centrally, making the code more maintainable and less prone to errors.

The ALB is set to be public-facing and distributes traffic among the VPC's subnets as requested. We define a Target Group that listens on port 80 and employs a health check mechanism via the / path. This health check guarantees that the ALB only routes traffic to healthy EC2 instances.

Finally, the ALB's listener is configured to route incoming traffic to the Target Group, ensuring that the web application stays available while the Auto Scaling Group increases the number of EC2 instances based on traffic.

module "security_group" {
  source                = "../shared/security_groups"
  vpc_id                = var.vpc_id
  allow_internet_access = true
}

resource "aws_lb" "alb" {
  name               = "${var.environment}-web-server-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [module.security_group.security_group_id]
  subnets            = var.alb_subnets
}

resource "aws_lb_target_group" "asg_target_group" {
  name     = "${var.environment}-asg-target-group"
  port     = 80
  protocol = "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    path                = var.health_check_path
    interval            = var.health_check_interval
    timeout             = var.health_check_timeout
    healthy_threshold   = var.healthy_threshold
    unhealthy_threshold = var.unhealthy_threshold
  }
}

resource "aws_lb_listener" "http_listener" {
  load_balancer_arn = aws_lb.alb.arn
  port              = 80
  protocol          = "HTTP"

  default_action {
    type             = "forward"
    target_group_arn = aws_lb_target_group.asg_target_group.arn
  }
}

Outputs

The output block provides key information, including the ALB's security group ID, the details of the ALB target group, and the ALB's DNS name, which is used for routing traffic to the application.

output "alb_security_group_id" {
  value = module.security_group.security_group_id
}

output "alb_target_group" {
  value = aws_lb_target_group.asg_target_group
}

output "alb_dns" {
  value = aws_lb.alb.dns_name
}

Step 5: Defining The Auto Scaling Groups

The Auto Scaling Group (ASG) configuration ensures that the number of web server instances scales proportionally to demand. The code sets the intended, minimum, and maximum instance counts, associates the instances with the ALB target group, and uses a launch template to ensure consistency. Scaling strategies are also developed to allow for automatic scaling up or down based on resource utilization, assuring optimal performance and cost effectiveness.

resource "aws_autoscaling_group" "web_servers" {
  desired_capacity    = var.desired_capacity
  max_size            = var.max_size
  min_size            = var.min_size
  vpc_zone_identifier = var.private_subnet_ids

  launch_template {
    id      = var.launch_template.id
    version = var.launch_template.latest_version
  }

  target_group_arns = [var.alb_target_group.arn]

  health_check_type         = "ELB"
  health_check_grace_period = 300
  force_delete              = true
  wait_for_capacity_timeout = "0"
}

resource "aws_autoscaling_policy" "scale_up" {
  name                   = "${var.environment}-scale-up"
  autoscaling_group_name = aws_autoscaling_group.web_servers.name
  adjustment_type        = "ChangeInCapacity"
  scaling_adjustment     = 1
  cooldown               = 300
}

resource "aws_autoscaling_policy" "scale_down" {
  name                   = "${var.environment}-scale-down"
  autoscaling_group_name = aws_autoscaling_group.web_servers.name
  adjustment_type        = "ChangeInCapacity"
  scaling_adjustment     = -1
  cooldown               = 300
}

Outputs

The outputs display key details like the Auto Scaling Group's name and ID, the ARNs of the scaling policies (scale-up and scale-down), and the ARN of the ALB target group.

output "asg" {
  description = "Auto Scaling Group details"
  value = {
    name = aws_autoscaling_group.web_servers.name
    id   = aws_autoscaling_group.web_servers.id
  }
}

output "scaling_policies" {
  description = "Scaling policies ARNs"
  value = {
    scale_up   = aws_autoscaling_policy.scale_up.arn
    scale_down = aws_autoscaling_policy.scale_down.arn
  }
}

output "alb_target_group_arn" {
  description = "ARN of the ALB Target Group"
  value       = var.alb_target_group.arn
}

Step 6: CloudWatch Alarms

Finally, we have completed the setup with CloudWatch alarms that will trigger auto-scaling actions based on CPU utilization.

First, the cpu-high alarm monitors CPU usage. If it exceeds 75%, it will trigger the scale_up action to add more instances. If the usage drops below the threshold, the scale_down action will be activated to reduce the capacity. Similarly, the cpu-low alarm triggers a scale-down action when CPU utilization falls below 20%.

resource "aws_cloudwatch_metric_alarm" "cpu_high" {
  alarm_name          = "cpu-high-${var.asg.name}"
  comparison_operator = "GreaterThanThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = "60"
  statistic           = "Average"
  threshold           = 75
  alarm_description   = "This metric monitors CPU utilization and triggers scaling actions."

  dimensions = {
    AutoScalingGroupName = var.asg.name
  }

  alarm_actions = [var.asg_policy.scale_up]
  ok_actions    = [var.asg_policy.scale_down]
}

resource "aws_cloudwatch_metric_alarm" "cpu_low" {
  alarm_name          = "cpu-low-${var.asg.name}"
  comparison_operator = "LessThanThreshold"
  evaluation_periods  = "2"
  metric_name         = "CPUUtilization"
  namespace           = "AWS/EC2"
  period              = "60"
  statistic           = "Average"
  threshold           = 20
  alarm_description   = "This metric monitors CPU utilization and triggers scaling actions."

  dimensions = {
    AutoScalingGroupName = var.asg.name
  }

  alarm_actions = [var.asg_policy.scale_down]
}

Conlusion

And with that, we were done! With everything in place—from a robust VPC, dynamic subnets, and the launch template to scaling policies and CloudWatch alarms—we have developed a highly automated, self-scaling environment that responds to traffic in real time.