Auto Scaling Group with Rolling Deployment Module
This Terraform Module creates an Auto Scaling Group (ASG) that can do a zero-downtime rolling deployment. That means
every time you update your app (e.g. publish a new AMI), all you have to do is run terraform apply
and the new
version of your app will automatically roll out across your Auto Scaling Group. Note that this module only
creates the ASG and it's up to you to create all the other related resources, such as the launch template, ELB,
and security groups.
This module used to use Launch configurations but has been updated to use Launch templates. This has been recommended by AWS for some time and Launch configurations will finally be deprecated entirely on Dec 31st 2023.
What's an Auto Scaling Group?
An Auto Scaling Group (ASG) is used to manage a cluster of EC2 Instances. It can enforce pre-defined rules about how many instances to run in the cluster, scale the number of instances up or down depending on traffic, and automatically restart instances if they go down.
How does rolling deployment work?
Since Terraform does not have rolling deployment built in (see https://github.com/hashicorp/terraform/issues/1552), we
are faking it using the create_before_destroy
lifecycle property. This approach is based on the rolling deploy
strategy used by HashiCorp itself, as described by Paul Hinze
here. As a result, every time you
update your launch templates (e.g. by specifying a new AMI to deploy), Terraform will:
- Create a new ASG with the new launch templates.
- Wait for the new ASG to deploy successfully and for the instances to register with the load balancer (if you associated an ELB or ALB with this ASG).
- Destroy the old ASG.
- Since the old ASG is only removed once the new ASG instances are registered with the ELB and serving traffic, there will be no downtime. Moreover, if anything went wrong while rolling out the new ASG, it will be marked as tainted (i.e. marked for deletion next time) and the original ASG will be left unchanged, so again, there is no downtime.
Note that if all we did was use create_before_destroy
, on each redeploy, our ASG would reset to its hard-coded
desired_capacity
, losing the capacity changes from auto scaling policies. We solve this problem by using an
external data source that runs the Python script
get-desired-capacity.py to fetch the latest value of the
desired_capacity
parameter:
- If the script finds a value from an already-existing ASG, we use it, to ensure that the changes form auto scaling events are not lost.
- If the script doesn't find an already-existing ASG, that means this is the first deploy, and we fall back to the
hard-coded
desired_capacity
value.
Sample Usage
- Terraform
- Terragrunt
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S ASG-ROLLING-DEPLOY MODULE
# ------------------------------------------------------------------------------------------------------
module "asg_rolling_deploy" {
source = "git::git@github.com:gruntwork-io/terraform-aws-asg.git//modules/asg-rolling-deploy?ref=v0.21.19"
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# The desired number of EC2 Instances to run in the ASG initially. Note that
# auto scaling policies may change this value. If you're using auto scaling
# policies to dynamically resize the cluster, you should actually leave this
# value as null.
desired_capacity = <number>
# The ID and version of the Launch Template to use for each EC2 instance in
# this ASG. The version value MUST be an output of the Launch Template
# resource itself. This ensures that a new ASG is created every time a new
# Launch Template version is created.
launch_template = <object(
id = string
name = string
version = string
)>
# The maximum number of EC2 Instances to run in the ASG
max_size = <number>
# The minimum number of EC2 Instances to run in the ASG
min_size = <number>
# A list of subnet ids in the VPC were the EC2 Instances should be deployed
vpc_subnet_ids = <list(string)>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# Override the auto-generated ASG name with this value.
asg_name = ""
# Capacity Rebalancing helps you maintain workload availability by proactively
# augmenting your fleet with a new Spot Instance before a running instance is
# interrupted by Amazon EC2
autoscaling_capacity_rebalance = false
# Defines the action the Auto Scaling group should take when the lifecycle
# hook timeout elapses or if an unexpected failure occurs. The value for this
# parameter can be either CONTINUE or ABANDON. The default value for this
# parameter is ABANDON.
autoscaling_lifecycle_hook_default_result = null
# Defines the amount of time, in seconds, that can elapse before the lifecycle
# hook times out. When the lifecycle hook times out, Auto Scaling performs the
# action defined in the DefaultResult parameter
autoscaling_lifecycle_hook_heartbeat_timeout = null
# Required if enable_autoscaling_lifecycle_hook is enabled. Instance state to
# which you want to attach the lifecycle hook. For a list of lifecycle hook
# types, see
# https://docs.aws.amazon.com/cli/latest/reference/autoscaling/describe-lifecycle-hook-types.html#examples
autoscaling_lifecycle_lifecycle_transition = null
# Contains additional information that you want to include any time Auto
# Scaling sends a message to the notification target.
autoscaling_lifecycle_notification_metadata = []
# ARN of the notification target that Auto Scaling will use to notify you when
# an instance is in the transition state for the lifecycle hook.
autoscaling_lifecycle_notification_target_arn = null
# ARN of the IAM role that allows the Auto Scaling group to publish to the
# specified notification target.
autoscaling_lifecycle_role_arn = null
# A list of custom tags to apply to the EC2 Instances in this ASG. Each item
# in this list should be a map with the parameters key, value, and
# propagate_at_launch.
custom_tags = []
# Timeout value for deletion operations on autoscale groups.
deletion_timeout = "10m"
# Toggles if the autoscaling_lifecycle_hook will be enabled or not. If
# enabled, the aws_autoscaling_lifecycle_hook resource will be created and
# attached to the ALB. Make sure you set all autoscaling_lifecycle_* variables
# to desired values if enabled.
enable_autoscaling_lifecycle_hook = false
# A list of metrics the ASG should enable for monitoring all instances in a
# group. The allowed values are GroupMinSize, GroupMaxSize,
# GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances,
# GroupStandbyInstances, GroupTerminatingInstances, GroupTotalInstances.
enabled_metrics = []
# Time, in seconds, after an EC2 Instance comes into service before checking
# health.
health_check_grace_period = 300
# A list of Elastic Load Balancer (ELB) names to associate with this ASG. If
# you're using the Application Load Balancer (ALB), see var.target_group_arns.
load_balancers = []
# The maximum amount of time, in seconds, that an instance inside an ASG can
# be in service, values must be either equal to 0 or between 604800 and
# 31536000 seconds.
max_instance_lifetime = null
# Wait for this number of EC2 Instances to show up healthy in the load
# balancer on creation.
min_elb_capacity = 0
# Define policy using spot and on-demand instances.
mixed_instance_policy = null
# The key for the tag that will be used to associate a unique identifier with
# this ASG. This identifier will persist between redeploys of the ASG, even
# though the underlying ASG is being deleted and replaced with a different
# one.
tag_asg_id_key = "AsgId"
# A list of Application Load Balancer (ALB) target group ARNs to associate
# with this ASG. If you're using the Elastic Load Balancer (ELB), see
# var.load_balancers.
target_group_arns = []
# A list of policies to decide how the instances in the auto scale group
# should be terminated. The allowed values are OldestInstance, NewestInstance,
# OldestLaunchTemplate, AllocationStrategy, ClosestToNextInstanceHour,
# Default.
termination_policies = []
# Whether or not ELB or ALB health checks should be enabled. If set to true,
# the load_balancers or target_groups_arns variable should be set depending on
# the load balancer type you are using. Useful for testing connectivity before
# health check endpoints are available.
use_elb_health_checks = true
# A maximum duration that Terraform should wait for the EC2 Instances to be
# healthy before timing out.
wait_for_capacity_timeout = "10m"
}
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S ASG-ROLLING-DEPLOY MODULE
# ------------------------------------------------------------------------------------------------------
terraform {
source = "git::git@github.com:gruntwork-io/terraform-aws-asg.git//modules/asg-rolling-deploy?ref=v0.21.19"
}
inputs = {
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# The desired number of EC2 Instances to run in the ASG initially. Note that
# auto scaling policies may change this value. If you're using auto scaling
# policies to dynamically resize the cluster, you should actually leave this
# value as null.
desired_capacity = <number>
# The ID and version of the Launch Template to use for each EC2 instance in
# this ASG. The version value MUST be an output of the Launch Template
# resource itself. This ensures that a new ASG is created every time a new
# Launch Template version is created.
launch_template = <object(
id = string
name = string
version = string
)>
# The maximum number of EC2 Instances to run in the ASG
max_size = <number>
# The minimum number of EC2 Instances to run in the ASG
min_size = <number>
# A list of subnet ids in the VPC were the EC2 Instances should be deployed
vpc_subnet_ids = <list(string)>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# Override the auto-generated ASG name with this value.
asg_name = ""
# Capacity Rebalancing helps you maintain workload availability by proactively
# augmenting your fleet with a new Spot Instance before a running instance is
# interrupted by Amazon EC2
autoscaling_capacity_rebalance = false
# Defines the action the Auto Scaling group should take when the lifecycle
# hook timeout elapses or if an unexpected failure occurs. The value for this
# parameter can be either CONTINUE or ABANDON. The default value for this
# parameter is ABANDON.
autoscaling_lifecycle_hook_default_result = null
# Defines the amount of time, in seconds, that can elapse before the lifecycle
# hook times out. When the lifecycle hook times out, Auto Scaling performs the
# action defined in the DefaultResult parameter
autoscaling_lifecycle_hook_heartbeat_timeout = null
# Required if enable_autoscaling_lifecycle_hook is enabled. Instance state to
# which you want to attach the lifecycle hook. For a list of lifecycle hook
# types, see
# https://docs.aws.amazon.com/cli/latest/reference/autoscaling/describe-lifecycle-hook-types.html#examples
autoscaling_lifecycle_lifecycle_transition = null
# Contains additional information that you want to include any time Auto
# Scaling sends a message to the notification target.
autoscaling_lifecycle_notification_metadata = []
# ARN of the notification target that Auto Scaling will use to notify you when
# an instance is in the transition state for the lifecycle hook.
autoscaling_lifecycle_notification_target_arn = null
# ARN of the IAM role that allows the Auto Scaling group to publish to the
# specified notification target.
autoscaling_lifecycle_role_arn = null
# A list of custom tags to apply to the EC2 Instances in this ASG. Each item
# in this list should be a map with the parameters key, value, and
# propagate_at_launch.
custom_tags = []
# Timeout value for deletion operations on autoscale groups.
deletion_timeout = "10m"
# Toggles if the autoscaling_lifecycle_hook will be enabled or not. If
# enabled, the aws_autoscaling_lifecycle_hook resource will be created and
# attached to the ALB. Make sure you set all autoscaling_lifecycle_* variables
# to desired values if enabled.
enable_autoscaling_lifecycle_hook = false
# A list of metrics the ASG should enable for monitoring all instances in a
# group. The allowed values are GroupMinSize, GroupMaxSize,
# GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances,
# GroupStandbyInstances, GroupTerminatingInstances, GroupTotalInstances.
enabled_metrics = []
# Time, in seconds, after an EC2 Instance comes into service before checking
# health.
health_check_grace_period = 300
# A list of Elastic Load Balancer (ELB) names to associate with this ASG. If
# you're using the Application Load Balancer (ALB), see var.target_group_arns.
load_balancers = []
# The maximum amount of time, in seconds, that an instance inside an ASG can
# be in service, values must be either equal to 0 or between 604800 and
# 31536000 seconds.
max_instance_lifetime = null
# Wait for this number of EC2 Instances to show up healthy in the load
# balancer on creation.
min_elb_capacity = 0
# Define policy using spot and on-demand instances.
mixed_instance_policy = null
# The key for the tag that will be used to associate a unique identifier with
# this ASG. This identifier will persist between redeploys of the ASG, even
# though the underlying ASG is being deleted and replaced with a different
# one.
tag_asg_id_key = "AsgId"
# A list of Application Load Balancer (ALB) target group ARNs to associate
# with this ASG. If you're using the Elastic Load Balancer (ELB), see
# var.load_balancers.
target_group_arns = []
# A list of policies to decide how the instances in the auto scale group
# should be terminated. The allowed values are OldestInstance, NewestInstance,
# OldestLaunchTemplate, AllocationStrategy, ClosestToNextInstanceHour,
# Default.
termination_policies = []
# Whether or not ELB or ALB health checks should be enabled. If set to true,
# the load_balancers or target_groups_arns variable should be set depending on
# the load balancer type you are using. Useful for testing connectivity before
# health check endpoints are available.
use_elb_health_checks = true
# A maximum duration that Terraform should wait for the EC2 Instances to be
# healthy before timing out.
wait_for_capacity_timeout = "10m"
}
Reference
- Inputs
- Outputs
Required
desired_capacity
numberThe desired number of EC2 Instances to run in the ASG initially. Note that auto scaling policies may change this value. If you're using auto scaling policies to dynamically resize the cluster, you should actually leave this value as null.
launch_template
object(…)The ID and version of the Launch Template to use for each EC2 instance in this ASG. The version value MUST be an output of the Launch Template resource itself. This ensures that a new ASG is created every time a new Launch Template version is created.
object({
id = string
name = string
version = string
})
max_size
numberThe maximum number of EC2 Instances to run in the ASG
min_size
numberThe minimum number of EC2 Instances to run in the ASG
vpc_subnet_ids
list(string)A list of subnet ids in the VPC were the EC2 Instances should be deployed
Optional
asg_name
stringOverride the auto-generated ASG name with this value.
""
Capacity Rebalancing helps you maintain workload availability by proactively augmenting your fleet with a new Spot Instance before a running instance is interrupted by Amazon EC2
false
Defines the action the Auto Scaling group should take when the lifecycle hook timeout elapses or if an unexpected failure occurs. The value for this parameter can be either CONTINUE or ABANDON. The default value for this parameter is ABANDON.
null
Defines the amount of time, in seconds, that can elapse before the lifecycle hook times out. When the lifecycle hook times out, Auto Scaling performs the action defined in the DefaultResult parameter
null
Required if enable_autoscaling_lifecycle_hook is enabled. Instance state to which you want to attach the lifecycle hook. For a list of lifecycle hook types, see https://docs.aws.amazon.com/cli/latest/reference/autoscaling/describe-lifecycle-hook-types.html#examples
null
Contains additional information that you want to include any time Auto Scaling sends a message to the notification target.
Any types represent complex values of variable type. For details, please consult `variables.tf` in the source repo.
[]
ARN of the notification target that Auto Scaling will use to notify you when an instance is in the transition state for the lifecycle hook.
null
ARN of the IAM role that allows the Auto Scaling group to publish to the specified notification target.
null
custom_tags
list(object(…))A list of custom tags to apply to the EC2 Instances in this ASG. Each item in this list should be a map with the parameters key, value, and propagate_at_launch.
list(object({
key = string
value = string
propagate_at_launch = bool
}))
[]
Example
default = [
{
key = "foo"
value = "bar"
propagate_at_launch = true
},
{
key = "baz"
value = "blah"
propagate_at_launch = true
}
]
deletion_timeout
stringTimeout value for deletion operations on autoscale groups.
"10m"
Toggles if the autoscaling_lifecycle_hook will be enabled or not. If enabled, the aws_autoscaling_lifecycle_hook resource will be created and attached to the ALB. Make sure you set all autoscaling_lifecycle_* variables to desired values if enabled.
false
enabled_metrics
list(string)A list of metrics the ASG should enable for monitoring all instances in a group. The allowed values are GroupMinSize, GroupMaxSize, GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances, GroupStandbyInstances, GroupTerminatingInstances, GroupTotalInstances.
[]
Example
enabled_metrics = [
"GroupDesiredCapacity",
"GroupInServiceInstances",
"GroupMaxSize",
"GroupMinSize",
"GroupPendingInstances",
"GroupStandbyInstances",
"GroupTerminatingInstances",
"GroupTotalInstances"
]
Time, in seconds, after an EC2 Instance comes into service before checking health.
300
load_balancers
list(string)A list of Elastic Load Balancer (ELB) names to associate with this ASG. If you're using the Application Load Balancer (ALB), see target_group_arns
.
[]
max_instance_lifetime
numberThe maximum amount of time, in seconds, that an instance inside an ASG can be in service, values must be either equal to 0 or between 604800 and 31536000 seconds.
null
min_elb_capacity
numberWait for this number of EC2 Instances to show up healthy in the load balancer on creation.
0
Define policy using spot and on-demand instances.
Any types represent complex values of variable type. For details, please consult `variables.tf` in the source repo.
null
tag_asg_id_key
stringThe key for the tag that will be used to associate a unique identifier with this ASG. This identifier will persist between redeploys of the ASG, even though the underlying ASG is being deleted and replaced with a different one.
"AsgId"
target_group_arns
list(string)A list of Application Load Balancer (ALB) target group ARNs to associate with this ASG. If you're using the Elastic Load Balancer (ELB), see load_balancers
.
[]
termination_policies
list(string)A list of policies to decide how the instances in the auto scale group should be terminated. The allowed values are OldestInstance, NewestInstance, OldestLaunchTemplate, AllocationStrategy, ClosestToNextInstanceHour, Default.
[]
Whether or not ELB or ALB health checks should be enabled. If set to true, the load_balancers or target_groups_arns variable should be set depending on the load balancer type you are using. Useful for testing connectivity before health check endpoints are available.
true
A maximum duration that Terraform should wait for the EC2 Instances to be healthy before timing out.
"10m"