EC2 Backup Lambda Function Module
NOTE: This module is deprecated and will be removed in the future. Use the Data Lifecycle Manager based backup system instead.
This module can be used to make scheduled backups of an EC2 Instance and its EBS Volumes. Under the hood, this module uses terraform-aws-lambda to deploy a Lambda function that is triggered on a scheduled basis by Amazon CloudWatch Events and runs ec2-snapper to take a snapshot of the EC2 Instance.
Difference with Data Lifecycle Manager
As an alternative to lambda functions using ec2-snapper
, we also have the ec2-backup
module in the repo terraform-aws-server
which
uses AWS Data Lifecycle Managers (DLM) to
manage the EBS snapshots. Unlike with lambda functions, this is an AWS native solution that does not have any
infrastructure to manage.
Additionally, Data Lifecycle Managers work through the use of tags on volumes, unlike the lambda function (which selects volumes by EC2 instance). This means that the backup function is able to group all the snapshots together across deployments. For example, if you wanted to support blue green deployments for your jenkins server and you rotated instances, the snapshots for the previous instance would still be managed using the same DLM policy.
However, there are a few features that the lambda based backup functions support which are currently not available with DLM:
- Support backup schedules with frequencies longer than 1 day (e.g., weekly). DLM does not support any frequency longer
than 1 day.
- NOTE: There is an open PR in the AWS provider to add support for this.
- Minimum backup counts. The lambda based backup mechanism supports specifying to keep a minimum number of backups around.
Example code
- Check out the jenkins example for working sample code.
- See vars.tf for all parameters you can configure on this module.
Specifying an instance
To specify the instance to backup, you simply provide the instance's name via the instance_name
parameter. This
should correspond to a tag on your EC2 Instance
with the name Name
.
Configuring the schedule
You can specify how often this lambda function runs using the backup_job_schedule_expression
parameter. This can
be either a rate expression such as rate(1 day)
or a cron expression such as cron(0 20 * * ? *)
. See Schedule
Expressions for more information and
examples.
Triggering alarms if backup fails
Every time the function runs successfully, it will increment a CloudWatch Metric. We've configured a CloudWatch alarm to go off if the metric is not updated on the expected schedule, as that implies the backup has failed to run!
You can specify the metric namespace and name using the cloudwatch_metric_namespace
and cloudwatch_metric_name
parameters, respectively. You can specify the SNS topic to notify when the alarm goes off using the
alarm_sns_topic_arns
parameter.
Cleaning up old snapshots
To prevent the number of snapshots from growing infinitely and costing you a lot of money, ec2-snapper
will
automatically delete older snapshots. You can specify two parameters to control how many snapshots are kept around:
-
delete_older_than
: Delete all snapshots older than this duration. For example, if you set this parameter to30d
, then snapshots that are more than 30 days old will be deleted. See Delete AMIs older than for more info. -
require_at_least
: Always keep around at least this many snapshots. This helps avoid deleting too much if you have, for example, a misconfiguration of thedelete_older_than
parameter.
Sample Usage
- Terraform
- Terragrunt
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S EC2-BACKUP MODULE
# ------------------------------------------------------------------------------------------------------
module "ec_2_backup" {
source = "git::git@github.com:gruntwork-io/terraform-aws-ci.git//modules/ec2-backup?ref=v0.59.1"
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# The ARN of SNS topics to notify if the CloudWatch alarm goes off because the
# backup job failed.
alarm_sns_topic_arns = <list(string)>
# How often, in seconds, the backup lambda function is expected to run. This
# is the same as var.backup_job_schedule_expression, but unfortunately,
# Terraform offers no way to convert rate expressions to seconds. We add a
# CloudWatch alarm that triggers if the value of var.cloudwatch_metric_name
# and var.cloudwatch_metric_namespace isn't updated within this time period,
# as that indicates the backup failed to run.
backup_job_alarm_period = <number>
# An expression that defines the schedule for how often to run the backup
# lambda function. For example, cron(0 20 * * ? *) or rate(1 day).
backup_job_schedule_expression = <string>
# The name for the CloudWatch Metric the AWS lambda backup function will
# increment every time the job completes successfully.
cloudwatch_metric_name = <string>
# The namespace for the CloudWatch Metric the AWS lambda backup function will
# increment every time the job completes successfully.
cloudwatch_metric_namespace = <string>
# Delete all snapshots older than this value (e.g., 30d, 5h, or 15m). For
# example, setting this to 30d means all snapshots more than 30 days old will
# be deleted.
delete_older_than = <string>
# The name of the EC2 Instance to backup. This must be the value of the tag
# 'Name' on that Instance.
instance_name = <string>
# The minimum number of snapshots to keep around. This ensures some number of
# snapshots are never deleted, regardless of the value of
# var.delete_older_than.
require_at_least = <number>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# When true, all IAM policies will be managed as dedicated policies rather
# than inline policies attached to the IAM roles. Dedicated managed policies
# are friendlier to automated policy checkers, which may scan a single
# resource for findings. As such, it is important to avoid inline policies
# when targeting compliance with various security standards.
use_managed_iam_policies = true
}
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S EC2-BACKUP MODULE
# ------------------------------------------------------------------------------------------------------
terraform {
source = "git::git@github.com:gruntwork-io/terraform-aws-ci.git//modules/ec2-backup?ref=v0.59.1"
}
inputs = {
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# The ARN of SNS topics to notify if the CloudWatch alarm goes off because the
# backup job failed.
alarm_sns_topic_arns = <list(string)>
# How often, in seconds, the backup lambda function is expected to run. This
# is the same as var.backup_job_schedule_expression, but unfortunately,
# Terraform offers no way to convert rate expressions to seconds. We add a
# CloudWatch alarm that triggers if the value of var.cloudwatch_metric_name
# and var.cloudwatch_metric_namespace isn't updated within this time period,
# as that indicates the backup failed to run.
backup_job_alarm_period = <number>
# An expression that defines the schedule for how often to run the backup
# lambda function. For example, cron(0 20 * * ? *) or rate(1 day).
backup_job_schedule_expression = <string>
# The name for the CloudWatch Metric the AWS lambda backup function will
# increment every time the job completes successfully.
cloudwatch_metric_name = <string>
# The namespace for the CloudWatch Metric the AWS lambda backup function will
# increment every time the job completes successfully.
cloudwatch_metric_namespace = <string>
# Delete all snapshots older than this value (e.g., 30d, 5h, or 15m). For
# example, setting this to 30d means all snapshots more than 30 days old will
# be deleted.
delete_older_than = <string>
# The name of the EC2 Instance to backup. This must be the value of the tag
# 'Name' on that Instance.
instance_name = <string>
# The minimum number of snapshots to keep around. This ensures some number of
# snapshots are never deleted, regardless of the value of
# var.delete_older_than.
require_at_least = <number>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# When true, all IAM policies will be managed as dedicated policies rather
# than inline policies attached to the IAM roles. Dedicated managed policies
# are friendlier to automated policy checkers, which may scan a single
# resource for findings. As such, it is important to avoid inline policies
# when targeting compliance with various security standards.
use_managed_iam_policies = true
}
Reference
- Inputs
- Outputs
Required
alarm_sns_topic_arns
list(string)The ARN of SNS topics to notify if the CloudWatch alarm goes off because the backup job failed.
backup_job_alarm_period
numberHow often, in seconds, the backup lambda function is expected to run. This is the same as backup_job_schedule_expression
, but unfortunately, Terraform offers no way to convert rate expressions to seconds. We add a CloudWatch alarm that triggers if the value of cloudwatch_metric_name
and cloudwatch_metric_namespace
isn't updated within this time period, as that indicates the backup failed to run.
An expression that defines the schedule for how often to run the backup lambda function. For example, cron(0 20 * * ? *) or rate(1 day).
cloudwatch_metric_name
stringThe name for the CloudWatch Metric the AWS lambda backup function will increment every time the job completes successfully.
The namespace for the CloudWatch Metric the AWS lambda backup function will increment every time the job completes successfully.
delete_older_than
stringDelete all snapshots older than this value (e.g., 30d, 5h, or 15m). For example, setting this to 30d means all snapshots more than 30 days old will be deleted.
instance_name
stringThe name of the EC2 Instance to backup. This must be the value of the tag 'Name' on that Instance.
require_at_least
numberThe minimum number of snapshots to keep around. This ensures some number of snapshots are never deleted, regardless of the value of delete_older_than
.
Optional
When true, all IAM policies will be managed as dedicated policies rather than inline policies attached to the IAM roles. Dedicated managed policies are friendlier to automated policy checkers, which may scan a single resource for findings. As such, it is important to avoid inline policies when targeting compliance with various security standards.
true