Skip to main content
CI Modules 0.59.7Last updated in version 0.58.0

EC2 Backup Lambda Function Module

View Source Release Notes

NOTE: This module is deprecated and will be removed in the future. Use the Data Lifecycle Manager based backup system instead.

This module can be used to make scheduled backups of an EC2 Instance and its EBS Volumes. Under the hood, this module uses terraform-aws-lambda to deploy a Lambda function that is triggered on a scheduled basis by Amazon CloudWatch Events and runs ec2-snapper to take a snapshot of the EC2 Instance.

Difference with Data Lifecycle Manager

As an alternative to lambda functions using ec2-snapper, we also have the ec2-backup module in the repo terraform-aws-server which uses AWS Data Lifecycle Managers (DLM) to manage the EBS snapshots. Unlike with lambda functions, this is an AWS native solution that does not have any infrastructure to manage.

Additionally, Data Lifecycle Managers work through the use of tags on volumes, unlike the lambda function (which selects volumes by EC2 instance). This means that the backup function is able to group all the snapshots together across deployments. For example, if you wanted to support blue green deployments for your jenkins server and you rotated instances, the snapshots for the previous instance would still be managed using the same DLM policy.

However, there are a few features that the lambda based backup functions support which are currently not available with DLM:

  • Support backup schedules with frequencies longer than 1 day (e.g., weekly). DLM does not support any frequency longer than 1 day.
    • NOTE: There is an open PR in the AWS provider to add support for this.
  • Minimum backup counts. The lambda based backup mechanism supports specifying to keep a minimum number of backups around.

Example code

  • Check out the jenkins example for working sample code.
  • See vars.tf for all parameters you can configure on this module.

Specifying an instance

To specify the instance to backup, you simply provide the instance's name via the instance_name parameter. This should correspond to a tag on your EC2 Instance with the name Name.

Configuring the schedule

You can specify how often this lambda function runs using the backup_job_schedule_expression parameter. This can be either a rate expression such as rate(1 day) or a cron expression such as cron(0 20 * * ? *). See Schedule Expressions for more information and examples.

Triggering alarms if backup fails

Every time the function runs successfully, it will increment a CloudWatch Metric. We've configured a CloudWatch alarm to go off if the metric is not updated on the expected schedule, as that implies the backup has failed to run!

You can specify the metric namespace and name using the cloudwatch_metric_namespace and cloudwatch_metric_name parameters, respectively. You can specify the SNS topic to notify when the alarm goes off using the alarm_sns_topic_arns parameter.

Cleaning up old snapshots

To prevent the number of snapshots from growing infinitely and costing you a lot of money, ec2-snapper will automatically delete older snapshots. You can specify two parameters to control how many snapshots are kept around:

  • delete_older_than: Delete all snapshots older than this duration. For example, if you set this parameter to 30d, then snapshots that are more than 30 days old will be deleted. See Delete AMIs older than for more info.

  • require_at_least: Always keep around at least this many snapshots. This helps avoid deleting too much if you have, for example, a misconfiguration of the delete_older_than parameter.

Sample Usage

main.tf

# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S EC2-BACKUP MODULE
# ------------------------------------------------------------------------------------------------------

module "ec_2_backup" {

source = "git::git@github.com:gruntwork-io/terraform-aws-ci.git//modules/ec2-backup?ref=v0.59.7"

# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------

# The ARN of SNS topics to notify if the CloudWatch alarm goes off because the
# backup job failed.
alarm_sns_topic_arns = <list(string)>

# How often, in seconds, the backup lambda function is expected to run. This
# is the same as var.backup_job_schedule_expression, but unfortunately,
# Terraform offers no way to convert rate expressions to seconds. We add a
# CloudWatch alarm that triggers if the value of var.cloudwatch_metric_name
# and var.cloudwatch_metric_namespace isn't updated within this time period,
# as that indicates the backup failed to run.
backup_job_alarm_period = <number>

# An expression that defines the schedule for how often to run the backup
# lambda function. For example, cron(0 20 * * ? *) or rate(1 day).
backup_job_schedule_expression = <string>

# The name for the CloudWatch Metric the AWS lambda backup function will
# increment every time the job completes successfully.
cloudwatch_metric_name = <string>

# The namespace for the CloudWatch Metric the AWS lambda backup function will
# increment every time the job completes successfully.
cloudwatch_metric_namespace = <string>

# Delete all snapshots older than this value (e.g., 30d, 5h, or 15m). For
# example, setting this to 30d means all snapshots more than 30 days old will
# be deleted.
delete_older_than = <string>

# The name of the EC2 Instance to backup. This must be the value of the tag
# 'Name' on that Instance.
instance_name = <string>

# The minimum number of snapshots to keep around. This ensures some number of
# snapshots are never deleted, regardless of the value of
# var.delete_older_than.
require_at_least = <number>

# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------

# When true, all IAM policies will be managed as dedicated policies rather
# than inline policies attached to the IAM roles. Dedicated managed policies
# are friendlier to automated policy checkers, which may scan a single
# resource for findings. As such, it is important to avoid inline policies
# when targeting compliance with various security standards.
use_managed_iam_policies = true

}


Reference

Required

alarm_sns_topic_arnslist(string)required

The ARN of SNS topics to notify if the CloudWatch alarm goes off because the backup job failed.

How often, in seconds, the backup lambda function is expected to run. This is the same as backup_job_schedule_expression, but unfortunately, Terraform offers no way to convert rate expressions to seconds. We add a CloudWatch alarm that triggers if the value of cloudwatch_metric_name and cloudwatch_metric_namespace isn't updated within this time period, as that indicates the backup failed to run.

An expression that defines the schedule for how often to run the backup lambda function. For example, cron(0 20 * * ? *) or rate(1 day).

The name for the CloudWatch Metric the AWS lambda backup function will increment every time the job completes successfully.

The namespace for the CloudWatch Metric the AWS lambda backup function will increment every time the job completes successfully.

delete_older_thanstringrequired

Delete all snapshots older than this value (e.g., 30d, 5h, or 15m). For example, setting this to 30d means all snapshots more than 30 days old will be deleted.

instance_namestringrequired

The name of the EC2 Instance to backup. This must be the value of the tag 'Name' on that Instance.

require_at_leastnumberrequired

The minimum number of snapshots to keep around. This ensures some number of snapshots are never deleted, regardless of the value of delete_older_than.

Optional

When true, all IAM policies will be managed as dedicated policies rather than inline policies attached to the IAM roles. Dedicated managed policies are friendlier to automated policy checkers, which may scan a single resource for findings. As such, it is important to avoid inline policies when targeting compliance with various security standards.

true