Cost Management
Overview
This service deploys a unified AWS cost-control stack:
- AWS Budgets — one or more threshold-based budgets (daily, monthly, etc.).
- AWS Cost Anomaly Detection (CAD) — ML-based anomaly monitoring with subscriber notifications.
- Notification fan-out — a single SNS topic receives both Budgets and CAD events, with optional Slack delivery via a Lambda notifier (webhook URL read from Secrets Manager at runtime) and optional direct email subscriptions.
- Scheduled cloud-nuke (optional) — an ECS Fargate task that runs cloud-nuke on a configurable schedule, defaulting to
--dry-run. Useful for ephemeral, sandbox, or developer accounts.
Learn
Under the hood, this service composes Gruntwork modules from terraform-aws-messaging (for the SNS topic) and terraform-aws-lambda (for the Slack notifier). If you are a subscriber and don't have access to those repos, email support@gruntwork.io.
Core concepts
- AWS Budgets: cost thresholds evaluated on a daily, monthly, quarterly, or annual cadence. Each budget publishes to an SNS topic when its threshold is crossed. See the AWS Budgets documentation for the data model and the available
notification_typeandtime_unitvalues. - AWS Cost Anomaly Detection: an ML-based monitor that observes spend patterns and emits an event when an anomaly is detected. See the Cost Anomaly Detection documentation.
- cloud-nuke: a Gruntwork tool that deletes resources in an AWS account, intended for cleanup of ephemeral or sandbox accounts. See the cloud-nuke README for the resource matrix and the
--configfile format.
Important caveats
AWS Cost Anomaly Detection is global, but pinned to us-east-1
aws_ce_anomaly_monitor and aws_ce_anomaly_subscription are global resources that must be created via the us-east-1
endpoint. This module requires an aliased provider named aws.us_east_1 configured for us-east-1. If your default
provider is already us-east-1, you can still alias it. See the example for the canonical configuration.
Only one DIMENSIONAL anomaly monitor per dimension per account
AWS allows at most one DIMENSIONAL aws_ce_anomaly_monitor per dimension (e.g., SERVICE) per account. If your account
already has one (created by another tool, a prior deployment, or the AWS console), this module's apply will fail with
ValidationException: Limit exceeded on dimensional spend monitor creation. Set enable_anomaly_detection = false and
attach an aws_ce_anomaly_subscription to the pre-existing monitor out-of-band, or destroy the existing monitor first.
cloud-nuke is destructive
When enable_scheduled_cloud_nuke = true:
- The module defaults to
cloud_nuke_dry_run = true. Dry-run mode logs what would be deleted without deleting. - To enable real deletions, you must set both
cloud_nuke_dry_run = falseandacknowledge_destructive_cloud_nuke = true. The module enforces this via a plan-time precondition. - The module does not ship an IAM policy granting cloud-nuke permissions to delete resources. You are expected to attach a separate policy to
output.cloud_nuke_task_role_arn. A reference policy snippet is published at the cloud-nuke README. Your security team should review and trim it before attaching.
Deploy
See examples/for-learning-and-testing/mgmt/cost-management for a runnable example.
Sample Usage
- Terraform
- Terragrunt
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S COST-MANAGEMENT MODULE
# ------------------------------------------------------------------------------------------------------
module "cost_management" {
source = "git::git@github.com:gruntwork-io/terraform-aws-service-catalog.git//modules/mgmt/cost-management?ref=v2.9.1"
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# Human-readable AWS account name (e.g., 'sandbox', 'prod-payments'). Included
# as a label in Slack notifications so a single channel can disambiguate
# alerts from multiple accounts.
account_name = <string>
# Name prefix applied to all resources created by this module (SNS topic,
# Budgets, CAD, Slack Lambda, cloud-nuke task).
name = <string>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# Explicit acknowledgement that you intend to run cloud-nuke in destructive
# (non-dry-run) mode. To enable real deletions, this must be true AND
# var.cloud_nuke_dry_run must be false. Enforced by a plan-time precondition.
acknowledge_destructive_cloud_nuke = false
# Dimension to monitor for anomalies. One of SERVICE or LINKED_ACCOUNT.
anomaly_monitor_dimension = "SERVICE"
# Notification frequency for the anomaly subscription. AWS only allows
# IMMEDIATE for SNS subscribers; DAILY and WEEKLY require EMAIL subscribers
# and are rejected at apply time.
anomaly_subscription_frequency = "IMMEDIATE"
# Minimum total impact, in USD, above which a detected anomaly triggers a
# subscription notification.
anomaly_threshold_amount = 100
# AWS Budgets to create. Each entry produces one aws_budgets_budget that
# publishes to the module's SNS topic when the threshold is crossed. Fields:
# 'name' is appended to var.name; 'time_unit' must be DAILY, MONTHLY,
# QUARTERLY, or ANNUALLY; 'limit_amount' is the cap in USD;
# 'threshold_percent' is the percentage of limit_amount at which to alert;
# 'notification_type' must be ACTUAL or FORECASTED.
budgets = [{"limit_amount":75,"name":"daily","notification_type":"ACTUAL","threshold_percent":100,"time_unit":"DAILY"},{"limit_amount":800,"name":"monthly-actual","notification_type":"ACTUAL","threshold_percent":100,"time_unit":"MONTHLY"}]
# Whether to assign a public IP to the cloud-nuke Fargate task ENI. Set to
# true when running in public subnets without a NAT gateway (typical for AWS
# default VPCs) so the task can reach AWS API endpoints and the image
# registry.
cloud_nuke_assign_public_ip = false
# Optional inline cloud-nuke config YAML. When non-null, the module writes the
# YAML to an S3 object and points cloud-nuke at it via --config. When null, no
# config file is passed and cloud-nuke uses its defaults. See
# https://github.com/gruntwork-io/cloud-nuke#config-file for the schema.
cloud_nuke_config_yaml = null
# Whether to run cloud-nuke in --dry-run mode, which logs what cloud-nuke
# would delete without actually deleting anything. See
# var.acknowledge_destructive_cloud_nuke for how to enable real deletions.
cloud_nuke_dry_run = true
# Container image reference for cloud-nuke (e.g. an ECR repository URI or a
# public image). Gruntwork does not currently publish a cloud-nuke OCI image,
# so operators must build and host their own — see the module README for a
# reference Dockerfile. Required when var.enable_scheduled_cloud_nuke is true.
cloud_nuke_image = null
# EventBridge Scheduler schedule expression for cloud-nuke runs. Accepts a
# cron(...) or rate(...) expression. The default is 07:00 UTC daily.
cloud_nuke_schedule_expression = "cron(0 7 * * ? *)"
# Security groups to attach to the cloud-nuke Fargate task ENI. Each must
# permit outbound HTTPS so the task can reach AWS API endpoints, S3 (for
# config), and ECR/GHCR (for the image pull). When empty, ECS attaches the VPC
# default security group, whose outbound rules may not permit these calls.
cloud_nuke_security_group_ids = []
# AWS regions cloud-nuke should operate against, passed as repeated --region
# flags. An empty list defers to cloud-nuke's default behavior, which is to
# operate on all regions enabled in the account.
cloud_nuke_target_regions = []
# Fargate task CPU units allocated to the cloud-nuke task. See
# https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
# for valid CPU/memory combinations.
cloud_nuke_task_cpu = 512
# Memory, in MB, allocated to the cloud-nuke Fargate task. See
# https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
# for valid CPU/memory combinations.
cloud_nuke_task_memory = 1024
# ARN of an existing ECS cluster to run the cloud-nuke task on. When null,
# this module creates a dedicated cluster named '<var.name>-cloud-nuke'. Only
# used when var.enable_scheduled_cloud_nuke is true.
ecs_cluster_arn = null
# Email addresses to subscribe to the cost alerts SNS topic. Each address
# receives an AWS confirmation email that must be acknowledged before delivery
# begins.
email_endpoints = []
# Whether to provision AWS Cost Anomaly Detection. Requires the aws.us_east_1
# aliased provider since CAD resources are global but only callable via
# us-east-1.
enable_anomaly_detection = true
# Whether to provision AWS Budgets alerts. When true, each entry in
# var.budgets becomes one aws_budgets_budget that publishes to the module's
# SNS topic.
enable_budget_alerts = true
# Whether to provision an EventBridge-scheduled Fargate task that runs
# cloud-nuke on var.cloud_nuke_schedule_expression. Off by default because
# cloud-nuke is destructive. When true, var.vpc_id and var.subnet_ids must
# also be set.
enable_scheduled_cloud_nuke = false
# Whether to deploy the Slack notifier Lambda and subscribe it to the cost
# alerts SNS topic. When true, var.slack_webhook_url_secrets_manager_arn must
# also be set.
enable_slack_notifications = false
# Memory, in MB, allocated to the Slack notifier Lambda.
lambda_memory_size = 128
# Maximum runtime, in seconds, of the Slack notifier Lambda before
# termination.
lambda_timeout = 30
# Retention, in days, for CloudWatch logs of the Slack notifier Lambda and the
# cloud-nuke Fargate task.
log_retention_days = 30
# ARN of a Secrets Manager secret whose SecretString is the Slack incoming
# webhook URL. Required when var.enable_slack_notifications is true. The
# Lambda reads this secret at runtime so the URL never appears in plan output
# or Terraform state.
slack_webhook_url_secrets_manager_arn = null
# Private subnet IDs the cloud-nuke Fargate task launches into. Subnets must
# have outbound internet access (e.g., via a NAT gateway) so the task can
# reach AWS API endpoints. Required when var.enable_scheduled_cloud_nuke is
# true.
subnet_ids = []
# VPC the cloud-nuke Fargate task runs in. Required when
# var.enable_scheduled_cloud_nuke is true.
vpc_id = null
}
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S COST-MANAGEMENT MODULE
# ------------------------------------------------------------------------------------------------------
terraform {
source = "git::git@github.com:gruntwork-io/terraform-aws-service-catalog.git//modules/mgmt/cost-management?ref=v2.9.1"
}
inputs = {
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# Human-readable AWS account name (e.g., 'sandbox', 'prod-payments'). Included
# as a label in Slack notifications so a single channel can disambiguate
# alerts from multiple accounts.
account_name = <string>
# Name prefix applied to all resources created by this module (SNS topic,
# Budgets, CAD, Slack Lambda, cloud-nuke task).
name = <string>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# Explicit acknowledgement that you intend to run cloud-nuke in destructive
# (non-dry-run) mode. To enable real deletions, this must be true AND
# var.cloud_nuke_dry_run must be false. Enforced by a plan-time precondition.
acknowledge_destructive_cloud_nuke = false
# Dimension to monitor for anomalies. One of SERVICE or LINKED_ACCOUNT.
anomaly_monitor_dimension = "SERVICE"
# Notification frequency for the anomaly subscription. AWS only allows
# IMMEDIATE for SNS subscribers; DAILY and WEEKLY require EMAIL subscribers
# and are rejected at apply time.
anomaly_subscription_frequency = "IMMEDIATE"
# Minimum total impact, in USD, above which a detected anomaly triggers a
# subscription notification.
anomaly_threshold_amount = 100
# AWS Budgets to create. Each entry produces one aws_budgets_budget that
# publishes to the module's SNS topic when the threshold is crossed. Fields:
# 'name' is appended to var.name; 'time_unit' must be DAILY, MONTHLY,
# QUARTERLY, or ANNUALLY; 'limit_amount' is the cap in USD;
# 'threshold_percent' is the percentage of limit_amount at which to alert;
# 'notification_type' must be ACTUAL or FORECASTED.
budgets = [{"limit_amount":75,"name":"daily","notification_type":"ACTUAL","threshold_percent":100,"time_unit":"DAILY"},{"limit_amount":800,"name":"monthly-actual","notification_type":"ACTUAL","threshold_percent":100,"time_unit":"MONTHLY"}]
# Whether to assign a public IP to the cloud-nuke Fargate task ENI. Set to
# true when running in public subnets without a NAT gateway (typical for AWS
# default VPCs) so the task can reach AWS API endpoints and the image
# registry.
cloud_nuke_assign_public_ip = false
# Optional inline cloud-nuke config YAML. When non-null, the module writes the
# YAML to an S3 object and points cloud-nuke at it via --config. When null, no
# config file is passed and cloud-nuke uses its defaults. See
# https://github.com/gruntwork-io/cloud-nuke#config-file for the schema.
cloud_nuke_config_yaml = null
# Whether to run cloud-nuke in --dry-run mode, which logs what cloud-nuke
# would delete without actually deleting anything. See
# var.acknowledge_destructive_cloud_nuke for how to enable real deletions.
cloud_nuke_dry_run = true
# Container image reference for cloud-nuke (e.g. an ECR repository URI or a
# public image). Gruntwork does not currently publish a cloud-nuke OCI image,
# so operators must build and host their own — see the module README for a
# reference Dockerfile. Required when var.enable_scheduled_cloud_nuke is true.
cloud_nuke_image = null
# EventBridge Scheduler schedule expression for cloud-nuke runs. Accepts a
# cron(...) or rate(...) expression. The default is 07:00 UTC daily.
cloud_nuke_schedule_expression = "cron(0 7 * * ? *)"
# Security groups to attach to the cloud-nuke Fargate task ENI. Each must
# permit outbound HTTPS so the task can reach AWS API endpoints, S3 (for
# config), and ECR/GHCR (for the image pull). When empty, ECS attaches the VPC
# default security group, whose outbound rules may not permit these calls.
cloud_nuke_security_group_ids = []
# AWS regions cloud-nuke should operate against, passed as repeated --region
# flags. An empty list defers to cloud-nuke's default behavior, which is to
# operate on all regions enabled in the account.
cloud_nuke_target_regions = []
# Fargate task CPU units allocated to the cloud-nuke task. See
# https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
# for valid CPU/memory combinations.
cloud_nuke_task_cpu = 512
# Memory, in MB, allocated to the cloud-nuke Fargate task. See
# https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html
# for valid CPU/memory combinations.
cloud_nuke_task_memory = 1024
# ARN of an existing ECS cluster to run the cloud-nuke task on. When null,
# this module creates a dedicated cluster named '<var.name>-cloud-nuke'. Only
# used when var.enable_scheduled_cloud_nuke is true.
ecs_cluster_arn = null
# Email addresses to subscribe to the cost alerts SNS topic. Each address
# receives an AWS confirmation email that must be acknowledged before delivery
# begins.
email_endpoints = []
# Whether to provision AWS Cost Anomaly Detection. Requires the aws.us_east_1
# aliased provider since CAD resources are global but only callable via
# us-east-1.
enable_anomaly_detection = true
# Whether to provision AWS Budgets alerts. When true, each entry in
# var.budgets becomes one aws_budgets_budget that publishes to the module's
# SNS topic.
enable_budget_alerts = true
# Whether to provision an EventBridge-scheduled Fargate task that runs
# cloud-nuke on var.cloud_nuke_schedule_expression. Off by default because
# cloud-nuke is destructive. When true, var.vpc_id and var.subnet_ids must
# also be set.
enable_scheduled_cloud_nuke = false
# Whether to deploy the Slack notifier Lambda and subscribe it to the cost
# alerts SNS topic. When true, var.slack_webhook_url_secrets_manager_arn must
# also be set.
enable_slack_notifications = false
# Memory, in MB, allocated to the Slack notifier Lambda.
lambda_memory_size = 128
# Maximum runtime, in seconds, of the Slack notifier Lambda before
# termination.
lambda_timeout = 30
# Retention, in days, for CloudWatch logs of the Slack notifier Lambda and the
# cloud-nuke Fargate task.
log_retention_days = 30
# ARN of a Secrets Manager secret whose SecretString is the Slack incoming
# webhook URL. Required when var.enable_slack_notifications is true. The
# Lambda reads this secret at runtime so the URL never appears in plan output
# or Terraform state.
slack_webhook_url_secrets_manager_arn = null
# Private subnet IDs the cloud-nuke Fargate task launches into. Subnets must
# have outbound internet access (e.g., via a NAT gateway) so the task can
# reach AWS API endpoints. Required when var.enable_scheduled_cloud_nuke is
# true.
subnet_ids = []
# VPC the cloud-nuke Fargate task runs in. Required when
# var.enable_scheduled_cloud_nuke is true.
vpc_id = null
}
Reference
- Inputs
- Outputs
Required
account_namestringHuman-readable AWS account name (e.g., 'sandbox', 'prod-payments'). Included as a label in Slack notifications so a single channel can disambiguate alerts from multiple accounts.
namestringName prefix applied to all resources created by this module (SNS topic, Budgets, CAD, Slack Lambda, cloud-nuke task).
Optional
Explicit acknowledgement that you intend to run cloud-nuke in destructive (non-dry-run) mode. To enable real deletions, this must be true AND cloud_nuke_dry_run must be false. Enforced by a plan-time precondition.
falseDimension to monitor for anomalies. One of SERVICE or LINKED_ACCOUNT.
"SERVICE"Notification frequency for the anomaly subscription. AWS only allows IMMEDIATE for SNS subscribers; DAILY and WEEKLY require EMAIL subscribers and are rejected at apply time.
"IMMEDIATE"anomaly_threshold_amountnumberMinimum total impact, in USD, above which a detected anomaly triggers a subscription notification.
100budgetslist(object(…))AWS Budgets to create. Each entry produces one aws_budgets_budget that publishes to the module's SNS topic when the threshold is crossed. Fields: 'name' is appended to name; 'time_unit' must be DAILY, MONTHLY, QUARTERLY, or ANNUALLY; 'limit_amount' is the cap in USD; 'threshold_percent' is the percentage of limit_amount at which to alert; 'notification_type' must be ACTUAL or FORECASTED.
list(object({
name = string
time_unit = string # DAILY | MONTHLY | QUARTERLY | ANNUALLY
limit_amount = number # USD
threshold_percent = number
notification_type = string # ACTUAL | FORECASTED
}))
[
{
limit_amount = 75,
name = "daily",
notification_type = "ACTUAL",
threshold_percent = 100,
time_unit = "DAILY"
},
{
limit_amount = 800,
name = "monthly-actual",
notification_type = "ACTUAL",
threshold_percent = 100,
time_unit = "MONTHLY"
}
]
Whether to assign a public IP to the cloud-nuke Fargate task ENI. Set to true when running in public subnets without a NAT gateway (typical for AWS default VPCs) so the task can reach AWS API endpoints and the image registry.
falsecloud_nuke_config_yamlstringOptional inline cloud-nuke config YAML. When non-null, the module writes the YAML to an S3 object and points cloud-nuke at it via --config. When null, no config file is passed and cloud-nuke uses its defaults. See https://github.com/gruntwork-io/cloud-nuke#config-file for the schema.
nullWhether to run cloud-nuke in --dry-run mode, which logs what cloud-nuke would delete without actually deleting anything. See acknowledge_destructive_cloud_nuke for how to enable real deletions.
truecloud_nuke_imagestringContainer image reference for cloud-nuke (e.g. an ECR repository URI or a public image). Gruntwork does not currently publish a cloud-nuke OCI image, so operators must build and host their own — see the module README for a reference Dockerfile. Required when enable_scheduled_cloud_nuke is true.
nullEventBridge Scheduler schedule expression for cloud-nuke runs. Accepts a cron(...) or rate(...) expression. The default is 07:00 UTC daily.
"cron(0 7 * * ? *)"cloud_nuke_security_group_idslist(string)Security groups to attach to the cloud-nuke Fargate task ENI. Each must permit outbound HTTPS so the task can reach AWS API endpoints, S3 (for config), and ECR/GHCR (for the image pull). When empty, ECS attaches the VPC default security group, whose outbound rules may not permit these calls.
[]cloud_nuke_target_regionslist(string)AWS regions cloud-nuke should operate against, passed as repeated --region flags. An empty list defers to cloud-nuke's default behavior, which is to operate on all regions enabled in the account.
[]cloud_nuke_task_cpunumberFargate task CPU units allocated to the cloud-nuke task. See https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html for valid CPU/memory combinations.
512cloud_nuke_task_memorynumberMemory, in MB, allocated to the cloud-nuke Fargate task. See https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-cpu-memory-error.html for valid CPU/memory combinations.
1024ecs_cluster_arnstringARN of an existing ECS cluster to run the cloud-nuke task on. When null, this module creates a dedicated cluster named '<name>-cloud-nuke'. Only used when enable_scheduled_cloud_nuke is true.
nullemail_endpointslist(string)Email addresses to subscribe to the cost alerts SNS topic. Each address receives an AWS confirmation email that must be acknowledged before delivery begins.
[]Whether to provision AWS Cost Anomaly Detection. Requires the aws.us_east_1 aliased provider since CAD resources are global but only callable via us-east-1.
trueWhether to provision AWS Budgets alerts. When true, each entry in budgets becomes one aws_budgets_budget that publishes to the module's SNS topic.
trueWhether to provision an EventBridge-scheduled Fargate task that runs cloud-nuke on cloud_nuke_schedule_expression. Off by default because cloud-nuke is destructive. When true, vpc_id and subnet_ids must also be set.
falseWhether to deploy the Slack notifier Lambda and subscribe it to the cost alerts SNS topic. When true, slack_webhook_url_secrets_manager_arn must also be set.
falselambda_memory_sizenumberMemory, in MB, allocated to the Slack notifier Lambda.
128lambda_timeoutnumberMaximum runtime, in seconds, of the Slack notifier Lambda before termination.
30log_retention_daysnumberRetention, in days, for CloudWatch logs of the Slack notifier Lambda and the cloud-nuke Fargate task.
30ARN of a Secrets Manager secret whose SecretString is the Slack incoming webhook URL. Required when enable_slack_notifications is true. The Lambda reads this secret at runtime so the URL never appears in plan output or Terraform state.
nullsubnet_idslist(string)Private subnet IDs the cloud-nuke Fargate task launches into. Subnets must have outbound internet access (e.g., via a NAT gateway) so the task can reach AWS API endpoints. Required when enable_scheduled_cloud_nuke is true.
[]vpc_idstringVPC the cloud-nuke Fargate task runs in. Required when enable_scheduled_cloud_nuke is true.
nullARN of the Cost Anomaly Detection monitor, or null when disabled.
ARN of the Cost Anomaly Detection subscription, or null when disabled.
ARNs of the AWS Budgets created by this module, keyed by budget name.
Name of the EventBridge schedule that triggers cloud-nuke runs, or null when disabled.
ARN of the cloud-nuke ECS task definition, or null when disabled.
ARN of the IAM role assumed by the cloud-nuke Fargate task — attach deletion permissions here. Null when scheduled cloud-nuke is disabled.
ARN of the Slack notifier Lambda, or null when disabled.
ARN of the SNS topic that fans out Budgets and Cost Anomaly Detection notifications.