EKS CloudWatch Agent Module
This Terraform Module installs and configures Amazon CloudWatch Agent on an EKS cluster, so that each node runs the agent to collect more system-level metrics from Amazon EC2 instances and ship them to Amazon CloudWatch. This extra metric data allows using CloudWatch Container Insights for a single pane of glass for application, performance, host, control plane, data plane insights.
This module uses the community helm chart, with a set of best practices inputs.
This module is for setting up CloudWatch Agent for EKS clusters with worker nodes (self-managed or managed node groups) that
have support for DaemonSets
. CloudWatch Container
Insights is not supported for EKS Fargate.
How does this work?
CloudWatch automatically collects metrics for many resources, such as CPU, memory, disk, and network. Container Insights also provides diagnostic information, such as container restart failures, to help you isolate issues and resolve them quickly.
In Amazon EKS and Kubernetes, using Container Insights requires using a containerized version of the CloudWatch agent to discover all of the running containers in a cluster. It collects performance data at every layer of the performance stack as log events using embedded metric format. From this data, CloudWatch creates aggregated metrics at the cluster, node, pod, task, and service level as CloudWatch metrics. The metrics that Container Insights collects are available in CloudWatch automatic dashboards, and also viewable in the Metrics section of the CloudWatch console.
cloudwatch-agent
is installed as a Kubernetes
DaemonSet
, which ensures that there is one
cloudwatch-agent
Pod
running per node. In this way, we are able to ensure that all workers in the cluster are running the
cloudwatch-agent
service for shipping the metric data into CloudWatch.
Note that metrics collected by CloudWatch Agent are charged as custom metrics. For more information about CloudWatch pricing, see Amazon CloudWatch Pricing.
You can read more about cloudwatch-agent
in the GitHub repository.
You can also learn more about Container Insights in the official AWS
docs.
Sample Usage
- Terraform
- Terragrunt
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S EKS-CLOUDWATCH-AGENT MODULE
# ------------------------------------------------------------------------------------------------------
module "eks_cloudwatch_agent" {
source = "git::git@github.com:gruntwork-io/terraform-aws-eks.git//modules/eks-cloudwatch-agent?ref=v0.72.0"
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# Name of the EKS cluster where resources are deployed to.
eks_cluster_name = <string>
# Configuration for using the IAM role with Service Accounts feature to
# provide permissions to the helm charts. This expects a map with two
# properties: `openid_connect_provider_arn` and `openid_connect_provider_url`.
# The `openid_connect_provider_arn` is the ARN of the OpenID Connect Provider
# for EKS to retrieve IAM credentials, while `openid_connect_provider_url` is
# the URL. Set to null if you do not wish to use IAM role with Service
# Accounts.
iam_role_for_service_accounts_config = <object(
openid_connect_provider_arn = string
openid_connect_provider_url = string
)>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# The Container repository to use for looking up the cloudwatch-agent
# Container image when deploying the pods. When null, uses the default
# repository set in the chart.
aws_cloudwatch_agent_image_repository = null
# Which version of amazon/cloudwatch-agent to install. When null, uses the
# default version set in the chart.
aws_cloudwatch_agent_version = null
# The version of the aws-cloudwatch-metrics helm chart to deploy. Note that
# this is different from the app/container version (use
# var.aws_cloudwatch_agent_version to control the app/container version).
aws_cloudwatch_metrics_chart_version = "0.0.7"
# Create a dependency between the resources in this module to the interpolated
# values in this list (and thus the source resources). In other words, the
# resources in this module will now depend on the resources backing the values
# in this list such that those resources need to be created before the
# resources in this module, and the resources in this module need to be
# destroyed before the resources in the list.
dependencies = []
# Used to name IAM roles for the service account. Recommended when
# var.iam_role_for_service_accounts_config is configured.
iam_role_name_prefix = null
# Namespace to create the resources in.
namespace = "kube-system"
# Configure affinity rules for the Pod to control which nodes to schedule on.
# Each item in the list should be a map with the keys `key`, `values`, and
# `operator`, corresponding to the 3 properties of matchExpressions. Note that
# all expressions must be satisfied to schedule on the node.
pod_node_affinity = []
# Specify the resource limits and requests for the cloudwatch-agent pods. Set
# to null (default) to use chart defaults.
pod_resources = null
# Configure tolerations rules to allow the Pod to schedule on nodes that have
# been tainted. Each item in the list specifies a toleration rule.
pod_tolerations = []
}
# ------------------------------------------------------------------------------------------------------
# DEPLOY GRUNTWORK'S EKS-CLOUDWATCH-AGENT MODULE
# ------------------------------------------------------------------------------------------------------
terraform {
source = "git::git@github.com:gruntwork-io/terraform-aws-eks.git//modules/eks-cloudwatch-agent?ref=v0.72.0"
}
inputs = {
# ----------------------------------------------------------------------------------------------------
# REQUIRED VARIABLES
# ----------------------------------------------------------------------------------------------------
# Name of the EKS cluster where resources are deployed to.
eks_cluster_name = <string>
# Configuration for using the IAM role with Service Accounts feature to
# provide permissions to the helm charts. This expects a map with two
# properties: `openid_connect_provider_arn` and `openid_connect_provider_url`.
# The `openid_connect_provider_arn` is the ARN of the OpenID Connect Provider
# for EKS to retrieve IAM credentials, while `openid_connect_provider_url` is
# the URL. Set to null if you do not wish to use IAM role with Service
# Accounts.
iam_role_for_service_accounts_config = <object(
openid_connect_provider_arn = string
openid_connect_provider_url = string
)>
# ----------------------------------------------------------------------------------------------------
# OPTIONAL VARIABLES
# ----------------------------------------------------------------------------------------------------
# The Container repository to use for looking up the cloudwatch-agent
# Container image when deploying the pods. When null, uses the default
# repository set in the chart.
aws_cloudwatch_agent_image_repository = null
# Which version of amazon/cloudwatch-agent to install. When null, uses the
# default version set in the chart.
aws_cloudwatch_agent_version = null
# The version of the aws-cloudwatch-metrics helm chart to deploy. Note that
# this is different from the app/container version (use
# var.aws_cloudwatch_agent_version to control the app/container version).
aws_cloudwatch_metrics_chart_version = "0.0.7"
# Create a dependency between the resources in this module to the interpolated
# values in this list (and thus the source resources). In other words, the
# resources in this module will now depend on the resources backing the values
# in this list such that those resources need to be created before the
# resources in this module, and the resources in this module need to be
# destroyed before the resources in the list.
dependencies = []
# Used to name IAM roles for the service account. Recommended when
# var.iam_role_for_service_accounts_config is configured.
iam_role_name_prefix = null
# Namespace to create the resources in.
namespace = "kube-system"
# Configure affinity rules for the Pod to control which nodes to schedule on.
# Each item in the list should be a map with the keys `key`, `values`, and
# `operator`, corresponding to the 3 properties of matchExpressions. Note that
# all expressions must be satisfied to schedule on the node.
pod_node_affinity = []
# Specify the resource limits and requests for the cloudwatch-agent pods. Set
# to null (default) to use chart defaults.
pod_resources = null
# Configure tolerations rules to allow the Pod to schedule on nodes that have
# been tainted. Each item in the list specifies a toleration rule.
pod_tolerations = []
}