How do I add a module to my Reference Architecture?
I have a Reference Architecture that only uses the modules it was deployed with. However, there are many other Gruntwork, open-source, and custom modules I also need to deploy. Do you know how I can do that? --- <ins datetime="2023-02-24T18:00:14Z"> <p><a href="https://support.gruntwork.io/hc/requests/109931">Tracked in ticket #109931</a></p> </ins>
## Deploy a new module in the Reference Architecture There are several ways you can accomplish this. I'll start from the most basic form, then extend that basic structure to be fully ["DRY"](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) by following the `_envcommon` pattern used here at Gruntwork. ### TL;DR We must create a new `terragrunt.hcl` file that points to the location of the Terraform module we want to deploy. This new Terragrunt file will live in its own directory, separate from the other modules inside your Reference Architecture's repository. > π Writing and updating the `infrastructure-live` codebase is the only work that should be done on your local machine; the deployment happens through GitOps and a serverless CI/CD architecture called **Gruntwork Pipelines** that was built for you as part of the Reference Architecture deployment. **_Where_** we create this new directory matters. The filesystem path dictates where in AWS the infrastructure will be deployed, e.g., `development/us-east-2/development/services/eks-cluster/terragrunt.hcl` will look up the EKS Terraform module and then deploy it in: - The `development` AWS account - In the `us-east-2` AWS region - Into the `app` VPC created during the initial Reference Architecture deployment, which is represented by the second `development` directory in the full module path. > π There are actually two VPCs in each AWS account when the Reference Architecture is initially deployed; `app`, where the sample app runs, and `mgmt`, which runs Gruntwork Pipelines. The code for both lives inside your `infrastructure-live` repository, and the latter is represented by something like `development/us-east-2/mgmt/networking/vpc`. Since each deployed _instance_ of Terraform module will have its own directory, you should give meaningful names to them like `payments-eks-cluster` or `risk-mgmt-aurora-database` that will be used across all environments, e.g., the `data-team-ecs` module be called three times, once for each environment, so it will exist in: - `prod/us-west-2/data-team-ecs` - `stage/us-west-2/data-team-ecs` - `dev/us-east-2/data-team-ecs` Once you've created your module directory and `terragrunt.hcl`, you can open up a pull request with your changes, merge it, and Gruntwork Pipelines will **_automatically deploy_** your Terraform module into the correct account and region. ```hcl # Create a directory and file in your Reference Architecture, something like: # development/us-east-2/development/services/demo-eks-cluster/terragrunt.hcl # Define the Terraform module you want to deploy terraform { source = "git@example.com:example-org/your-modules-here.git//modules/eks-cluster?ref=v0.0.1" } # Include the `terragrunt.hcl` located in the root of our Reference Architecture's # repository because this file contains all the templating required to make our # Terraform state and AWS provider DRY. include "root" { path = find_in_parent_folders() } # Your module will expect some inputs. This is how to pass in values for those inputs: inputs = { name = "development-eks-cluster" vpc_id = "vpc-85556dc" } ``` > β οΈ This is **not DRY**, as we'd end up with duplicate code across environments. How to DRY up this code using Terragrunt and the `_envcommon` pattern is explained in the following guide. ### Background If we want to follow the already deployed modules' pattern, we'll first need a new directory and a `terragrunt.hcl` file. The directory name should be meaningful. For example, suppose we had a `development` environment with five different ECS clusters. In that case, we'd need a directory for each ECS cluster, as those folders contain the `terragrunt.hcl` file that calls the Terraform module to create them. So it would look something like this: ``` development/us-east-2/development/services 0 $ tree . βββ team-a-ecs βΒ Β βββ terragrunt.hcl βββ team-b-ecs βΒ Β βββ terragrunt.hcl βββ team-c-ecs βΒ Β βββ terragrunt.hcl βββ team-d-ecs βΒ Β βββ terragrunt.hcl βββ team-e-ecs βββ terragrunt.hcl ``` Each of those `terragrunt.hcl` files would look something like this: ```hcl # development/us-east-2/development/services/team-e-ecs/terragrunt.hcl # This is where we tell Terragrunt what Terraform module we want to deploy. terraform { source = "git@example.com:gruntwork-io/terraform-aws-service-catalog.git//modules/services/ecs-cluster?ref=v0.0.1" } # This is where we provide the input values to the Terraform module we want to deploy. inputs = { cluster_name = "Team E" cluster_instance_type = "t2.micro" } ``` ### How to add a module using a new `terragrunt.hcl` file In my case, I will add a `demo-vpc` directory with a `terragrunt.hcl` file that deploys an elementary module to my existing Reference Architecture. The first step would be creating my directory and Terragrunt file: ``` mkdir development/us-east-2/development/services/demo-vpc ``` ``` cd development/us-east-2/development/services/demo-vpc && touch terragrunt.hcl ``` Since my [demo module](https://github.com/gruntwork-io/terraform-fake-modules/tree/main/modules/aws/vpc) doesn't expect any inputs, I _could_ quickly (**but don't)** get started by only having this one block in my `terragrunt.hcl` file: ```hcl # development/us-east-2/development/services/demo-vpc/terragrunt.hcl terraform { source = "git@github.com:gruntwork-io/terraform-fake-modules.git//modules/aws/vpc?ref=main" } ``` And it _will_ run just fine, but can you **spot the bug** π in our thinking? > β° **Hint:** Consider the lack of Terraform configuration in our `terragrunt.hcl` file and remember Terragrunt is a _thin_ wrapper for Terraform, not a replacement, and does not remove the need for fundamental concepts like providers and state.  ### β οΈ The issue with the above Terragrunt configuration (and not being _DRY_ enough) If we run this job, everything will be created as expected, but if we re-run this job, it will create **duplicate resources**. This is bad for a number of reasons; not only will we have duplicate infrastructure that could potentially spin up and overload other services, but we also cannot quickly and cleanly destroy the duplicate infrastructure that's been created. What we need to do is define a **[Terraform state](https://developer.hashicorp.com/terraform/language/state)** to keep track of all the infrastructure that has come in and out of service. If our Terraform configuration was set up correctly, instead of recreating these resources, it would detect they already exist and that there are no configuration changes, so the deployment will do [nothing](https://en.wikipedia.org/wiki/NOP_(code)). Let's understand what happened with this example and then remediate it in the following sections.  ### So what happened when we re-ran our deployment, and how do we fix it? We didn't define an S3 backend to store our Terraform state in. This behavior occurs because Gruntwork Pipelines uses serverless runners on ECS for the actual deployment. When we did the first `apply`, the state was stored locally. However, that runner was spun down, and that local state was lost. The second `apply` recreated those resources, unaware of previous CI/CD jobs. The Gruntwork Pipelines [architecture](https://docs.gruntwork.io/guides/build-it-yourself/pipelines/core-concepts/threat-model-of-ci-cd/) has many benefits but requires following best practices. Using a remote Terraform state is a major one of those best practices we enforce, but we also follow a pattern (described below) to make it painless to manage at scale. > π No matter what you use for CI/CD, always use a remote backend for your Terraform state! In our `demo-vpc/terragrunt.hcl` file above, we got away with a lot, like Gruntwork Pipelines finding a default AWS provider automatically, but we can do better than that. To solve this, we _could_ **(but don't)** modify our `demo-vpc/terragrunt.hcl` to look something like this: ```hcl # development/us-east-2/development/services/demo-vpc/terragrunt.hcl terraform { source = "git@github.com:gruntwork-io/terraform-fake-modules.git//modules/aws/vpc?ref=main" } # Non-DRY example, do not use! remote_state { backend = "s3" config = { bucket = "mybucket" key = "path/to/my/key" region = "us-east-1" } } ``` But if we had thousands of resources to manage, defining that `remote_state` block for every Terraform module we use would become burdensome and dangerous. ### π Managing our Terraform state using DRY principals In the **root** of our `infrastructure-live` repository that contains our Reference Architecture, there is a `terragrunt.hcl` file.  Inside this file, you'll see a block of HCL that looks like this: ```hcl # ---------------------------------------------------------------------------------------------------------------- # GENERATED REMOTE STATE BLOCK # ---------------------------------------------------------------------------------------------------------------- # Generate the Terraform remote state block for storing state in S3 remote_state { backend = "s3" config = { encrypt = true bucket = lower("${local.name_prefix}-${local.account_name}-${local.aws_region}-tf-state") key = "${path_relative_to_include()}/terraform.tfstate" // ... ``` > π Note that the only **hardcoded value** is `encrypt = true`, which is desirable for this use case, but the rest are parameterized strings used to automatically gather environment context and use that to define values for things like the `bucket` name and `terraform.tfstate` location inside S3. Pretty cool, but how do we use it? ### Not repeating ourselves To access this generated remote state block that will automatically take care of everything for us, we need to _include_ the **root `terragrunt.hcl`** into our `demo-vpc/terragrunt.hcl` by doing the following: ```hcl # development/us-east-2/development/services/demo-vpc/terragrunt.hcl # Create our "VPC." In reality, this module outputs data that looks like a VPC was created). terraform { source = "git@github.com:gruntwork-io/terraform-fake-modules.git//modules/aws/vpc?ref=main" } # Include the root `terragrunt.hcl` because this file contains all our Terraform state and AWS provider information. include "root" { path = find_in_parent_folders() } ``` ### Adding inputs before deploying our module This part is straightforward. Even though my module doesn't expect any inputs, I will add some since 99% of modules out there expect some value. ```hcl # development/us-east-2/development/services/demo-vpc/terragrunt.hcl # Create our "VPC." In reality, this module outputs data that looks like a VPC was created). terraform { source = "git@github.com:gruntwork-io/terraform-fake-modules.git//modules/aws/vpc?ref=main" } # Include the root `terragrunt.hcl` because this file contains all our Terraform state and AWS provider information. include "root" { path = find_in_parent_folders() } # This module expects no inputs, but most will expect at least a few. This is how I input those values. inputs = { cidr_block = "10.222.0.0/18" nat_gw_count = 1 } ``` ### Merging and what to expect Open up a pull request with your new `terragrunt.hcl` file and module directory, review the `terragrunt plan` output, approve it, and hit merge. Gruntwork Pipelines will **deploy it in the correct environment automatically** π’ . Nothing should be run from a local workstation. Suppose this is the first time you're deploying your specific Terraform module. In that case, Terragrunt will automatically create the remote state for you based on the above changes we made to `demo-vpc/terragrunt.hcl` by including the root `terragrunt.hcl` that contains all of the generated remote state information. The scripts that initialize Gruntwork Pipelines and use `git-updated-folders` to understand what environments have changed are in the `_ci` directory. If we look in S3, we'll see that a brand new `terraform.tfstate` file in a brand new location that matches our location in the repository, `development/us-east-2/development/services/demo-vpc/terragrunt.hcl`, in a bucket that matches `lower("${local.name_prefix}-${local.account_name}-${local.aws_region}-tf-state")`.  ## Create global configurations in `_envcommon` to be DRY We have now made the Terraform state and AWS provider information DRY, but If we wanted to launch our `demo-vpc` in the `staging` environment? The most obvious answer is to do something like this: ```hcl # staging/us-east-2/staging/services/demo-vpc/terragrunt.hcl # Create our "VPC." In reality, this module outputs data that looks like a VPC was created). terraform { source = "git@github.com:gruntwork-io/terraform-fake-modules.git//modules/aws/vpc?ref=main" # same as development } # Include the root `terragrunt.hcl` because this file contains all our Terraform state and AWS provider information. include "root" { path = find_in_parent_folders() } # This module expects no inputs, but most will expect at least a few. This is how I input those values. inputs = { cidr_block = "10.223.0.0/18" # different from development nat_gw_count = 1 # same as development } ``` Some values will be the same between `development` and `staging`, like `nat_gw_count = 1`. However, most values passed into Terraform modules are not unique to the environment. Therefore, we can keep this common configuration in a different HCL file and reference it from all of our `terragrunt.hcl` files that deploy Terraform modules. To start the `_envcommon` pattern in this example, we will create a new HCL file at this location, `_envcommon/services/demo-vpc.hcl`, which will hold all of our global values. We can always override the values in this file, but they are a great way to keep our infrastructure-live repository DRY: ```hcl # _envcommon/services/demo-vpc.hcl terraform { # We can override this in our terragrunt.hcl files. This is useful for promoting changes across environments. source = "${local.source_base_url}?ref=main" } locals { source_base_url = "git@github.com:gruntwork-io/terraform-fake-modules.git//modules/aws/vpc" } inputs = { # Safe default values available to all environments are defined here. They can be overridden in our terragrunt.hcl files. cidr_block = "10.0.0.0/16" nat_gw_count = 1 } ``` ### How to use the global configuration in `_envcommon` We can now use a very generic `terragrunt.hcl` whenever we need to call this `demo-vpc` in our environments: ```hcl # development/us-east-2/development/services/demo-vpc/terragrunt.hcl # Create our "VPC." In reality, this module outputs data that looks like a VPC was created). terraform { source = "${include.envcommon.locals.source_base_url}?ref=main" } # Follow the _envcommon pattern and use the configuration located at _envcommon/services/demo-vpc.hcl include "envcommon" { path = "${dirname(find_in_parent_folders())}/_envcommon/services/demo-vpc.hcl" # We want to reference the variables from the included config in this configuration, so we expose it. expose = true } # Include the root `terragrunt.hcl` because this file contains all our Terraform state and AWS provider information. include "root" { path = find_in_parent_folders() } inputs = { cidr_block = "10.222.0.0/18" # Unique only to development, so we override the default _envcommon value of 10.0.0.0/16. } ``` ### Using prior art to maximize `_envcommon` DRYness for new modules Your Reference Architecture was deployed with several dozen modules. A good one to look at would be `_envcommon/data-stores/redis.hcl` to see `dependency` blocks. The purpose of `dependency` blocks is to extract values from other modules' existing infrastructure that your new module depends on, e.g., your Redis cluster needs a VPC to run in, and the Redis module expects a value for `vpc_id`. ```hcl # --------------------------------------------------------------------------------------------------------------------- # Dependencies are modules that need to be deployed before this one. # --------------------------------------------------------------------------------------------------------------------- dependency "vpc" { config_path = "${get_terragrunt_dir()}/../../networking/vpc" mock_outputs = { // ... } mock_outputs_allowed_terraform_commands = ["validate", ] } dependency "network_bastion" { config_path = "${get_terragrunt_dir()}/../../networking/openvpn-server" // ... ``` Many AWS resources depend on a VPC, and many Terraform modules expect a value for `vpc_id` to ensure the resources are deployed to the correct network location. We can be very DRY here. Rather than hardcoding the `vpc_id` in a `.tfvars` file or doing a [remote_state](https://registry.terraform.io/providers/hashicorp/terraform/latest/docs/data-sources/remote_state) data call, we can use the filesystem and directory structure to access values output from other modules using `config_path = // ..`. We access the `dependency` address space to pass values into our `inputs = {}` for the Terraform module we want to deploy. In this example, there are many safe default values like `instance_type = "cache.t3.micro"` to ensure we don't accidentally launch an expensive instance type, but we also use values from _three_ different Terraform modules that the Redis depends on: ```hcl # --------------------------------------------------------------------------------------------------------------------- # MODULE PARAMETERS # These are the variables we must pass in to use the module specified in the terragrunt configuration above. # This defines the parameters that are common across all environments. # --------------------------------------------------------------------------------------------------------------------- inputs = { # Redis cluster name must be < 40 characters name = substr("redis-${local.name_prefix}-${lower(local.account_name)}", 0, 40) instance_type = "cache.t3.micro" vpc_id = dependency.vpc.outputs.vpc_id subnet_ids = dependency.vpc.outputs.private_persistence_subnet_ids redis_version = "5.0.6" replication_group_size = 1 enable_multi_az = false enable_automatic_failover = false parameter_group_name = "default.redis5.0" enable_cloudwatch_alarms = true alarms_sns_topic_arns = [dependency.sns.outputs.topic_arn] # Here we allow any connection from the private app subnet tier of the VPC. You can further restrict network access by # security groups for better defense in depth. allow_connections_from_cidr_blocks = dependency.vpc.outputs.private_app_subnet_cidr_blocks allow_connections_from_security_groups = [dependency.network_bastion.outputs.security_group_id] # Only apply changes during the scheduled maintenance window, as certain DB changes cause degraded performance or # downtime. For more info, see: https://docs.aws.amazon.com/AmazonElastiCache/latest/mem-ug/Clusters.Modify.html # We default to false, but in non-prod environments we set it to true to immediately roll out the changes. apply_immediately = false } ``` At the bottom of `_envcommon/data-stores/redis.hcl` we can see those dependencies in action: ```hcl inputs = { // ... vpc_id = dependency.vpc.outputs.vpc_id subnet_ids = dependency.vpc.outputs.private_persistence_subnet_ids // ... alarms_sns_topic_arns = [dependency.sns.outputs.topic_arn] // ... allow_connections_from_cidr_blocks = dependency.vpc.outputs.private_app_subnet_cidr_blocks allow_connections_from_security_groups = [dependency.network_bastion.outputs.security_group_id] } ``` ### Wrapping up We have successfully made our configuration DRY in the following areas: - Terraform state - Terraform providers - Common Terraform module configuration - Passing of outputs from one module to the inputs of another module And additionally, because of Terragrunt, we've unlocked additional benefits: - Created a path for developers to copy-paste `terragrunt.hcl` files with safe defaults to self-serve infrastructure. - Used GitOps to create new infrastructure. - Created a pattern to merge different versions of the same module into different environments.