Skip to main content
Knowledge Base

Why do I need to deploy Gruntwork Pipelines (the ECS deploy runner) in multiple accounts?

Answer

A common question we get is why do we deploy the ECS deploy runner portion of Gruntwork pipelines in every single account (dev, stage, prod, shared, etc)? Why not just have a single copy in shared that works across all the accounts? --- <ins datetime="2023-02-22T10:17:31Z"> <p><a href="https://support.gruntwork.io/hc/requests/109920">Tracked in ticket #109920</a></p> </ins>

Gruntwork Pipelines is highly configurable and flexible and can be used for a variety of use cases where a CI / CD pipeline needs to do an action that requires sensitive (e.g., admin) permissions. How many copies of Gruntwork Pipelines—of the ECS Deploy Runner (EDR) component—you need depends on the use case. The most common use cases are: 1. **Building Docker images and AMIs as part of a CI / CD pipeline.** For this use case, it's enough to have a single copy of EDR in one AWS account, as we support configuring your AMIs and Docker images to be accessible from multiple accounts. For the AMIs, you do this by configuring your Packer build to share the AMIs with other accounts: e.g., our Packer builds expose the [`ami_users` variable](https://github.com/gruntwork-io/terraform-aws-service-catalog/blob/master/modules/mgmt/openvpn-server/openvpn-server-ubuntu.pkr.hcl#L50-L54) for this purpose). For Docker images, you have to push them to a Docker Registry, such as ECR, and you can configure ECR repos to be accessible from other accounts: e.g., our `ecr-repos` module exposes the [`external_account_ids_with_read_access ` variable](https://github.com/gruntwork-io/terraform-aws-service-catalog/blob/master/modules/data-stores/ecr-repos/variables.tf#L16) for this purpose. This way, you can build all your AMIs and Docker images in one account (e.g., shared), and then have all other accounts (e.g., dev, stage, prod) deploy those same images. 1. **Automatically updating, committing, and pushing code to your Git repos as part of a CI / CD pipeline.** For this use case, it's again enough to have a single copy of EDR in one AWS account. You configure that EDR instance with access to the repos it needs, and the secrets to access those repos. 1. **Running `terraform plan/apply/destroy` as part of a CI / CD pipeline.** This is the one use case where you currently need to deploy multiple copies of EDR: namely, one in each AWS account where you want to run `terraform plan/apply/destroy`. Item (3) was an intentional design decision. Here's a few of the reasons for it: 1. **Providing separation/isolate between AWS accounts.** The design of Gruntwork Pipelines is to grant sensitive permissions (e.g., admin permissions to deploy arbitrary Terraform changes in your AWS accounts) not to your CI server (which is accessible to all your devs and often vulnerable from a security perspective), but solely to an IAM role, and to ensure that the only thing that can use that IAM role is the ECS Deploy Runner, which limits you to running only specific commands (e.g., `terraform apply`), in specific repos (e.g., `infrastructure-live`), in specific branches/folders, and so on. That dramatically limits the damage a malicious actor can do. However, your CI server does need the ability to trigger the ECS Deploy Runner. If you have a single ECS Deploy Runner with access to all accounts, then granting someone the ability to trigger it in, say, the dev environment also allows them to trigger it in stage and prod and everywhere else. By having separate ECS Deploy Runners in each account, you can limit permissions in a fine grained way. That was the theory, anyway; in practice, we've found that almost all customers end up setting up a single CI server (e.g., GH Actions) with permissions to trigger all their ECS Deploy Runners; for these use cases, there's no advantage from multiple ECS Deploy Runners. 1. **IAM role chaining limitations.** The used to be limitations with IAM roles assuming other IAM roles in the past. E.g., ECS Deploy Runner gets AWS permissions through an IAM role in the same account; if you wanted it to be able to deploy in another AWS account, it would have to assume an IAM role in that other account. In the past, this sort of IAM role chaining had limits: e.g., you could only assume the role for at most 1 hour, whereas there were many Terraform (or more accurately, AWS) operations that could take more than an hour (e.g., deploying RDS, EKS, Elasticsearch, and CloudFront changes could all take > 1 hour). Moreover, to grant permissions to assume an IAM role in another account, _you have to deploy changes in that other account anyway_. So, we went with the model of one ECS Deploy Runner per account, with an IAM role that has permissions just for that account. These days, I _think_ the IAM role chaining limits have solutions: e.g., you can set a max session duration when creating an IAM role and specify the session duration (up to that maximum) when assuming the IAM role. That said, more research is needed, as some AWS APIs (e.g., creating IAM resources) can't be called with temporary STS credentials (except with MFA), and I'm not sure if we'd hit that limit with IAM role chaining or not. 1. **Principle of least privilege with IAM permissions.** When you deploy an ECS Deploy Runner, you limit it to only specific commands, repos, branches, etc, as mentioned above. Having multiple ECS Deploy Runners allows you to have different limitations in different environments: e.g., for deployment to stage, you may allow deploying from one branch, whereas for prod, from another; or you may allow access to some IAM permissions in one environment and different IAM permissions in another. Doing this mapping at the ECS Deploy Runner level was easy and flexible; and as it's serverless, it doesn't increase cost. However, it does increase complexity in terms of having more ECS Deploy Runners to manage. In theory, we could move the mapping within ECS Deploy Runner, so it can enforce these limits on a per-IAM role basis, but that may be a more complex UX. That may also go against the principle of least privilege, as explained in https://github.com/gruntwork-io/knowledge-base/discussions/564. From the list above, item (1) isn't relevant any more; item (2) probably isn't relevant any more, but more research is needed; item (3) is the big one to consider. In the future, we may revisit (3) to see if we can deploy a single copy of EDR shared amongst all accounts, but for now, the recommendation is to deploy one copy of EDR per account. Also, to be clear, even if we succeeded at deploying just a single copy of EDR that is shared across all accounts, you'd still have to deploy IAM roles in all other accounts to grant EDR access to those accounts; this is a hard requirement in the way AWS/IAM are designed, and not something specific to GW Pipelines. In short, when working with multiple AWS accounts, there's no way to avoid having to make at least _some_ changes in every one of those accounts.