Updating ECS Deploy Runner (Github)
We recently tried to update the ECS Deploy Runner image in our reference architecture. We started by running the following script: ``` aws-vault exec shared-admin -- shared/us-east-2/_regional/container_images/build_deploy_runner_image.sh ``` And got the following error: ``` cmd: /bin/sh -- args: [-c if [ -z "$(cat /kaniko/secrets/github-token)" ]; then echo "ERROR: You must pass a GitHub PAT as an environment variable named GITHUB_OAUTH_TOKEN."; exit 1; fi] Running: [/bin/sh -c if [ -z "$(cat /kaniko/secrets/github-token)" ]; then echo "ERROR: You must pass a GitHub PAT as an environment variable named GITHUB_OAUTH_TOKEN."; exit 1; fi] cat: can't open '/kaniko/secrets/github-token': No such file or directory ERROR: You must pass a GitHub PAT as an environment variable named GITHUB_OAUTH_TOKEN. error building image: error building stage: failed to execute command: waiting for process to exit: exit status 1 ERROR: exit status 1 exit status 1 ERROR: exit status 1 ``` I thought it had something to do with the recent buildkit change (https://github.com/gruntwork-io/terraform-aws-ci/pull/495), so I added `--secret 'id=github-token,env=GITHUB_OAUTH_TOKEN'` to the script and got the following error: ``` [infrastructure-deployer] INFO[2022-12-22T13:41:41+02:00] Invoking Lambda function ecs-deploy-runner-invoker to trigger deployment. ERROR: OptionNotInAllowedOptionsError: Option --secret is not in the provided list of allowed options for the script. ``` Then, we tried to update our `ecs-deploy-runner` module in the `shared` account adding the following: ``` docker_image_builder_hardcoded_options = { "--secret" = ["'id=github-token,env=GITHUB_OAUTH_TOKEN'"] } ``` Finally we tried running the script again and it failed, and while looking in the ECS logs we saw: ``` Running command: /opt/ecs-deploy-runner/scripts/build-docker-image (args redacted) -- Incorrect Usage. flag provided but not defined: -secret Usage: build-docker-image [--repo] [--ref] [--sha] [--idempotent] [--context-path] [--dockerfile-path] [--docker-image-tag] [--build-arg] [--env-secret] [--iam-role] [--no-push] [--help] command [options] [args] Command to trigger a docker build using kaniko. The intention of the script is to simplify the args so that it reflects the specific use case of the ECS deploy runner. Commands: help, h Shows a list of commands or help for one command ERROR: flag provided but not defined: -secret exit status 1 ERROR: exit status 1 ``` It looks like we can't update the image. Are we missing something? --- <ins datetime="2022-12-22T11:45:00Z"> <p><a href="https://support.gruntwork.io/hc/requests/109746">Tracked in ticket #109746</a></p> </ins>
Here is a complete upgrade guide for moving past the breaking buildkit change in terraform-aws-ci. This change was introduced in v0.50.12. ### Prerequisites - Docker must be installed on your machine at version 18.09 or later - You have a [Github OAuth Token](https://docs.github.com/en/enterprise-server@3.4/authentication/keeping-your-account-and-data-secure/creating-a-personal-access-token) available with access to the gruntwork-io github organization - You have access to - Push images to ECR in your Shared AWS account - Perform CRUD actions on ECS in all relevant AWS accounts - Perform CRUD actions on Lambda in all relevant AWS accounts --- ### Guide We will need to build two docker images locally. Starting with Kaniko: 1. **Checkout terraform-aws-ci** at version v0.50.12 or later (we recommend the [latest release](https://github.com/gruntwork-io/terraform-aws-ci/releases)): [https://github.com/gruntwork-io/terraform-aws-ci](https://github.com/gruntwork-io/terraform-aws-ci/releases) ```bash cd terraform-aws-ci/modules/ecs-deploy-runner/docker/kaniko ``` 2. **Make your github token available to the build system.** We recommend using [bitwarden](https://medium.com/gruntwork/how-to-securely-store-secrets-in-bitwarden-cli-and-load-them-into-your-zsh-shell-when-needed-f12d4d040df) or [pass](https://blog.gruntwork.io/a-comprehensive-guide-to-managing-secrets-in-your-terraform-code-1d586955ace1#4df5) for security, but the command below can work in a pinch ```bash export GITHUB_OAUTH_TOKEN=your_token ``` 3. **Build the Kaniko Image** Run the docker build tagged to your kaniko ECR repo in your Shared AWS account with a version matching the terraform-aws-ci version you chose in step 1. Your repo tag should have the following format: ```bash # Replace the following variables with the values for your account and Reference Architecture ${account-id}.dkr.ecr.${primary-region}.amazonaws.com/kaniko:${terraform-aws-ci-version} # For example: [1234123412.dkr.ecr.us-east-1.amazonaws.com/kaniko:v0.50.12](http://1234123412.dkr.ecr.us-east-1.amazonaws.com/kaniko:v0.50.12) # where the first 12 digits are the account ID of your shared AWS account and the us-east-1 is the PrimaryRegion you selected when your Reference Architecture was deployed. ``` Build the Kaniko docker image. Pass the `--tag` value that you constructed ```bash DOCKER_BUILDKIT=1 docker build \ --secret id=github-token,env=GITHUB_OAUTH_TOKEN \ --build-arg module_ci_tag="v0.50.12" \ --tag $shared_account_id.dkr.ecr.$ecr_repo_region.amazonaws.com/kaniko:v0.50.12 \ --platform linux/amd64 . ``` 4. **Sign in to your ECR repo.** We use AWS vault, but any method of exposing credentials is valid ```bash aws-vault exec shared -- aws ecr get-login-password --region us-east-1 \ | docker login -u AWS --password-stdin $shared_account_id.dkr.ecr.$ecr_repo_region.amazonaws.com ``` 5. **Push the docker image** to your shared repo ```bash docker push $shared_account_id.dkr.ecr.$ecr_repo_region.amazonaws.com/kaniko:v0.50.12 ``` 6. **Repeat for the ecs-deploy-runner image** ```bash #navigate to terraform-aws-ci/modules/ecs-deploy-runner/docker/deploy-runner cd ../deploy-runner #build DOCKER_BUILDKIT=1 docker build \ --secret id=github-token,env=GITHUB_OAUTH_TOKEN \ --build-arg module_ci_tag"v0.50.12" \ --tag $shared_account_id.dkr.ecr.$ecr_repo_region.amazonaws.com/deploy-runner:v0.50.12 \ --platform linux/amd64 . #docker login aws-vault exec shared -- aws ecr get-login-password --region us-east-1 \ | docker login -u AWS --password-stdin $shared_account_id.dkr.ecr.$ecr_repo_region.amazonaws.com #push the tag docker push $shared_account_id.dkr.ecr.$ecr_repo_region.amazonaws.com/deploy-runner:v0.50.12 ``` 7. **Update your Terraform to point at the new images** 1. In your infrastructure-live repo find `common.hcl` at the repo root and replace `kaniko_container_image_tag` and `deploy_runner_container_image_tag` with the new image tags we just pushed. ```bash #On line 21 deploy_runner_container_image_tag = "v0.50.12" #On line 25 kaniko_container_image_tag = "v0.50.12" ``` 2. In `_envcommon/mgmt/ecs-deploy-runner.hcl` ensure that the version ref is at least `v0.96.3` and no local environment (dev, stage, prod, shared, security, or logs) is overriding it 3. Run `terragrunt apply` in all impacted modules (this can be done using your usual CI/CD process or manually in each `env/us-east-1/mgmt/ecs-deploy-runner` directory) 8. **Update the build scripts to use docker buildkit for next time you update.** Change `build_deploy_runner_image.sh` and `build_kaniko_image.sh` in `shared/us-east-1/_regional/container_images/` to use the updated versions and buildkit syntax: ```bash #On line 15, replace: readonly DOCKERFILE_REPO_REF="v0.50.6" #With: readonly DOCKERFILE_REPO_REF="v0.50.12" #On line 43, replace: --build-arg 'GITHUB_OAUTH_TOKEN' \ #With: --env-secret 'github-token=GITHUB_OAUTH_TOKEN' \ ```