Atlantis is a self-hosted golang application that listens for
Terraform pull request events via webhooks. I’ve
incorporated it in my recent engagement in CriticalStart but also I use it in my private infrastructure.
I think the idea is great for making
terraform workflow more easy for infrastructure teams.
terraform change need to go through review process. When PR is created it automatically run
displaying its output as a comment. Applying is also done by adding a comment. It’s highly configurable.
atlantis allows to closing whole
terraform workflow on PR page!
I always had the feeling that checking out branch and running
terraform locally is a waste of time.
Now you can just look at
plan in PR, do the review and continue with other work.
Sounds good? Let’s dig it!
How atlantis works?
In a nutshell when repository webhook is triggered by according action (like create PR or comment),
it and starts workflow. When workflow is finished then
atlantis comment on PR with result.
Note: This implies that your repo provider need to have access to
atlantis endpoint and vice versa.
You can use
atlantis with git providers like:
- Azure DevOps
atlantis must work with remote
With custom workflows it’s possible to define exactly how
terraform will be executed - flags, additional commands or even running
I will write here just about two topics which I think are crucial to understand
atlantis - apply requirements and
Atlantis allows you to require certain conditions be satisfied before an atlantis apply command can be run:
- Approved – requires pull requests to be approved by at least one user
- Mergeable – requires pull requests to be able to be merged
If you decide to use apply requirements then be sure to understand what they mean for your git provider.
atlantis communicate with git providers via API calls and every git provider has it’s own unique API.
Apply requirements are used to limit possibility of a failure due to human error. Doing apply where there are git conflicts is silly and if your team consist of more than one person then it would be best practice to also require PR being approved before applying.
Difference between approved condition for different providers as an example:
- GitHub – Any user with read permissions to the repo can approve a pull request
- GitLab – You can set who is allowed to approve
- Bitbucket – A user can approve their own pull request but Atlantis does not count that as an approval and requires an approval from at least one user that is not the author of the pull request
- Azure DevOps – All builtin groups include the “Contribute to pull requests” permission and can approve a pull request
Also each VCS provider has a different concept of “mergeability”, be sure to check it out in docs as well.
There are two strategies available:
- branch [default]
Both of them have some valid usage.
Branch strategy will checkout code on latest branch commit and run
assumes that there was no
terraform changes on
master in the meantime. If there were changes you will get
unexpexted diff in your plan. To mitigate it you should assure that your branch is on top of them master by either
rebase or master
merge to your branch.
Merge strategy will create merge commit with master and run plan there.
branch strategy because my repo force to be on top of the master. It saves time on failed plans.
atlantis to be functional a webhook is needed. Webhook and the git provider API are main communication channels.
In my case I did
github webhook with
CloudPosse module but for
gitlab I had to create
it manually. With
atlantis documentation it was piece of cake:
If you’re using GitLab, navigate to your project’s home page in GitLab
- Click Settings > Integrations in the sidebar
- set URL to http://$URL/events (or https://$URL/events if you’re using SSL) where $URL is where Atlantis is hosted. Be sure to add /events
- double-check you added /events to the end of your URL.
- set Secret Token to the Webhook Secret you generated previously
– NOTE If you’re adding a webhook to multiple repositories, each repository will need to use the same secret.
- check the boxes
– Push events
– Merge Request events
– leave Enable SSL verification checked
- click Add webhook
Deployment on ECS
Setup in CriticalStart was based on atlantis provider which consist of such resources:
- Virtual Private Cloud (VPC)
- SSL certificate using Amazon Certificate Manager (ACM)
- Application Load Balancer (ALB)
- Domain name using AWS Route53 which points to ALB
- AWS Elastic Cloud Service (ECS) and AWS Fargate running Atlantis Docker image
- AWS Parameter Store to keep secrets and access them in ECS task natively
Main resource here would be ECS Fargate task running
atlantis which is exposed by ALB.
There is a bit overhead to create additional VPC and ALB and so on but it gives better isolation, which at the end of the day gives us more secure environment.
Enabling it with
Additionally we used CloudPosse module to create github webhook for
Deployment on Kubernetes
In my private
k8s cluster I use
helm to deploy
atlantis. It has it’s own namespace where only
atlantis is deployed.
To prevent commiting credentials to repository where helm configuration is we need to create two secrets:
awsSecretNamecontains AWS CLI credentials file.
To keep such secrets in repository I use sealed secrets, which basically create encrypted Secret in SealedSecret resource. Such resources can only be encrypted with access to k8s so it’s safe to have them even in public repository although I would advice against that.
atlantis is complicated topic as there are multiple ways to exploit it.
git providershould use secure channel - precisely
atlantisshouldn’t be exposed as HTTP, webhooks should hit HTTPS endpoints.
- Webhooks should have webhook secret so it’s possible to validate it and reject if it’s not legitimate.
- Restrict access to
atlantisapplication - people should not have permission to exec a shell and run commands.
atlantiscredentials (API keys, ssh credentials, etc) to appliacation in secure way.
- Restrict access to repository which triggers
atlantisjust to people that should do infrastructure changes, anyone with access to repository can possibly exploit
--repo-whitelistflag to define which repositories can trigger
- Watch out for PRs created by
forks. It’s possible to trigger
atlantisin isolated environment - separate k8s namespace, different ECS cluster, dedicated EC2 instance, etc
- Allow communication to
atlantisonly from specific
I love it!
terraform has never been easier for me.
It needs a lot of love in the beginning to get everything right but after that it’s smooth ride.
Main issue for me is security. A lot of privileges for
atlantis means blast radius is gigantic.
Anyone that can access
atlantis app, send
webhooks, comment in PR, etc can break whole infrastructure to which
atlantis have access.
The risk is not just internal,
atlantis is exposed so additionally it’s another large surface attack vector.
At the end of the day if you can “afford”
atlantis it’s definitely worth the effort!