1-888-317-7920 info@2ndwatch.com

In this post, we’ll go over a complete workflow for continuous integration (CI) and continuous delivery (CD) for infrastructure as code (IaC) with just 2 tools: Terraform, and Atlantis.

What is Terraform?

So what is Terraform? According to the Terraform website:

Terraform is a tool for building, changing, and versioning infrastructure safely and efficiently. Terraform can manage existing and popular service providers as well as custom in-house solutions.

In practice, this means that Terraform allows you to declare what you want your infrastructure to look like – in any cloud provider – and will automatically determine the changes necessary to make it so. Because of its simple syntax and cross-cloud compatibility, it’s 2nd Watch’s choice for infrastructure as code.

Pain You May Be Experiencing Working With Terraform

When you have multiple collaborators (individ

uals, teams, etc.) working on a Terraform codebase, some common problems are likely to emerge:

  1. Enforcing peer review becomes difficult. In any codebase, you’ll want to ensure that your code is peer reviewed in order to ensure better quality in accordance with The Second Way of DevOps: Feedback. The role of peer review in IaC codebases is even more important. IaC is a powerful tool, but that tool has a double-edge – we are clearly more productive for using it, but that increased productivity also means that a simple typo could potentially cause a major issue with production infrastructure. In order to minimize the potential for bad code to be deployed, you should require peer review on all proposed changes to a codebase (e.g. GitHub Pull Requests with at least one reviewer required). Terraform’s open source offering has no facility to enforce this rule.
  1. Terraform plan output is not easily integrated in code reviews. In all code reviews, you must examine the source code to ensure that your standards are followed, that the code is readable, that it’s reasonably optimized, etc. In this aspect, reviewing Terraform code is like reviewing any other code. However, Terraform code has the unique requirement that you must also examine the effect the code change will have upon your infrastructure (i.e. you must also review the output of a terraform plan command). When you potentially have multiple feature branches in the review process, it becomes critical that you are assured that the terraform plan output is what will be executed when you run terraform apply. If the state of infrastructure changes between a run of terraform plan and a run of terraform apply, the effect of this difference in state could range from inconvenient (the apply fails) to catastrophic (a significant production outage). Terraform itself offers locking capabilities but does not provide an easy way to integrate locking into a peer review process in its open source product.
  1. Too many sets of privileged credentials. Highly-privileged credentials are often required to perform Terraform actions, and the greater the number principals you have with privileged access, the higher your attack surface area becomes. Therefore, from a security standpoint, we’d like to have fewer sets of admin credentials which can potentially be compromised.

What is Atlantis?

And what is Atlantis? Atlantis is an open source tool that allows safe collaboration on Terraform projects by making sure that proposed changes are reviewed and that the proposed change is the actual change which will be executed on your infrastructure. Atlantis is compatible (at the time of writing) with GitHub and Gitlab, so if you’re not using either of these Git hosting systems, you won’t be able to use Atlantis.

How Atlantis Works With Terraform

Atlantis is deployed as a single binary executable with no system-wide dependencies. An operator adds a GitHub or GitLab token for a repository containing Terraform code. The Atlantis installation process then adds hooks to the repository which allows communication to the Atlantis server during the Pull Request process.

You can run Atlantis in a container or a small virtual machine – the only requirement is that the Terraform instance can communicate with both your version control (e.g. GitHub) and infrastructure (e.g. AWS) you’re changing. Once Atlantis is configured for a repository, the typical workflow is:

  1. A developer creates a feature branch in git, makes some changes, and creates a Pull Request (GitHub) or Merge Request (GitLab).
  2. The developer enters atlantis plan in a PR comment.
  3. Via the installed web hooks, Atlantis locally runs terraform plan. If there are no other Pull Requests in progress, Atlantis adds the resulting plan as a comment to the Merge Request.
    • If there are other Pull Requests in progress, the command fails because we can’t ensure that the plan will be valid once applied.
  4. The developer ensures the plan looks good and adds reviewers to the Merge Request.
  5. Once the PR has been approved, the developer enters atlantis apply in a PR comment. This will trigger Atlantis to run terraform apply and the changes will be deployed to your infrastructure.
    • The command will fail if the Merge Request has not been approved.

The following sequence diagram illustrates the sequence of actions described above:

Atlantis sequence diagram

We can see how our pain points in Terraform collaboration are addressed by Atlantis:

  1. In order to enforce code review, you can launch Atlantis with the –require approvals flag: https://github.com/runatlantis/atlantis#approvals
  2. In order to ensure that your terraform plan accurately reflects the change to your infrastructure that will be made when you run terraform apply, Atlantis performs locking on a project or workspace basis: https://github.com/runatlantis/atlantis#locking
  3. In order to prevent creating multiple sets of privileged credentials, you can deploy Atlantis to run on an EC2 instance with a privileged IAM role in its instance profile (e.g. in AWS). In this way, all of your Terraform commands run through a single set of privileged credentials and obviate the need to distribute multiple sets of privileged credentials: https://github.com/runatlantis/atlantis#aws-credentials

Conclusion

You can see that with minimal additional infrastructure you can establish a safe and reliable CI/CD pipeline for your infrastructure as code, enabling you to get more done safely! To find out how you can deploy a CI/CD pipeline in less than 60 days, Contact Us.

-Josh Kodroff, Associate Cloud Consultant

Facebooktwittergoogle_pluslinkedinmailrss