What does that even mean?
What I am talking about here is the automation of the following:
- AWS Linked Account Creation (the creation of secondary accounts under a single master account)
- Account Initialization and Configuration
- Continuous Compliance
It is commonplace for organizations to manage their AWS assets/resources across a wide range of different AWS accounts. This is nothing new, and we’ve seen some of our customers scale this into the hundreds. This has some pretty obvious implications from an operational, security, and accounting standpoint.
AWS Linked Account Creation
First there is the creation of the linked account itself, which can be a time consuming and arduous (if at least only one-time) process. Even if you have a rigid process for this, it is inevitable that some human error will introduce some drift or inconsistency at some point in time. It’s not a matter of if, just a matter of when. There is also the tracking of the root account credentials and everything that goes along with that. Looks like another process that is ripe for some sweet, sweet automation. Until very recently there was no API available for this, but AWS released a beta API to create linked accounts around a year ago that has recently gone to general availability. So score one for automation!
Account Initialization and Configuration
Now you’ve got your shiny new linked account. but for every account you manage you have to ensure that all of your base settings and resources are properly set up (e.g. AWS CloudTrail, AWS Config, IAM password policies, SAML Federation with your central AD, on and on). Not only set up, but set up in a consistent way so that you don’t have drift between accounts. Ok, so you could put together a nice CloudFormation template (CFT), Manage it in Terraform, or possibly just a homegrown set of scripts (bash+AWSCL, python, ruby, etc.). Those are all a great start, but you still need to be able to audit those resources to ensure they are what they are supposed to be. Also, you need to support the ability to push changes to those resources.
A few examples…
- IT AD Admin: The ADFS servers are updating their XML metadata doc, so we need you to go update the ADFS SAML Federation for our 37 linked accounts.
- IT Security Admin: We need to actively manage our set of IAM Roles that map to ADFS groups and their respective permissions on a regular and ongoing basis. How are we going to quickly and consistently do that across our 37 linked accounts?
- IT Security Admin: Hey, our email address for AWS CloudTrail notifications (SNS subscription) needs to be updated to use a new email address. I need you to get that updated on all of our 37 linked accounts ASAP!
And on and on it goes. Suffice it to say, there is a never-ending need to be able to make modifications across one, several, or all of your linked AWS accounts. You need an approach for handling what would normally be an unwieldy and tedious bit of guaranteed work. The more human intervention required to manage these things. the more likely we are to see inconsistencies, errors, and misses. And we’ve seen enough cautionary tales on failed security practices in the news in the past few years that I don’t need to stress the importance of getting this stuff right. Every time. All the time.
Once you have these things configured you really need a way to continually audit those resources and settings on an ongoing basis and ideally be able to automatically respond to drift events. This one is a bit trickier than the others because, while you can use tools like CloudFormation or Terraform to set up your initial settings and configurations. the resources they create can be modified afterwards outside of the tool they were created/configured with in the first place. Tools like AWS CloudTrail and AWS Config provide valuable tracking information for helping audit resources but alone don’t solve this puzzle. Especially if you are talking about managing this across a few dozen accounts. Something more robust must be employed to collect that data and do something intelligent with it.
How do I escape this sort of multi-account management nightmare you are describing?!!
I’ll be going into a deeper dive into this in my next blog, but here is a high-level overview of the architecture and accompanying tools and technologies you can put in place to pull it off.
AWS Linked Account Creation
With the somewhat recent release of the organizations API this has become a reality. As per the CreateAccount API documentation. you will need to ensure that AWS Organizations is enabled in the master account. But fear not! You probably already are. Specifically if you are already running multiple accounts under a master account, then you most certainly are. I won’t bore you with details, and AWS already has a very nice article detailing the process required to use organizations and the API to automate account creation. Pretty spiffy!
Account Initialization and Configuration
Once you have created the linked account using the CreateAccount API the next step is to apply any and all org-specific initialization and configuration to the new account to get it all ready for action. This step and the Continuous Compliance step can also be managed by the same tool if that is how you decide to architect it.
The key is that this is where we initialize the account with its base configuration. Whether you do that with custom scripting/code, CloudFormation, Terraform, or some amalgamation of those and/or other tools/services is not of paramount importance. What is important, is having a way to track those resource and their state. Make sure you keep that in mind when architecting a solution. One nice thing about CloudFormation is that the state tracking is built right into the service itself. You can easily list all resources within a CloudFormation stack and you can include stack Outputs to track any custom data you may generate or derive during the CFT stack launch.
You could do something similar with Terraform through the use of their state files, but it (non-enterprise Terraform) lacks the same API queryability that CloudFormation has built in. Also, it is less transparent to the casual onlooker in the AWS console where resources are originating from. Of course, once you query the resources you will still require a method for determining and tracking the state of those resources. But now we’re getting ahead of ourselves.
This is going to require a service that will allow you to: – Track the state of resources we care about – Audit the state of those resources automatically on an ongoing basis – Report on any configuration drift – Optionally automatically remediate drift.
Using the AWS CloudTrail and AWS Config services gives us the ability to track changes real-time and tie those changes to a specific user/role. But what about services that are not yet supported by AWS Config? In that case you may want to (as we have done) build a suite of services to handle these tasks. Resources and configurations are registered with a service that tracks their known-desired state. Another service is responsible for querying the current state of those items and raising a flag if there is drift. Potentially another service could report on those flagged out-of-compliance resources/settings. Optionally you could deploy a service that remediates drift in your desired configuration state on all out-of-compliance resources, or possibly just a subset.
At 2nd Watch we’ve actually architected and built out our own Managed Cloud specific implementation of Automated Account Creation and Continuous Compliance. If you would rather focus your energy on your business’s core competencies and not on building foundation cloud management tooling, why not come on board and let us empower you to deliver your product and drive shareholder value in the most secure, stable, and cost-effective way possible? We’ve got the tools and the people to make it happen! Contact us to learn more.
–Ryan Kennedy, Principal Cloud Automation Architect, 2nd Watch
-Craig Monson, Sr Automation Architect
-Lars Cromley, Director of Engineering
Without a doubt, AWS has fundamentally changed how modern enterprises deploy IT infrastructure. Their services are flexible, cost effective, scalable, secure and reliable. And while moving from on-premise data centers to the cloud is, in most cases, the smart move; once there managing your costs becomes much more complex.
On-premise costs are straight forward, enterprises purchase servers and amortize their costs over the expected life. Shared services such as internet access, racks, power and cooling are proportionally allocated to the cost of each server. AWS on the other hand, invoices each usage type separately. For example, if you are running a basic EC2 instance, you will not only be charged for the EC2 box usage but also the data transfer, EBS Storage and associated snapshots. You could end up with as many as 13 line items of cost for a single EC2.
Example: Pricing line items for a single c4.xlarge Linux virtual machine running in the US East Region (Click on image to view larger)
When examining the composition of various workload types the numbers of line items to manage will vary. A traditional VM-based workload may have 50 cost line items for every $1,000 of spend while an agile, cloud-native workload may have as many as 500 per $1,000 and a dynamic workload leveraging spot instances may have upwards of 1,200 per $1,000. This “parts bin” approach to pricing makes the job of cost account challenging.
To address this complexity and enable accurate cost accounting of your cloud costs; we recommend creating a business-relevant financial tagging schema to organize your resources and associated cost line items based on your specific financial accounting structure.
Here are some recommended financial management tags you should consider (Click on image to view larger):
AWS Tagging data integrity is extremely important in ensuring the quality of the information it provides and is directly dependent upon the rigor applied in adopting a systematic and disciplined approach to AWS Tagging.
Financial Management Tagging – Best Practices
- Create a framework or standard for your enterprise that outlines required tag names, tag formatting rules, and governance of tags.
- Tags should be enforced and automated at startup of the resource via Cloud Formation templates or other infrastructure as code tools, such as Terraform, to ensure cost accounting details are captures from time of launch.
- NOTE: Tags are point in time based. If a resource is launched without being tagged and then tagged sometime in the future, all hours the resource ran prior to being tagged will not be included in tag reports in the AWS console.
- Manually creating tags and associated values is strongly discouraged as it leads to miss-tagged and untagged resources and in-accurate cost accounting
- Select all upper case or all lower-case keys and values to avoid discrepancies with capitalization.
- NOTE: “Production” and “production” are considered two different tag names or values.
- Monitor resources with AWS Config Rules and alert for newly created resources that are not tagged
Once your tagging schema is created, automation is in place to tag resources during startup and alerts are set up to ensure tagging is managed, you can accurately to view, track and report your cost and usage using any of your tagging dimensions.
Financial Management Reporting – Best Practices
- Using your tagging schema, group your resources by workload.
- Apply Reserved Instance discounts to the workloads you intended them to be for.
- NOTE: 2nd Watch’s CMP Finance Manager tool converts reserved instances into resources so that you can add them to the workload they were intended for.
- Organize your groups to match your specific multi-level financial reporting structure.
- Managed shared resources
- Create groups for shared resources. If you have resources that are shared across multiple workloads such as a database used my multiple applications or virtual machines with more than one applications running on it, create groups to capture these costs and allocate them proportionally to the applications using them.
- Manage un-taggable resources
- Create a group for un-taggable resources. Some AWS resources are not taggable and should be grouped together and their associated costs proportionally allocated to all applications.
- Manage spend to budget
- Create budgets and budget alerts for each group to ensure you stay in budget throughout the year.
- Key alerts
- Forecasted month end cost exceeds alert threshold
- MTD cost is over alert threshold
- Forecasted year end cost exceeds alert threshold
- YTD cost is over alert threshold
- Sign up to receive monthly cost and usage reports for integration into your internal cost accounting system.
- Cost by application, environment, business unit etc.
Even though AWS’ “parts bin” approach to pricing is complicated, following these guidelines will help ensure accurate cost accounting of your cloud spend.
–Timothy Hill, Senior Product Manager, 2nd Watch
Momentum continues to build for companies who are migrating their workloads to the cloud, across all industries, even highly regulated industries such as Financial Services, Health Care, and Government. And it’s not just for small companies and startups. Most of the largest companies in the world – we’re talking Fortune 500 here – are adopting rapid and aggressive strategies for migrating and managing their workloads in the cloud. While the benefits of migrating workloads to the cloud are seemingly obvious (cost savings, of course), the “hidden” benefits exist in the fact that the cloud allows businesses to be more nimble, enabling business users with faster, more powerful, and more scalable business capabilities than they’ve ever had before.
So what do enterprises care about when managing workloads in the cloud? More importantly, what should you care about? Let’s assume, for the sake of argument, that your workloads are already in the cloud – that you’ve adopted a sound methodology for migrating your workloads to the cloud.
Raise your expectations I would submit that enterprises should raise their expectations from “standard” workload management. Why? Because the cloud provides a more flexible, powerful, and scalable paradigm than the typical application-running-in-a-data-center-on-a-bunch-of-servers model. Once your workloads are in the cloud, the basic requirements for managing them are not dissimilar to what you’d expect today for managing workloads on-premise or in a data center.
The basics include:
- Service Levels: Basic service levels are still just that – basic service levels – Availability, response time, capacity, support, monitoring, etc. So what’s different in the cloud world? You should pay particular attention to ensuring your personal data is protected in your hosted cloud service.
- Support: Like any hosting capability, support is very important to consider. Does your provider provide online, call center, dedicated, and/or a combo platter of all of these?
- Security: Ensure that your provider has robust security measures in place and mechanisms to preserve your applications and data
- Compliance: You should ensure your cloud provider is in compliance with the standards for your specific industry. Privacy, security and quality are principal compliance areas to evaluate and ensure are being provided.
Now what should enterprises expect on top of the “basics?”
- Visibility: When your workloads are in the cloud, you can’t see them anymore. No longer will you be able to walk through the data center and see your racks of servers with blinking lights, but there’s a certain comfort in that, right? So when you move to the cloud, you need to be able to see (ideally in a visual paradigm) the services that you’re using to run your critical workloads
- Be Proactive: It used to be that enterprises only cared if their data center providers/data center guys were just good at being “reactive” (responding to tickets, monitoring apps and servers, escalating issues, etc). But now the cloud allows us to be proactive. How can you optimize your infrastructure so you actually use less, rather than more? Wouldn’t it be great if your IT operations guy came to you and said “Hey, we can decrease our footprint and lessen our spend,” rather than the other way around?
- Partner with the business: Now that your workloads are running in the cloud, your IT ops team can focus more on working with the business/applications teams to understand better how the infrastructure can work for them, again rather than the other way around, and they can educate the business/applications teams on how some of the newest cloud services, like elasticity, big data, unstructured data, auto-scaling, etc., can cause the business to think differently and innovate faster.
Enterprises should – and are – raising their expectations as they relate to managing their workloads in the cloud. Why? Because the cloud provides a more flexible, powerful, and scalable paradigm than the typical hardware-centric, data center-focused approach.
-Keith Carlson, EVP of Professional and Managed Workload Services
Amazon Web Services™ (AWS) released a new service at re:invent a few weeks ago that will have operations and security managers smiling. CloudTrail is a web service that records AWS API calls and stores the logs in S3. This provides organizations the visibility they need to their AWS infrastructure to maintain proper governance of changes to their environment.
2nd Watch was pleased to announce support for CloudTrail in our launch of our 2W Atlas product. 2W Atlas is a product that organizes and visualizes AWS resources and output data. Enterprise organizations need tools and services built for the cloud to properly manage these new architectures. 2W Atlas provides organizations with a tool that enables their divisions and business units to organize and manage the CloudTrail data for their individual group.
2nd Watch is committed to assisting enterprise organizations with the expertise and tools to make the cloud work for them. The tight integration 2nd Watch has developed with CloudTrail and Atlas is further proof of our expertise in bringing enterprise solutions that our customers demand.
To learn more about 2W Atlas or CloudTrail, Contact Us and let us know how we can help.
-Matt Whitney, Sales Executive