AWS provides an oft overlooked tool available to accounts with “Business” or “Enterprise” level support called Trusted Advisor (TA). Trusted Advisor is a tool that analyzes your current AWS resources for ways to improve your environment in the following categories:
- Cost Optimization
- Fault Tolerance
It rigorously scours your AWS resources for inefficiencies, waste, potential capacity issues, best practices, security holes and much, much more. It provides a very straightforward and easy to use interface for viewing the identified issues.
Trusted Advisor will do everything from detecting EC2 instances that are under-utilized (e.g. using an m3.xlarge for a low traffic NAT instance), to detecting S3 buckets that are good candidates for fronting with a CloudFront distribution, to identifying Security Groups with wide open access to a port(s), and everything in between.
In Amazon’s own words…
[blockquote]AWS Trusted Advisor inspects your AWS environment and makes recommendations for saving money, improving system performance and reliability, or closing security gaps. Since 2013, customers have viewed over 1.7 million best-practice recommendations in AWS Trusted Advisor in the categories of cost optimization, performance improvement, security, and fault tolerance, and they have realized over $300 million in estimated cost reductions. Currently, Trusted Advisor provides 37 checks; the most popular ones are Low Utilization Amazon EC2 Instances, Amazon EC2 Reserved Instances Optimization, AWS CloudTrail Logging, Amazon EBS Snapshots, and two security group configuration checks.[/blockquote]
This week (7/23/2014) AWS just announced the release of the new Trusted Advisor Console.
Two new features of the TA console I found particularly noteworthy and useful are the Action Links and Access Management.
Action Links allow you to click a hyperlink next to an issue in the TA Console that redirects you to the appropriate place to take action on the issue. Pretty slick… saves you time jumping around tabs in your browser or navigate to the correct Console and menus. Action Links will also take the guess work out of hunting down the correct place if you aren’t that familiar with the AWS Console.
Access Management allows you to use AWS IAM (Identity and Access Management) credentials to control access to specific categories and checks within Trusted Advisor. This gives you the ability to have granular access control over which people in your organization can view and act on specific checks.
In addition to the console, Trusted Advisor also supports API access. And this wouldn’t be my AWS blog post without some kind of coding example using Python and the boto library. The following example code will print out a nicely formatted list of all the Trusted Advisory categories and each of the checks underneath them in alphabetical order.
from boto import connect_support
conn = connect_support()
ta_checks = sorted(conn.describe_trusted_advisor_checks('en')['checks'],
key=lambda check: check['category'])
for cat in sorted(set([ x['category'] for x in ta_checks ])):
print "\n%s\n%s" % (cat, '-' * len(cat))
for check in sorted(ta_checks, key=lambda check: check['name']):
if check['category'] == cat:
print " %s" % check['name']
Here is the resulting output (notice all 37 checks are accounted for):
Amazon EC2 Reserved Instances Optimization
Amazon RDS Idle DB Instances
Amazon Route 53 Latency Resource Record Sets
Idle Load Balancers
Low Utilization Amazon EC2 Instances
Unassociated Elastic IP Addresses
Underutilized Amazon EBS Volumes
Amazon EBS Snapshots
Amazon EC2 Availability Zone Balance
Amazon RDS Backups
Amazon RDS Multi-AZ
Amazon Route 53 Deleted Health Checks
Amazon Route 53 Failover Resource Record Sets
Amazon Route 53 High TTL Resource Record Sets
Amazon Route 53 Name Server Delegations
Amazon S3 Bucket Logging
Auto Scaling Group Health Check
Auto Scaling Group Resources
Load Balancer Optimization
VPN Tunnel Redundancy
Amazon EBS Provisioned IOPS (SSD) Volume Attachment Configuration
Amazon Route 53 Alias Resource Record Sets
CloudFront Content Delivery Optimization
High Utilization Amazon EC2 Instances
Large Number of EC2 Security Group Rules Applied to an Instance
Large Number of Rules in an EC2 Security Group
Overutilized Amazon EBS Magnetic Volumes
AWS CloudTrail Logging
Amazon RDS Security Group Access Risk
Amazon Route 53 MX and SPF Resource Record Sets
Amazon S3 Bucket Permissions
IAM Password Policy
MFA on Root Account
Security Groups - Specific Ports Unrestricted
Security Groups - Unrestricted Access
In addition to the meta-data about categories and checks, actual TA check results and recommendations can also be pulled and refreshed using the API.
While Trusted Advisor is a great tool to quickly scan your AWS environment for inefficiencies, waste, potential cost savings, basic security issues, and best practices, it isn’t a “silver bullet” solution. It takes a specific set of AWS architectural understanding, skills, and experience to look at an entire application stack or ecosystem and ensure it is properly designed, built, and/or tuned to best utilize AWS and its array of complex and powerful building blocks. This where a company like 2nd Watch can add immense value in a providing a true “top down” cloud optimization. Our architects and engineers are the best in the business at ensuring applications and infrastructure are designed and implemented using AWS and cloud computing best practices with a fierce attention to detail and focus on our customers’ success in their business and cloud initiatives.
-Ryan Kennedy, Senior Cloud Architect
Have you seen our new service delivery model? Our Cloud Migration and Management Methodology, or CM3, is an innovative, catalogue-based approach to AWS deployments that reduces infrastructure implementation time for Enterprise businesses and significantly lowers risk.
CM3 uses repeatable “building blocks” of services that we select, deploy and manage for customers based on their specific requirements and needs, so it simplifies options and pricing for companies migrating applications and data to AWS. Each service is priced separately, creating a highly transparent offering to ease budgeting and speed approvals for companies needing to move quickly into the cloud.
Read More about CM3
In the article “Increasing your Cloud Footprint” we discussed the phased approach of moving a traditional environment to Amazon Web Services (AWS). You start with some of the low risk workloads like archiving and backups, move on to workloads that are a more natural fit for the cloud like disaster recovery or development accounts, and finally create POCs for production workloads. By the time you reach production workloads and work out all the kinks you should be operating full time in the cloud! OK not quite, but you will have the experience and know-how to be comfortable with what works and what doesn’t work for your organization in the cloud. Once the organization gets comfortable with AWS, it is a natural progression for more and more of the environment to migrate to the cloud. Before you know it, you have many workloads in AWS and might even have several accounts. The next big question is, what tools are available to manage the environment?
AWS provides users with several tools to manage their cloud environments. The main tools most people use when getting started are the AWS Console and the AWS CLI. The AWS console gives the ability to access and manage most AWS services through an intuitive web based interface, while the CLI is a command line based tool you can use to manage services and automate actions with scripts. For developers, AWS provides SDKs for simplifying using AWS services in applications. AWS provides an API tailored to work with several programming languages and platforms like Java, .NET, Node.js, PHP, Python, Ruby, Android, and iOS.
Along with the regular tools like the AWS Console, CLI and APIs, AWS provides IDE Toolkits that integrate into your development environment. Both the AWS Toolkit for Eclipse and the AWS Toolkit for Visual Studio make it easier for developers to develop and deploy application using AWS technologies.
The great thing about the AWS IDE Toolkits is that they are very useful even if you are not a developer. For example, if you manage multiple accounts mainly through the standard AWS console, tasks like switching between accounts can become cumbersome and unwieldy. Either you have to log in and out of each environment through your browser, always checking to make sure you are executing commands in the right environment, or you have to use multiple browsers to separate multiple accounts. Either way the process isn’t optimal. The AWS Toolkit for Visual Studio (or Eclipse) seems to solve this problem and can be handy for any AWS cloud administrator. The AWS Toolkit for Visual Studio is compatible with Visual Studio 2010, 2012, and 2013. To setup a new account you download the AWS Toolkit for visual studio here. Once installed, you add a user through the AWS explorer Profile section seen here:
You can then add an account using a Display Name, Access Key ID, Secret Access Key, and Account number. You can add multiple AWS accounts as long as you have the Access Keys for a user with the ability to access resources. See the Add Account box below:
Once you have the credentials entered for multiple accounts, you will have the ability to manage each account by just pulling down the Account dropdown. As you can see below I have two accounts “2nd Watch Prod” and “2nd Watch Dev”:
Finally, you can manage the resources in the selected account by just dropping down which account you want active and then clicking on the corresponding AWS resource you would like to manage. In the example below we are looking at the Amazon EC2 Instances for the Ireland region for another account called “2nd Watch SandBox”. You can quickly click on the Account drop down to select another account and look at the instances associated with it. Suddenly, switching between accounts is manageable and you can focus on being more productive across all your accounts!
The AWS Toolkit for Visual Studio is an extremely powerful tool. Not only is it a great tool for integrating your environment for developers, it can also serve as a great way to manage your devices on AWS. There are many services you can manage with the AWS Toolkits, but be warned, it doesn’t have them all. For example, working with auto-scale groups can be done using the CLI or through the AWS console as there is no AWS Toolkit compatibility yet. If you are interested in AWS Toolkit for Visual Studio you can see the complete instructions here.
Overall, managing your AWS environment largely depends on how you want to interact with the AWS services. If you like the GUI feel, the console or AWS Toolkits are a great match. However, if you like texted based CLI interfaces, the AWS CLI tools and SDKs are a great way to interact with AWS. Lastly, using each tool takes time to learn, but once you find the best one for your specific needs you should experience an increase in productivity that will make life using AWS that much easier.
– Derek Baltazar, Senior Cloud Engineer
IT infrastructure is the hardware, network, services and software required for enterprise IT. It is the foundation that enables organizations to deliver IT services to their users. Disaster recovery (DR) is preparing for and recovering from natural and people-related disasters that impact IT infrastructure for critical business functions. Natural disasters include earthquakes, fires, etc. People-related disasters include human error, terrorism, etc. Business continuity differs from DR as it involves keeping all aspects of the organization functioning, not just IT infrastructure.
When planning for DR, companies must establish a recovery time objective (RTO) and recovery point objective (RPO) for each critical IT service. RTO is the acceptable amount of time in which an IT service must be restored. RPO is the acceptable amount of data loss measured in time. Companies establish both RTOs and RPOs to mitigate financial and other types of loss to the business. Companies then design and implement DR plans to effectively and efficiently recover the IT infrastructure necessary to run critical business functions.
For companies with corporate datacenters, the traditional approach to DR involves duplicating IT infrastructure at a secondary location to ensure available capacity in a disaster. The key downside is IT infrastructure must be bought, installed and maintained in advance to address anticipated capacity requirements. This often causes IT infrastructure in the secondary location to be over-procured and under-utilized. In contrast, Amazon Web Services (AWS) provides companies with access to enterprise-grade IT infrastructure that can be scaled up or down for DR as needed.
The four most common DR architectures on AWS are:
- Backup and Restore ($) – Companies can use their current backup software to replicate data into AWS. Companies use Amazon S3 for short-term archiving and Amazon Glacier for long-term archiving. In the event of a disaster, data can be made available on AWS infrastructure or restored from the cloud back onto an on-premise server.
- Pilot Light ($$) – While backup and restore are focused on data, pilot light includes applications. Companies only provision core infrastructure needed for critical applications. When disaster strikes, Amazon Machine Images (AMIs) and other automation services are used to quickly provision the remaining environment for production.
- Warm Standby ($$$) – Taking the Pilot Light model one step further, warm standby creates an active/passive cluster. The minimum amount of capacity is provisioned in AWS. When needed, the environment rapidly scales up to meet full production demands. Companies receive (near) 100% uptime and (near) no downtime.
- Hot Standby ($$$$) – Hot standby is an active/active cluster with both cloud and on-premise components to it. Using weighted DNS load-balancing, IT determines how much application traffic to process in-house and on AWS. If a disaster or spike in load occurs, more or all of it can be routed to AWS with auto-scaling.
In a non-disaster environment, warm standby DR is not scaled for full production, but is fully functional. To help adsorb/justify cost, companies can use the DR site for non-production work, such as quality assurance, ing, etc. For hot standby DR, cost is determined by how much production traffic is handled by AWS in normal operation. In the recovery phase, companies only pay for what they use in addition and for the duration the DR site is at full scale. In hot standby, companies can further reduce the costs of their “always on” AWS servers with Reserved Instances (RIs).
Smart companies know disaster is not a matter of if, but when. According to a study done by the University of Oregon, every dollar spent on hazard mitigation, including DR, saves companies four dollars in recovery and response costs. In addition to cost savings, smart companies also view DR as critical to their survival. For example, 51% of companies that experienced a major data loss closed within two years (Source: Gartner), and 44% of companies that experienced a major fire never re-opened (Source: EBM). Again, disaster is not a ready of if, but when. Be ready.
-Josh Lowry, General Manager – West
In migrating customers to AWS one of the consistent questions we are asked is, “How do we extend our backup services into the Cloud?” My answer? You don’t. This is often met with incredulous stares where the customer is wondering if I’m joking, crazy, or I just don’t understand IT. After all, backups are fundamental to data centers as well as IT systems in general, so why on Earth would I tell someone not to backup their systems?
The short answer to backups is just not to do it, honestly. The more in depth answer is, of course, more complicated than that. To be clear, I am talking about system backups; those backups typically used for bare metal restores. Backups of databases, of file services – these we’ll tackle separately. For the bulk of systems, however, we’ll leave backups as a relic of on premise data centers.
How? Why? Consider a typical three tiered architecture: web servers, application servers, and database servers. In AWS, ideally your application and web servers are stateless, auto scaled systems. With that in mind, why would you ever want to spend time, money, or resources on backing up and restoring one of these systems? The design should be set so if and when a system fails, the health check/monitoring automatically terminates the instance, which in turn automatically creates an auto scale event to launch a new instance in its place. No painfully long hours working through a restore process.
Similarly, your database systems can work without large scale backup systems. Yes, by all means run database backups! Database backups are not for server instance failures but for database application corruption or updates/upgrade rollbacks. Unfortunately, the Cloud doesn’t magically make your databases any more immune to human error. For the database servers (assuming non-RDS), however, maintaining a snapshot of the server instance is likely good enough for backups. If and when the database server fails, the instance can be terminated and the standby system can become the live system to maintain system integrity. Launch a new database server based on the snapshot, restore the database and/or configure replication from the live system, depending on database technology, and you’re live.
So yes, in a properly configured AWS environment, the backup and restore you love to loathe from your on premise environment is a thing of the past.
-Keith Homewood, Cloud Architect
Wired Innovation Insights published a blog article written by our own Chris Nolan yesterday. Chris discusses ways you can save money on your AWS cloud deployment in “How to Manage Your Amazon Cloud Deployment to Save Money.” Chris’ top tips incude:
- Use CloudFormation or other configuration and orchestration tool.
- Watch out for cloud sprawl.
- Use AWS auto scaling.
- Turn the lights off when you leave the room.
- Use tools to monitor spend.
- Build in redundancy.
- Planning saves money.
Read the Full Article