Earlier this month, AWS announced that it was integrating its Route 53 DNS service with its CloudWatch management tools. That’s a fantastic development for folks managing EC2 architectures who want monitoring and alerting to accompany their DNS management.
If you’re unfamiliar with DNS, an over-simplified analogy is a phonebook. Like a phonebook manages person or business name to phone number translation, DNS manages the translation of Internet domain names (e.g. website address: www.2ndwatch.com) to IP addresses (184.108.40.206). There’s really a lot more to DNS than that, but that is outside of the scope of what we are covering in this article, and there are some very good books on the matter if you’re interested in learning more. For those unfamiliar with Route 53, it is Amazon’s in-house simple DNS service designed with high availability, scalability, automation, and ease of use in mind. It can be a great tool for businesses with agile cloud services, basic (or even large-scale) DNS needs. Route 53 lets you build and easily manage your public DNS records as well as your private internal VPC DNS records if you so choose. In addition to the web-based AWS Management Console, Route 53 has its own API so you can fully manage your DNS zones and records in a programmatic fashion by integrating with your existing and future applications/tools. Here is a link to the Route 53 developer tools. There are also a number of free tools out there that others have written to leverage the Route 53 API. One of the more useful ones I’ve seen is a tool written in Python using the Boto lib called cli53. Here’s a good article on it and a link to the github site where you can download the la verison.
Route 53 is typically operated through the AWS Management Console. Within the management console navigation you can see two main sections: “Hosted Zones” and “Health Checks”. DNS records managed with Route 53 get organized into the “Hosted Zones” section while the “Health Checks” section is used to house simple checks that will monitor the health of endpoints used in a dynamic Routing Policy. A hosted zone is simply a DNS zone where you can store the DNS records for a particular domain name. Upon creation of a hosted zone Route 53 assigns four AWS nameservers as the zone’s “Delegation Set” and the zone apex NS records. At that point you can create, delete, or edit records and begin managing your zone. Bringing them into the real world is simply a matter of updating your master nameservers with the registrar for your domain with the four nameservers assigned to your zone’s Delegation Set.
If you’ve already got a web site or a zone you are hosting outside of AWS, you can easily import your existing DNS zone(s) and records into Route 53. You can do this either manually (using the AWS Management Console, which works OK if you only have a handful of records) or by using a command-line tool like cli53. Another option is to utilize a tool like Python and the Boto library to access the Route 53 API. If the scripting and automation pieces are out of your wheelhouse and you have a bunch of records you need to migrate don’t worry, 2nd Watch has experienced engineers who can assist you with just this sort of thing! Once you have your zones and records imported into Route 53 all that’s left to do is update your master nameservers with your domain registrar. The master nameservers are the ones assigned as the “Delegation Set” when you create your zone in Route 53. These will now be the authoritative nameservers for your zones.
Route 53 supports the standard DNS record types: A, AAAA, CNAME, MX, NS, PTR, SOA, SPF, SRV, and TXT. In addition, Route 53 includes a special type of resource record called an “alias,” which is an extension to standard DNS functionality. The alias is a type of record within a Route 53 hosted zone which is similar to a CNAME but with some important differences. The alias record maps to other AWS services like CloudFront, ELB load balancers, S3 buckets with static web-hosting enabled, or a Route 53 resource record in the same hosted zone. They differ from a CNAME record in that they are not visible to resolvers. Resolvers only see the A record and the resulting IP address of the target record. An Alias also supports using a zone apex as the target, which is another feature that standard CNAME records don’t support. This has the advantage of completely masking the somewhat cryptic DNS names associated with CloudFront, S3, ELB, and other resources. It allows you to disguise the fact that you’re utilizing a specific AWS service or resource if you so desire.
One of the more useful features in Route 53 is the support for policy based routing. You can configure it to answer DNS queries with specific IPs from a group based on the following policies:
- Routing traffic to the region with the lowest latency in relation to the client
- If an IP in the group fails its healthcheck Route 53 will no longer answer queries with that IP
- Using specific ratios to direct more or less traffic at certain IPs in the group
- Just standard round-robin DNS which ensures an even distribution of traffic to all IPs in the group
*NOTE: It is important to keep TTL in mind when using routing based policies with healthchecks as a lower TTL will make your applications more responsive to dynamic changes when a failure is detected. This is especially important if you are using the Failover policy as a traffic switch for HA purposes.
As mentioned at the beginning of this article, Amazon recently announced that it has added Route 53 support to CloudWatch. This means you can use CloudWatch to monitor your Route 53 zones, check health, set threshold alarms and trigger events based on health data returned. Any Route 53 health check you configure gets turned into a CloudWatch metric, so it’s constantly available in the CloudWatch management console and viewable in a graph view as well as the raw metrics.
If you’re running your web site off of an EC2 server farm or are planning on making a foray into AWS you should definitely look into both Route 53 and CloudWatch. This combination not only helps with initial DNS configuration, but the CloudWatch integration now makes it especially useful for monitoring and acting upon events in an automated fashion. Check it out.
-Ryan Kennedy, Senior Cloud Engineer
There are an endless supply of articles talking about “the dangers of the hidden costs of cloud computing”. Week after week there’s a new article from a new source highlighting (in the same way) how the movement to cloud won’t help the bottom line of a business because all of the “true costs” are not fully considered by most until it’s “too late”. Too late for what? These articles are an empty defensive move because of the inevitable movement our industry is experiencing toward cloud. Now to be fair…are some things overlooked by folks? Yes. Do some people jump in too quickly and start deploying before they plan properly? Yes. Is cloud still emerging/evolving with architecture, deployment and cost models shifting on a quarterly (if not monthly) basis? Yes. But, this is what makes cloud so exciting. It’s a chance for us to rethink how we leverage technology, and isn’t that what we’ve done for years in IT? Nobody talks about the hidden savings of cloud nor do they talk about the unspoken costs with status quo infrastructure.
Before jumping into an organization that was cloud-first, I worked for 13 years, in many roles, at an infrastructure/data center-first organization, and we did very well and helped many people. However, as the years progressed and as cloud went from a gimmick to a fad to a buzzword to now a completely mainstream and enterprise IT computing platform, I saw a pattern developing in that traditional IT data center projects were costing more and more whereas cloud was looking like it cost less. I’ll give you an unnamed customer example.
Four years ago a customer of mine who was growing their virtual infrastructure (VMware) and their storage infrastructure (EMC) deployed a full data center solution of compute, storage and virtualization that cost in the 4 million dollar range. From then until now they added some additional capacity overall for about another 500K. They also went through a virtual infrastructure platform (code) upgrade as well as software upgrades to the storage and compute platforms. So this is the usual story…they made a large purchase (actually it was an operational lease, ironically like cloud could be), then added to it, and spent a ton of time and man hours doing engineering around the infrastructure just to maintain status quo. I can quantify the infrastructure but not the man hours, but I’m sure you know what I’m talking about.
Four years later guess what’s happening – they have to go through it all over again! They need to refresh their SAN and basically redeploy everything – migrate all the data off, , validate, etc. And how much is all of this? 6 to 7 million, plus a massive amount of services and about 4 months of project execution. To be fair, they grew over 100%, made some acquisitions and some of their stuff has to be within their own data center. However, there are hidden costs here in my opinion. 1) Technology manufacturers have got customers into this cycle of doing a refresh every 3 years. How? They bake the support (3 years’ worth) into the initial purchase so there is no operational expense. Then after 3 years, maintenance kicks in which becomes very expensive, and they just run a spreadsheet showing how if they just refresh they avoid “x” dollars in maintenance and how it’s worth it to just get new technology. Somehow that approach still works. There are massive amounts of professional services to executive the migration, a multi-month disruption to business, and no innovation from the IT department. It’s maintaining status quo. The only reduction that can be realized on this regard are hardware and software decreases over time, which are historically based on Moore’s law. Do you want your IT budget and staff at the mercy of Moore’s law and technology manufacturers that use funky accounting to show you “savings”?
Now let’s look at the other side, and let’s be fair. In cloud there can be hidden costs, but they exist in my mind only if you do one thing, forget about planning. Even with cloud you need to take the same approach in doing a plan, design, build, migrate, and manage methodology to your IT infrastructure. Just because cloud is easy to deploy doesn’t mean you should forget about the steps you normally take. But that isn’t a problem with cloud. It’s a problem with how people deploy into the cloud, and that’s an easy fix. If you control your methodology there should be no hidden costs because you properly planned, architected and built your cloud infrastructure. In theory this is true, but let’s look at the other side people fail to highlight…the hidden SAVINGS!!
With Amazon Web Services there have been 37 price reductions in the 7 years they have been selling their cloud platform. That’s a little more than 5/year. Do you get that on an ongoing basis after you spend 5 million on traditional infrastructure? With this approach, once you sign up you are almost guaranteed to get a credit as some point in the lifecycle of your cloud infrastructure, and those price reductions are not based on Moore’s law. Those price reductions are based on AWS having very much the same approach to their cloud as they do their retail business. Amazon wants to extend the value to customers that exists because of their size and scale, and they set margin limits on their services. Once they are “making too much” on a service or product they cut the cost. So as they grow and become more efficient and gain more market share with their cloud business, you save more!
Another bonus is that there are no refresh cycles or migration efforts every 3 years. Once you migrate to the cloud AWS manages all the infrastructure migration efforts. You don’t have to worry about your storage platform or your virtual infrastructure. Everything from the hypervisor down is on AWS, and you manage your operating system and application. What does that mean? You are not incurring massive services every 3-4 years for a 3rd party to help you design/build/migrate your stuff, and you aren’t spending 3-4 months every few years on a disruption to your business and your staff not innovating.
-David Stewart, Solutions Architect
Not long ago, 2nd Watch published an article on Amazon Glacier. In it Caleb provides a great primer on the capabilities of Glacier and the cost benefits. Now that he’s taken the time to explain what it is, let’s talk about possible use cases for Glacier and how to avoid some of the pitfalls. As Amazon says, “Amazon Glacier is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable.” What immediately comes to mind are backups, but most AWS customers do this through EBS snapshots, which can restore in minutes, while a Glacier recall can take hours. Rather than looking at the obvious, consider these use cases for Glacier Archival storage: compliance (regulatory or internal process), conversion of paper archives, and application retirement.
Compliance often forces organizations to retain records and backups for years, customers often mention a seven year retention policy based on regulatory compliance. In seven years, a traditional (on premise) server can be replaced at least once, operating systems are upgraded several times, applications have been upgraded or modified, and backup hardware/software has been changed. Add to that all the media that would need to be replaced/upgraded and you have every IT department’s nightmare – needing to either maintain old tape hardware or convert all the old backup tapes to the new hardware format (and hope too many haven’t degraded over the years). Glacier removes the need to worry about the hardware, the media, and the storage fees (currently 1¢ per GB/month in US-East) are tiny compared to the cost of media and storage on premise. Upload your backup file(s) to S3, setup a lifecycle policy, and you have greatly simplified your archival process while keeping regulatory compliance.
So how do customers create these lifecycle policies so their data automatically moves to Glacier? From the AWS Management Console, once you have an S3 bucket there is a Property called ‘Lifecycle’ that can manage the migration to Glacier (and possible deletion as well). Add a rule (or rules) to the S3 bucket that can migrate files based on a filename prefix, how long since their creation date, or how long from an effective date (perhaps 1 day from the current date for things you want to move directly to Glacier). For the example above, perhaps customers take backup files, move them to S3, then have them move to Glacier after 30 days and delete after 7 years.
Before we go too far and setup lifecycles, however, one major point should be highlighted: Amazon charges customers based on GB/month stored in Glacier and a one-time fee for each file moved from S3 to Glacier. Moving a terabyte of data from S3 to Glacier could cost little more than $10/month in storage fees, however, if that data is made up of 1k log files, the one-time fee for that migration can be more than $50,000! While this is an extreme example, consider data management before archiving. If at all possible, compress the files into a single file (zip/tar/rar), upload those compressed files to S3 and then archive to Glacier.
-Keith Homewood, Cloud Architect
According to IDC, a typical server utilizes an average of 15% of its capacity. That means 85% of a company’s capital investment can be categorized as waste. While virtualization can increase server capacity to as high as 80%, the company is still faced with 20% waste under the best case scenario. The situation gets worse when companies have to forecast demand for specific periods; e.g., the holiday season in December. If they buy too much capacity, they overspend and create waste. If they buy too little, they create customer experience and satisfaction issues.
The elasticity of Amazon Web Services (AWS) removes the need to forecast demand and buy capacity up-front. Companies can scale their infrastructure up and down as needed to match capacity to demand. Common use cases include: a) fast growth (new projects, startups); b) on and off (occasionally need capacity); c) predictable peaks (specific demand at specific times); and d) unpredictable peaks (demand exceeds capacity). Use the elasticity of AWS to eliminate waste and reduce costs over traditional IT, while providing better experiences and performance to users.
-Josh Lowry, General Manager Western U.S.
A lot of companies have been dipping their toe in the proverbial Cloud waters for some time, looking at ways to help their businesses be more efficient, agile and innovative. There have been a lot of articles published recently about cloud being overhyped, cloud being the new buzzword for everything IT, or about cloud being just a fad. The bottom line is that cloud is enabling a completely new way to conduct business, one that’s not constrained but driven through a completely new business paradigm and should not be overlooked but leveraged, and leveraged immediately.
- Cyclical Business Demand – We’ve been helping customers architect, build, deploy and manage environments for unpredictable or spikey demand. This has become more prevalent with the proliferation of mobile devices and social media outlets where you never know when the next surge in traffic will come.
- Datacenter Transformation – Helping customers figure out what can move to the public cloud and what should stay on premise is a typical engagement for us. As the continued migration from on premise technology to cloud computing accelerates, these approaches and best practices are helping customers not just optimize what they have today but also ease the burden of trying to make an all or nothing decision.
- Financial Optimization – Designing a way to help customers understand their cloud finances and then giving them the ability to create financial models for internal chargebacks and billing can sometimes be overlooked upfront. We’ve developed solutions to help customers do both where customers are seeing significant cost savings.
If you haven’t heard of Amazon Glacier, you need to check it out. As its name implies, you can think of Glacier as “frozen” storage. When considering the speed of EBS and S3, Glacier by comparison moves glacially slow. Consider Glacier as essentially a cloud-based archival solution that works similarly to old-style tape backup. In the past, backups first ran to tape, then were stored locally in case of immediate access requirements, and were then taken off-site once a certain date requirement was met (once a week, once a month, etc.). Glacier essentially works as the last stage of that process.
When a snapshot in S3, for instance, gets to be a month old, you can instruct AWS to automatically move that object to Glacier. Writing it to Glacier happens pretty much immediately, though being able to see that object on your Glacier management console can take between 3-5 hours. If you need it back, you’ll issue a request, but that can take up to 24 hours to be resolved. Amazon hasn’t released the exact mechanics of how they’re storing the data on their end, but large tape libraries are a good bet since they jive with one of Glacier’s best features: its price. That’s only $0.01 per gigabyte. Its second best feature is 11 nines worth of “durability” (which refers to data loss) and 4 nines worth of “reliability” (which refers to data availability). That’s 99.999999999% for those who like the visual.
Configuring Glacier, while a straightforward process, will require some technical savvy on your part. Amazon has done a nice job of representing how Glacier works in an illustration:
As you can see, the first step is to download the Glacier software development kit (SDK), which is available for Java or .NET. Once you’ve got that, you’ll need to create your vault. This is an easy step that starts with accessing your Glacier management console, selecting your service region (Glacier is automatically redundant across availability zones in your region, which is part of the reason for its high durability rating), naming your vault, and hitting the create button. I’m using the sandbox environment that comes with your AWS account to take these screen shots, so the region is pre-selected. In a live environment, this would be a drop-down menu providing you with region options.
The vault is where you’ll store your objects, which equate to a single file, like a document or a photo. But instead of proceeding directly to vault creation from the screen above, be sure and set up your vault’s Amazon Simple Notification Service (SNS) parameters.
Notifications can be created for a variety of operations and delivered to systems managers or applications using whatever protocol you need (HTML for a homegrown web control or email for your sys admin, for example). Once you create the vault from the notifications screen, you’re in your basic Glacier management console:
Uploading and downloading documents is where it gets technical. Currently, the web-based console above doesn’t have tools for managing archive operations like you’d find with S3. Uploading, downloading, deleting or any other operation will require programming in whichever language for which you’ve downloaded the SDK. You can use the AWS Identity and Access Management (IAM) service to attach user permissions to vaults and manage billing through your Account interface, but everything else happens at the code level. However, there are third-party Glacier consoles out there that can handle much of the development stuff in the background while presenting you with a much simpler management interface, such as CloudBerry Explorer 3.6. We’re not going to run through code samples here, but Amazon has plenty of resources for this off its Sample Code & Libraries site.
On the upside, while programming for Glacier operations is difficult for non-programmers, if you’ve got the skills, it provides a lot of flexibility in designing your own archive and backup processes. You can assign vaults to any of the various backup operations being run by your business and define your own archive schedules. Essentially, that means you can configure a hierarchical storage management (HSM) architecture that natively incorporates AWS.
For example, imagine a typical server farm running in EC2. At the first tier, it’s using EBS for immediate, current data transactions, similar to a hard disk or SAN LUN. When files in your EBS store have been unused for a period of time or if you’ve scheduled them to move at a recurring time (like with server snapshots), those files can be automatically moved to S3. Access between your EC2 servers and S3 isn’t quite as fast as EBS, but it’s still a nearline return on data requests. Once those files have lived on S3 for a time, you can give them a time to live (TTL) parameter after which they are automatically archived on Glacier. It’ll take some programming work, but unlike with standard on-premises archival solutions, which are usually based on a proprietary architecture, using Java or .NET means you can configure your storage management any way you like – for different geographic locations, different departments, different applications, or even different kinds of data.
And this kind of HSM design doesn’t have to be entirely cloud-based. Glacier works just as well with on-premises data, applications, or server management. There is no minimum or maximum amount of data you can archive with Glacier, though individual archives can’t be less than 1 byte or larger than 40 terabytes. To help you observe regulatory compliance issues, Glacier uses secure protocols for data transfer and encrypts all data on the server side using key management and 256-bit encryption.
Pricing is extremely low and simple to calculate. Data stored in Glacier is $0.01 per gigabyte. Upload and retrieval operations run only $0.05 per 1000 requests, and there is a pro-rated charge of $0.03 per gigabyte if you delete objects prior to 90 days of storage. Like everything else in AWS, Glacier is a powerful solution that provides highly customizable functionality for which you only pay for what you use. This service is definitely worth a very close look.
-Caleb Carter, Solutions Architect