Batch computing isn’t necessarily the most difficult thing to design a solution around, but there are a lot of moving parts to manage, and building in elasticity to handle fluctuations in demand certainly cranks up the complexity. It might not be particularly exciting, but it is one of those things that almost every business has to deal with in some form or another.
The on-demand and ephemeral nature of the Cloud makes batch computing a pretty logical use of the technology, but how do you best architect a solution that will take care of this? Thankfully, AWS has a number of services geared towards just that. Amazon SQS (Simple Queue Services) and SWF (Simple Workflow Service) are both very good tools to assist in managing batch processing jobs in the Cloud. Elastic Transcoder is another tool that is geared specifically around transcoding media files. If your workload is geared more towards analytics and processing petabyte scale big data, then tools like EMR (Elastic Map Reduce) and Kinesis could be right up your alley (we’ll cover that in another blog). In addition to not having to manage any of the infrastructure these services ride on, you also benefit from the streamlined integration with other AWS services like IAM for access control, S3, SNS, DynamoDB, etc.
For this article, we’re going to take a closer look at using SQS and SWF to handle typical batch computing demands.
Simple Queue Services (SQS), as the name suggests, is relatively simple. It provides a queuing system that allows you to reliably populate and consume queues of data. Queued items in SQS are called messages and are either a string, number, or binary value. Messages are variable in size but can be no larger than 256KB (at the time of this writing). If you need to queue data/messages larger than 256KB in size the best practice is to store the data elsewhere (e.g. S3, DynamoDB, Redis, MySQL) and use the message data field as a linker to the actual data. Messages are stored redundantly by the SQS service, providing fault tolerance and guaranteed delivery. SQS doesn’t guarantee delivery order or that a message will be delivered only once, which seems like something that could be problematic except that it provides something called Visibility Timeout that ensures once a message has been retrieved it will not be resent for a given period of time. You (well, your application really) have to tell SQS when you have consumed a message and issue a delete on that message. The important thing is to make sure you are doing this within the Visibility Timeout, otherwise you may end up processing single messages multiple times. The reasoning behind not just deleting a message once it has been read from the queue is that SQS has no visibility into your application and whether the message was actually processed completely, or even just successfully read for that matter.
Where SQS is designed to be data-centric and remove the burden of managing a queuing application and infrastructure, Simple Workflow Service (SWF) takes it a step further and allows you to better manage the entire workflow around the data. While SWF implies simplicity in its name, it is a bit more complex than SQS (though that added complexity buys you a lot). With SQS you are responsible for managing the state of your workflow and processing of the messages in the queue, but with SWF, the workflow state and much of its management is abstracted away from the infrastructure and application you have to manage. The initiators, workers, and deciders have to interface with the SWF API to trigger state changes, but the state and logical flow are all stored and managed on the backend by SWF. SWF is quite flexible too in that you can use it to work with AWS infrastructure, other public and private cloud providers, or even traditional on-premise infrastructure. SWF supports both sequential and parallel processing of workflow tasks.
Note: if you are familiar with or are already using JMS, you may be interested to know SQS provides a JMS interface through its java messaging library.
One major thing SWF buys you over using SQS is that the execution state of the entire workflow is stored by SWF extracted from the initiators, workers, and deciders. So not only do you not have to concern yourself with maintaining the workflow execution state, it is completely abstracted away from your infrastructure. This makes the SWF architecture highly scalable in nature and inherently very fault-tolerant.
There are a number of good SWF examples and use-cases available on the web. The SWF Developer Guide uses a classic e-commerce customer order workflow (i.e. place order, process payment, ship order, record completed order). The SWF console also has a built in demo workflow that processes an image and converts it to either grayscale or sepia (requires AWS account login). Either of these are good examples to walk through to gain a better understanding of how SWF is designed to work.
Contact 2nd Watch today to get started with your batch computing workloads in the cloud.
-Ryan Kennedy, Sr. Cloud Architect
If you’re an old hand in IT (that is, someone with at least 5 years under your belt), making a switch to the cloud could be the best decision you’ll ever make. The question is, do you have what it takes?
Job openings in the cloud computing industry are everywhere – Amazon alone lists 17 different categories of positions related to AWS on its website, and by one estimate there are nearly four million cloud computing jobs just in the United States. But cloud experts, especially architects, are a rare breed. Cloud is a relatively new platform, which limits the number of qualified personnel. Finding people with the right skillsets can be hard.
Cloud architects are experts in designing, running and managing applications in the cloud, leading DevOps teams and working with multiple cloud vendors. A senior cloud architect oversees the broad strategy of a company’s investment in the cloud, and is responsible for managing and delivering ROI from cloud investments and continually aligning with business objectives.
Yet being a cloud architect is not simply about understanding new technologies and features as they come off the conveyor belt. Beyond dealing with rapid technological change, you’ve got to have some creativity and business acumen. If you are fiercely independent and don’t enjoy a little schmoozing with business colleagues and chatting up vendors, this probably is not a good career choice for you. If you don’t like things changing frequently and problem-solving, you may suffer from recurring anxiety attacks.
In talking to customers, we’ve come up with a list of the top non-techie skills that every cloud architect should have. Here are the top 10:
- Strategic problem-solving skills
- Security & compliance experience
- Ability to balance trade-offs with agility
- Business and accounting knowledge
- Customer experience focus
- Deploy & destroy mentality
- Adept negotiation and communications skills
- Ability to solve problems with an eye for the future
- Understanding of platform integrations
- Ability to evolve with the business
In short, cloud architects are like great companions: once you have one, hold on and never let him or her go. Check out the infographic for a complete mapping of the perfect cloud architect!
-Jeff Aden, EVP Business Development & Marketing
With the New Year comes the resolutions. When the clock struck midnight on January 1st, 2015 many people turned the page on 2014 and made a promise to do an act of self-improvement. Often times it’s eating healthier or going to the gym more regularly. With the New Year, I thought I could put a spin on a typical New Year’s Resolution and make it about AWS.
How could you improve on your AWS environment? Without getting too overzealous, let’s focus on the fundamental AWS network infrastructure, specifically an AWS Virtual Private Cloud (VPC). An AWS VPC is a logically isolated, user controlled, piece of the AWS Cloud where you can launch and use other AWS resources. You can think of it as your own slice of AWS network infrastructure that you can fully customize and tailor to your needs. So let’s talk about VPCs and how you can improve on yours.
- Make sure you’re using VPCs! The simple act of implementing a VPC can put you way ahead of the game. VPCs provide a ton of customization options from defining your own VPC size via IP addressing; to controlling subnets, route tables, and gateways for controlling network flow between your resources; to even defining fine-grained security using security groups and network ACLs. With VPCs you can control things that simply can’t be done when using EC2-Classic.
- Are you using multiple Availability Zones (AZs)? An AZ is a distinct isolated location engineered to be inaccessible from failures of other AZs. Make sure you take advantage of using multiple AZs with your VPC. Often time instances are just launched into a VPC with no rhyme or reason. It is great practice to use the low-latency nature and engineered isolation of AZs to facilitate high availability or disaster recovery scenarios.
- Are you using VPC security groups? “Of course I am.” Are you using network ACLs? “I know they are available, but I don’t use them.” Are you using AWS Identity and Access Management (IAM) to secure access to your VPCs? “Huh, what’s an IAM?!” Don’t fret, most environments don’t take advantage of all the tools available for securing a VPC, however now is the time reevaluate your VPC and see if you can or even should use these security options. Security groups are ingress and egress firewall rules you place on individual AWS resources in your VPC and one of the fundamental building blocks of an environment. Now may be a good time to audit the security groups to make sure you’re using the principle of least privilege, or not allowing any access or rules that are not absolutely needed. Network ACLs work at the subnet level and may be useful in some cases. In larger environments IAM may be a good idea if you want more control of how resources interact with your VPC. In any case there is never a bad time to reevaluate security of your environment, particularly your VPC.
- Clean up your VPC! One of the most common issues in AWS environments are resources that are not being used. Now may be a good time to audit your VPC and take note of what instances you have out there and make sure you don’t have resources racking up unnecessary charges. It’s a good idea to account for all instances, leftover EBS volumes, and even clean up old AMIs that may be sitting in your account. There are also things like extra EIPs, security groups, and subnets that can be cleaned up. One great tool to use would be AWS Trusted Advisor. Per the AWS service page, “Trusted Advisor inspects your AWS environment and finds opportunities to save money, improve system performance and reliability, or help close security gaps.”
- Bring your VPC home. AWS, being a public cloud provider, allows you to create VPCs that are isolated from everything, including your on-premise LAN or datacenter. Because of this isolation all network activity between the user and their VPC happens over the internet. One of the great things about VPCs are the many types of connectivity options they provide. Now is the time to reevelautate how you use VPCs in conjunction with your local LAN environment. Maybe it is time to setup a VPN and turn your environment into a hybrid cloud and physical environment allowing all communication to pass over a private network. You can even take it one step further by incorporating AWS Direct Connect, a service that allows you to establish private connectivity between AWS and your datacenter, office, or colocation environment. This can help reduce your network costs, increase bandwidth throughput, and provide a more consistent overall network experience.
These are just a few things you can do when reevaluating your AWS VPC for the New Year. By following these guidelines you can gain efficiencies in your environment you didn’t have before and can rest assured your environment is in the best shape possible for all your new AWS goals of 2015.
-Derek Baltazar, Senior Cloud Engineer
With every new year comes a new beginning. The holidays give us a chance to reflect on our achievements from the previous year, as well as give us a benchmark for what we want to accomplish in the following year. For most individuals, weight loss, quitting a bad habit, or even saving money top the list for resolutions. For companies, the goals are a little bit more straight forward and quantitative. Revenue goals are set, quotas are established, and business objectives are defined. The success of a company is entrenched in these goals and will determine; positively or negatively, how a company should be valued.
Today’s businesses are becoming even more complex than ever, and we can thank new technologies, emerging markets, and the ease of globalization for helping drive these new trends. One of the most impactful and fas adopting technologies that is helping businesses in 2015 is the public cloud.
What’s amazing, though, is that how businesses are planning for the adoption of public cloud is still unknown to most. Common questions such as “Is my team staffed accordingly to handle this technology change?” or “How do I know if I’m architecting correctly for my business?” are coming up often. These questions are extremely common with new technologies, but it doesn’t have to be difficult if you take these simple steps.
- Plan Ahead: Guide your leadership to see that now is the time to review the current technology inventory being utilized by the company and strategically outline what it will take to help the company become more agile, cost effective, and leverage the most robust technologies in the New Year.
- Over communicate: By talking with all the necessary parties, you will turn an unknown topic into water cooler conversation. Involve as many people as possible and ask for feedback constantly. This way, if there is anyone that is not on-board with this technology shift, you will have advocates across the organization helping your cause.
- Track your progress: Keep an active log of your adoption process, pitfalls, to-dos, and active contributors. Establish a weekly cadence to review past success and upcoming agendas. Focus on small wins, and after a while you will see amazing results for your achievements.
- Handle problems with positivity: Technology changes are never easy for an organization, but take each problem as an opportunity to learn. If something isn’t working, it’s probably for good reason. Review what went wrong, learn from the mistakes, and make sure they don’t repeat themselves. Each problem should be welcomed, addressed and reviewed accordingly.
- Stay diligent: Rome wasn’t built in a day, and neither will your new public-cloud datacenter be. Review your plan, do constant check points against your cloud strategy, follow your roadmap and address problems as soon as they come up. By staying focused and tenacious you will be successful in your endeavors.
Happy 2015, and let’s make it a great year.
-Blake Diers, Alliance Manager
Momentum continues to build for companies who are migrating their workloads to the cloud, across all industries, even highly regulated industries such as Financial Services, Health Care, and Government. And it’s not just for small companies and startups. Most of the largest companies in the world – we’re talking Fortune 500 here – are adopting rapid and aggressive strategies for migrating and managing their workloads in the cloud. While the benefits of migrating workloads to the cloud are seemingly obvious (cost savings, of course), the “hidden” benefits exist in the fact that the cloud allows businesses to be more nimble, enabling business users with faster, more powerful, and more scalable business capabilities than they’ve ever had before.
So what do enterprises care about when managing workloads in the cloud? More importantly, what should you care about? Let’s assume, for the sake of argument, that your workloads are already in the cloud – that you’ve adopted a sound methodology for migrating your workloads to the cloud.
Raise your expectations I would submit that enterprises should raise their expectations from “standard” workload management. Why? Because the cloud provides a more flexible, powerful, and scalable paradigm than the typical application-running-in-a-data-center-on-a-bunch-of-servers model. Once your workloads are in the cloud, the basic requirements for managing them are not dissimilar to what you’d expect today for managing workloads on-premise or in a data center.
The basics include:
- Service Levels: Basic service levels are still just that – basic service levels – Availability, response time, capacity, support, monitoring, etc. So what’s different in the cloud world? You should pay particular attention to ensuring your personal data is protected in your hosted cloud service.
- Support: Like any hosting capability, support is very important to consider. Does your provider provide online, call center, dedicated, and/or a combo platter of all of these?
- Security: Ensure that your provider has robust security measures in place and mechanisms to preserve your applications and data
- Compliance: You should ensure your cloud provider is in compliance with the standards for your specific industry. Privacy, security and quality are principal compliance areas to evaluate and ensure are being provided.
Now what should enterprises expect on top of the “basics?”
- Visibility: When your workloads are in the cloud, you can’t see them anymore. No longer will you be able to walk through the data center and see your racks of servers with blinking lights, but there’s a certain comfort in that, right? So when you move to the cloud, you need to be able to see (ideally in a visual paradigm) the services that you’re using to run your critical workloads
- Be Proactive: It used to be that enterprises only cared if their data center providers/data center guys were just good at being “reactive” (responding to tickets, monitoring apps and servers, escalating issues, etc). But now the cloud allows us to be proactive. How can you optimize your infrastructure so you actually use less, rather than more? Wouldn’t it be great if your IT operations guy came to you and said “Hey, we can decrease our footprint and lessen our spend,” rather than the other way around?
- Partner with the business: Now that your workloads are running in the cloud, your IT ops team can focus more on working with the business/applications teams to understand better how the infrastructure can work for them, again rather than the other way around, and they can educate the business/applications teams on how some of the newest cloud services, like elasticity, big data, unstructured data, auto-scaling, etc., can cause the business to think differently and innovate faster.
Enterprises should – and are – raising their expectations as they relate to managing their workloads in the cloud. Why? Because the cloud provides a more flexible, powerful, and scalable paradigm than the typical hardware-centric, data center-focused approach.
-Keith Carlson, EVP of Professional and Managed Workload Services
Implementing Cloud Infrastructure in the Enterprise is not easy. An organization needs to think about scale, integration, security, compliance, results, reliability and many other factors. The pace of change pushes us to stay on top of these topics to help our organization realize the many benefits of Cloud Infrastructure.
Think about this in terms of running a race. The race has not changed – there are still hurdles to be cleared – hurdles before the race in practice and hurdles on the track during prime time. We bucket these hurdles into two classes: pre-adoption and operational.
Pre-adoption hurdles come in the form of all things required to make Cloud Infrastructure a standard in your enterprise. A big hurdle we often see is a clear roadmap and strategy around Cloud. What applications will be moving and when? When will new applications be built on the Cloud? What can we move without refactoring? Another common hurdle is standards. How do you ensure your enterprise can order the same thing over and over blessed by Enterprise Architecture, Security and your lawyers. Let’s examine these two major pre-adoption hurdles.
Having a clear IT strategy around Cloud Computing is key to getting effective enterprise adoption. Everyone from the CIO to the System Admin should be able to tell you how your organization will be consuming Cloud and what their role in the journey will be. In our experience at 2nd Watch, this typically involves a specific effort to analyze your current application portfolio for benefits and compatibility in the Cloud. We often help our customers define a classification matrix of applications and workloads that can move to the Cloud and categorize them into classes of applications based on the effort and benefits received from moving workloads to the Cloud. Whether you have a “Cloud First,” “Cloud Only” or another strategy for leveraging Cloud, the important thing is that your organization understands the strategy and is empowered to make the changes required to move forward.
Standardization is a challenge when it comes to implementing Cloud Computing. There are plenty of Cloud Service Providers, and there are no common standards for implementations. The good news is that AWS is quickly becoming the de facto standard for Cloud Infrastructure, and other providers are starting to follow suit.
2nd Watch works closely with our customers to define standards we call “Reference Architectures” to enable consistency in Cloud usage across business units, regions, etc. Our approach is powered by Cloud Formation and made real by Cloud Trails, enabling us to deploy standard infrastructure and be notified when someone makes a change to the standard in production (or Test/Dev, etc.). This is where the power of AWS really shines.
Imagine… A service catalog of all the different application or technology stacks that you need to deploy in your enterprise – now think about having an automated way to deploy those standards quickly and easily in minutes instead of days/weeks/months. Standards will pay dividends in helping your organization consume Cloud and maintain existing compliance and security requirements.
Operational hurdles for Cloud Computing come about due to the different types of people, processes and technology. Do the people who support your IT infrastructure understand the new technology involved in managing Cloud infrastructure? Do you have the right operational processes in place to deal with incidents involving Cloud infrastructure? Do you have the right technology to help you manage your cloud infrastructure at enterprise scale?
Here are some people related questions to ask yourself when you are looking to put Cloud infrastructure to work in your enterprise:
- How does my IT organization have to change when I move to the cloud?
- What new IT roles are going to be required as I move to the cloud?
- What type of training should be scheduled and who should attend?
- Who will manage the applications after they are moved to the cloud?
People are critical to the IT equation, and the Cloud requires IT skills and expertise. It has been our experience that organizations that take the people component seriously have a much more effective and efficient Cloud experience than those who might address it after the fact or with less purpose. Focus on your people – make sure they have the training and support they need to ensure success once you are live in the Cloud.
Cloud infrastructure uses some of the same technology your enterprise deploys today – virtualization, hypervisors, hardware, network, etc. The difference is that the experts are managing the core components and letting you build on top. This is a different approach to infrastructure and requires enterprise IT shops to consider what changes will need to be made to their process to ensure they can operationalize Cloud computing. An example: How will your process deal with host management issues like needing to reboot a group of servers if the incident originates from a provider instead of your own equipment?
Finally, technology plays a big role in ensuring a successful Cloud infrastructure implementation. As users request new features and IT responds with new technology, thought needs to be given to how the enterprise will manage that technology. How will your existing management and monitoring tools connect to your Cloud infrastructure? To what pieces of the datacenter will you be unable to attach? When will you have to use Cloud Service Provider plugins vs. your existing toolset? What can you manage with your existing tools? How do you take advantage of the new infrastructure, including batch scheduling, auto-scaling, reference architectures, etc.? Picking the right management tools and technology will go a long way to providing some of the real benefits of Cloud Infrastructure.
At 2nd Watch we believe that Enterprise Architecture (in a broad sense) is relevant regardless of the underlying technology platform. It is true that moving from on premises infrastructure to Cloud enables us to reduce the number of things demanding our focus – Amazon Web Services vs. Cisco, Juniper, F5, IBM, HP, Dell, EMC, NetApp, etc.
This is the simplicity of it – the number of vendors and platforms to deal with as an IT person is shrinking, and thank goodness! But, we still need to think about how to best leverage the technology at hand. Cloud adoption will have hurdles. The great news is that together we can train ourselves to clear them and move our businesses forward.
-Kris Bliesner, CTO