Midmarket and enterprise companies looking to transform their IT operations to new models based on the public cloud and Agile/DevOps have a long, arduous journey. Moving from internally-managed IT departments with predictable needs, to ones which must be flexible and run on-demand is one of the grea paradigm shifts for CIOs today.
It requires new skills and new ways of working including a fundamental reorganisation of IT organisations. Meanwhile, IT must continue with business as usual, supporting core systems and processes for productivity and operations.
Many companies can’t get there fast enough, which is why the market for service providers specialising in public cloud infrastructure and DevOps is growing. A new crop of MSPs that focus specifically on public cloud infrastructure has appeared in the last few years to address the specific needs of public cloud as it relates to migration, legacy systems, integration, provisioning and configuration, security and financial management.
IT organisations are moving on from ing and prototyping to launching production applications in the cloud; there is often not enough time to ramp up quickly in the new capabilities needed for success. Here’s a look at how MSPs can ease the pain of enterprise public cloud and DevOps initiatives:
In the public cloud, network management is handled differently due to the differences in the actual network environment and the fact that you just don’t have the same level of visibility and control as you do in your own data center. MSPs can help by lending a hand of expertise in building and managing secure networks in the public cloud.
Without the help of an MSP your business will need to do their own homework on how the network works and what tools are effective on the security side of things: heads up, it’s a much different list than the traditional data center. What do you get out of the box? When do you need third-party software to help? MSPs have years of experience running production workloads in the public cloud and can help you make the right decisions the first time without going through an exhaustive discovery phase.
Design and architecture:
Deploying systems into the cloud requires a mental shift, due to the elastic nature of virtual resources. This reinvents infrastructure design, since instances come and go according to demand and performance needs. IT needs to understand how to automate infrastructure changes according to shifting requirements and risks, such as hardware failures and security configurations. Experienced service providers that have helped companies migrate to the cloud over and again can deliver best practices and reduce risks.
Cloud and DevOps go hand-in-hand due to the joint requirements of frequent iteration, rapid change and continuous integration/development. The processes and tools for CI and CD are still emerging. Doing this well requires not only new, collaborative workflows but working with unfamiliar technologies such as containers.
While AWS has released a new service for managing containers, that’s just one piece of the puzzle. Many companies moving toward DevOps benefit from outside help in training, planning, measuring results and navigating internal barriers to change. Lastly, the automation infrastructure itself (Puppet, Chef, others) requires maintenance and is critical in the security landscape. An MSP can help build and manage this infrastructure so that you can focus on your code.
Security in the cloud is a shared responsibility. Many customers incorrectly assume that because public cloud providers have excellent security records and deep compliance frameworks for PCI and other regulations, that their infrastructure is secure by default. The reality is that providers do an excellent job of securing the underlying infrastructure but that is where things stop for them and begin for you as a customer.
Most security issues found in the public cloud today relate to misconfigurations.track configuration changes and validate architectural designs against them. In DevOps, rapid development processes may inadvertently trump security, and using containers and micro-services to speed deployment also introduces security risks. Missteps in the area of security can be long and costly to fix later; an MSP can help mitigate that risk through upfront design and ongoing monitoring and management.
Provisioning and cost management:
Virtual sprawl is no myth. IT teams that for years have used over-provisioning as a stopgap measure to ensure uptime may struggle to adapt to a different approach using on-demand infrastructure. Experts can help make that transition through proper provisioning at the outset as well as applying spend management tools built for the cloud to monitor and predict usage.
One of the best features of public cloud providers is high elasticity, the ability to spin up large amounts of virtual instances at a moment’s notice and then shut them off when you are done using them. The trick here is to remember to shut them off: many development teams claim to work 24×7 but the reality is usually much different. An MSP can set up cost alerting and monitoring and can even leverage tools to help you allocate costs to your heavy users or business units.
Large companies often want to move legacy systems to the public cloud to reduce the costly overhead of storage and maintenance. Yet no CIO wants to be accountable for migrating a mission-critical legacy system which later doesn’t perform well or is out of compliance.
Service providers can help evaluate whether a system can be migrated as is, “lift and shift,” or needs to be reconfigured to run in the cloud. CIOs may lean toward handling this task with their internal teams, yet doing so will likely take longer and require significant retraining of staff. There’s also the need to pay close attention to compliance. Experienced MSPs can help navigate financial regulations (Sarbanes-Oxley, PCI), privacy laws (HIPAA) and data management regulations in some sectors that go against the grain of DevOps.
Most IT infrastructure managers whom have been around for a while are well-versed in VMware-specific tools such as Vsphere. Yet unfortunately, most of those operational tools made to support virtualisation software don’t work well, or at all, in the public cloud. There are some cloud-native management tools available now, including those from AWS, yet none of them are clear winners yet.
IT departments are stuck with patching together their own toolsets or developing them from scratch, such as Netflix has done. That’s not always the best use of time and money, depending on your sector. MSPs can take over the operations management function altogether. Customers benefit through the continual learning on industry best practices that the service provider must undertake to effectively manage dozens or hundreds of customers.
As with any disruptive technology, people are the biggest barrier to change. While human beings are highly adaptable, many of us simply are not comfortable with change. Take a hard look at not just skills but your culture. Do you have the type of organisation where people are willing and able to adapt without threatening to quit? If not, using the services of an MSP might be the path of least friction. Some organisations simply want the benefits of new technologies without needing to understand nor manage every nook and cranny.
Beyond all the above advantages, the MSP partner helps IT organisations move faster by serving as a knowledgeable extension of the IT department. CIOs and their teams can focus on serving the business and its evolving requirements, while the MSP helps ensure that those requirements transition well to the public cloud. Executives who have decided that the public cloud is their future and that DevOps is the way to get there are progressive thinkers whom are unafraid to take risks.
Yet that doesn’t mean they should go it alone. Find a partner who you can trust, and move toward your future with an experienced team propping you up all the way. The old adage of “you won’t get fired for hiring <name legacy service provider>” has now changed to “My new MSP got me promoted.”
-Kris Bliesner, CTO
This article was first published on ITProPortal on 11/25/15.
Last week, we kicked off a four-part blog series with our strategic partner, Alert Logic, that has a focus on the importance of cloud security for Digital Businesses. This week, Alert Logic has contributed the following blog post as a guide to help digital businesses prepare for—and respond to—cyber incidents.
Evaluating your organization’s cyber security incident response readiness is an important part of your overall security program. But responding to a cyber security incident effectively and efficiently can be a tremendous challenge for most. In most cases, the struggle to keep up during an incident is due to either of the following:
- The cyber incident response plan has been “shelf-ware” for too long
- The plan hasn’t been practiced by the incident response team.
Unfortunately, most organizations view cyber incident response as a technical issue—they assume that if a cyber incident response plan is in place and has been reviewed by the “techies,” then the plan is complete. In reality, all these organizations have is a theoretical cyber incident response plan, one with no ing or validation. Cyber incident response plans are much more than a technical issue. In the end, they are about people, process, communication, and even brand protection.
How to ensure your cyber incident response plan works
The key to ensuring your cyber incident response plan works is to practice your plan. You must dedicate time and resources to properly the plan. Cyber incident response is a “use or lose” skill that requires practice. It’s similar to an athlete mastering a specific skill; the athlete must complete numerous repetitions to develop muscle memory to enhance performance. In the same way, the practice (repetitions) of ing your cyber incident response plan will enhance our team’s performance during a real incident.
Steps for ing your plan effectively
Step 1: Self-Assessment and Basic Walk-Through
An effective methodology to your cyber incident response plan begins with a self-assessment and simple walk-through of the plan with limited team members. Steps should include:
- The incident response manager reads through the plan, using the details of a recent data breach to follow the plan. The manager also identifies how the incident was discovered as well as notification processes.
- The team follows the triage, containment, eradication, and forensics stages of the plan, identifying any gaps.
- The incident response manager walks through the communications process along the way, including recovery and steady-state operations.
- The team documents possible modifications, follow-up questions, and clarifications that should be added to the plan.
Step 2: All Hands Walk-Through
The next step to a self-assessment is the walk-through with the entire incident response team. This requires an organized meeting in a conference room and can take between 2-4 hours, in which a scenario (recent breach) is used to walk through the incident response document. These working sessions are ideal to fill in the gaps and clarify expectations for things like detection, analysis, required tools, and resources. Organizations with successful incident response plans will also include their executive teams during this type of . The executive team participation highlights priorities from a business and resource perspective and is less focused on the technical aspects of the incident.
Step 3: Live Exercise
The most important step in evaluating your incident response plan is to conduct a live exercise. A live exercise is a customized training event for the purpose of sharpening your incident response teams’ skills in a safe, non-production environment. It isn’t a penetration ; it’s an incident response exercise designed to your team’s ability to adapt and execute the plan during a live cyber attack. It’s essentially the equivalent to a pre-season game—the team participates, but it doesn’t count in the win/loss column. The value of a live exercise is the plan evaluation and team experience. The lessons learned usually prove to be the most valuable to the maturation of your cyber incident response plan.
Ultimately, preparedness is not just about having an incident response plan; it’s about knowing the plan, practicing the plan, and understanding it’s a work in progress. The development of an excellent incident response plan includes involvement and validation from the incident response team as well as a commitment to a repetitive cycle of practice and refinement.
Learn more about 2W Managed Cloud Security and how our partnership with Alert Logic can ensure your environment’s security.
Article contributed by Alert Logic
On Friday our day was packed with unpacking the supplies we brought from 2W, setting up the laptops we donated, greeting women, and holding babies! We brought 4 suitcases packed with supplies for the school and children’s home and 4 new computers. We spent time setting up the computers with parental controls and showing them how to use the computers. We also took new books to the library in the school and spent time speaking English to the kids in the school.
The women’s conference kicked off and we welcomed 90 women. They worshiped and learned about the Lord and had amazing fellowship. Six of us were responsible for all 50 kids while the Mamas went to the women’s conference. Man was that an adventure! These kids have so much energy! We took them to the open-air covered sports arena on the grounds and played soccer (futbol) and tag. We ended the night with a traditional Guatemalan meal together with the women and let off lanterns over the mountain and lit sparklers!! It was the perfect ending to a perfect day!
Saturday, was the final day of the women’s conference. We continued with more messages, workshops, imonies, & worship. Again, we helped with the children and allowed the Mamas to have a much needed/well deserved break. We played futbol & basketball with the big kids, and painted the little girls nails.
Today is our final day at Eagle’s Nest. We will attend their church and then start our journey down to Pana, 15 minutes and 2,000 feet below Eagle’s Nest, for some zip lining and a boat tour on Lake Atitlan.
We are very sad to leave our new friends and the Eagles Nest family.
A few months back we published a blog article titled Introducing Amazon Aurora, which described Amazon’s la RDS RDBMS engine offering Aurora. AWS Aurora is Amazon’s own internally developed MySQL 5.6 compatible DBMS.
Let’s review what we learned from the last article:
- MySQL “drop-in” compatibility
- Roughly 5x performance increase over traditional MySQL
- Moves from a monolithic to service-oriented approach
- Dynamically expandable storage (up to 64T) with zero downtime or performance degradation
- Data storage and IO utilization is “only pay for what you use” ($0.10/GB/mo., $0.20/million IO)
- High performance SSD backed storage
- Data is automatically replicated (two copies) across three availability zones
- Uses quorum writes (4 of 6) to increase write performance
- Self-healing instant recovery through new parallel, distributed, and asynchronous redo logs
- Cache remains warmed across DB restarts by decoupling the cache from the DB process
- Up to 15 read replicas for scaling reads horizontally
- Any read replica can be promoted to the DB master nearly instantly
- Simulation of node, disk, or networking failure for ing/HA
In addition to those, I’d like to point out a few more features of Aurora:
- Designed for 99.99% availability
- Automatic recovery from instance and storage failures
- On-volume instant snapshots
- Continuous incremental off-volume snapshots to S3
- Automatic restriping, mirror repair, hot spot management, and encryption
- Backups introduce zero load on the DB
- 400x (yes TIMES, NOT PERCENT!) lowered read replica lag over MySQL
- Much improved concurrency handling
That is a pretty impressive list of features/improvements that Aurora buys us over standard MySQL! Even at the slight increase in Aurora’s RDS run rate, the performance gains more than offset the added run costs in typical use-cases.
So what does that list of features and enhancements translate to in the real world? What does it ACTUALLY MEAN for the typical AWS customer? Having worked with a fair number of DBMS platforms over the past 15+ years, I can tell you that this is a pretty significant leap forward. The small cost increase over MySQL is nominal in light of the performance gains and added benefits. In fact, the price-to-performance ratio (akin to horsepower-to-weight ratio for any gear heads out there) is advertised as being 4x that of standard MySQL RDS. This means you will be able to gain similar performance on a significantly smaller instance size. Combine that with only having to pay for the storage your database is actually consuming (not a pre-allocated chunk) and all of the new features, and choosing Aurora is nearly always going to be your best option.
You should definitely consider using Aurora to replace any of your MySQL or MySQL derivative databases (Oracle MySQL, Percona, Maria). It’s designed using modern architectural principals, for the Cloud, and with high scalability and fault-tolerance in mind. Whether you are currently running, or are considering running, your own MySQL DBMS solution on EC2 instances or are using RDS to manage it, you should strongly consider Aurora. The only exception to this may be if you are using MySQL on a lower-end RDS, EC2, Docker, etc. instance due to lower performance needs and cost considerations.
Because Aurora has some specific performance requirements, it requires db.r3.large, or faster, instances. In some cases people choose to run smaller instances for their MySQL as they have lower performance and availability needs than what Aurora provides and would prefer the cost savings. Also, there is no way to run “Aurora” outside of RDS (as it is a platform and not simply an overhauled DBMS), which could be a consideration for those wanting to run it in a dev/ context on micro instances or containers. However, running a 5.6 compatible version of MySQL would provide application congruency between the RDS Aurora instance and the one-off (e.g. a developer’s MySQL 5.6 DB running on a Docker container).
In addition to being an instantly pluggable MySQL replacement, Aurora can be a great option for replacing high-end, expensive commercial DBMS solutions like Oracle. Switching to Aurora could provide massive cost savings while delivering a simplified, powerful, and highly scalable solution. The price-to-performance ratio of Aurora is really in a class by itself, and it provides all of the features and performance today’s critical business applications demand from a relational database. Aurora gives you on-par, or even better, performance, manageability, reliability, and security for around 1/10th of the cost!
To learn how the experts at 2nd Watch can help you get the most out of your cloud database architecture, contact us today.
-Ryan Kennedy, Senior Cloud Architect
CloudWatch is a tool for monitoring Amazon Web Services (AWS) cloud resources. With CloudWatch you can gather and monitor metrics for many of your AWS assets. CloudWatch for AWS EC2 allows 10 pre-selected metrics that are polled at five minute frequencies. These pre-selected metrics include CPU Utilization, Disk Reads, Disk Read Operations, Disk Writes, Disk Write Operations, Network-In, Network-Out, Status Check Failed (Any), Status Check Failed (Instance), and Status Check Failed (System). These metrics are designed to give you the most relevant information to help keep your environment running smoothly. CloudWatch goes one step further and offers seven pre-selected metrics that poll at an increased frequency of one-minute intervals for an additional charge. With CloudWatch you can set alarms based on thresholds set on any of your metrics. The alarms can trigger you to receive status notifications or to have the environment take automated action. For example you can set an alarm to notify you if one of your instances is experiencing high CPU load. As you can see from the graph below we’re using CloudWatch to gain insight on an instance’s average CPU Utilization over a period of 1 hour at 5 minute intervals:
You can clearly see that at 19:10 the CPU Utilization is at zero and then spikes over the next 35 minutes and is at 100% CPU utilization. 100% CPU utilization lasts for longer than 10 minutes. Without any monitoring this could be a real problem as the CPU of the system is being completely taxed, and performance would undoubtedly become sluggish. If this was a webserver, users would experience dropped connections, timeouts, or very slow response times. In this example it doesn’t matter what is causing the CPU spike, it matters how you would deal with it. If this happened in the middle of the night you would experience downtime and a disruption to your business. With a lot riding on uninterrupted 24×7 operations, processes must be in place to withstand unexpected events like this. With CloudWatch, AWS makes monitoring a little easier and setting alarms based on resource thresholds simple. Here is one way to do it for our previous CPU Utilization example:
1. Go to https://console.aws.amazon.com/cloudwatch/
2. In the Dashboard go to Metrics and select the instance and metric name in question. On the Right side of the screen you should also see a button that says Create Alarm. (See figure below)
3. Once you hit Create Alarm, the page will allow you to set an Alarm Threshold based on parameters that you choose. We’ll call our threshold “High CPU” and give it a description “Alarm when CPU is 85% for 10 minutes or more”.
4. Additionally you have to set the parameters to trigger the alarm. We choose “Whenever CPU Utilization is 85% for 2 consecutive periods” (remember our periods are 5 minutes each). This means after 10 minutes in an alarm state our action will take place.
5. For Actions we select “Whenever this alarm: State is ALARM” send notification to our SNS Topic MyHighCPU and send an email. This will cause the trigger to send an email to an email address or distribution list. (See the figure below)
6. Finally we hit Create Alarm, and we get the following:
7. Finally you have to go to the email account of the address you entered and confirm the SNS Notification subscription. You should see a message that says: “You have chosen to subscribe to the topic: arn:aws:sns:us-west-1:xxxxxxxxxxxxx:MyHighCPU. To confirm this subscription, click or visit the link below (If this was in error no action is necessary). Confirm subscription.
Overall the process of creating alarms for a couple metrics is pretty straight forward and simple. It can get more complex when you incorporate more complex logic. For example you could have a couple EC2 instances in an Auto Scale Group behind an Elastic Load Balancer, and if CPU spiked over 85% for 10 minutes you could have the Auto Scale Group take immediate automated action to spin up additional EC2 instances to take on the increased load. When that presumed web traffic that was causing the CPU spike subsides you can have a trigger that scales back instances so you are no longer paying for them. With the power of CloudWatch managing your AWS systems can become completely automated, and you can react immediately to any problems or changing conditions.
In many environments the act of monitoring and managing systems can become complicated and burdensome leaving you little time for developing your website or application. At 2nd Watch we provide a suite of managed services (http://2ndwatch.com/cloud-services/managed-cloud-platform/) to help you free up time for more important aspects of your business. We can put much of the complex logic in place for you to help minimize your administrative cloud burden. We take a lot of the headache out of managing your own systems and ensure that your operations are secure, reliable, and compliant at the lowest possible cost.
-Derek Baltazar, Senior Cloud Engineer
Managed Services is the practice of outsourcing day-to-day management responsibilities of cloud infrastructure as a strategic method for improving IT operations and reducing expenses. Managed Services removes the customer’s burden of nuts-and-bolts, undifferentiated work and enables them to focus more on value creating activities that directly impact their core business. In accordance with its expertise in Amazon Web Services (AWS) and enterprise IT operations, 2nd Watch has built a best-in-class Managed Services practice with the following capabilities.
* Escalation Management – Collaboration between MS and professional services on problem resolution.
* Incident Management – Resolution of day-to-day tickets with response times based on severity.
* Problem Management – Root cause analysis and resolution for recurring incidents.
* Alarming Services – CPU utilization, data backup, space availability thresholds, etc.
* Reporting Services – Cost optimization, incidents, SLA, etc.
* System Reliability – 24×7 monitoring at a world-class Network Operations Center (NOC).
* Audit Support – Data pulls, process documents, staff interviews, etc.
* Change Management – Software deployment, patching, ing, etc.
* Service-Level Agreement – Availability and uptime guarantees, including 99.9% and 99.99%.
2nd Watch’s Managed Services practice provides a single support organization for customers running their cloud infrastructure on AWS. Customers benefit from improved IT operations and reduced expenses. Customers also benefit from Managed Services identifying problems before they become critical business issues impacting availability and uptime. Costs generally involve a setup or transition fee and an ongoing, fixed monthly fee making support costs predictable. Shift your focus to valuation creation and let Managed Services do the undifferentiated heavy lifting.
All contents copyright © 2013, Josh Lowry. All rights reserved.
-Josh Lowry, General Manager West