1-888-317-7920 info@2ndwatch.com

Cloud Autonomics and Automated Management and Optimization

Autonomics systems is an exciting new arena within cloud computing, although it is not a new technology by any means. Automation, orchestration and optimization have been alive and well in the datacenter for almost a decade now. Companies like Microsoft with System Center, IBM with Tivoli and ServiceNow are just a few examples of platforms that harness the ability to collect, analyze and make decisions on how to act against sensor data derived from physical/virtual infrastructure and appliances.

Autonomic cloud capabilities are lighting up quickly across the cloud ecosystem. The systems can monitor infrastructure, services, systems and make decisions to support remediation and healing, failover and failback and snapshot and recovery. The abilities come from workflow creation, runbook and playbook development, which helps to support a broad range of insight with action and corrective policy enforcement.

In the compliance world, we are seeing many great companies come into the mix to bring autonomic type functionality to life in the world of security and compliance.

Evident is a great example of a technology that functions with autonomic-type capabilities. The product can do some amazing things in terms of automation and action. It provides visibility across the entire cloud platform and identifies and manages risk associated with the operation of core cloud infrastructure and applications within an organization.

Using signatures and control insight as well as custom-defined controls, it can determine exploitable and vulnerable systems at scale and report the current state of risk within an organization. That on face value is not autonomic, however, the next phase it performs is critical to why it is a great example of autonomics in action.

After analyzing the current state of the vulnerability and risk landscape, it reports current risk and vulnerability state and derives a set of guided remediations that can be either performed manually against the infrastructure in question or automated for remediation to ensure a proactive response hands off to ensure vulnerabilities and security compliance can always be maintained.

Moving away from Evident, the focus going forward is a marriage of many things to increase systems capabilities and enhance autonomic cloud operations. Operations management systems in the cloud will light up advanced Artificially Intelligent and Machine Learning-based capabilities, which will take in large amounts of sensor data across many cloud-based technologies and services and derive analysis, insight and proactive remediation – not just for security compliance, but across the board in terms of cloud stabilization and core operations and optimization.

CloudHealth Technologies and many others in the cloud management platform space are looking deeply into how to turn the sensor data derived into core cloud optimization via automation and optimization.

AIOps is a term growing year over year, and it fits well to describe how autonomic systems have evolved from the datacenter to the cloud. Gartner is looking deeply into this space, and we at 2nd Watch see promising advancement coming from companies like Palo Alto Networks with their native security platform capabilities along with Evident for continuous compliance and security.

MoogSoft is bringing a next generation platform for IT incident management to life for the cloud, and its Artificial Intelligence capabilities for IT operations are helping DevOps teams operate smarter, faster and more effectively in terms of automating traditional IT operations tasks and freeing up IT engineers to work on the important business-level needs of the organization vs day-to-day IT operations. By providing intelligence to the response of systems issues and challenges, IT operations teams can become more agile and more capable to solve mission critical problems and maintain a proactive and highly optimized enterprise cloud.

As we move forward, expect to see more and more AI and ML-based functionality move into the core cloud management platforms. Cloud ISVs will be leveraging more and more sensor data to determine response, action and resolution and this will become tightly coupled directly to the virtual machine topology and the cloud native services underlying all cloud providers moving forward.

It is an exciting time for autonomic systems capabilities in the cloud, and we are excited to help customers realize the many potential capabilities and benefits which can help automate, orchestrate and proactively maintain and optimize your core cloud infrastructure.

To learn more about autonomic systems and capabilities, check out Gartner’s AIOps research and reach out to 2nd Watch. We would love to help you realize the potential of these technologies in your cloud environment today!

-Peter Meister, Sr Director of Product Management

Facebooktwittergoogle_pluslinkedinmailrss

Logging and Monitoring in the era of Serverless – Part 1

Figuring out monitoring in a holistic sense is a challenge for many companies still, whether it is with conventional infrastructure or new platforms like serverless or containers.

In most applications there are two aspects of monitoring an application:

  • System Metrics such as errors, invocations, latency, memory and cpu usage
  • Business Analytics such as number of signups, number of emails sent, transactions processed, etc

The former is fairly universal and generally applicable in any stack to a varying degree. This is what I would call the undifferentiated aspect of monitoring an application. The abilities to perform error detection and track performance metrics are absolutely necessary to operate an application.

Everything that is old is new again. I am huge fan of the Twelve-Factor App. If you aren’t familiar, I highly suggest taking a look at it. Drafted in 2011 by developers at Heroku, the Twelve-Factor App is a methodology and set best practices designed to enable applications to be built with portability and resiliency when deployed to the web.

In the Twelve-Factor App manifesto, it is stated that applications should produce “logs as event streams” and leave it up to the execution environment to aggregate them. If we are to gather information from our application, why not make that present in the logs? We can use our event stream (i.e. application log) to create time-series metrics. Time-series metrics are just datapoints that have been sampled and aggregated over time, which enable developers and engineers to track performance. They allow us to make correlations with events at a specific time.

AWS Lambda works almost exactly in this way by default, aggregating its logs via AWS CloudWatch. CloudWatch organizes logs based on function, version, and containers while Lambda adds metadata for each invocation. And it is up to the developer to add application-specific logging to their function. CloudWatch, however, will only get you so far. If we want to track more information than just invocation, latency, or memory utilization, we need to analyze the logs deeper. This is where something like Splunk, Kibana, or other tools come into play.

In order to get to the meat of our application and the value it is delivering we need to ensure that we have additional information (telemetry) going to the logs as well:

e.g. – Timeouts – Configuration Failures – Stack traces – Event objects

Logging out these types of events or information will enable those other tools with rich query languages to create a dashboard with just about anything we want on them.

For instance, let’s say we added the following line of code to our application to track an event that was happening from a specific invocation and pull out additional information about execution:

log.Println(fmt.Sprintf(“-metrics.%s.blob.%s”, environment, method))

In a system that tracks time-series metrics in logs (e.g. SumoLogic), we could build a query like this:

“-metrics.prod.blob.” | parse “-metrics.prod.blob.*” as method | timeslice 5m | count(method) group by  _timeslice, method | transpose row _timeslice column method

This would give us a nice breakdown of the different methods used in a CRUD or RESTful service and can then be visualized in the very same tool.

While visualization is nice, particularly when taking a closer look at a problem, it might not be immediately apparent where there is a problem. For that we need some way to grab the attention of the engineers or developers working on the application. Many of the tools mentioned here support some level of monitoring and alerting.

In the next installment of this series we will talk about increasing visibility into your operations and battling dashboard apathy! Check back next week.

-Lars Cromley, Director, Cloud Advocacy and Innovation

Facebooktwittergoogle_pluslinkedinmailrss

Managing Azure Cloud Governance with Resource Policies

I love an all you can eat buffet. One can get a ton of value from a lot to choose from, and you can eat as much as you want or not, for a fixed price.

In the same regards, I love the freedom and vast array of technologies that the cloud allows you. A technological all you can eat buffet, if you will. However, there is no fixed price when it comes to the cloud. You pay for every resource! And as you can imagine, it can become quite costly if you are not mindful.

So, how do organizations govern and ensure that their cloud spend is managed efficiently? Well, in Microsoft’s Azure cloud you can mitigate this issue using Azure resource policies.

Azure resource policies allow you to define what, where or how resources are provisioned, thus allowing an organization to set restrictions and enable some granular control over their cloud spend.

Azure resource policies allow an organization to control things like:

  • Where resources are deployed – Azure has more than 20 regions all over the world. Resource policies can dictate what regions their deployments should remain within.
  • Virtual Machine SKUs – Resource policies can define only the VM sizes that the organization allows.
  • Azure resources – Resource policies can define the specific resources that are within an organization’s supportable technologies and restrict others that are outside the standards. For instance, your organization supports SQL and Oracle databases but not Cosmos or MySQL, resource policies can enforce these standards.
  • OS types – Resource policies can define which OS flavors and versions are deployable in an organization’s environment. No longer support Windows Server 2008, or want to limit the Linux distros to a small handful? Resource policies can assist.

Azure resource policies are applied at the resource group or the subscription level. This allows granular control of the policy assignments. For instance, in a non-prod subscription you may want to allow non-standard and non-supported resources to allow the development teams the ability to test and vet new technologies, without hampering innovation. But in a production environment standards and supportability are of the utmost importance, and deployments should be highly controlled. Policies can also be excluded from a scope. For instance, an application that requires a non-standard resource can be excluded at the resource level from the subscription policy to allow the exception.

A number of pre-defined Azure resource policies are available for your use, including:

  • Allowed locations – Used to enforce geo-location requirements by restricting which regions resources can be deployed in.
  • Allowed virtual machine SKUs – Restricts the virtual machines sizes/ SKUs that can be deployed to a predefined set of SKUs. Useful for controlling costs of virtual machine resources.
  • Enforce tag and its value – Requires resources to be tagged. This is useful for tracking resource costs for purposes of department chargebacks.
  • Not allowed resource types – Identifies resource types that cannot be deployed. For example, you may want to prevent a costly HDInsight cluster deployment if you know your group would never need it.

Azure also allows custom resource policies when you need some restriction not defined in a custom policy. A policy definition is described using JSON and includes a policy rule.

This JSON example denies a storage account from being created without blob encryption being enabled:

{
 
"if": {
 
"allOf": [
 
{
 
"field": "type",
 
"equals": "Microsoft.Storage/ storageAccounts"
 
},
 
{
 
"field": "Microsoft.Storage/ storageAccounts/ enableBlobEncryption",
 
"equals": "false"
 
}
 
]
 
},
 
"then": { "effect": "deny"
 
}
 
}

The use of Azure Resource Policies can go a long way in assisting you to ensure that your organization’s Azure deployments meet your governance and compliance goals. For more information on Azure Resource Policies visit https://docs.microsoft.com/en-us/azure/azure-policy/azure-policy-introduction.

For help in getting started with Azure resource policies, contact us.

-David Muxo, Sr Cloud Consultant

Facebooktwittergoogle_pluslinkedinmailrss

Why buy Amazon Web Services through a Partner and who “owns” the account?

As an AWS Premier Partner and audited, authorized APN Managed Service Provider (MSP), 2nd Watch offers comprehensive services to help customers accelerate their journey to the cloud.  For many of our customer we not only provide robust Managed Cloud Services, we also resell Amazon products and services.  What are the advantages for customers who purchase from a value-added AWS reseller?  Why would a customer do this? With an AWS reseller, who owns the account? These are all great questions and the subject of today’s blog post.

I am going to take these questions in reverse order and deal with the ownership issue first, as it is the most commonly misconstrued part of the arrangement.  Let me be clear – when 2nd Watch resells Amazon Web Services as an AWS reseller, our customer “owns” the account.  At 2nd Watch we work hard every day to earn our customers’ trust and confidence and thereby, their business.  Our pricing model for Managed Cloud Services is designed to leverage the advantages of cloud computing’s consumption model – pay for what you use.  2nd Watch customers who purchase AWS through us have the right to move their account to another MSP or purchase direct from AWS if they are unhappy with our services.

I put the word “own” in quotes above because I think it is worth digressing for a minute on how different audiences interpret that word.  Some people see the ownership issue as a vendor lock-in issue, some as an intellectual property concern and still others a liability and security requirement.  For all of these reasons it is important we are specific and precise with our language.

With 2nd Watch’s Managed Cloud Services consumption model you are not locked-in to 2nd Watch as your AWS reseller or MSP.  AWS Accounts and usage purchased through us belong to the customer, not 2nd Watch, and therefore any intellectual property contained therein is the responsibility and property of the customer.  Additionally, as the account owner, a customer’s AWS accounts use a shared responsibility model.  With regards to liability and security, however, our role as an MSP can be a major benefit.

Often MSP’s “govern or manage” the IAM credentials for an AWS account to ensure consistency, security and governance.  I use the words govern or manage and not “own” as I want to be clear that the customer still has the right to take back the credentials and overall responsibility for managing each account, which is the opposite of lock-in.  So why would a customer want their MSP to manage their credentials?  The reason is pretty simple; similar to a managed data center or colocation facility, you own the infrastructure, but you hire experts to manage the day-to-day management for increased limits of liability, security and enhanced SLA’s.

Simply put, if you, as a customer, want your MSP to carry the responsibility for your AWS account and provide service level agreements (complete with financial repercussions), you are going to want to make sure administrative access to the environment is limited with regards to who can make changes that may impact stability or performance.  As a 2nd Watch Managed Cloud Services customer, allowing us to manage IAM credentials also comes with the benefit of our secure SOC 2 Type 2 (audited) compliant systems and processes.  Often our security controls exceed the capabilities of our customers.

Also worth noting – as we on-board a Managed Cloud Services customer, we often will audit their environment and provide best practice recommendations.  These recommendations are aligned with the excellent AWS Well Architected framework and help customers achieve greater stability, performance, security and cost optimization.  Our customers have the option of completing the remediation or having 2nd Watch perform the remediation.  Implementing best practices for managing user access along with leveraging cutting edge technology results in a streamlined journey to the cloud.

So now we have addressed the question of who owns the account, but we haven’t addressed why a customer would want to procure AWS through a partner.  First, see my earlier blog post regarding Cloud Cost Complexity for some background.  Second, buying AWS through 2nd Watch as an AWS partner, or AWS reseller, comes with several immediate advantages:

  • All services are provided at AWS market rates or better.
  • Pass through all AWS volume tier discounts and pricing
  • Pass through AWS Enterprise Agreement terms, if one exists
  • Solution based and enhanced SLA’s (above and beyond what AWS provides) shaped around your business requirements
  • Familiarity with your account – our (2) U.S. based NOC’s are staffed 24x7x365 and have access to a comprehensive history of your account and governance policies.
  • Access to Enterprise class support including 2nd Watch’s multiple dedicated AWS Technical Account Managers with Managed Cloud Services agreements
  • Consolidate usage across many AWS accounts (see AWS volume discount tiers above)
  • Consolidated billing for both Managed Cloud Services and AWS Capacity
  • Access to our Cloud Management Platform, a web-based console, greatly simplifies the management and analysis of AWS usage
    • Ability to support complex show-back or charge-back bills for different business units or departments as well as enterprise-wide roll-ups for a global view
    • Ability to allocate Volume and Reserved Instance discounts to business units per your requirements
    • Set budgets with alerts, trend analysis, tag reporting, etc.
  • Ability to provide Reserved Instance recommendations and management services
    • Helps improve utilization and prevent spoilage
  • You can select the level of services for Managed Cloud Services on any or all accounts – you can consolidate your purchasing without requiring services you don’t need.
  • Assistance with AWS Account provisioning and governance – we adhere to your corporate standards (and make pro-active recommendations).

In short, buying your AWS capacity through 2nd Watch as your MSP is an excellent value that will help you accelerate your cloud adoption.  We provide the best of AWS with our own services layered on top to enhance the overall offering.  Please contact us for more information about our Managed Cloud Services including Managed AWS Capacity, and 2nd Watch as an AWS Reseller Partner.

-Marc Kagan, Director, Account Management

Facebooktwittergoogle_pluslinkedinmailrss

Managing the Unexpected Complexity of the Cloud

When people first hear about the cloud, they typically envision some nebulous server in the sky. Moving apps to the cloud should be a piece of cake, they think. Simply pick them up, stick them in the cloud, and you’re done.

Reality, of course, is quite different. True, for simple, monolithic applications, you could provision a single cloud instance and simply port the code over.

The problem is, today’s applications are far from simple and rarely monolithic. Even a simple web app has multiple pieces, ranging from front-end web server code interacting with application code on the middle tier, which in turn talks to the database underneath.

However, in the enterprise context, even these multi-tier web apps are more the exception than the rule. Older enterprise applications like ERP run on multiple servers, leveraging various data sources and user interfaces, communicating via some type of middleware.

Migrating such an application to the cloud is a multifaceted, complex task that goes well beyond picking it up and putting it in the cloud. In practice, some components typically remain on premise while others may move to the cloud, creating a hybrid cloud scenario.

Furthermore, quite often developers must rewrite those elements that move to the cloud in order to leverage its advantages. After all, the cloud promises to provide horizontal scalability, elasticity, and automated recovery from failure, among other benefits. It’s essential to architect and build applications appropriately to take advantage of these characteristics.

However, not all enterprise cloud challenges necessarily involve migrating older applications to cloud environments. For many organizations, digital transformation is the driving force, as customer preferences and behavior drive their technology decisions – and thus digital often begins with the customer interface.

When digital is the priority, enterprises cannot simply build a web site and call it a day, as they may have done in the 1990s. Even adding mobile interfaces doesn’t address customer digital demands. Instead, digital represents an end-to-end rethink of what it means to put an application into the hands of customers.

Today’s modern digital application typically includes multiple third-party applications, from the widgets, plugins, and tags that all modern enterprise web pages include, to the diversity of third-party SaaS cloud apps that support the fabric of modern IT.

With this dynamic complexity of today’s applications, the boundaries of the cloud itself are becoming unclear. Code may change at any time. And there is no central, automated command and control that encompasses the full breadth of such applications.

Instead, management of modern cloud-based, digital applications involves a never-ending, adaptive approach to management that maintains the performance and security of these complex enterprise applications.

Without such proactive, adaptive management, the customer experience will suffer – and with it the bottom line. Furthermore, security and compliance breaches become increasingly likely as the complexity of these applications grows.

It’s easy to spot the irony here. The cloud promised greater automation of the operational environment, and with increased automation we expected simpler management. But instead, complexity exploded, thus leading to the need for more sophisticated, adaptive management. But in the end, we’re able to deliver greater customer value – as long as we properly manage today’s end-to-end, cloud-centric digital applications.

-Jason Bloomberg, President, Intellyx

Copyright © Intellyx LLC. 2nd Watch is an Intellyx client. Intellyx retains final editorial control of this article.

Facebooktwittergoogle_pluslinkedinmailrss