1-888-317-7920 info@2ndwatch.com

Azure Cloud Shell is a Hidden Gem

The simple way to describe Azure Cloud Shell is an on-demand Linux VM with a managed toolset that is accessible from virtually anywhere. You can access it via the Azure Portal, shell.azure.com, the Azure Mobile App, and Visual Studio Code. Pricing is simple. you only need to pay for storage that is used to persist your files between Cloud Shell sessions. Finally, Cloud Shell offers two shell experiences – Bash and PowerShell – however you can access PowerShell from Bash and Bash from PowerShell, so just choose whatever you are most comfortable with. 

Cloud Shell contains the following tools: 

  • Linux Tools– bash, zsh, sh, tmux, dig
  • Azure Tools– Azure CLI, AzCopy, Service Fabric CLI
  • Programming Languages– .NET Core, Go, Java, Node.js, PowerShell, Python
  • Editors– vim, nano, emacs, code
  • Source Control– git
  • Build Tools– make, maven, npm, pip
  • Containers– Docker CLI / Docker Machine, Kubectl, Helm, DC/OS CLI
  • Databases– MySQL client, PostgreSQL client, sqlcmd utility, mssql-scripter
  • Other– iPython Client, Cloud Foundry CLI, Terraform, Ansible, Chef InSpec

You are probably thinking to yourself, that’s great, but what can I use it for? Good question… 

Got a bunch of Azure management scripts that you have developed and need to be able to run? Cloud Shell is a great way to run and manage those scripts. You can leverage git for version control and run PowerShell, Bash, or Python scripts whenever and wherever you are. For example, you are grabbing some lunch and the boss sends you an email asking how many VMs are currently running in your environment and wants the answer right now. Being that this isn’t the first time that the boss has asked this question, you have already created a script that will send a report with how many VMs are currently running. So, you load the Azure Mobile App on your phone, connect to Cloud Shell to run the script and get back to your lunch without having to run back to the office. 

Are you an Azure CLI master? Cloud Shell has you covered! Cloud Shell always has the latest version of the Azure CLI without you ever having to maintain a VM or update your local installation. 

Need to deploy an agent to a bunch of VMs but don’t want to manage a Configuration Management tool? Once again, Cloud Shell has you covered. Use the built-in Ansible to run a playbook that deploys the agent you need installed. 

Do you run a multi-cloud shop? Need to deploy things to both Azure and AWS? Then you are in luck! With Cloud Shell you can use Terraform to deploy both Azure and AWS resources. Another multi-cloud idea would be to install the AWSPowerShell.NetCore PowerShell module to be able to perform day-to-day tasks and automation of AWS. 

There are some limitations of Cloud Shell, such as your Cloud Shell session being temporary. It will be recycled after your session is inactive after 20 minutes.  

The pricing for Azure Cloud Shell is great. Like I mentioned before, you only pay for storage. Storage is used to persist data between instances of Cloud Shell. If you install a PowerShell module or use git to clone a repo, the next time you fire up Cloud Shell, those files are still there. 

Azure Cloud Shell can help with a lot of different use cases and requires very little management. For more information on Azure Cloud Shell visit https://docs.microsoft.com/en-us/azure/cloud-shell/overview or for help getting started with Azure, contact us. 

-Russell Slater, Senior Cloud Consultant

Facebooktwittergoogle_pluslinkedinmailrss

ICYMI: 6 Key Announcements from VMworld for VMware on AWS

It’s been about one month since VMworld 2018, and the focus was heavy on VMware on AWS.  Let’s review 6 of the major announcements around the offering and what’s coming next.

  1. NSX Upgrades

If you’re familiar with NSX, we’re looking at the upgrade of NSX-V to NSX-T inside of the VMware on AWS environment.  This is going to open a lot of new functionality for users as it’s a “cloud-ready” version of the product.  We saw this with the announcement of NSX micro-segmentation and security upgrades (Distributed Firewall) and with the changes to the Direct Connect to allow NSX to pass both management and compute traffic across the private link.  We’re excited to see the NSX-T load balancing options on the roadmap and look forward to testing those out.

  1. Node Counts and Discounts

The minimum number of nodes to run in the SDDC was reduced from 4 to 3, effectively reducing the price to get in the door by 25%.  They further offered to only charge you for 2 of the 3 nodes for 90 days.  This effectively gets you down to half price.  For clients looking to use SDDC for smaller datacenters or as a pilot light to DR, this is very good news.  But let’s be honest, 4 nodes were still cheaper than your physical DR datacenter.  Note that a two-host SDDC cluster is on the roadmap, so look for that entry price point to be even cheaper.

  1. New Instance and Storage Options

VMware on AWS now has the option to choose the R5.metal instance type instead of the i3.metal instance type.  With this instance type there are a number of important changes.  First, the hosts are 50% bigger than the i3 instance type.  Secondly, you can only get EBS based storage that comes between 15Tb – 35Tb in size (in 5Tb increments).  These EBS disks for the R5 will be available over iSCSI networking paths as opposed to being connected directly with the i3.  There might be a case where the performance will dictate one or the other.  We hope to see more instance types in the future and, on the storage front, are excited about shared disks on the roadmap so we can run our classic active/passive sql clusters and cut our sql licencing bill in half.

  1. Speaking of Licensing, Custom Core Counts

When enterprise software is licensed by core (*cough* oracle *cough), having flexibility to choose/limit core counts can save a lot of money.

  1. HCX or should I say “NSX Hybrid Connect”?

HCX got a lot of love and a rebranding.  With the new VMware Cloud Motion with vSphere Replication feature for HCX you can live migrate thousands of vms reliably.  Basically, you schedule your migration, and the data is pre-migrated and ready for the final move when you are.  VMware HCX was rebranded as NSX Hybrid Connect.

  1. Sydney

VMware announced its new region in APAC and is continuing to push for new regions on an aggressive release cycle.  The next regions on the roadmap are Tokyo, Ohio, north CA, and Dublin.  We hope that Tokyo is soon so that APAC gains a pairing for regional active/passive failover strategies.

That’s our list of 6 major VMware on AWS announcements from VMWorld and a review of the roadmap for features coming down the pipe. If you’re interested in learning more about VMware on AWS, contact us.

-Coin Graham, Principal Consultant

Facebooktwittergoogle_pluslinkedinmailrss

Why VMware Cloud on AWS?

By now you’ve likely heard of VMware Cloud on AWS, either from the first announcement of the offering, or more recently as activity in the space has been heating up since the product has reached a state of maturity.  On-premises, we loved what VMware could do for us in terms of ease of management and the full utilization of hardware resources.  However, in the cloud the push for native services is ever present, and many first reactions about VMC are “Why would you do that?   This is certainly the elephant in the room whenever the topic arises.  Previous experience with manually deployed VMware in the AWS cloud required nested virtualization and nearly the same care and feeding as on-premises.  This further adds to initial reaction.  Common sense would dictate however, that if the two 800-pound gorillas come together in the room, they may be able to take on the elephant in the room!  As features have been added to the product and customer feedback implemented, it has become more and more compelling for the enormous installed base of VMware to take advantage of the offering.

What are the best features of VMware Cloud on AWS?

Some of the most attractive features of the cloud are the managed services, which reduce the administrative overhead normally required to maintain reliable and secure operations.  Let’s say you want to use SQL Server in AWS.  Moving to the RDS service where there is no maintenance, configuration or patching of the underlying server is an easy decision.  After some time, the thought of configuring a server and installing/maintaining a RDBMS seems archaic and troublesome. You can now have your DBA focus on the business value that the database provides.  VMware Cloud on AWS is no different.  The underlying software and physical hardware is no longer a concern.  One can always be on the optimum version of the platform with no effort, and additional hardware can be added to a cluster at the press of a button.

So, what software/service helps manage and control the entirety of your IT estate?

There are many third-party software solutions, managed service providers, and up and coming native services like Simple Systems Manager.  Now imagine a cloud based managed service that works for on-premises and cloud resources, and has an existing, mature ecosystem where nearly everyone in Enterprise IT has basic to advanced knowledge.  Sounds attractive, doesn’t it?  That is the idea behind VMware Cloud on AWS.

The architecture of VMC is based on dedicated bare metal systems that are physically located in AWS datacenters.  VMware Cloud on AWS Software Defined Datacenters (SDDCs) are deployed with a fully configured vSAN running on NVMe Flash storage local to the cluster, which currently can expand up to 32 nodes.  You are free to provision the hosts anyway you see fit.  This arrangement also allows full access to AWS services, and keeps resources in the same low latency network.  There is also a connector between the customer’s AWS account and the VMC SDDC, allowing direct low latency access to existing AWS resources in a client VPC.  For management, the hybrid linked mode gives a single logical view spanning both on-premises and VMC vCenter servers.  This allows control of the complete hybrid environment with vCenter and the familiar web console.

Figure 1.  VMware Cloud on AWS Overview

Below are some selected capabilities, benefits, and general information on the VMware Cloud on AWS:

  • There is no immediate requirement for refactoring of existing applications, but access to AWS services allows for future modernization.
  • Very little retraining of personnel is required. Existing scripts, tools and workflows are reusable.
  • Easy expansion of resource footprint without deploying more physical infrastructure.
  • Easy migration of VMs across specific geographies or between cloud/premises for compliance and latency reasons.
  • VMware native resiliency and availability features are fully supported: including DRS for workload distribution, shared storage for clustered application support, and automatic VM restart after node failure.
  • DR as a service with Site Recovery is supported, including the creation of stretched clusters. This can provide zero-RPO between AZ’s within the AWS region.  This service takes advantage of the AWS infrastructure which is already designed with high availability in mind.
  • VMware Horizon 7 is fully supported. This can extend on-premises desktop services without buying additional hardware and enables placement of virtual desktops near latency-sensitive applications in the cloud.
  • The service has GDPR, HIPAA, ISO, and SOC attestations to enable the creation of compliant solutions.
  • Region expansion is underway and two new regions have recently come online in Europe.
  • Discounts are available based on existing product consumption and licensing.
  • Integration with CloudFormation for automated deployment is available.

Figure 2:  VMware Cloud on AWS Target use cases

So for those currently using VMware and considering a move to the cloud and/or hybrid architecture, VMware Cloud on AWS offers the most straightforward gateway into this space.  The service then brings all the hundreds of services in the AWS ecosystem into play, as well as a consistent operational model, the ability to retain familiar VMware tools, policies, management, and investments in third-party tools.  So instead of planning and executing your next hardware refresh and VMware version upgrade, consider migrating to VMware Cloud on AWS!

For help getting started migrating to VMware Cloud on AWS, contact us.

-Eric Deehr, Cloud Solutions Architect & Technical Product Manager

Facebooktwittergoogle_pluslinkedinmailrss

Cloud Autonomics and Automated Management and Optimization

Autonomics systems is an exciting new arena within cloud computing, although it is not a new technology by any means. Automation, orchestration and optimization have been alive and well in the datacenter for almost a decade now. Companies like Microsoft with System Center, IBM with Tivoli and ServiceNow are just a few examples of platforms that harness the ability to collect, analyze and make decisions on how to act against sensor data derived from physical/virtual infrastructure and appliances.

Autonomic cloud capabilities are lighting up quickly across the cloud ecosystem. The systems can monitor infrastructure, services, systems and make decisions to support remediation and healing, failover and failback and snapshot and recovery. The abilities come from workflow creation, runbook and playbook development, which helps to support a broad range of insight with action and corrective policy enforcement.

In the compliance world, we are seeing many great companies come into the mix to bring autonomic type functionality to life in the world of security and compliance.

Evident is a great example of a technology that functions with autonomic-type capabilities. The product can do some amazing things in terms of automation and action. It provides visibility across the entire cloud platform and identifies and manages risk associated with the operation of core cloud infrastructure and applications within an organization.

Using signatures and control insight as well as custom-defined controls, it can determine exploitable and vulnerable systems at scale and report the current state of risk within an organization. That on face value is not autonomic, however, the next phase it performs is critical to why it is a great example of autonomics in action.

After analyzing the current state of the vulnerability and risk landscape, it reports current risk and vulnerability state and derives a set of guided remediations that can be either performed manually against the infrastructure in question or automated for remediation to ensure a proactive response hands off to ensure vulnerabilities and security compliance can always be maintained.

Moving away from Evident, the focus going forward is a marriage of many things to increase systems capabilities and enhance autonomic cloud operations. Operations management systems in the cloud will light up advanced Artificially Intelligent and Machine Learning-based capabilities, which will take in large amounts of sensor data across many cloud-based technologies and services and derive analysis, insight and proactive remediation – not just for security compliance, but across the board in terms of cloud stabilization and core operations and optimization.

CloudHealth Technologies and many others in the cloud management platform space are looking deeply into how to turn the sensor data derived into core cloud optimization via automation and optimization.

AIOps is a term growing year over year, and it fits well to describe how autonomic systems have evolved from the datacenter to the cloud. Gartner is looking deeply into this space, and we at 2nd Watch see promising advancement coming from companies like Palo Alto Networks with their native security platform capabilities along with Evident for continuous compliance and security.

MoogSoft is bringing a next generation platform for IT incident management to life for the cloud, and its Artificial Intelligence capabilities for IT operations are helping DevOps teams operate smarter, faster and more effectively in terms of automating traditional IT operations tasks and freeing up IT engineers to work on the important business-level needs of the organization vs day-to-day IT operations. By providing intelligence to the response of systems issues and challenges, IT operations teams can become more agile and more capable to solve mission critical problems and maintain a proactive and highly optimized enterprise cloud.

As we move forward, expect to see more and more AI and ML-based functionality move into the core cloud management platforms. Cloud ISVs will be leveraging more and more sensor data to determine response, action and resolution and this will become tightly coupled directly to the virtual machine topology and the cloud native services underlying all cloud providers moving forward.

It is an exciting time for autonomic systems capabilities in the cloud, and we are excited to help customers realize the many potential capabilities and benefits which can help automate, orchestrate and proactively maintain and optimize your core cloud infrastructure.

To learn more about autonomic systems and capabilities, check out Gartner’s AIOps research and reach out to 2nd Watch. We would love to help you realize the potential of these technologies in your cloud environment today!

-Peter Meister, Sr Director of Product Management

Facebooktwittergoogle_pluslinkedinmailrss

Continuous Compliance – Automatically Detect and Report Vulnerabilities in your Cloud Enterprise

Customers are wrangling with many challenges in managing security at scale across the enterprise. As customers embrace more and more cloud capabilities across more providers, it becomes daunting to manage compliance.

The landscape of tools and providers is endless, and customers are utilizing a mix of traditional enterprise tools from the past along with cloud tools to try to achieve security baselines within their enterprise.

At 2nd Watch we have a strong partnership with Palo Alto Networks, which provides truly enterprise-grade security to our customers across a very diverse enterprise landscape – datacenter, private cloud, public cloud and hybrid – across AWS, Azure and Google Cloud Platform.

Palo Alto Networks acquired a brilliant company recently – Evident.io. Evident.io is well known for providing monitoring, compliance and security posture management to organizations across the globe. Evident.io provides continuous compliance across AWS and Azure and brings strong compliance vehicles around HIPAA, ISO 27001, NIST 800-53, NIST 900-171, PCI and SOC 2.

The key to continuous compliance lies in the ability to centralize monitoring and reporting as well as insight into one console dashboard where you can see, in real time, the core health and state of your cloud enterprise.

This starts with gaining core knowledge of your environment’s current health state. You must audit, assess and report on where you currently stand in terms of scope of health. Knowing current state will allow you to see the areas where you need to correct and will also open insight into compliance challenges. Evident.io automates this process and allows for automated, continuous visibility and control of infrastructure security while allowing for customized workflow and orchestration, which allows clients to tune the solution to fit specific organizational needs and requirements easily and effectively.

After achieving the core insight of current state of compliance, you must now work on ways to remediate and efficiently maintain compliance moving forward. Evident.io provides a rich set of real-time alerting and workflow functionality that allows clients to achieve automated alerting, automated remediation and automated enforcement. Evident.io employs continuous security monitoring and stores the data collected in the evident security platform, which allows our clients to eliminate manual review and build rich reporting and insight into current state and future state. Evident.io employs a rich set of reporting capabilities out of the box, across a broad range of compliance areas, which helps to report compliance quickly and address existing gaps and reduce and mitigate risk moving forward.

Evident.io works through API on AWS and Azure in a read-only posture. This provides a non-intrusive and effective approach to core system and resource insight without the burden of heavy agent deployment and configuration. Evident Security Platform acquires this data through API securely and analyzes it against core compliance baselines and security best practices to ensure gaps in enterprise security are corrected and risk is reduced.

Continuous Compliance requires continuous delivery. As clients embrace the cloud and the capabilities the cloud providers provide, it becomes more important then ever before that we institute solutions that help us manage against continuous software utilization and delivery. The speed of the cloud requires a new approach for core security and compliance, one that provides automation, orchestration and rich reporting to reduce the overall day-to-day burden of managing towards compliance at scale in your cloud enterprise.

If you are not familiar with Evident.io, check them out at http://evident.io, and reach out to us at 2nd Watch for help realizing your potential for continuous compliance in your organization.

-Peter Meister, Sr Director of Product Management

Facebooktwittergoogle_pluslinkedinmailrss

Logging and Monitoring in the era of Serverless – Part 1

Figuring out monitoring in a holistic sense is a challenge for many companies still, whether it is with conventional infrastructure or new platforms like serverless or containers.

In most applications there are two aspects of monitoring an application:

  • System Metrics such as errors, invocations, latency, memory and cpu usage
  • Business Analytics such as number of signups, number of emails sent, transactions processed, etc

The former is fairly universal and generally applicable in any stack to a varying degree. This is what I would call the undifferentiated aspect of monitoring an application. The abilities to perform error detection and track performance metrics are absolutely necessary to operate an application.

Everything that is old is new again. I am huge fan of the Twelve-Factor App. If you aren’t familiar, I highly suggest taking a look at it. Drafted in 2011 by developers at Heroku, the Twelve-Factor App is a methodology and set best practices designed to enable applications to be built with portability and resiliency when deployed to the web.

In the Twelve-Factor App manifesto, it is stated that applications should produce “logs as event streams” and leave it up to the execution environment to aggregate them. If we are to gather information from our application, why not make that present in the logs? We can use our event stream (i.e. application log) to create time-series metrics. Time-series metrics are just datapoints that have been sampled and aggregated over time, which enable developers and engineers to track performance. They allow us to make correlations with events at a specific time.

AWS Lambda works almost exactly in this way by default, aggregating its logs via AWS CloudWatch. CloudWatch organizes logs based on function, version, and containers while Lambda adds metadata for each invocation. And it is up to the developer to add application-specific logging to their function. CloudWatch, however, will only get you so far. If we want to track more information than just invocation, latency, or memory utilization, we need to analyze the logs deeper. This is where something like Splunk, Kibana, or other tools come into play.

In order to get to the meat of our application and the value it is delivering we need to ensure that we have additional information (telemetry) going to the logs as well:

e.g. – Timeouts – Configuration Failures – Stack traces – Event objects

Logging out these types of events or information will enable those other tools with rich query languages to create a dashboard with just about anything we want on them.

For instance, let’s say we added the following line of code to our application to track an event that was happening from a specific invocation and pull out additional information about execution:

log.Println(fmt.Sprintf(“-metrics.%s.blob.%s”, environment, method))

In a system that tracks time-series metrics in logs (e.g. SumoLogic), we could build a query like this:

“-metrics.prod.blob.” | parse “-metrics.prod.blob.*” as method | timeslice 5m | count(method) group by  _timeslice, method | transpose row _timeslice column method

This would give us a nice breakdown of the different methods used in a CRUD or RESTful service and can then be visualized in the very same tool.

While visualization is nice, particularly when taking a closer look at a problem, it might not be immediately apparent where there is a problem. For that we need some way to grab the attention of the engineers or developers working on the application. Many of the tools mentioned here support some level of monitoring and alerting.

In the next installment of this series we will talk about increasing visibility into your operations and battling dashboard apathy! Check back next week.

-Lars Cromley, Director, Cloud Advocacy and Innovation

Facebooktwittergoogle_pluslinkedinmailrss

10 Ways Migrating to the Cloud Can Improve Your IT Organization

While at 2nd Watch, I’ve had the opportunity to work with a plethora of CIOs on their journey to the cloud.  Some focused on application-specific migrations, while others focused on building a foundation. Regardless of where they started, their journey began out of a need for greater agility, flexibility, extensibility and standardization.

Moving to the cloud not only provides you with agility, flexibility and extensibility – it actually improves your IT organization. How? In this post I will outline 10 ways migrating to the cloud will improve your IT organization.

  1. CI/CD: IT organizations require speed and agility when responding to development and infrastructure requests. Today’s development processes encourage continuous integration, summarized as continuously releasing code utilizing release automation. Using these processes, an IT organization is able to continually produce minimally viable products – faster.
  2. Organizational Streamlining: In order to implement continuous integration, an organization’s processes must be connected and streamlined – from resource provisioning to coding productivity. Moving to the cloud enables the IT organization to create sustainable processes; processes that track requests for resources, the provisioning of those resources, streamlined communication and facilitates the business unit chargeback in addition to the general benefit of working more efficiently. For example, the provisioning process of one customer took 15 days – from requirement gathering to approval to finally provisioning resources. By working with the 2nd Watch team we were able to automate the entire provisioning process, including several approval gates. The new automated process now deploys the requested systems in minutes compared to days.
  3. Work More Efficiently: Moving to the cloud returns the IT organization’s focus to where it belongs so your team can focus on the jobs they were hired to do. No longer focused on resource provisioning, patching and configuration, they are now working on the core functions of their role, such as aligning new IT service offerings to business needs.
  4. New Capabilities: IT organizations can focus on developing new capabilities and capitalize on new opportunities for the business. More importantly, IT departments can focus on projects that more closely align to business strategy.
  5. An actual Dev/Test: Organizations can now create true Dev/Test environments in the cloud that enables self-service provisioning and de-provisioning of testing servers with significantly lower cost and overhead. Something that was previously expensive, inefficient and hard to maintain on-prem can now be deployed in a way that is easy, flexible and cost efficient.
  6. Dedicated CIO Leadership: Moving to and operating within a cloud-based environment requires strong IT leadership. Now the CIO is more easily able to focus on key strategic initiatives that deliver value to the business. With fewer distractions, this ability to define and drive the overall strategy and planning of the organization, IT policy, capacity planning, compliance and security enables the CIO to lead the charge with innovation when working with business.
  7. Foster Stronger IT and Business Relationships: Moving to the cloud creates stronger relationships between IT and the business. No longer is IT relegated to just determining requirements, selecting services and implementing the chosen solution. They can now participate in collaborative discussions with the business to help define what is Moving to the cloud fosters collaboration between IT and business leaders to promote a cohesive and inclusive cloud strategy that meets IT’s governance requirements but also enables the agility needed by the business to stay competitive.
  8. Creation of a CCoE: Migrating to the cloud offers the IT organization an opportunity to create a Cloud Center of Excellence. Ideally, the CCoE should be designed to be a custom turn-key operation embedded with your enterprise’s existing IT engineers as part of its core level of expertise. This team will consist of an IT team dedicated to creating, evangelizing, and institutionalizing best practices, frameworks, and governance for evolving technology operations, which are increasingly implemented using the cloud. The CCoE develops a point of view for how cloud technology is implemented at scale for an organization. Moreover, by creating a CCoE it can help with breaking down silos and creating a single pane of glass view when it comes to cloud technology, from creating a standard for machine images through infrastructure builds to managing cloud costs.
  9. New Training Opportunities: Evolving the technical breadth already present in the organization and working through the cultural changes required to bring the skeptics along is a great opportunity to bring your team closer together while simultaneously expanding your capabilities. The more knowledge your teams have on cloud technologies, the smoother the transition will be for the organization. As a result, you will develop more internal evangelists and ease the fear, uncertainty and doubt often felt by IT professionals when making the transition to the cloud. The importance of investing in training and growth of employees cannot be stressed enough as, based on our experience, there is a strong correlation between investments in training and successful moves to the cloud. Continued education is part of the “Cloud Way” that pays off while preserving much of the tribal knowledge that exists within your organization.
  10. Flexibility, Elasticity and Functionality: Cloud computing allows your IT organization to adapt more quickly with flexibility that is not available when working with on-prem solutions. Moving to a cloud platform enables quick response to internal capacity demands. No more over-provisioning! With cloud computing, you can pay as you go – spin up what you need when you need it, and spin it down when demand drops.

As a whole, IT organizations need to be prepared to set aside the old and welcome new approaches to delivering cloud services. The journey to the cloud not only brings efficiencies but also fosters more collaboration within your organization and enhances your IT organization to becoming a well-oiled machine that develops best practices and quickly responds to your business cloud needs. Ready to get started on your cloud journey? Contact us to get started with a Cloud Readiness Assessment.

-Yvette Schmitter, Senior Manager

 

Facebooktwittergoogle_pluslinkedinmailrss

2nd Watch Earns AWS Certification Distinction for Achieving 200 Certifications

Today we are excited to announce we have earned an AWS Certification Distinction for achieving more than 200 active AWS Certifications! We have invested significant time and resources in education to validate the technical expertise of our staff with AWS certifications, enabling our cloud experts to better serve our clients across the US. This distinction assures our customers that they are working with a well-qualified AWS partner, since AWS Certifications recognize IT professionals with the technical skills and expertise to design, deploy, and operate applications and infrastructure on AWS.

2nd Watch’s value of achieving AWS Certifications is passed on to our customers, beyond passing an exam, as our technical teams have nearly a decade of hands-on, practical experience surpassing many providers. This designation is more than just an AWS Certification in that it highlights our company’s culture and commitment to providing highly-qualified individuals for every project to provide customers a high-quality and consistent customer experience in every engagement.

“Being an APN Partner provides us with the right resources, accreditation and training we need to serve our customers, and these certifications validate our AWS expertise and customer obsession,” says 2nd Watch co-founder and EVP of Marketing and Business Development, Jeff Aden. “AWS Certifications benefit our team and extend to our customers in a high-quality experience. We’ll continue to invest in our team’s education and training, to ensure we’re at the forefront of AWS’ innovation.”

AWS Certifications are recognized industry-wide as a credential that shows expertise in AWS cloud infrastructure, and has been recognized as one of the top 10 Cloud Certifications for partners. Historically, AWS Certifications have been an individual achievement, but APN Certification Distinctions now showcase APN Partners that have achieved 50 or more AWS Certifications.

-Nicole Maus, Marketing Manager

Facebooktwittergoogle_pluslinkedinmailrss

Fully Coded And Automated CI/CD Pipelines: The Weeds

The Why

In my last post we went over why we’d want to go the CI/CD/Automated route and the more cultural reasons of why it is so beneficial. In this post, we’re going to delve a little bit deeper and examine the technical side of tooling. Remember, a primary point of doing a release is mitigating risk. CI/CD is all about mitigating risk… fast.

There’s a Process

The previous article noted that you can’t do CI/CD without building on a set of steps, and I’m going to take this approach here as well. Unsurprisingly, we’ll follow the steps we laid out in the “Why” article, and tackle each in turn.

Step I: Automated Testing

You must automate your testing. There is no other way to describe this. In this particular step however, we can concentrate on unit testing: Testing the small chunks of code you produce (usually functions or methods). There’s some chatter about TDD (Test Driven Development) vs BDD (Behavior Driven Development) in the development community, but I don’t think it really matters, just so long as you are writing test code along side your production code. On our team, we prefer the BDD style testing paradigm. I’ve always liked the symantically descriptive nature of BDD testing over strictly code-driven ones. However, it should be said that both are effective and any is better than none, so this is more of a personal preference. On our team we’ve been coding in golang, and our BDD framework of choice is the Ginkgo/Gomega combo.

Here’s a snippet of one of our tests that’s not entirely simple:

Describe("IsValidFormat", func() {
  for _, check := range AvailableFormats {
    Context("when checking "+check, func() {
      It("should return true", func() {
        Ω(IsValidFormat(check)).To(BeTrue())
      })
    })
  }
 
  Context("when checking foo", func() {
    It("should return false", func() {
      Ω(IsValidFormat("foo")).To(BeFalse())
    })
  })
)

So as you can see, the Ginkgo (ie: BDD) formatting is pretty descriptive about what’s happening. I can instantly understand what’s expected. The function IsValidFormat, should return true given the range (list) of AvailableFormats. A format of foo (which is not a valid format) should return false. It’s both tested and understandable to the future change agent (me or someone else).

Step II: Continuous Integration

Continuous Integration takes Step 1 further, in that it brings all the changes to your codebase to a singular point, and building an object for deployment. This means you’ll need an external system to automatically handle merges / pushes. We use Jenkins as our automation server, running it in Kubernetes using the Pipeline style of job description. I’ll get into the way we do our builds using Make in a bit, but the fact we can include our build code in with our projects is a huge win.

Here’s a (modified) Jenkinsfile we use for one of our CI jobs:

def notifyFailed() {
  slackSend (color: '#FF0000', message: "FAILED: '${env.JOB_NAME} [${env.BUILD_NUMBER}]' (${env.BUILD_URL})")
}
 
podTemplate(
  label: 'fooProject-build',
  containers: [
    containerTemplate(
      name: 'jnlp',
      image: 'some.link.to.a.container:latest',
      args: '${computer.jnlpmac} ${computer.name}',
      alwaysPullImage: true,
    ),
    containerTemplate(
      name: 'image-builder',
      image: 'some.link.to.another.container:latest',
      ttyEnabled: true,
      alwaysPullImage: true,
      command: 'cat'
    ),
  ],
  volumes: [
    hostPathVolume(
      hostPath: '/var/run/docker.sock',
      mountPath: '/var/run/docker.sock'
    ),
    hostPathVolume(
      hostPath: '/home/jenkins/workspace/fooProject',
      mountPath: '/home/jenkins/workspace/fooProject'
    ),
    secretVolume(
      secretName: 'jenkins-creds-for-aws',
      mountPath: '/home/jenkins/.aws-jenkins'
    ),
    hostPathVolume(
      hostPath: '/home/jenkins/.aws',
      mountPath: '/home/jenkins/.aws'
    )
  ]
)
{
  node ('fooProject-build') {
    try {
      checkout scm
 
      wrap([$class: 'AnsiColorBuildWrapper', 'colorMapName': 'XTerm']) {
        container('image-builder'){
          stage('Prep') {
            sh '''
              cp /home/jenkins/.aws-jenkins/config /home/jenkins/.aws/.
              cp /home/jenkins/.aws-jenkins/credentials /home/jenkins/.aws/.
              make get_images
            '''
          }
 
          stage('Unit Test'){
            sh '''
              make test
              make profile
            '''
          }
 
          step([
            $class:              'CoberturaPublisher',
            autoUpdateHealth:    false,
            autoUpdateStability: false,
            coberturaReportFile: 'report.xml',
            failUnhealthy:       false,
            failUnstable:        false,
            maxNumberOfBuilds:   0,
            sourceEncoding:      'ASCII',
            zoomCoverageChart:   false
          ])
 
          stage('Build and Push Container'){
            sh '''
              make push
            '''
          }
        }
      }
 
      stage('Integration'){
        container('image-builder') {
          sh '''
            make deploy_integration
            make toggle_integration_service
          '''
        }
        try {
          wrap([$class: 'AnsiColorBuildWrapper', 'colorMapName': 'XTerm']) {
            container('image-builder') {
              sh '''
                sleep 45
                export KUBE_INTEGRATION=https://fooProject-integration
                export SKIP_TEST_SERVER=true
                make integration
              '''
            }
          }
        } catch(e) {
          container('image-builder'){
            sh '''
              make clean
            '''
          }
          throw(e)
        }
      }
 
      stage('Deploy to Production'){
        container('image-builder') {
          sh '''
            make clean
            make deploy_dev
          '''
        }
      }
    } catch(e) {
      container('image-builder'){
        sh '''
          make clean
        '''
      }
      currentBuild.result = 'FAILED'
      notifyFailed()
      throw(e)
    }
  }
}

There’s a lot going on here, but the important part to notice is that I grabbed this from the project repo. The build instructions are included with the project itself. It’s creating an artifact, running our tests, etc. But it’s all part of our project code base. It’s checked into git. It’s code like all the other code we mess with. The steps are somewhat inconsequential for this level of topic, but it works. We also have it setup to run when there’s a push to github (AND nightly). This ensures that we are continuously running this build and integrating everything that’s happened to the repo in a day. It helps us keep on top of all the possible changes to the repo as well as our environment.

Hey… what’s all that make_ crap?_

Make

Our team uses a lot of tools. We ascribe to the maxim: Use what’s best for the particular situation. I can’t remember every tool we use. Neither can my teammates. Neither can 90% of the people that “do the devops.” I’ve heard a lot of folks say, “No! We must solidify on our toolset!” Let your teams use what they need to get the job done the right way. Now, the fear of experiencing tool “overload” seems like a legitimate one in this scenario, but the problem isn’t the number of tools… it’s how you manage and use use them.

Enter Makefiles! (aka: make)

Make has been a mainstay in the UNIX world for a long time (especially in the C world). It is a build tool that’s utilized to help satisfy dependencies, create system-specific configurations, and compile code from various sources independent of platform. This is fantastic, except, we couldn’t care less about that in the context of our CI/CD Pipelines. We use it because it’s great at running “buildy” commands.

Make is our unifier. It links our Jenkins CI/CD build functionality with our Dev functionality. Specifically, opening up the docker port here in the Jenkinsfile:

volumes: [
  hostPathVolume(
    hostPath: '/var/run/docker.sock',
    mountPath: '/var/run/docker.sock'
  ),

…allows us to run THE SAME COMMANDS WHEN WE’RE DEVELOPING AS WE DO IN OUR CI/CD PROCESS. This socket allows us to run containers from containers, and since Jenkins is running on a container, this allows us to run our toolset containers in Jenkins, using the same commands we’d use in our local dev environment. On our local dev machines, we use docker nearly exclusively as a wrapper to our tools. This ensures we have library, version, and platform consistency on all of our dev environments as well as our build system. We use containers for our prod microservices so production is part of that “chain of consistency” as well. It ensures that we see consistent behavior across the horizon of application development through production. It’s a beautiful thing! We use the Makefile as the means to consistently interface with the docker “tool” across differing environments.

Ok, I know your interest is peaked at this point. (Or at least I really hope it is!)
So here’s a generic makefile we use for many of our projects:

CONTAINER=$(shell basename $$PWD | sed -E 's/^ia-image-//')
.PHONY: install install_exe install_test_exe deploy test
 
install:
    docker pull sweet.path.to.a.repo/$(CONTAINER)
    docker tag sweet.path.to.a.repo/$(CONTAINER):latest $(CONTAINER):latest
 
install_exe:
    if [[ ! -d $(HOME)/bin ]]; then mkdir -p $(HOME)/bin; fi
    echo "docker run -itP -v \$$PWD:/root $(CONTAINER) \"\$$@\"" > $(HOME)/bin/$(CONTAINER)
    chmod u+x $(HOME)/bin/$(CONTAINER)
 
install_test_exe:
    if [[ ! -d $(HOME)/bin ]]; then mkdir -p $(HOME)/bin; fi
    echo "docker run -itP -v \$$PWD:/root $(CONTAINER)-test \"\$$@\"" > $(HOME)/bin/$(CONTAINER)
    chmod u+x $(HOME)/bin/$(CONTAINER)
 
test:
    docker build -t $(CONTAINER)-test .
 
deploy:
    captain push

This is a Makefile we use to build our tooling images. It’s much simpler than our project Makefiles, but I think this illustrates how you can use Make to wrap EVERYTHING you use in your development workflow. This also allows us to settle on similar/consistent terminology between different projects. %> make test? That’ll run the tests regardless if we are working on a golang project or a python lambda project, or in this case, building a test container, and tagging it as whatever-test. Make unifies “all the things.”

This also codifies how to execute the commands. ie: what arguments to pass, what inputs etc. If I can’t even remember the name of the command, I’m not going to remember the arguments. To remedy, I just open up the Makefile, and I can instantly see.

Step III: Continuous Deployment

After the last post (you read it right?), some might have noticed that I skipped the “Delivery” portion of the “CD” pipeline. As far as I’m concerned, there is no “Delivery” in a “Deployment” pipeline. The “Delivery” is the actual deployment of your artifact. Since the ultimate goal should be Depoloyment, I’ve just skipped over that intermediate step.

Okay, sure, if you want to hold off on deploying automatically to Prod, then have that gate. But Dev, Int, QA, etc? Deployment to those non-prod environments should be automated just like the rest of your code.

If you guessed we use make to deploy our code, you’d be right! We put all our deployment code with the project itself, just like the rest of the code concerning that particular object. For services, we use a Dockerfile that describes the service container and several yaml files (e.g. deployment_<env>.yaml) that describe the configurations (e.g. ingress, services, deployments) we use to configure and deploy to our Kubernetes cluster.

Here’s an example:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  labels:
    app: sweet-aws-service
    stage: dev
  name: sweet-aws-service-dev
  namespace: sweet-service-namespace
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: sweet-aws-service
      name: sweet-aws-service
    spec:
      containers:
      - name: sweet-aws-service
        image: path.to.repo.for/sweet-aws-service:latest
        imagePullPolicy: Always
        env:
          - name: PORT
            value: "50000"
          - name: TLS_KEY
            valueFrom:
              secretKeyRef:
                name: grpc-tls
                key: key
          - name: TLS_CERT
            valueFrom:
              secretKeyRef:
                name: grpc-tls
                key: cert

This is an example of a deployment into Kubernetes for dev. That %> make deploy_dev from the Jenkinsfile above? That’s pushing this to our Kubernetes cluster.

Conclusion

There is a lot of information to take in here, but there are two points to really take home:

  1. It is totally possible.
  2. Use a unifying tool to… unify your tools. (“one tool to rule them all”)

For us, Point 1 is moot… it’s what we do. For Point 2, we use Make, and we use Make THROUGH THE ENTIRE PROCESS. I use Make locally in dev and on our build server. It ensures we’re using the same commands, the same containers, the same tools to do the same things. Test, integrate (test), and deploy. It’s not just about writing functional code anymore. It’s about writing a functional process to get that code, that value, to your customers!

And remember, as with anything, this stuff get’s easier with practice. So once you start doing it you will get the hang of it and life becomes easier and better. If you’d like some help getting started, download our datasheet to learn about our Modern CI/CD Pipeline.

-Craig Monson, Sr Automation Architect

 

Facebooktwittergoogle_pluslinkedinmailrss

How We Organize Terraform Code at 2nd Watch

When IT organizations adopt infrastructure as code (IaC), the benefits in productivity, quality, and ability to function at scale are manifold. However, the first few steps on the journey to full automation and immutable infrastructure bliss can be a major disruption to a more traditional IT operations team’s established ways of working. One of the common problems faced in adopting infrastructure as code is how to structure the files within a repository in a consistent, intuitive, and scaleable manner. Even IT operations teams whose members have development skills will still face this anxiety-inducing challenge simply because adopting IaC involves new tools whose conventions differ somewhat from more familiar languages and frameworks.

In this blog post, we’ll go over how we structure our IaC repositories within 2nd Watch professional services and managed services engagements with a particular focus on Terraform, an open-source tool by Hashicorp for provisioning infrastructure across multiple cloud providers with a single interface.

First Things First: README.md and .gitignore

The task in any new repository is to create a README file. Many git repositories (especially on Github) have adopted Markdown as a de facto standard format for README files. A good README file will include the following information:

  1. Overview: A brief description of the infrastructure the repo builds. A high-level diagram is often an effective method of expressing this information. 2nd Watch uses LucidChart for general diagrams (exported to PNG or a similar format) and mscgen_js for sequence diagrams.
  2. Pre-requisites: Installation instructions (or links thereto) for any software that must be installed before building or changing the code.
  3. Building The Code: What commands to run in order to build the infrastructure and/or run the tests when applicable. 2nd Watch uses Make in order to provide a single tool with a consistent interface to build all codebases, regardless of language or toolset. If using Make in Windows environments, Windows Subsystem for Linux is recommended for Windows 10 in order to avoid having to write two sets of commands in Makefiles: Bash, and PowerShell.

It’s important that you do not neglect this basic documentation for two reasons (even if you think you’re the only one who will work on the codebase):

  1. The obvious: Writing this critical information down in an easily viewable place makes it easier for other members of your organization to onboard onto your project and will prevent the need for a panicked knowledge transfer when projects change hands.
  2. The not-so-obvious: The act of writing a description of the design clarifies your intent to yourself and will result in a cleaner design and a more coherent repository.

All repositories should also include a .gitignore file with the appropriate settings for Terraform. GitHub’s default Terraform .gitignore is a decent starting point, but in most cases you will not want to ignore .tfvars files because they often contain environment-specific parameters that allow for greater code reuse as we will see later.

Terraform Roots and Multiple Environments

A Terraform root is the unit of work for a single terraform apply command. We group our infrastructure into multiple terraform roots in order to limit our “blast radius” (the amount of damage a single errant terraform apply can cause).

  • Repositories with multiple roots should contain a roots/ directory with a subdirectory for each root (e.g. VPC, one per-application) tf file as the primary entry point.
  • Note that the roots/ directory is optional for repositories that only contain a single root, e.g. infrastructure for an application team which includes only a few resources which should be deployed in concert. In this case, modules/ may be placed in the same directory as tf.
  • Roots which are deployed into multiple environments should include an env/ subdirectory at the same level as tf. Each environment corresponds to a tfvars file under env/ named after the environment, e.g. staging.tfvars. Each .tfvars file contains parameters appropriate for each environment, e.g. EC2 instance sizes.

Here’s what our roots directory might look like for a sample with a VPC and 2 application stacks, and 3 environments (QA, Staging, and Production):

Terraform modules

Terraform modules are self-contained packages of Terraform configurations that are managed as a group. Modules are used to create reusable components, improve organization, and to treat pieces of infrastructure as a black box. In short, they are the Terraform equivalent of functions or reusable code libraries.

Terraform modules come in two flavors:

  1. Internal modules, whose source code is consumed by roots that live in the same repository as the module.
  2. External modules, whose source code is consumed by roots in multiple repositories. The source code for external modules lives in its own repository, separate from any consumers and separate from other modules to ensure we can version the module correctly.

In this post, we’ll only be covering internal modules.

  • Each internal module should be placed within a subdirectory under modules/.
  • Module subdirectories/repositories should follow the standard module structure per the Terraform docs.
  • External modules should always be pinned at a version: a git revision or a version number. This practice allows for reliable and repeatable builds. Failing to pin module versions may cause a module to be updated between builds by breaking the build without any obvious changes in our code. Even worse, failing to pin our module versions might cause a plan to be generated with changes we did not anticipate.

Here’s what our modules directory might look like:

Terraform and Other Tools

Terraform is often used alongside other automation tools within the same repository. Some frequent collaborators include Ansible for configuration management and Packer for compiling identical machine images across multiple virtualization platforms or cloud providers. When using Terraform in conjunction with other tools within the same repo, 2nd Watch creates a directory per tool from the root of the repo:

Putting it all together

The following illustrates a sample Terraform repository structure with all of the concepts outlined above:

Conclusion

There’s no single repository format that’s optimal, but we’ve found that this standard works for the majority of our use cases in our extensive use of Terraform on dozens of projects. That said, if you find a tweak that works better for your organization – go for it! The structure described in this post will give you a solid and battle-tested starting point to keep your Terraform code organized so your team can stay productive.

Additional resources

  • The Terraform Book by James Turnbull provides an excellent introduction to Terraform all the way through repository structure and collaboration techniques.
  • The Hashicorp AWS VPC Module is one of the most popular modules in the Terraform Registry and is an excellent example of a well-written Terraform module.
  • The source code for James Nugent’s Hashidays NYC 2017 talk code is an exemplary Terraform repository. Although it’s based on an older version of Terraform (before providers were broken out from the main Terraform executable), the code structure, formatting, and use of Makefiles is still current.

For help getting started adopting Infrastructure as Code, contact us.

-Josh Kodroff, Associate Cloud Consultant

Facebooktwittergoogle_pluslinkedinmailrss