All posts by jgardner

How to avoid cloud project roadblocks

(c)iStock.com/Jeffoto

By Jason Deck, VP Strategic Development, Logicworks

Nearly half of organisations meet major roadblocks after the initial stage of a digital project, according to a recent survey. These early failures are potentially disastrous for a project; when IT teams fail to deliver early successes, the entire endeavour starts to look like a bad idea.

A misstep in the early phases of a cloud project can take all the momentum out of cloud adoption. Failure or strain in proof of concept migrations not only threaten the success of current cloud projects, but also dramatically reduces the likelihood of the organisation continuing to invest in cloud technology. This is bad news for IT modernisation efforts and the long-term transformation of enterprise infrastructure.

What causes these early failures? Is it poor strategic planning, insufficient training, or bad technology? The solution requires enterprises to adopt new risk management strategies — and leave behind strategies that were effective when IT was a static, monolithic back office provider.  

Cloud automation

The traditional IT service model says that infrastructure needs to be updated every five years and manually maintained in between. But the cloud finally gives IT the opportunity to design infrastructure to evolve continually.

When you treat your cloud like a system that needs to be manually maintained and upgraded, you quickly reach the point where new rapid product development cycles clash with static infrastructure. IT tries to make changes to the cloud manually to keep up with demand, and because there is no central change management and everything is “custom”, phase two cloud projects stall.

The answer to early stage failure is to stop treating your infrastructure as a system that must be manually maintained, and instead equip your team with tools that allow them to centrally maintain the cloud with minimal manual effort

The answer to early stage failure is to stop treating your infrastructure as a system that must be manually maintained, and instead equip your team with tools that allow them to centrally maintain — and make changes to — the cloud with minimal manual effort. The answer is cloud automation.

Cloud infrastructure must be built from day one to evolve. It must be built so engineers are only touching the abstraction level above server-level maintenance, so that changes are centralised and immediately system-wide. This dramatically accelerates cloud maintenance and also ensures that the system has the proper controls to validate, control, and version changes to infrastructure.

Cloud automation makes it possible to spin up fully-configured resources in minutes. It guarantees compliance and governance across the enterprise by automatically and consistently checking on the status of logs, installing monitoring tools, and shipping data to cold storage.

This reduces friction between development teams and operations, allowing developers to keep up with business demands and systems engineers to focus on improving what matters, not on manual patches and reconfigurations. Automation reduces human effort, streamlines processes, and therefore reduces the disjointed, process-less quagmire that usually plagues new cloud teams.

However, there is a potential pitfall here. Some enterprises believe they can build their cloud environments from scratch and then automate “later”, when their team’s experience grows. But cloud automation is a difficult task that requires a specialised set of skills and a lot of time. Unfortunately, by the time that team gets around to figuring out AWS CloudFormation, configuration management, and deployment automation, they are already knee-deep in the cloud issues.

Successful cloud project leaders realise that they cannot wait six to 12 months for internal staff to write custom automation scripts. Instead, they turn to an external provider to integrate and automate their cloud environment from day one. This requires more than a consulting engagement, and usually means bringing in a managed service provider like Logicworks comes in during the early stages of a project to create automation scripts that can continue to manage automation after migration.

As I have written about before, cloud automation should be the core service offering of every MSP. In the coming years, enterprises will clarify the support they need — and find that of all the many tasks in a cloud adoption process, cloud automation makes the most sense to outsource. MSPs will develop a core library of automation scripts, and this intellectual property will be the foundation of their success. MSPs will accelerate cloud adoption and allow enterprises to run on top of more secure, more highly available clouds.

Enterprise support

Even when clouds are fully automated, enterprises rarely cut system support teams. Instead, they use them with greater effectiveness, focusing them on supporting a greater volume and velocity of product development.

But it takes enterprise teams some time to adapt to a new cloud process. At early stages, having your main compute cluster fail at 11am on a Tuesday could spell disaster for the entire project. To mitigate these risks, enterprises with successful cloud projects usually invest in AWS Enterprise Support.

Cloud automation should be the core service offering of every MSP. In the coming years, enterprises will clarify the support they need, and find that cloud automation makes the most sense to outsource

AWS Enterprise Support is fast, reliable, expert service. We have seen detailed email responses to technical issues from experts in less than two minutes and resolution in 10. Infact, the only problem with AWS Enterprise Support is that more businesses want it than have the budget for it.

As you might expect, these time-intensive services come with a price; it costs a minimum of $15,000 a month or 10% of your usage bill, whichever is greater. A selection of these services is available at their Business Tier starting at $10,000 a month. For many SMBs, AWS Premium Support is simply not an option — especially when they are experimenting with a limited number of POCs and a limited budget.

The answer? Find an MSP that gives every client – even the small ones – access to AWS Enterprise Support — included in the regular cost of maintenance.

How is a $15,000 service ‘included’? You can take advantage of the fact that the AWS MSP Partner has other clients, and your collective budget gives your service provider access to AWS support engineers. It is a little-known benefit that you can get access to AWS Enterprise Support and fund your MSP for the same cost as you can get AWS Enterprise Support on your own.

Set up your team for early wins

Pairing AWS Enterprise Support with automation significantly de-risks cloud adoption.  

When you bring on a strong, automation-focused partner and AWS support in early, you have a better experience and derive greater value from the cloud faster than if you rely on internal teams alone. A good MSP will not only ensure the success of early projects, but will use early projects to establish migration patterns to facilitate future migrations. They maintain early momentum, accelerating all-in cloud adoption.

It is easier to avoid early cloud issues than climb out of them later. The usual strategy — hire more short-term consultants and give workshops on DevOps philosophy — are usually more expensive in the long term that implementing the real structural changes necessary to automate and control your cloud.

Why cloud MSPs are software companies

(c)iStock.com/barcin

By Jason Deck, vice president of strategic development, Logicworks

When your infrastructure is code, the art of developing great software applications and building great infrastructure systems start to look similar.

Many of the best practices of software development — continuous integration, versioning, integration testing — are now the best practices of systems engineers. In enterprises that have aggressively virtualised data centres or moved to the public cloud, servers, switches, and hypervisors are now strings and brackets in JSON.

The scripts that spin up a server (instance) or configure a network can be standardised, modified over time, and reused. These scripts are essentially software applications that build infrastructure and are maintained much like a piece of software. They are versioned in GitHub and engineers patch the scripts or containers, not the hardware, and test those scripts again and again on multiple projects until they are perfected.

Cloud MSPs build up a library of automation scripts over time, but it is not a plug and play system

Providers that build and manage cloud infrastructure for clients accumulate entire libraries of these scripts. These libraries are a cloud MSP’s core intellectual property, and what differentiates one MSP from another. And this software is a main reason why enterprises need MSPs in the first place.

Why automation software?

Engineers can easily be trained to perform the basics of cloud management manually. Anyone can launch an EC2 instance from the AWS console. But when you need to deploy an instance that has fully configured security groups, subnets, databases, etc. and is HIPAA compliant, it becomes harder to find and/or train this talent. Even fewer enterprises have the in-house staff capable of writing scripts that help developers or systems staff who can launch instances automatically.

It would take multiple, senior-level level automation engineers working for months to develop a script that spins up perfectly configured instances for a variety of applications from scratch. If cloud MSPs can run a script (created beforehand) to spin up a new environment in days, that is a huge value add for enterprises.

Why not just train engineers on the lower-level manual cloud configuration tasks? Because enterprises do not just want to turn manual data centre work into manual cloud console work. An infrastructure as code transition is also usually accompanied by a transition to a DevOps culture, part of which usually means helping developers to work faster. Business leaders do not want developers to wait for systems engineers to push new features. They want more experimentation and faster product lifecycles. Ideally, they want developers to have tools at their fingertips to deploy code without having to worry about the infrastructure at all.

Infrastructure automation makes a DevOps transition possible. Developers can get their code tested and in production in minutes. Automation empowers cost-effective experimentation. Tools like containerisation create a common language for both systems engineers and developers to communicate.

One caveat, however: cloud MSPs build up a library of automation scripts over time, but it is not a plug and play system. Various automation scripts have to be customised for the packages required to run those applications, and they need to be tweaked to meet the unique security demands of each client. That is just in the effort to get the infrastructure build-out process automated; a great cloud MSP will also orchestrate multiple automated systems across multiple environments, and these usually require a combination of custom jobs and cloud orchestration software.

Cloud orchestration

As we have discussed previously, infrastructure orchestration is a key missing element on many cloud projects. CTOs and CIOs struggle to coordinate projects across hundreds of data centres and multiple clouds, and compliance officers struggle to monitor and audit them. Lengthy governance processes are getting cut, but enterprises still have serious compliance and governance obligations that now must be met in even more complex systems.

Unfortunately, cloud orchestration software is not yet mature. MSPs are the only vendors that have developed true cloud orchestration software through a combination of custom programming and third party APIs.

Here is an example of one such application. When an engineer logs into their MSP-provided cloud software, they can:

  • See a complete overview of what is running across multiple environments
  • Monitor cloud costs (through a 3rd party application like CloudCheckr)
  • Monitor security alerts
  • Spin up new instances in a few clicks (perhaps using Docker)
  • File repair/upgrade tickets
  • See status of current tickets
  • Read Wikis
  • Pay bills
  • Integrate with existing security monitoring software, antivirus, and logging systems, and even with monitoring of their on-premises virtualised environments.  

Developing such systems requires quite a bit of complex integration and software development. This is usually custom per-client. But this level of transparency across environments is exactly what enterprises need.

Next-generation cloud software

The services AWS and Google offer are constantly changing. AWS has released hundreds of major updates to its services in 2015. It is experimenting in mobile development software. It has created services that make complex cloud automation tasks easier.

We predict the next wave of software development will include the services that an MSP currently performs more or less manually, and that have limited or no software options on the market

Every six months, AWS releases services that have the potential to disrupt entire software industries. At the same time, AWS Marketplace has become the destination for cloud ISVs developing cloud-based software, including automation and other functions.

In other words, there is a fertile and rapidly shifting cloud ecosystem that appears to be swallowing up traditional software market channels. This has allowed newcomers to shape entire industries and forced old players to adapt.

We predict that the next wave of software development will include the services that an MSP currently performs more or less manually, and that have limited or no software options on the market including the following:

  • Predictive, big data-crunching software that determines you are going to need more infrastructure before something like AWS CloudWatch knows
  • Fully automated backups and disaster recovery
  • Automated destructive testing that empirically proves the infrastructure is highly available, currently done manually with something like Netflix’s Simian Army
  • More sophisticated multi-cloud integration software between AWS and other clouds
  • More sophisticated hybrid cloud identity and access management, currently managed somewhat manually with AWS IAM and Active Directory
  • More Docker-like packages of code with “built in” compliance, like HIPAA and PCI

Together, cloud automation and orchestration software have the potential to drastically reduce the effort in migrating to the cloud, reduce the risk of human error, guarantee that developers maintain compliance, and increase speed of development. Cloud MSPs will become software companies as well as curators of a dizzyingly complex software marketplace, and help enterprises put the latest cloud innovations to work.

The post Why Cloud MSPs Are Software Companies appeared first on Gathering Clouds.

Why AWS Marketplace is changing the software industry

(Image Credit: iStockPhoto/fotostorm)

The launch of Amazon Web Services (AWS) in 2006 revolutionised the concept of technology infrastructure. AWS Marketplace is marching along the same path to transform the way software is sold and deployed.

As Amazon consistently pushes out innovative products on AWS, the enterprise software world sees the AWS Marketplace as a thriving environment for their customer deployments. Marketplace has improved the software search and implementation process for consumers and will likely change the way we think about software in the years to come.

How AWS Marketplace became a $1bn platform

There are a number of reasons why Marketplace became the Apple Store of infrastructure as code. Most importantly, the Amazon Machine Image (AMI) standard enabled software vendors to publish pre-built EC2 images into the AWS Marketplace with optimised environments for their products to run smoothly and efficiently. For more complex deployments, AWS Marketplace vendors can also leverage CloudFormation templates to deploy servers, network components, databases and storage to support their software.

This means software on Marketplace is fundamentally different from traditional software: no software installs, no lengthy configurations, and of course no custom hardware to support it. It is software in line with how engineers consume resources on the cloud.

Software on Marketplace is also less risky to adopt from a business standpoint. AWS Partner Orbitera worked with Amazon to offer AWS Test Drive, a combination of AWS infrastructure, software installation and licensing, and free usage time that allows customers to experience solutions offered by ISVs and AWS Consulting Partners before actually buying them. Test drive packages range from simple content management systems to complex, highly secure solutions comprised of products from several vendors. Solutions are made available as AMIs, CloudFormation templates, or instant connections to a SaaS service hosted on AWS. Test drives are browseable by industry, software vendor, and by specific AWS regions offering the test drive.

The AWS Marketplace, along with Test Drive, have turned the often labyrinthine process of purchasing and licensing enterprise software into something far closer to Amazon’s roots in e-commerce.

Marketplace has also required software vendors to simplify licensing, a welcome change for most IT consumers. To showcase their wares in the Marketplace, software vendors must sign contracts with Amazon to sell their products at set prices, without the confusing EUAs and SPLAs of the pre-cloud era.

There are over 2,200 products in the Marketplace now, and that number is increasing at a rapid rate. This past year, AWS expanded the reach of the Marketplace from data center products to a suite of managed desktop products, called Workspaces. With Workspaces, businesses of all sizes can access secure, managed desktop computing tools, including hosted email and all popular business software products from a wide variety of hardware platforms.

At the 2015 AWS Summit in San Francisco, Andy Jassy, head of AWS, boasted that AWS is now not only the fastest growing enterprise IT company in the world, but that it is the only company showing double-digit year over year growth. The AWS Marketplace may be the fastest growing part of the overall offering at this point, contributing about a billion dollars to the AWS revenue stream.

Innovators on the marketplace

Innovation on the Marketplace is driven by AWS Partners. It is likely that these partners will be the ones to help large Fortune 1000 enterprises move to AWS.

For example, look at Logicworks’ partner New Relic, who offers its software in the Marketplace with two different methods of delivery. Existing New Relic customers are offered an AMI that can be activated with pre-configured dashboards ready to immediately monitor their AWS environment. New customers can set up an account on their SaaS portal to begin monitoring their deployed applications.

CloudEndure offers AWS Migration and Disaster Recovery tools in the Marketplace with SaaS-based delivery. The fascinating thing about CloudEndure is that it can literally take a virtual machine and clone it into AWS with no impact on the source machine. It is one-click DR. This will be revolutionary for enterprises looking to migrate to the cloud.

While the Marketplace contains great products, the sheer quantity of vendors on the Marketplace can be overwhelming for consumers. That is why Consulting Partners have become increasingly valuable for enterprises. Last month, Amazon added Consulting Partners to the Marketplace alongside software vendors and resellers, further strengthening the Marketplace as an enterprise-grade e-commerce channel.

As AWS continues to innovate and launch exciting new computing products for its customers, the software partners featured in the Marketplace will continue to build the best possible products to run in the AWS public cloud.

Amazon is focusing on innovating in the right areas, especially mobile development (they announced AWS Device Farm last week) and database. They wisely leave specialised migration, DR, monitoring, etc. tools to partners. Together, AWS and the AWS Marketplace will shape what we expect – and get from – software vendors.

Security and advanced automation in the enterprise: Getting it right

(c)iStock.com/Mikko Lemola

Complexity is a huge security risk for the enterprise.

While security is always a top priority during the initial build phase of a cloud project, over time security tends to slip. As systems evolve, stacks change, and engineers come and go, it’s very easy to end up with a mash-up of legacy and cloud security policies piled on top of custom code that only a few engineers know how to work with.

The security of your system should not depend on the manual labour — or the memory — of your engineers. They shouldn’t have to remember to close XYZ security loophole when deploying a new environment. They don’t have time to manually ensure that every historical vulnerability is patched on every system across multiple clouds.

Security automation is the only long-term solution.

Automation significantly improves an engineer’s ability to “guarantee” that security policies are not only instituted, but maintained throughout the lifecycle of the infrastructure. Automated security policies encourage the adoption of evolving standards. And as vulnerabilities are exposed, changes can be implemented across hundreds or even thousands of complex systems, often simultaneously.

Why security automation?

No one can remember everything: The #1 reason to automate security is that human memory is limited. The bigger the infrastructure, the easier it is to forget to close XYZ security loophole when launching a new environment, or remember to require MFA, etc. Engineers are a smart group, but automation created by expert engineers is smarter.

Code it once and maintain templates, not instances: Manual security work is not only risk, but it is extremely time-consuming. It is much wiser to focus engineering time on building and maintaining the automation scripts that ensure security than it is on the manual work required to hunt down, patch, and upgrade individual components on each of your virtual servers.

Standard naming conventions: Inconsistent or sloppy naming is a bigger security risk than most people think. Imagine an engineer being tasked with opening a port on one of the following security groups, below. It would be fairly easy to mistake one security group for another.

Ensure historical vulnerabilities continue to be patched: When a security vulnerability is identified, engineers must manually patch the vulnerability in the right places, across hundreds or even thousands of separate systems. No human can ensure that the patch is in place across all of these systems.

When something like Heartbleed happens, the engineer can:

  1. Update SSL (or affected package) in a single configuration script
  2. Use a configuration management tool like Puppet to declaratively update all running and future instances, without human intervention
  3. See at first glance which instances are meeting core security objectives
  4. Guarantee that any new instances, either created during Auto Scaling event or due to failover, are protected against all historical vulnerabilities

No limited custom configurations: When different environments are built by different engineering teams at different times, manual security configurations often mean custom configurations. This makes it very difficult to gauge the impact of a feature change on security. A single or limited number of custom configurations not only reduces the risk of unexpected security implications, but also means your team is not relying on the memory of the one or two engineers that built the application’s infrastructure.

Our security automation tools

Infrastructure build out: AWS CloudFormation

Infrastructure build out should be the first thing an IT team automates. This includes networking, security groups, subnets, and network ACLs. At Logicworks, we use AWS CloudFormation to create templates of the foundational architecture of an environment.

CloudFormation allows us to spin up completely new environments in hours. This means no manual security group configuration and no AWS Identity and Access Management (IAM) role configuration. Because configuration is consistent across multiple environments, updates / security patches are near-simultaneous. It also ensures that the templated architecture meets compliance standards, which are usually crucial in the enterprise.

There have been a number of tools released in the last year to build out templates of AWS resources. Our opinion is that CloudFormation is the best tool available, despite certain limitations.

Here are a few tasks that CloudFormation performs:

  • Build network foundation
  • Configure gateways and access points
  • Install management services, like Puppet
  • Allocate Amazon S3 buckets
  • Attach encrypted volumes
  • Control and manage access though IAM
  • Register DNS names with Amazon Route 53
  • Configure log shipping and retention

Configuration management: Puppet

Boot time is arguably the most crucial part of an instance lifetime. Puppet or another configuration management tool like Chef or Ansible not only simplifies and speeds up the bootstrap process, but for security purposes, continually checks in on instances and rolls back non-authorized changes. Puppet manifests are therefore a living single source of truth on instance configuration across the environment. This means that engineers can ensure that no (permanent) changes are made on an instance level that compromise security.

Puppet is also used to install various security features on an instance, like Identity Detection System agents, log shipping, monitoring software, etc. as well as requiring MFA and binding the instance to central authentication.

If there is more experience on an IT team with tools like Chef or Ansible, these are equally powerful solutions for configuration management.

Iterative deployment process: AWS CodeDeploy / Jenkins

Ideally, enterprises want to get to a place where deployment is fully automated. This not only maintains high availability by reducing human error, but it also makes it possible for an organisation to respond to security threats quickly.

AWS CodeDeploy is one of the best tools to be able to achieve automated deployments. Unlike Jenkins, which requires a bit more custom configuration, CodeDeploy can be used across multiple environments simultaneously. Any effort that removes custom work is engineering time that can be focused on more important features — whether that’s developing new code or maintaining the automation scripts that make security automation possible.

Monitoring: EM7, Alert Logic, CloudCheckr

By choosing the right third party monitoring tools, you can bake automated security monitoring into every deploy. ScienceLogic’s EM7 is the best tool we’ve found for automated reporting and trend analysis, while Alert Logic provides the most sophisticated intrusion detection tools. CloudCheckr not only provides excellent cost analysis, but it also has introduced governance features that help enterprises stay compliant. Enterprises are usually quite familiar with these tools, and they can function across public clouds and on-premises environments.

Coming soon to the enterprise?

Security automation is not easy.

In fact, for some enterprises, it may be more cost-effective in the short term to configure security manually; CloudFormation and Puppet take several weeks or even months to learn, and it may take a consulting engagement with a third party cloud expert to even understand the foundational security policies in place across different systems.

However, we expect that a manual security approach will be impossible in five years. Enterprises are already spanning on-premises data centres, on-premises virtualised data centres, colocation centres, some public clouds, etc. As the enterprise moves towards hybrid cloud on an application-by-application basis, this means even more complexity.

But complexity does not have to mean custom configuration. Security automation tools, combined with tools like containers, mean that engineers can escape manual configuration work on individual servers. As security is abstracted away from the underlying infrastructure, we have the opportunity to improve our overall security posture.

This is the next frontier: security as code.

The post Security and Advanced Automation in the Enterprise appeared first on Gathering Clouds.

Why enterprises need containers and Docker

(c)iStock.com/JVT

At DockerCon 2015 last week, it was very clear that Docker is poised to transform enterprise IT.

While it traditionally takes years for a software innovation — and especially an open source one — to reach the enterprise, Docker is defying all the rules. Analysts expect Docker will be the norm in enterprises by 2016, less than two years after its 1.0 release.

Why are Yelp, Goldman Sachs, and other enterprises using Docker? Because in many ways, enterprises have been unable to take full advantage of revolutions in virtualisation and cloud computing without containerisation.

Docker, standard containers, and the hybrid cloud

If there ever was a container battle among vendors, Docker has won — and is now nearly synonymous with container technology.

Most already understand what containers do: describe and deploy the template of a system in seconds, with all infrastructure-as-code, libraries, configs, and internal dependences in a single package, so that the Docker file can be deployed on virtually any system.

But the leaders of the open-source project wisely understand that in order to work in enterprises, there needs to be a “standard” container that works across more traditional vendors like VMware, Cisco, and across new public cloud platforms like Amazon Web Services. At DockerCon, Docker and CoreOS announced that they were joining a Linux Foundation initiative called the Open Container Project, where everyone agrees on a standard container image format and runtime.

This is big news for enterprises looking to adopt container technology. First, in a market that is becoming increasingly skittish about “vendor lock-in”, container vendors have removed one more hurdle to moving containers across AWS, VMware, Cisco, etc. But more importantly for many IT leaders, this container standardisation makes it that much easier to move across internal clouds operated by multiple vendors or across testing and production environments.

A survey of 745 IT professionals found that the top reason IT organizations are adopting Docker containers is to build a hybrid cloud. Despite the promises of the flexibility of hybrid clouds, it is actually quite a difficult engineering feat to build cloud bursting systems (where load is balanced across multiple environments), and there is no such thing as a “seamless” transition across clouds. Vendors that claim to facilitate this often do so by compromising feature sets or by building applications to the lowest common denominator, which often means not taking full advantage of the cost savings or scalability of public clouds.

By building in dependencies, Docker containers all but eliminate these interoperability concerns. Apps that run well in test environments built on AWS will run exactly the same in production environments in on-premises clouds.

Docker also announced major upgrades in networking that allow containers to communicate with each across hosts. After acquiring SocketPlane six months ago, the SocketPlane team is working to complete a set of networking APIs, it looks like Docker is hard at work making networking enterprise-grade, so that developers are guaranteed application portability throughout the application lifecycle. Read all the updates from DockerCon 2015 here.

Reducing complexity and managing risk

Docker does add another level of complexity when engineers are setting up the environment. On top of virtualisation software, auto scaling, and all of the moving parts of automation and orchestration now in place in most enterprises, Docker may initially seem like an unnecessary layer.

But once Docker is in place, it drastically simplifies and de-risks the deploy process. Developers have more of a chance to work on application knowing that once they deploy to a Docker file, it will run on their server. They can build their app on their laptop, deploy as a Docker file, and type in a command to deploy it to production. On AWS, using ECS with Docker takes away some of the configuration you need to complete with Docker. You can achieve workflows where Jenkins or other configuration integration tools run tests, AWS CloudFormation scales up an environment, all in minutes.

This simplified (and shortened) deployment cycle is even more useful in complex environments, where developers often must “remember” to account for varying system and infrastructure requirements during the deploy process. In other words, the deploy process happens faster with fewer errors, so developers can focus on doing their jobs. System engineers do not have to jump through the same hoops to make sure an application runs on infrastructure it was not configured for.

Many large start-ups, like Netflix, have developed workarounds and custom solutions to simplify and coordinate hundreds of deploys a day across multiple teams. But as enterprises are in the nascent stages of continuous delivery, Docker has come at a perfect time to eliminate the pain of complex deploys before they have to develop their own workarounds.

Caveat: Docker in hybrid environments is not “easy”

We mentioned it above, but it is important to note that setting up Docker is a specialised skill. It has even taken the senior automation engineers at Logicworks quite some time to get used to. No wonder why it was announced at DockerCon that the number of Docker-related job listings went from 2,500 to 43,000 in 2015, an increase of 1,720 percent.

In addition, Docker works best in environments that have already developed sophisticated configuration automation practices (using Puppet or Chef), where engineers have invested time in developing templates to describe cloud resources (CloudFormation). Docker also requires that these scripts and templates change. Most enterprises will either have to hire several engineers to implement Docker or hire a managed service provider with expertise in container technology.

On top of this, there are lingering concerns over the security of Docker in production — and rightly so. While many enterprises, like Yelp and Goldman Sachs, have used Docker in production, there are certain measures one can take to protect these assets for applications carrying sensitive data or compliance obligations.

Docker did announce the launch of Docker Trusted Registry last week, which is a piece of software that securely stores container images. It also comes with management features and support, which meets Docker’s paid support business objectives. This announcement is specifically targeted at the enterprise market, which has traditionally been skittish of open source projects without signatures and support (e.g., Linux vs. Red Hat). AWS and other cloud platforms have already agreed to resell the technology.

Over the next 12 months, best practices and security protocols around containers will become more standardised. And as they do, enterprises and start-ups will benefit from Docker to create IT departments that function as smoothly as container terminals.

The post Why Enterprises Need Containers and Docker appeared first on Gathering Clouds.

Government cloud on the rise: NSA and DOJ move to Amazon Web Services

(c)iStock.com/DHuss

At the Amazon Public Sector Symposium last week, the NSA announced that it will be moving some of its IT infrastructure to AWS. The NSA follows several other federal agencies, including the Department of Defense and the National Geospatial-Intelligence Agency (NGA), in joining the CIA in the Amazon cloud in the last 9 months.

“The infrastructure as a service which Amazon provides has shown us significant IT efficiencies,” said Alex Voultepsis, chief of the engineering for the NSA’s Intelligence Community Special Operations Group, at a panel last week. Voultepsis then estimated that the agency will save 50-55% on infrastructure costs alone by moving to AWS.

The state of the government cloud

In 2010, the CIO of the U.S. government, Vivek Kundra, famously declared that the federal government must move to a “cloud first” policy. It has taken five years for the first federal agencies to get on board, but there is a long way to go. According to a report released by the U.S. Government Accountability Office, an average of 2% of IT spend went towards cloud computing in 2014. The seven largest federal agencies did not even consider cloud computing services for 67% of their projects.

The Cloud Computing Caucus Advisory Group (CCCAG), a nonprofit that builds awareness of the role of cloud computing in society, industry and government, regularly speaks with government IT leaders to discuss the cloud. According to a report in Forbes, CCCAG’s leaders frequently hear concerns such as “the costs savings aren’t real, the technologies aren’t proven” or “there’s not enough data security.” Research conducted by the Congressional Research Service similarly cites security, network infrastructure requirements, and compliance as the largest challenges.

Despite resistance, analysts believe the federal government is at a tipping point in cloud adoption. The IDG estimates that in 2014, federal government spending on private cloud was $1.7 billion, with just $118.3 million on public cloud; they expect this number will double by the end of 2015.

The reason for these sunny projections? Analysts anticipate that adoption of the cloud by the Department of Defense and other early adopters on the state level — including health and security agencies that obviously have very sophisticated security and compliance requirements — will push hesitant federal agencies out of the pilot phase.

Healthcare and defence on AWS

Federal and state Health and Human Services Departments must adhere to very restrictive and punitive data hosting standards. However, special federal and state regulations and incentives have provided strong incentives for state HHS departments to develop Electronic Health Record (EHR), Health Insurance Exchanges (HIX) and Health Information Exchanges (HIE) systems on the cloud. The success of these projects should serve as a model to cloud-averse federal agencies.

Massachusetts Executive Office of Health and Human Services (EOHHS) is a prime example of an early cloud adopter. In 2014, Massachusetts EOHHS launched Virtual Gateway (VG), a platform that connects more than fifteen state agencies, on Logicworks’ hosted private cloud. VG provides over forty software applications used by eighty thousand users including state workers, health care providers, and the public. Healthcare providers may use VG applications to send claims to the EOHHS or to submit disease information to the Department of Public Health, reducing redundancy and improving reliability of services.

“By consolidating information and online services in a single location on the Internet, the Virtual Gateway, a critical computing infrastructure platform, simplifies the process of connecting people to critical health and human services programs and information,” said Manu Tandon, Chief Information Officer for Massachusetts EOHHS.

The rapid progress of Health Information Exchanges (HIEs) across most states should also serve as a model for federal agencies. Largely driven by the HITECH Act, states and providers are required to consolidate and secure health records across multiple local agencies and hospitals, in order to improve interoperability of systems and quality of care for patients. In many cases, these organizations are enabled and supported financially by statewide health information exchange grants from the Office of the National Coordinator for Health Information Technology. The HIEs for 30 states are currently hosted on a private cloud, and California’s HIE, CalIndex, is hosted on Amazon Web Services.

Consolidating IT resources across agencies

Regulations in this vertical were specifically oriented to the interoperability of data across multiple agencies, hospitals, and providers. This is a challenge that the cloud is able to fulfill more simply and inexpensively than traditional hosting. The ability to share cloud resources across multiple agencies is also a clear benefit of cloud-based hosting for the federal government, and the success of state-run agencies and large, complex federal departments like the Department of Defense will recommend the cloud as a key interoperability solution.

A similar need existed among multiple defense agencies. Commercial Cloud Services (C2S), the Amazon cloud region established by the Central Intelligence Agency for classified data, is now open to all 17 federal intelligence agencies, according to TechTarget reporting.

“We cannot continue to operate in the silo mentality of each agency not talking to each other…we’re leveraging this initiative to start working together,” said Jason Hess, cloud security manager for The National Geospatial-Intelligence Agency (NGA).

In the next 6-18 months, SaaS and IaaS offerings for the government cloud will likely become more robust to meet the growing demand of federal agencies. Lessons from the cloud deployments of the most highly regulated federal and state agencies will accelerate cloud adoption. Hopefully, agencies beyond defense and healthcare will soon be able to pass on cost savings — and improved service quality — to taxpayers.

The post Government Cloud on the Rise: NSA, DOJ Move to Amazon Web Services appeared first on Gathering Clouds.

Real use cases: Why 50% of enterprises are choosing hybrid cloud

(c)iStock.com/marekuliasz

Enterprises are shaping the cloud to fit their needs, and the result is overwhelmingly a hybrid of public, private, and on-premises clouds. Today, nineteen percent (19%) of organizations manage hybrid clouds and an additional 60% plan to deploy them. Gartner estimates that hybrid cloud adoption will near 50% by 2017.

Why are enterprises going hybrid? Below are four use cases that demonstrate how enterprises determine what, when and how to move to the cloud.

Case 1: Testing environments

Migrating testing/staging environments to the public cloud is a compelling business case for many enterprises, especially when business leaders are still skittish about trusting production environments to multi-tenant clouds. Enterprises get a 20-30% cost reduction on non-critical infrastructure, and developers have the opportunity to get familiar with cloud architecture without the risk of an inexperienced cloud engineer bringing down their applications.

A large software service provider builds discrete environments for each of their clients. Their data is highly regulated, and many of their clients are not comfortable using the public cloud to host sensitive information. However, compelled by the cost savings of Amazon Web Services, they decided to host their development and staging environments for their customer-facing applications onto AWS. Their development and staging environments do not host sensitive data. Low-latency connections from their AWS environment to their hosted private cloud with AWS Direct Connect allow them to maintain high deployment velocity. AWS Storage Gateway allows them to easily move data between clouds if necessary for a nominal fee, and AWS Code Deploy allows them to coordinate code pushes to their production servers.

The company has reduced their hosting costs and the success of the project has led to greater department-wide acceptance of cloud hosting. If the project is successful, they plan to migrate the production environment to AWS in 6-12 months.

Case 2: Disaster recovery

Enterprises spend millions maintaining backup systems that are largely unused for 95% of the year. The public cloud allows companies to pay for disaster recovery when they need it — and not pay when they don’t. The public cloud also has greater geographic diversity of datacenters than even the largest enterprises can afford, and enterprises can cheaply ship and store machine images in AWS.

However, migrating backups to the cloud is not a simple proposition. Many enterprises have highly regulated backup procedures around location of backups, length of data storage, and data security. Enterprises often do not have the internal experience in cloud database and storage to meet compliance standards.

A research company maintains critical intellectual property in on-premises and colocated data centers. Their business leaders are adamant about not hosting intellectual property in the public cloud, yet they want to explore the public cloud for cost savings. When evaluating their disaster recovery procedures, they realized that while their backups were geographically dispersed, each was highly prone to earthquakes.

The research company has decided to maintain backups in AWS, including vast quantities of data from research trials. This will allow them to repurpose hardware currently dedicated to backups on front-line data processing, saving both on disaster recovery costs and non-hardware provisioning. They plan to use pilot light disaster recovery to maintain a mirrored DB server while keeping their other application and caching servers off. In the case of a disaster, they will be able to start up these instances in under 30 minutes.

Case 3: Legacy systems

In many organizations, complex lines of application dependency mean that some components of an application can move to the cloud while others, usually those tied to legacy systems, cannot. These legacy systems are hosted on-premises or in private clouds while other components are moved to the public cloud.

A large SaaS provider maintains Oracle RAC as a major database system supporting a critical piece of several applications’ environments. Oracle RAC is a shared cache clustered database architecture that utilizes Oracle Grid Infrastructure to enable the sharing of server and storage resources. Automatic, instantaneous failover to other nodes enables an extremely high degree of scalability, availability, and performance. They wanted to move other components of the application onto AWS, but neither Amazon’s EC2 nor RDS provide native support for RAC.

The high performance capability of RAC meant they did not want to look for another solution on AWS. Instead, they decided to host RAC on bare metal servers and use AWS Direct Connect to provide low-latency connections to the other application tiers on AWS. They were able to successfully maintain the high performance of RAC while still gaining the scalability and low cost of AWS compute resources.

Case 4: Cloud bursting

The idea of cloud bursting appeared several years ago, but the “own the base, rent the spike” system has never taken off in the enterprise. Partially this is due to the fact that it is technically difficult to accomplish, and often requires that applications be built for cross-cloud interoperability from the get-go. Systems often cannot talk to each other, and the systems often require handcrafting by developers and administrators. Building the automation scripts to perform scaling without human intervention is a challenge for even the most advanced automation engineers.  Only in the last 6-12 months have we seen the kind of tools appear on the market that might facilitate bursting for enterprise grade applications, like Amazon resources that now appear in VMware’s vCenter management console.

We have yet to hear of cases where a large-scale enterprise employs cloud bursting. We expect this to become more common only as hybrid cloud tools mature. Vendors that claim to have the built-in ability to do cloud bursting are often limited in other ways, such as breadth of services and compliance. Furthermore enterprises are at a far greater risk of vendor lock-in with these smaller clouds than with a system built modularly for scalability, as expert engineers can achieve in AWS.

In the next 12-24 months, many enterprises will be in the application evaluation and planning phase of their hybrid cloud deployments. Some will choose to experiment with the public cloud in highly controlled ways, as in Case 1 and 2, while other enterprises — usually smaller ones — will take a more aggressive approach and migrate production applications, as in Case 3 and 4. Although most hybrid deployments add complexity to enterprise infrastructure, the success of this planning phase will turn what could become a series of unwieldy mash-ups into the ultimate tool for business agility.

The post Why Are 50% of Enterprises Choosing Hybrid Cloud? Real Use-Cases appeared first on Gathering Clouds.

Is your cloud provider HIPAA compliant? An 11 point checklist

(c)iStock.com/AndreyPopov

Healthcare organisations frequently turn to managed service providers (MSPs) to deploy and manage private, hybrid or public cloud solutions. MSPs play a crucial role in ensuring that healthcare organisations maintain secure and HIPAA compliant infrastructure.

Although most MSPs offer the same basic services – cloud design, migration, and maintenance – the MSP’s security expertise and their ability to build compliant solutions on both private and public clouds can vary widely.

Hospitals, healthcare ISVs and SaaS providers need an MSP that meets and exceeds the administrative, technical, and physical safeguards established in HIPAA Security Rule. The following criteria either must or should be met by an MSP:

1. Must offer business associate agreements

An MSP must offer a Business Associate Agreement (BAA) if it hopes to attract healthcare business. When a Business Associate is under a BAA, they are subject to audits by the Office for Civil Rights (OCR) and could be accountable for a data breach and fined for noncompliance.

According to HHS, covered entities are not required to monitor or oversee how their Business Associates carry out privacy safeguards, or in what ways MSPs abide by the privacy requirements of the contract. Furthermore, HHS has stated that a healthcare organisation is not liable for the actions of an MSP under BAA unless otherwise specified.

An MSP should be able to provide a detailed responsibility matrix that outlines which aspects of compliance are the responsibility of whom. Overall, while an MSP allows healthcare organisations to outsource a significant amount of both the technical effort and the risk of HIPAA compliance, organisations should still play an active role in monitoring MSPs. After all, an OCR fine is often the least of an organisation’s worries in the event of a security breach; negative publicity is potentially even more damaging.

2. Should maintain credentials

There is no “seal of approval” for HIPAA compliance that an MSP can earn. The OCR grants no such qualifications. However, any hosting provider offering HIPAA compliant hosting should have had their offering audited by a reputable auditor against the HIPAA requirements as defined by HHS.

In addition, the presence of other certifications can assist healthcare organisations in choosing an MSP that takes security and compliance concerns very seriously. A well-qualified MSP will maintain the following certifications:

  •      SSAE-16
  •      SAS70 Type II
  •      SOX Compliance
  •      PCI DSS Compliance

While these certifications are by no means required for HIPAA compliance, the ability to earn such qualifications indicates a high level of security and compliance expertise. They require extensive (and expensive) investigations by 3rd party auditors of physical infrastructure and team practices.

3. Should offer guaranteed response times

Providers should indicate guaranteed response times within their Service Level Agreement. While 24/7/365 NOC support is crucial, the mere existence of a NOC team is not sufficient for mission-critical applications; healthcare organisations need a guarantee that the MSP’s NOC and security teams will respond to routine changes and to security threats in a timely manner.  Every enterprise should have guaranteed response times for non-critical additions and changes, as well.

How such changes and threats are prioritized and what response is appropriate for each should be the subject of intense scrutiny by healthcare organisations, who also have HIPAA-regulated obligations in notifying authorities of security breaches.

4. Must meet data encryption standards

The right MSP will create infrastructure that is highly secure by default, meaning that the highest security measures should be applied to any component where such measures do not interfere with the function of the application. In the case of data encryption, while HIPAA’s Security Rule only requires encryption for data in transit, data should reasonable be encrypted everywhere by default, including at rest and in transit.

When MSPs and healthcare organisations encrypt PHI, they are within the “encryption safe harbor.” Unauthorised disclosure will not be considered a breach and will not necessitate a breach notification if the disclosed PHI is encrypted.

Strong encryption policies are particularly important in public cloud deployments. The MSP should be familiar with best practices for encrypting data both within the AWS environment and in transit between AWS and on-site back-ups or co-location facilities. We discuss data encryption best practices for HIPAA compliant hosting on AWS here.

It is important to note that not all encryption is created equal; look for an MSP that guarantees at least AES-256 Encryption, the level enforced by federal agencies. It is useful to note that AWS’ check-box encryption of EBS volumes meets this standard.

5. Should have “traditional IT” and cloud expertise

Major healthcare organisations have begun to explore public cloud solutions. However, maintaining security in public clouds and in hybrid environments across on-premises and cloud infrastructure is a specialty few MSPs have learned. “Born in the Cloud” providers, whose businesses started recently and are made up exclusively of cloud experts, are quite simply lacking the necessary experience in complex, traditional database and networking that would enable them to migrate legacy healthcare applications and aging EHR systems onto the public cloud without either a) over-provisioning or b) exposing not-fully-understood components to security threats.

No matter the marketing hype around “Born in the Cloud” providers, it certainly is possible to have best-in-class DevOps and cloud security expertise and a strong background in traditional database and networking. In fact, this is what any enterprise with legacy applications should expect.

Hiring an MSP that provides private cloud, bare metal hosting, database migrations, legacy application hosting, and also has a dedicated senior cloud team is optimal. This ensures that the team is aware of the unique features of the custom hardware that currently supports the infrastructure, and will not expose the application to security risks by running the application using their “standard” instance configuration.

6. Must provide ongoing auditing and reporting

HIPAA Security Rule requires that the covered entity “regularly” audit their own environment for security threats. It does not, however, define “regularly,” so healthcare organisations should request the following from their MSPs:

  • Monthly or quarterly engineering reviews, both for security concerns and cost effectiveness
  • Annual 3rd party audits
  • Regular IAM reports. A credential report can be generated every four hours; it lists all of the organisations users and access keys.
  • Monthly re-certification of staff’s IAM roles
  • Weekly or daily reports from 3rd party security providers, like Alert Logic or New Relic

7. Must maintain compliant staffers and staffing procedures

HIPAA requires organisations to provide training for new workforce members as well as periodic reminder training. As a business associate, the MSP has certain obligations for training their own technical and non-technical staff in HIPAA compliance. There are also certain staff controls and procedures that must be in place and others that are strongly advisable. A covered entity should ask the MSP the following questions:

  • What formal sanctions exist against employees who fail to comply with security procedures?
  • What supervision exists of employees who deal with PHI?
  • What is the approval process for internal collaboration software or cloud technologies?
  • How do employees gain access to your office? Is a FOB required?
  • What is your email encryption policy?
  • How will your staff inform our internal IT staff of newly deployed instances/servers? How will keys be communicated, if necessary?
  • Is there a central authorisation hub such as Active Directory for the rapid decommissioning of employees?
  • Can you provide us with your staff’s HIPAA training documents?
  • Do you provide security threat updates to staff?
  • What are internal policies for password rotation?
  • (For Public Cloud) How are root account keys stored?
  • (For Public Cloud) How many staff members have Administrative access to our account?
  • (For Public Cloud) What logging is in place for employee access to the account? Is it distinct by employee, and if federated access is employed, where is this information logged?

While the answers to certain of these questions do not confirm or deny an MSP’s degree of HIPAA compliance, they may help distinguish a new company that just wants to attract lucrative healthcare business versus a company already well versed in such procedures.

8. Must secure physical access to servers

In the case of a public cloud MSP, the MSP should be able to communicate why their cloud platform of choice maintains physical data centres that meet HIPAA standards. To review AWS’s physical data centre security measures, see their white paper on the subject. If a hybrid or private cloud is also maintained with the MSP, they should provide a list of global security standards for their data centres, including ISO 27001, SOC, FIPS 140-2, FISMA, and DoD CSM Levels 1-5, among others. The specific best practices for physical data centre security that healthcare organisations should look out for is well covered in ISO 27001 documentation.

9. Should conduct risk analysis in accordance with NIST guidelines

The National Institute of Standards and Technology, or NIST, is a non-regulatory federal agency under the Department of Commerce. NIST develops information security standards that set the minimum requirements for any information technology system used by the federal government.

NIST produces Standard Reference Materials (SRMs) that outline the security practices, and their most recent Guide for Conducting Risk Assessments provides guidance on how to prepare for, conduct, communicate, and maintain a risk assessment as well as how to identify and monitor specific risk factors. NIST-800 has become a foundational document for service providers and organisations in the information systems industry.

An MSP should be able to provide a report that communicates the results of the most recent risk assessment, as well as the procedure by which the assessment was accomplished and the frequency of risk assessments.

Organisations can also obtain NIST 800-53 Certification from NIST as a further qualification of security procedures. While again this is not required of HIPAA Business Associates, it indicates a sophisticated risk management procedure — and is a much more powerful piece of evidence than standard marketing material around disaster recovery and security auditing.

10. Must develop a disaster recovery plan and business continuity plan

The HIPAA Contingency Plan standard requires the implementation of a disaster recovery plan. This plan must anticipate how natural disasters, security attacks, and other events could impact systems that contain PHI and develops policies and procedures for responding to such situations.

An MSP must be able to provide their disaster recovery plan to a healthcare organisation, which should include answers to questions like these:

  • Where is backup data hosted? What procedure maintains retrievable copies of ePHI?
  • What procedures identify suspected security incidents?
  • Who must be notified in the event of a security incident? How are such incidents documented?
  • What procedure documents and restores the loss of ePHI?
  • What is the business continuity plan for maintaining operations during a security incident?
  • How often is the disaster recovery plan tested?

11. Should already provide service to large, complex healthcare clients

Although the qualifications listed above are more valuable evidence of HIPAA compliance, a roster of clients with large, complex, HIPAA-compliant deployments should provide extra assurance. This pedigree will be particularly useful in vendor decision discussions with non-technical business executives. The MSPs ability to maintain healthcare clients in the long-term (2-3+ years) is important to consider.

The post Is Your Cloud Provider HIPAA Compliant? 11 Point Checklist appeared first on Gathering Clouds.

Analysing the differences between cloud orchestration and cloud automation

(c)iStock.com/TARIK KIZILKAYA

What is the difference between cloud orchestration and cloud automation? An exploration of these two terms is more than a vocabulary exercise; it highlights a key challenge for teams looking to improve IT processes.

In most situations, cloud automation describes a task or function accomplished without human intervention. Cloud orchestration describes the arranging and coordination of automated tasks, ultimately resulting in a consolidated process or workflow.

It is simplest to see this in an example. To create a standard process to spin up an environment to host a new application, IT teams need to orchestrate several automated tasks: they can automate the addition of new instances during an auto scaling event with auto scaling groups, elastic load balancers, alarms, etc; the environment might also include a deployment automation tool like Code Deploy; Puppet scripts might automate the configuration of the OS; etc. All of these functions are cloud automation processes.

These automation tools must occur in a particular order, under certain security groups/tools, be given roles and granted permissions. In other words, engineers must complete hundreds of manual tasks to deliver the new environment, even when the building blocks of that environment are automated. This is where cloud orchestration is key.

Orchestration tools, whether native to the IaaS platform or 3rd party software tools, enumerate the resources, instance types, IAM roles, etc. that are required, as well as the configuration of those resources and the interconnections between them. Engineers can use tools like AWS CloudFormation or VMware’s vRealize Orchestrator to create declarative templates that orchestrate these processes into a single workflow, so that the “new environment” workflow described above becomes a single API call.

Cloud automation describes a task or function accomplished without human intervention. Cloud orchestration describes the arranging and coordination of automated tasks, ultimately resulting in a consolidated process or workflow.

The creation of these templates is time-consuming and challenging. However, whether the IT team is small and needs such tools to multiply and preserve manpower, or the IT team is very large and needs to maintain a single source of truth, security configuration, and approximate cost per deployment across multiple teams, orchestration tools both simplify and de-risk complex IT processes.

How does orchestration relate to DevOps? Essentially, well-orchestrated IT processes enable and empower continuous integration and continuous delivery, uniting teams in the creation of a set of templates that meet developer requirements. Such templates are in many ways living documents that embody DevOps philosophy. Automation is a technical task, orchestration is an IT workflow composed of tasks, and DevOps is a philosophy that empowers and is powered by sophisticated, orchestrated processes.

As is already obvious, orchestration has the potential to lower overall IT costs, free up engineering time for new projects, improve delivery times, and reduce friction between system and development teams. However, every enterprise is in a different stage of implementing the tools and the philosophy orchestration implies. Some organizations have only begun the cloud automation process, and smaller organizations may still rely on a single individual or team to be the orchestration “brain” that is coordinating IT processes. (One can imagine what happens when this individual or team leaves the organization.) On the other end of the spectrum, organizations that orchestrate automation tasks into standard but flexible IT workflows under a single monitoring and orchestration software interface are true DevOps shops.

The post Cloud Orchestration vs. Cloud Automation appeared first on Gathering Clouds.

Five ways to monitor and control AWS cloud costs

(c)iStock.com/surpasspro

Many IT teams find that their AWS cloud costs grow less efficient as “clutter” builds up in their accounts. The good news is that both AWS and a small army of third party providers have developed tools to help engineers discover the cause(s) of these inefficiencies.

While there are several “easier” fixes, such as Reserved Instances and eliminating unused resources, the real issue is usually far more complex. Unplanned costs are frequently the result of nonstandard deployments that come from an unclear or absent development processes, poor organisation, or the absence of automated deployment and configuration tools.

Controlling AWS costs is no simple task in enterprises with highly distributed teams, unpredictable legacy applications, and complex lines of dependency. Here are some strategies Logicworks engineers use to keep our clients’ costs down:

1. Cloudcheckr and Trusted Advisor

The first step in controlling AWS costs is to gather historical cost/usage data and set up an interface where this data can be viewed easily.

There are many third party and native AWS resources that provide consolidated monitoring as well as recommendations for potential cost saving, using tools like scheduled runtime and parking calendars to take advantage of the best prices for On-Demand instances.

Cloudcheckr is a sophisticated cloud management tool that is especially useful in enforcing standard policies and alerting developers if any resources are launched outside of that configuration. It also has features like cost heat maps and detailed billing analysis to give managers full visibility into their environments. When unusual costs appear in an AWS bill, Cloudcheckr is the first place to look.

Trusted Advisor is a native AWS resource available with Business-level support. TA’s primary function is to recommend cost savings opportunities and like Cloudcheckr, it also provides availability, security, and fault tolerance recommendations. Even simple tunings in CPU usage and provisioned IOPS can add up to significant savings; Oscar Health recently reported that it saw 20% savings after using Trusted Advisor for just one hour.

Last year, Amazon also launched the Cost Explorer tool, a simple graphical interface displaying the most common cost queries: monthly cost by service, monthly cost by linked account, and daily spend. This level of detail might be suitable for upper management and finance teams, as it does not have particularly specific technological data.

2. Reserved instances

The most obvious way to control compute cost is to purchase reserved EC2 instances for the period of one or three years, either paid all upfront, partially upfront, or none upfront. Customers can see savings of over 50% on reserved instances vs. on-demand instances.

However, reserved instances have several complications. First, it is not a simple matter to predict one or three years of usage when an enterprise has been on AWS for the same amount of time or less; secondly, businesses that are attracted to the pay-as-you-go cloud model are wary of capital costs that harken back to long-term contracts and sunk costs. It can also be difficult to find extra capacity of certain instance types on the marketplace, and enterprises might find this a complicated and costly procedure in any case.

Companies can still get value out of reserved instances by following certain best practices:

  • Buy reserved capacity to meet the minimum or average sustained usage for the minimum number of instances necessary to keep the application running, or instances that are historically always running.
  • To figure out average sustained usage, use tools like Cloudcheckr and Trusted Advisor (explored above) to audit your usage history. Cloudcheckr will recommend reserved instance purchases based on those figures, which can be especially helpful if you do not want to comb through years of data across multiple applications.
  • Focus first on what will achieve the highest savings with rapid ROI; this lowers the potential impact of future unused resources. The best use-cases for reserved instances are applications with very stable usage patterns.
  • For larger enterprises, use a single individual, financial team, and AWS account to purchase reserved instances across the entire organisation. This allows for a centralized reserved instance hub so that resources that are not used on one application/team can be taken up by other projects internally.
  • Consolidated accounts can purchase reserved instances more effectively when instance families are also consolidated. Reservations cannot be moved between accounts, but they can be moved within RI families. Reservations can be changed at any time from one size to another within a family. The fewer families are maintained, the more ways an RI can be applied. However, as explored below, the cost efficiencies gained by choosing a more recently released, more specialised instance type could outweigh the benefits of consolidating families to make the RI process smoother.
  • Many EC2 Instances are underutilised. Experiment with a small number of RIs on stable applications, but you may find better value by choosing smaller instance sizes and via better scheduling of On-Demand instances, without upfront costs.

3. Spot instances

Spot instances allow customers to set the maximum price for compute on EC2. This is great for running background jobs more cheaply, processing large data loads in off-peak times, etc. Those familiar with certain CPC bid rules in advertising may recognise the model.

The issue is that a spot instance might be terminated when you are 90% through a job if the price for that instance rises above the price threshold. An architecture unplanned for this can see the cost of the spot instance wasted. Bid prices need to change dynamically, but without exceeding on-demand prices. Best practice is to set up an Auto Scaling group that only has spot instances; CloudWatch can watch the current rate and in the event of the price meeting a bid, it would scale up the group as long as it is within the parameters of the request. Then create a second Auto Scaling group with on-demand instances (the minimum to keep the lights on), and set an ELB between them so that requests get served either by the spot group or the on-demand group. If the on-demand price is greater than bid price, then create a new launch configuration that sets the min_size of spot instances Auto Scaling group to 0. Sanket Dangi outlines this process here.

Engineers can also use this process to make background jobs run faster, so that spot instances are used to supplement a scheduled runtime if the bid price is below a certain figure, thus minimizing the impact on end-users and potentially saving cost between reserved and on-demand instances.

For those not interested in writing custom scripts, Amazon recently acquired ClusterK, which reallocates resources to on-demand resources when spot instances terminate and “opportunistically rebalance” to spot instances when the price fits. This dramatically expands the use-case for spot instances beyond background applications to mission-critical apps and services.

4. Organise and automate

As IT teams evolve to a more service-oriented structure, highly distributed teams will increasingly have more autonomy over provisioning resources without the red tape and extensive time delay of traditional IT environments. While this is a crucial characteristic of any DevOps team, if it is implemented without the accompanying automation and process best practices, decentralised teams have the potential to produce convoluted and non-standard security rules, configurations, storage volumes, etc. and therefore drive up costs.

The answer to many of these concerns is CloudFormation. The more time an IT team spends in AWS, the more it is absolutely crucial that the team use CloudFormation. Enterprises deploying on AWS without CloudFormation are not truly taking advantage of all the features of AWS, and are exposing themselves to both security and cost risks as multiple developers deploy nonstandard code that is forgotten about / never updated.

CloudFormation allows infrastructure staff to bake in security, network, and instance family/size configurations, so that the process of deploying instances is not only faster but also less risky. Used in combination with a configuration management tool like Puppet, it becomes possible to bring up instances that are ready to go in a matter of minutes. Puppet manifests also provide canonical reference points if anything does not go as planned. Puppet maintains the correct configuration even if it means reverting back to an earlier version. For example, a custom fact to report which security groups an instance is running in, along with a manifest to automatically associate the instance with specific groups as needed. This can significantly lower the risk of downtime associated with faulty deploys. CloudFormation can also dictate which families of instances should be used, if it is important to leverage previously-purchased RIs or provide the flexibility to do so at a later point.

Granted, maintenance of these templates requires a significant amount of staff time, and can initially feel like a step backwards in terms of cost efficiency. CloudFormation takes some time to learn. But investing the time and resources will have enormous impacts on a team’s ability to deploy quickly and encourage consistency within an AWS account. Clutter builds up in any environment, but this can be significantly reduced when a team automates configuration and deployment.

5. Instance types and resource optimisation

Amazon is constantly delivering new products and services. Most of these have been added as a direct result of customer comments about cost or resource efficiencies, and it is well worth keeping on top of these releases to discover if the cost savings outweigh the cost of implementing a new solution. If the team is using CloudFormation, this may be easier.

New instance types often have cost-savings potential. For instance, last year Amazon launched new T2 instances, which provide low cost stable processing power and the ability to build up “CPU credits” during these quiet periods to use automatically during busy times. This is particularly convenient for bursty applications with rare spikes, like small databases and development tools.

A number of Amazon’s new features over the last year have related to price transparency, including the pricing tiers of reserved instances, so it appears safe to expect more services that offer additional cost efficiencies in the next several years.

The post 5 Ways to Monitor and Control AWS Cloud Costs appeared first on Gathering Clouds.