(c)iStock.com/kickers
Around six years ago, Amazon introduced an EC2 usage option at very low costs: Spot Instances. With EC2 Spot Instances, you can save money, but there’s a catch – they won’t necessarily be available when you need or want to use them. Managing and strategising around such temperamental instances can be challenging, not to mention unsuitable for AWS customers who are looking to provide stable online end user experiences.
Amazon created this type of resource in order to optimise their data center utilisation by leveraging their data centre’s spare capacity. However, if a customer on a higher-priced on-demand, or reserved capacity plan requires an instance, Amazon will take it from the Spot Instance pool to provide it to the higher paying customer. In this article, several use cases will be presented to show how this type of “unreliable” resource can be used to help effectiveness without putting your system at risk.
How Spot Instance pricing works
Before we jump into the use cases, let’s take a look at the ins and outs of Spot Instance pricing. As shown in the scheme below, AWS users can generate a Spot Instance request by bidding on the maximum price that they are willing to pay per hour per instance. If the bid is greater than the current spot market price, the user will get the Spot Instance resource and pay the current spot price. It’s recommended to check the AWS console for the Spot Instance pricing history in order to select a bid price that works for you.
Due to the unpredictable manner of Spot Instances and the potential service interruptions they may bring, you should look into using them for workloads or applications that are not mission critical. In other words, application or workload downtime or even failure should not impact business operations.
Batch processing
Batch jobs are used to upload information at the end of a business day, generating reports, processing documents, and performing other non-interactive operations. The main requirement is processing a large amount of data at one time. These applications rely heavily on the cloud’s large number of processors and are mostly stateless. This means that at any point, an instance could go down and its state could be lost, however when bringing up a new instance, the system continues to function. Other examples of batch jobs that are suitable for Spot Instances are systems that convert file formats such as video encoding. AWS has reported cases of up to 70% cost savings with Spot Instances when compared to On-Demand EC2 Instance pricing for this type of use case when running on on-demand resources.
High performance computing
The second use case is that of high performance computing. This is particularly relevant to the health care and scientific research industries. Image processing such as scanning complex body organ imaging or genome mapping requires a large capacity. Traditionally, with physical data centres, scientists would settle for small capacity. Nowadays, however, with the public cloud and cheap Spot Instances, scientists can provision a cluster of 1,000 servers (rather than five), for example, cutting processing times down from months to hours. Calculations and data processing jobs that traditionally took months or even years to complete can now be distributed among compute nodes for faster processing. Provisioning a cluster of 1,000 Spot Instances is relatively safe considering the fact that removing 10, 20, or even 50 servers from the cluster doesn’t significantly impact performance.
According to AWS, a biotechnology platform company decided to concentrate on making their drug design process more data-driven, efficient and predictable. After just five days of engineering efforts they saw that Amazon EC2 Spot Instances saved them 50%.
Development/testing
The third use case is large scale testing. Test and development accounts for more than 60% of independent software vendors’ (ISV) and enterprises’ IT environments seeing as they can be required to spin up multiple test environments at the same time. These processes are typically automatically scheduled and run infrequently. In order to reduce costs in this situation, implementing Spot Instances can be a very good option.
Although they are an appealing way to decrease costs, Spot Instances are not always an effective option for your development team. For instance, if a developer is working on something and suddenly an instance goes down, the work can be lost. If there are mechanisms in place for backup and recovery when an application fails, the developer should be able to recover the system in a matter of minutes. Ideally, Spot Instances are used for testing purposes such as websites that carry out load testing and monitoring for companies that need to ensure online performance.
Next steps with AWS Spot Instances
Just a few months ago, Amazon announced that they acquired ClusterK, a company that helps Spot Instance failover to On-Demand or Reserved Instances. Amazon continues to invest in developing this capability by continually releasing more and more useful features that help customers utilize Spot Instances in a more effective way. For example, Spot fleets reduce the development costs of using Spot Instances. They enable admins to deal with an entire set of Spot Instances as a unit instead of having to deal with each one separately.
Another useful Amazon feature is its alert capability. When a Spot Instance is about to be automatically terminated, Amazon sends an alert notifying you before the termination is carried out. Spot Instances are an important milestone in the cloud roadmap, and even though they are not hugely popular, Amazon is still investing in developing this area of its services.
AWS is not the only player in the Spot Instance domain. Google recently announced their own offering called Google Preemptible VMs, which, like Spot Instances, can be terminated at any time; however their price is deterministic and available on GCP’s pricing page. This is where Cloudyn comes into the picture. The volatility of Spot Instance prices exemplifies the complexity of a cloud environment. Therefore, continuously tracking usage and monitoring costs are a must when it comes to decision making.