At the AWS Summit in New York, Amazon Web Services focused predominantly around enterprise migration – and launched two new products aimed at taking the difficulty out of data analysis and transfer.
The cloud infrastructure giant announced the launch of AWS Migration Hub, a tool which aims to help organisations migrate their assets from on-prem data centres to Amazon’s cloud, as well as the general availability of AWS Glue, a product first announced in December last year which eases the process of moving data between data stores.
“Companies want to be able to fly from some of the constraints and break free from lock-in, and some of the relationships they have,” Adrian Cockcroft, AWS VP of cloud architecture, told attendees. “What we’ve been hearing from our customers is they want the freedom to build things quickly, unshackle from current database vendors, drive costs down, and have good ways to migrate out.”
This begat a discussion around relational database engine Aurora, AWS’ fastest growing product, which was launched in 2014. This time last year, AWS managed services partner Logicworks, writing for this publication, explained the reason for its success. “As cloud adoption matures, expect more companies to make a (slow) migration over to cloud-native systems,” the company wrote. “Because in the end, it is not just about licensing costs. It is about removing management burden from IT – and choosing to focus engineering talent on what really matters.”
More than 34,000 databases (below) have been migrated since the product’s launch; Cockcroft mentioned that he had given this talk a few times this year, and the number was continually being updated. An example of a company using Aurora to its advantage was Expedia, who performs 300 million writes a day on the engine, tracking how many hotel rooms are available across every hotel in the world.
When it comes to taking everything from a data centre – not just the greenfield apps, not even the mission-critical apps – then that was what AWS Migration Hub was for, Cockcroft added. The product is generally available today, hosted on AWS’ US West 2 zone in Oregon, but with a global reach.
Glue, on the other hand, is positioned as a fully managed data catalogue and ETL (extract, transform, load) service to take the fuss out of those “ubiquitous, and extremely tedious” workloads, as Dr. Matt Wood, general manager for artificial intelligence at AWS, put it.
Wood first riffed on the importance of AWS’ plethora of data handling tools, from the previously mentioned Aurora, to ElastiCache, to Redshift. “This approach, where we have a broad set of tools, each with a deep set of functionality, allows you to find the right tool for the job,” he said. “You don’t see Formula 1 engineers try and fix Formula 1 cars with Swiss Army knives.”
An example from Redshift Spectrum, which enables running SQL queries against exabytes of data in Amazon S3 was presented to the audience (below). Running a complex query against an exabyte dataset took Hive, running a 1000 node cluster, five years – obviously they made some estimates instead of letting it run its course – whereas Spectrum took just over two and a half minutes.
Wood said that up to three quarters of data scientists’ and data warehouse managers’ time was spent running ETL workloads. “Nobody goes to work in the morning and wants to write another ETL script,” he added. Through Glue, and its entirely serverless system and what Wood described as “by far the simplest UI [he had] ever shown to an audience of this size”, AWS aims for that to be a thing of the past.
Wood also discussed the machine learning projects being undertaken on AWS’ infrastructure. “The reason [machine learning] has started to stick in this iteration is that the cloud has enabled machine learning and customers to overcome the single largest point of friction, which is almost always around scale,” he explained. “Much like we did in the early days of AWS… we want to put this magical technology into the hands of every developer.”
Among the most interesting examples of the many companies exploring machine learning (below) were Stanford, who trained a deep learning model to help prevent diabetic blindness, Arterys, who has put together the first FDA-approved use of neural networks in medical imaging, and Wolfram Alpha. The latter, best known as the company to which Siri refers if she is stumped by a question, uses machine learning on AWS to build a computational knowledge engine. “When we’re talking about the challenges of handling inference at scale, with complicated deep learning models, this is the sort of scale you can achieve today through AWS,” said Wood.
Elsewhere, AWS announced a new customer in the shape of Hulu. The media company is moving away from its previous strategy of managing its own infrastructure and data centres – “everything we have, you name it, we built it” as the company put it to attendees – to help cover its various bets, from streaming content, to subscription systems, to live television.
“While we’ve experimented with cloud before, this became our first large scale production deployment,” said Rafael Soltanovich, VP of software development at Hulu.
“Building live TV is really hard, especially when you’re trying to do it in a radically different way,” he added, giving an example of just one of the issues Hulu had to sort out when rebuilding its entire tech stack. Take the Avengers film series based on the Marvel comic book characters, and the unrelated series of the same name, a 1998 film and the 1960s UK TV series. Having the right name, and the right image for each product, is vital to capture the attention of the viewer, Soltanovich said.
The recent Game of Thrones premiere was another example of Hulu’s nimble infrastructure in action; balancing between video on demand and live streams, between data centre and cloud, the company was able to normalise the load on its infrastructure to keep up with ‘massive’ user demand.
You can find out more about AWS Migration Hub here.
Picture credits: AWS/Screenshots