All posts by kelly.stirman

A trillion tiny robots in the cloud: The future of AI in an algorithm world

(c)iStock.com/davincidig

It’s not words, but how we say them that speaks volumes. By analysing your tone of voice, a new computer algorithm can predict whether your relationship will last, and it arguably does so more accurately than professional therapists.

Now, I’m not here to advise you on couples therapy. What I’m interested in is how algorithms like this one are going to change the way we live and provide a massive opportunity for the cloud industry.

Data + machine learning algorithm = AI

Popular culture leads us to believe that the future of artificial intelligence (AI) will be a single, magical supercomputer. Think HAL 9000 from 2001: A Space Odyssey, or the ship’s computer on Star Trek. But we’re starting to understand now that we’ve been looking at it all wrong.

x.ai has created an AI-powered personal assistant that schedules meetings for you. In the US alone, there are 87 million knowledge workers who spend up to five hours a week scheduling meetings

The future of AI isn’t about one giant super-intelligence. Instead, it’s about many small, dedicated agents that know you intimately and work on your behalf to improve your everyday life. That could be helping you shop, get to work or, even, find a partner. Each is focused on a discrete task, and each gets better over time and adapts to your needs as they evolve.

This kind of smart software isn’t new. It’s been almost 20 years since chess Grandmaster Garry Kasparov lost to IBM’s Deep Blue in a chess match.

Amazon has had machine learning for many years too. Every time the giant retailer serves up a recommendation, AI made the decision and what would be the best option for you on that particular day.

But if Amazon can have AI working to sell you more things, shouldn’t you have your own AI working to find better deals from other vendors, looking for reliable reviews, or keeping you ahead of the latest trends?

Of course, algorithms are also nothing new, but it’s become vastly easier to write and use them in recent years. The main driver behind this has been cheap and ubiquitous computing, an abundance of data, and a platform that brings these elements together: cloud computing.

For AI to be useful it needs all three: a good algorithm, millions of relevant data points, and computing power to process it quickly so it can drive actions in real-time. Lose any one of these and it’s not nearly as useful in the modern world.

The point is this: your business is going to get disrupted by AI, but not in the way you might have thought. Rather than preparing yourself for one monolithic, all knowing consciousness, it’s going to be a trillion tiny agents all focused on specific tasks, and all of them hungry for data.

Finding the time

One example of this focused approach is x.ai. The New York City startup has created an AI-powered personal assistant that schedules meetings for you. It’s a simple enough to use. You connect your calendar to x.ai and then CC in amy@x.ai whenever a discussion starts about scheduling. Once you copy in Amy, she takes over the thread, finds a mutually agreeable time and place, and sets up the meeting for you. The person at the other end has no idea she’s not a human.

How brilliant is that? In the US alone, there are 87 million knowledge workers who spend nearly five hours per week scheduling meetings. I don’t imagine many of us enjoy the process and would more than happy to delegate to a virtual assistant instead.

It’s not such a simple fix though. The technology that powers the virtual assistant is complex. Amy passes each email through natural language processing and supervised learning engines that understand the context of the information. The data is then enriched and stored in MongoDB where it is combined with other information such as the user’s preferred working hours and their current time zone. Based on these inputs Amy determines the appropriate course of action and crafts a response. There’s no app to install. Amy exists only in the cloud.

This is only one example of how algorithms are changing our lives. Cities are starting to automatically adjust traffic flow based on weather, construction, congestion, events, and other real-time factors. Ads you land on while browsing your favourite sites run an algorithm over your data and match it with their calculated preferences about you to serve up something that is highly relevant.

Many of the most popular cloud and data technologies are already responding to this trend. Apache Spark is full of machine learning libraries that come built into the framework. Google released TensorFlow as an open source project, which makes the machine learning technology behind Google Translate and many other products, freely available to anyone.

With these tools easily accessible by developers, it’s easy to see how many different tasks could be quickly re-imagined as algorithms that delivered as convenient services. In fact Peter Sondergaard at Gartner is predicting a whole new Algorithm Economy.

Things you can’t algorithm

Cloud computing solved the two biggest hurdles for AI: abundant, low cost computing and a way to leverage massive volumes of data. However, a number of challenges remain. Chief among those challenges is the one affecting the whole industry: skills.

Cloud computing solved the two biggest hurdles for AI – abundant, low cost computing, and a way to leverage massive volumes of data – however a number of challenges remain

While open source libraries make it easy to get started, for genuinely powerful AI you need actual data scientists. People with strong programming backgrounds, a deep understanding of mathematics and statistics, as well as business domain knowledge. Needless to say, those people are rare.

The other challenges will mainly be around the data. Most modern data is inherently unstructured – it’s geographic data, sensor data, and social data. If your stack is built on decades-old relational technology you are going to struggle to feed modern algorithms running in the cloud.

Despite the challenges, the main lesson is this: small, focused, cloud-based algorithms are going to be the AI that changes our lives over the next decade. It’s better to solve one problem really well, than it is to solve 100 problems poorly. Today’s markets reward companies that maintain their focus.

To take advantage of these trillion robots in the cloud, you’re going to need a thoroughly modern infrastructure.

Why the Tomb Raider publishers created their own database as a service

Picture credit: Flickr/Anthony Jauneaud

In the past 25 years, one of the most proprietary technologies that has come to market is cloud computing. That’s the claim I made to the editor of this very publication back in July. The cloud’s promise of flexibility may prove to be a Trojan horse of vendor lock-in as you move up each layer of the vendor’s stack, consuming not just infrastructure, but also software and services.

In this article, I’d like to explain why there’s a risk of cloud lock-in and one robust tactic for avoiding it.

In the beginning

All the major cloud vendors began with infrastructure as a service (IaaS) offerings with two irresistible features: dramatically reduced infrastructure provisioning time, and the advantage of a pay-as-you-go elastic pricing model. This was incredibly well received by the market, and today it’s hard to imagine that most enterprise workloads won’t eventually be deployed on these offerings.

With a captive audience, these same vendors realised they could simply move up the stack, putting an ‘aaS’ on every layer. The most valued and most critical software component of all, the database, is very much the end game here as a database as a service (DBaaS). Amazon, Microsoft, and Google, among others, have developed wonderfully simple DBaaS offerings that eliminate much of the complexity and headache from running your own deployment in the cloud. The challenge is that the data always outlives the applications.

There is nothing wrong with the idea of DBaaS. Your business is probably using some of it right now.  Most organisations are resource constrained, especially when it comes to database admins. They are happy to give up some control and flexibility for convenience. In some cases their choice may be as stark as to either build their application on a DBaaS or not to build at all.

Many organisations are just recovering from an era when vendors used punitive and rigid licensing to force inflexible and outdated products on people. In the past 15 years we’ve seen the unstoppable march of Linux, as well as open source alternatives for every layer of the technology stack. While these options were initially viewed as inferior to their proprietary competitors, today open source is not only legitimate, it has become the innovator in many categories of technology.

Cloud vendors developed most of their offerings on an open source stack, and for good reason. It would be easy to view this as a continuation of the move away from vendor lock-in, but the truth is if you take a closer look at the pricing models, the egress charges, the interfaces, the absence of source code and so on, you’ll notice a familiar whiff coming from many of the cloud contracts. Prediction: DBaaS is going to be the new lock-in that everyone complains viciously about.

So here’s your challenge: you want to offer your team of developers the convenience of a DBaaS, but you want to keep complete control of your stack to avoid lock in and maximise flexibility. You also want to avoid an unsightly invoice from *insert cloud giant here* stapled to your forehead every month. What do you do?

The third way

Square Enix is one of the world’s leading providers of gaming experiences, publishing iconic titles like Tomb Raider and Final Fantasy. Collectively Square Enix games have sold hundreds of millions of units worldwide. It’s not just in gaming distribution that Square Enix is an innovator though. The operations team has also taken a progressive approach to delivering infrastructure to its army of designers and developers.  

Every game has its own set of functionality, so each team of developers uses dedicated infrastructure in a public cloud to store unique data sets for their game. Some functions are used across games, such as leaderboards, but most functions are specific to a given title. For example, Hitman Absolution introduced the ability for players to create their own contracts and share those with other players.

As the number and complexity of online games grew, Square Enix found it could not scale its infrastructure, which at that time was based on a relational database. The operations team needed to overcome that scaling issue and provide all the gaming studios with access to a high performance database. To do this, they migrated to a non-relational database and built a multi-tenant platform they call Online Suite. Online Suite is deployed as one instance of infrastructure that is shared across the company and studios. Essentially, the ops team built their own MongoDB as a service (MDaaS) which is delivered to all of Square Enix’s studios and developers.

The Online Suite provides an API that allows the studios to use MDaaS to store and manage metrics, player profiles, info cast information, leaderboards and competitions. The MDaaS is also used to enable players to share messages across all supported platform such as PlayStation, Xbox, PC, web, iOS, and Android. Essentially, the Online Suite supports any functionality that is shared across multiple games.

This gives them the best of both worlds: control and convenience. They are able to maintain full control of their self-managed environment, with the convenience that comes from a management platform consumed as a service from the cloud.

Square Enix can now scale dozens of database clusters on-demand and deliver 24×7 availability to its developers around the world, all with a single operations staffer. By adopting a multi-tenant DBaaS, Square Enix has been able to consolidate its database instances. This has improved performance and reliability while simplifying life for developers.

The way forward

Crucially, Square Enix has not lost any control. The ops team can still access the code throughout the stack, but they’ve hidden that complexity from their users. As far as the developers are concerned, they have the irresistible cloud experience that is flexible and elastic, but Square Enix has protected itself from lock-in by keeping ownership of the stack.

I’m not crazy. I don’t think this approach would work in every single organisation. I do hope that the example is instructive though. It’s not always a simple dichotomy between the burden of running your own stack or losing control and getting locked-in.

Cloud computing is dramatically changing the way we create services and products. It is a great tool but it’s also a Siren’s call of flexibility and cost savings which has the potential to trap you and to limit your options. But if you learn the lessons from our friends behind Tomb Raider, you might just be able to navigate a course out of cloud cuckoo land.

Why the cloud wars are good for you: Google’s NoSQL database and the battle with Microsoft and Amazon

(c)iStock.com/TARIK KIZILKAYA

Google is creeping up your data stack. In response to Microsoft’s recent Azure DocumentDB announcement Google has released Cloud BigTable into the wild. Cloud BigTable is a managed NoSQL database service based on a version of BigTable used internally for more than decade. Last week Google announced the database would be made available to the masses, and could even be accessed through the Apache HBase API (take that, Hadoop vendors!). It’s a big play in the war for the control of computing workloads running in the cloud.

Sure, Google’s announcement could be viewed as yet another volley in the cloudy game of thrones but it’s more than that. There are two reasons it’s interesting:

  1. Shifting battlefields – big players are moving up the stack to provide greater value and chase higher margins
  2. Full circle – this is NoSQL coming full circle, from research paper to a full service offering

Making stacks from the stack

There are only three companies that have a chance of long-term success with mass-market cloud infrastructure business. No prizes for guessing the names: Amazon, Microsoft and Google. Amazon is the clear leader. Microsoft is making huge investments which, so far, have the Redmond-based giant out ahead of Google too. The bets are big and the stakes are high. The reality is that most companies are moving to the cloud. It’s only a matter of time and which infrastructure player they chose to invest with.

Nobody generates their own electricity in their house, it’s a utility. Cloud infrastructure should be the same. As profit margins flatten for cloud offerings, the major players are looking elsewhere for big data dollars. That’s what Google’s announcement is all about. The search behemoth wants to gobble up more of the big data stack.  

In the beginning cloud was just the basic physical infrastructure. In recent years vendors are adding more and more of what you need to run an application. If you want to run infrastructure on Google, Amazon or Microsoft today, there’s less you need to do for that to become a reality.

So how does this arms-race impact our friendly neighbourhood IT decision maker? Right now it’s all good. There are more options and the fierce competition is forcing down prices. However, buyer beware – many of the services and platforms are far more niche than the providers would have you believe (see below), while at the same time locking you into the vendor’s technology stack.

Full circle: From research paper to product

Many of the important software innovations of the past decade are based on published papers describing Google’s infrastructure. Hadoop is based on two key pieces of research Google published in 2003 and 2004 on its file system (GFS) and map-reduce implementation. Other examples of research that spawned popular open source software projects include Chubby (Zookeeper), Dremel (Drill), and BigTable (HBase and Cassandra).

HBase was initially developed at a company called Powerset to power a natural language search system, which was acquired by Microsoft. Facebook built Cassandra to power its Inbox search feature. Both HBase and Cassandra use a data model inspired by BigTable, which is why they are being compared to Google’s new offering.

Fast forward seven years and the thing that inspired people to build these open source software projects is now a service you can use. And to take advantage of it you don’t need to build the software that Google uses. In fact you don’t even have to run a product that emulates it. You can really use Google’s Bigtable to power your own applications.

As my friend and former colleague Matt Asay pointed out: “Google has finally given enterprises a clear reason to prefer Google over its cloudy alternatives: The chance to scale and run like Google.”

Are you going to need a BiggerTable?

Organisations that are interested Google Cloud BigTable have already decided this type of data model is right for their application. This offering is competitive with products from DataStax and the Hadoop distribution vendors that support HBase. While some advanced customers will choose to manage their own infrastructure, many will be happy to let someone else take care of the details, especially if that someone is Google.

Cloud BigTable is a database with a very narrow set of features. It is a wide column store with a simple key-value query model. Like Cassandra and HBase, Cloud BigTable is limited by:

  • A complex data model which presents a steep learning curve to developers, slowing the rate of new application development
  • Lack of features such as an expressive query language (key-value only), integrated text search, native secondary indexes, aggregations and more. Collectively, these enable organisations to build more functional applications faster

Competition conquers complexity

This is a story about cloud infrastructure warfare and, in a way, we all win. In the insanely competitive cloud market the prices are dropping as quickly as the capabilities are expanding. As we’ve seen in the mobile industry over the past decade, incredible competition drives incredible innovation.

It’s clear the future of databases are primarily going to be in the cloud. MongoDB is designed for cloud deployments and is incredibly popular on AWS, and Google Cloud Platform already offers hosted MongoDB. We also think that a big part of removing complexity is finding software an organisation can standardise on. No one wants to deal with half a dozen databases. They want standards that have the best parts of the various niche data tools.

To achieve this, the big players are throwing huge money at infrastructure and services. Google, Amazon and Microsoft will continue to search for more areas of big data where they can provide value in the market. Ultimately this will lower barriers to entry for new products and services.

Before the year is out, I’d expect there will be even more vendors trying to creep up your big data stack. That’s good for all of us.