Back in 1944 Wesleyan University Librarian Fremont Rider wrote a paper which estimated American university libraries were doubling in size every sixteen years meaning the Yale Library in 2040 would occupy over 6,000 miles of shelves. This is not big data as most people would know it, but the vast and violent increase in the quantity and variety of information in the Yale library is the same principle.
The concept was not known as big data back then, but technologists today are also facing a challenge on how to handle such a vast amount of information. Not necessarily on how to store it, but how to make use of it. The promise of big data, and data analytics more generically, is to provide intelligence, insight and predictability but only now are we getting to a stage where technology is advanced enough to capitalise on the vast amount of information which we have available to us.
Back in 2003 Google wrote a paper on its MapReduce and Google File System which has generally been attributed to the beginning of the Apache Hadoop platform. At this point, few people could anticipate the explosion of technology which we’ve witnessed, Cloudera Chairman and CSO Mike Olson is one of these people, but he is also leading a company which has been regularly attributed as one of the go-to organizations for the Apache Hadoop platform.
“We’re seeing innovation in CPUs, in optical networking all the way to the chip, in solid state, highly affordable, high performance memory systems, we’re seeing dramatic changes in storage capabilities generally. Those changes are going to force us to adapt the software and change the way it operates,” said Olson, speaking at the Strata + Hadoop event in London. “Apache Hadoop has come a long way in 10 years; the road in front of it is exciting but is going to require an awful lot of work.”
Analytics was previously seen as an opportunity for companies to look back at its performance over a defined period, and develop lessons for employees on how future performance can be improved. Today the application of advanced analytics is improvements in real-time performance. A company can react in real-time to shift the focus of a marketing campaign, or alter a production line to improve the outcome. The promise of big data and IoT is predictability and data defined decision making, which can shift a business from a reactionary position through to a predictive. Understanding trends can create proactive business models which advice decision makers on how to steer a company. But what comes next?
For Olsen, machine learning and artificial intelligence is where the industry is heading. We’re at a stage where big data and analytics can be used to automate processes and replace humans for simple tasks. In a short period of time, we’ve seen some significant advances in the applications of the technology, most notably Google’s AlphaGo beating World Go champion Lee Se-dol and Facebook’s use of AI in picture recognition.
Although computers taking on humans in games of strategy would not be considered a new PR stunt, IBM’s Deep Blue defeated chess world champion Garry Kasparov in 1997, this is a very different proposition. While chess is a game which relies on strategy, go is another beast. Due to the vast number of permutations available, strategies within the game rely on intuition and feel, a complex task for the Google team. The fact AlphaGo won the match demonstrates how far researchers have progressed in making machine-learning and artificial intelligence a reality.
“In narrow but very interesting domains, computers have become better than humans at vision and we’re going to see that piece of innovation absolutely continue,” said Olsen. “Big Data is going to drive innovation here.”
This may be difficult for a number of people to comprehend, but big data has entered the business world; true AI and automated, data-driven decision may not be too far behind. Data is driving the direction of businesses through a better understanding of the customer, increase the security of an organization or gaining a better understanding of the risk associated with any business decision. Big data is no longer a theory, but an accomplished business strategy.
Olsen is not saying computers will replace humans, but the number of and variety of processes which can be replaced by machines is certainly growing, and growing faster every day.