Category Archives: Big Data

Hitachi launches customer-centric predictive analytics for telcos

AnalyticsMobile operators, telcos and service providers could soon stem the tide of subscriber defections thanks to a new cloud based predictive analytics service from Hitachi Data Systems (HDS). By forecasting customer behaviour, HDS aims to improve subscriber satisfaction and reduce churn for its clients.

The new offering, announced at the 2016 Mobile World Congress (MWC) in Barcelona, will run as the Hitachi Unified Compute Platform (UCP) 6000 for Predictive Analytics system. It uses the latest analytics software to find the patterns characteristic of unhappy customers and predict customer attrition. The system uses predictive scoring – based on events such as constant use of the help desk and failures of the network – in order to give support staff the information needed for real-time decision making. Once identified, the churn-prone subscribers can be targeted with compensatory offers before they defect.

The Hitachi UCP 6000 for Predictive Analytics is built on SAP’s HANA converged infrastructure which can conduct in-memory data interrogations of big data. HDS claims its UCP 6000 for SAP HANA can simplify the deployment of SAP solutions for telcos, which in turn will help them minimise the IT infrastructure disruption and maximise application performance.

As part of the solution, SAP HANA and SAP Predictive Analytics will allow users to run predictive models on massive amounts of data from external data points. However, as a consequence of the crunching of data in flash memory, the clients will get their insights in seconds and can nip customer uprisings in the bud. SAP’s Predictive Analytics software will automatically handle the wide dataset and make predictive modelling more effective, according to HDS.

HDS described the churn-busting service as an ‘immense opportunity’ to translate data into tangible business outcomes.

IBM Watson Health buys Truven Health Analytics for $2.6B

Legs of Fit Couple Exercising on Treadmill DeviceIBM Watson Health has announced an agreement to acquire cloud based big data specialist Truven Health Analytics. The deal, valued at $2.6 billion, will give the IBM Watson Health portfolio an additional 8,500 clients and information on 215 million new patients, subject to the merger being concluded. Upon completion of due diligence, IBM will buy Truven from its current owner Veritas Capital.

Truven Health Analytics has a client list that includes US federal and state government agencies, employers, health plans, hospitals, clinicians and life sciences companies. The 215 million records of patient lives from Truven will be added to data from previous IBM Watson Health acquisitions of big data companies. These include 50 million patient case histories that came with its acquisition of cloud based health care intelligence company Explorys and 45 million records owned by population health analyser Phytel. IBM Watson Health has also bought medical imaging expert Merge Healthcare. In total, IBM Watson Health now has 310 million records of ‘patient lives’ which, IBM claims, gives it a health cloud housing ‘one of the world’s largest and most diverse collections of health-related data’.

In September BCN reported how two new cloud services, IBM Watson Health Cloud for Life Sciences Compliance and IBM Watson Care Manager had been created to unblock the big data bottlenecks in clinical research. The first service helps biomedical companies bring their inventions to market more efficiently, while the Care Manager system gives medical professionals a wider perspective on the factors they need to consider for personalised patient engagement programmes.

According to IBM it has now invested over $4 billion on buying health data and systems and will have 5,000 staff in its Watson Health division, including clinicians, epidemiologists, statisticians, healthcare administrators, policy experts and consultants.

Truven’s cloud-based technology, systems and health claims data, currently housed in offices and data centers across facilities in Michigan, Denver, Chicago, Carolina and India, are to be integrated with the Watson Health Cloud.

IBM has invited partners to build text, speech and image recognition capacity into their software and systems and 100 ecosystem partners have launched their own Watson-based apps. IBM opened a San Francisco office for its Watson developer cloud in September 2015 and is also building a new Watson data centre there, which is due to open in early 2016.

Cohesity claims data silo fragmentation solution

Data visualisationsSanta Clara based start-up Cohesity claims it will be able to drastically reduce the escalating costs of secondary storage.

The new Cohesity Data Platform achieves this, it reckons, by consolidating all the diverse backup, archive, testing, development and replication systems onto a single, scalable entity.

In response to feedback from early adopters, it has now added site-to-site replication, cloud archive, and hardware-accelerated, 256-bit encryption to version 2.0 of the Data Platform (DP).

The system tackles one of the by-products of the proliferation of cloud systems, the creation of fragmented data silos. These are the after effects of the rapid unstructured growth of IT which led to the adoption of endless varieties of individual systems for handling backup, file services, analytics and other secondary storage use cases. By unifying them, Cohesity claims it can cut the storage footprint of a data centre by 80%. It promises an immediate tangible return on investment by obviating the need for backup.

Among the time saving features that have been added to the system are automated virtual machine cloning for testing and development and a newly added public cloud archival tier. The latter gives enterprise users the option of spilling over their least-used data to Google Cloud Storage Nearline, Microsoft Azure and Amazon S3 and Glacier in order to cut costs. The Cohesity Data Platform 2.0 also provides ‘adaptive throttling of backup streams’, which minimises the burden that storage places on the production infrastructure.

“We manage data sprawl with a hyperconverged solution that uses flash, compute and policy-based quality of service,” said Cohesity CEO Mohit Aron.

MapR gets converged data platform patented

dataCalifornia-based open source big data specialist MapR Technologies has been granted patent protection for its technique for converging open source, enterprise storage, NoSQL and other event streams.

The United States Patent and Trademark Office recognised the detail differentiation of the Hadoop specialist’s work within the free, Java-based programming framework of Hadoop. Though the technology is derived from technology created by the open source oriented Apache Software Foundation, the patent office has judged that MapR’s performance, data protection, disaster recovery and multi-tenancy features merit a recognisable level of differentiation.

The key components of the patent claims include a design based on containers, self-contained autonomous units with their own operating system and app software. Containers can ring fence data against loss, optimise replication techniques and create a system that can cater for multiple node failures in a cluster.

Other vital components of the system are transactional read-write-update semantics with cluster-wide consistency, recovery techniques and update techniques. The recovery features can reconcile the divergence of replicated data after node failure, even while transactional updates are continuously being added. The update techniques allow for extreme variations of performance and scale while supporting familiar application programming interfaces (APIs).

MapR claims its Converged Data Platform allows clients to innovate with open source, provides a foundation for analytics (by converging all the data), creates enterprise grade reliability in one open source platform and makes instant, continuous data processing possible.

It’s the differentiation of the core with standard APIs that makes it stand out from other Apache projects, MapR claims. Meanwhile the system’s ability to use a single cluster, that can handle converged workloads, makes it easier to manage and secure, it claims.

“The patent details how our platform gives us an advantage in the big data market. Some of the most demanding enterprises in the world are solving their business challenges using MapR,” said Anil Gadre, MapR Technologies’ senior VP of product management.

AliCloud launches 20 services under brand name Big Data Platform

dataAlibaba Cloud Computing (Alicloud) is to launch 20 new online services to the Chinese market under the brand name Big Data Platform.

The new application service range caters for activities in the data development chain, including processing, analysis, computing, machine learning and big data hosting. Around 1,000 developers are expected to be developing services with AliCloud in the next three years.

The plan is to use all the data-processing capacity and data-security skills that the Alibaba Group has accumulated in ten years of running the world’s biggest ecommerce platform, AliCloud president Simon Hu told reporters at the launch. “That data becomes a resource and a service that we can provide our clients,” said Hu.

Meanwhile, AliCloud is working with US chip specialist Nvidia to develop China’s first GPU-based, high-performance computing cloud platform. Along with offering clients GPU-accelerated computing services AliCloud aims to remove the data bottlenecks that handicap many chinese companies, according to Hu.

The Nvidia GPU-based services could also improve the computing capacity of many of Alibaba’s typical users in China, such as manufacturers and distributors, said Hu. “Right now, AliCloud mainly serves internet companies, but our next step will be to also provide cloud computing services to traditional industries such as manufacturing to remove the computing limitations that these companies may face,” said Hu.

The new launch puts AliCloud in direct contention with big data service supplier Data Mall, a start-up that recently launched an online mall for big data assets. The Data Mall cloud offering helps service providers and independent researchers to trade intelligence and market information. Consulting firm Guan Zheng Hang Seng says the Beijing Datatang owned Data Mall service now has 460,000 users supplying raw data to its platform.

A study by Forrester Research forecast that the enterprise cloud service market in China will be worth $3.8 billion by 2020, more than double its estimated size of $1.8 billion last year. According to Forrester analyst Charlie Dai AliCloud now has the Chinese market’s biggest range of public cloud services and alliances with service providers.

Microsoft acquires Metanautix with Quest for intelligent cloud

MicrosoftMicrosoft has bought Californian start up Metanautix for an undisclosed fee in a bid to improve the flow of analytics data as part of its ‘intelligent cloud’ strategy.

The Palo Alto vendor was launched by Theo Vassilakis and Toli Lerios in 2014 with $7 million. The Google and Facebook veterans had impressed venture capitalists with their plans for more penetrative analysis of disparate data. The strategy was to integrate the data supply chains of enterprises by building a data computing engine, Quest, that created scalable SQL access to any data.

Modern corporations aspire to data-driven strategies but have far too much information to deal with, according to Metanautix. With so many sources of data, only a fraction can be analysed, often because too many information silos are impervious to query tools.

Metanautix uses SQL, the most popular query language, to interrogate sources as diverse as data warehouses, open source data base, business systems and in-house/on-premise systems. The upshot is that all data is equally accessible, whether it’s from Salesforce or SQL Server, Teradata or MongoDB.

“As someone who has led complex, large scale data warehousing projects myself, I am excited about building the intelligent cloud and helping to realize the full value of data,” said Joseph Sirosh, corporate VP of Microsoft’s  Data Group, announcing the take-over on the company web site.

Metanautix’s technology, which promises to connect to all data regardless of type, size or location, will no longer be available as a branded product or service. Microsoft is to initially integrate it within its SQL Server and Cortana Analytics systems with details of integration with the rest of Microsoft’s service portfolio to be announced in later months, Sirosh said.

The blog posting from Metanautix CEO Theo Vassilakis hinted at further developments. “We look forward to being part of Microsoft’s important efforts with Azure and SQL Server to give enterprise customers a unified view of all of their data across cloud and on-premises systems,” he said.

Cloud is growing up: from cost saving to competitive advantage

Analytics1The last decade witnessed one of, if not the most transformational waves of technological change ever to break on the shores of IT – cloud computing. Companies vied to position as the key holders to the cloud universe, and customers too, competed for the honor of being first to market in terms of their use and migration to the various cloud models.

The first phase of cloud was characterised by migration of business to the cloud.  This phase is still happening, with many companies of all shapes and sizes at varying stages along the migration path.

The initial catalyst for cloud adoption was, broadly speaking, cost and efficiency based. Amidst times of global economic fluctuations and downturn during the ‘mid-noughties’ the cloud model of IT promised considerable IT efficiencies and thus, cost savings. For the early migrators however, cloud has moved beyond simple cost efficiencies to the next phase of maturity: competitive advantage.

IDC reported earlier in the year that 80% of cloud applications in the future will be data-intensive; therefore, industry know-how and data are the true benefits of the cloud.

The brokerage of valuable data, (be it a clients’ own proprietary information about inventory or customer behavior, or wider industry data), and the delivery of this critical information as a service is where the competitive advantage can be truly found – it’s almost now a case of ‘Innovation as a Service’.

The changing modus operandi of cloud has largely been driven by the increasing, types, variety and volumes of streams of data businesses now require to stay competitive, and now the roll out of cognitive and analytics capabilities within cloud environments are as important to achieving business goals and competitive advantage, as the actual cloud structure itself.

There’s almost no better example of this, than the symbiotic relationship between Weather.com and its use of the cloud.  For a company like Weather.com the need to extract maximum value from global weather data, was paramount to producing accurate forecasting pictures, but also by using advanced analytics, the management of its data globally.

Through IoT deployments and cloud computing Weather.com collects data from more than 100,000 weather sensors, aircraft and drones, millions of Smartphones, buildings and even moving vehicles. The forecasting system itself ingests and processes data from thousands of sources, resulting in approximately 2.2 billion unique forecast points worldwide, geared to deliver over 26 billion forecasts a day.

By integrating real-time weather insights, Weather.com has been able to improve operational performance and decision-making. However, by shifting its (hugely data-intensive), services to the cloud and integrating it with advance analytics, it was not only able to deliver billions of highly accurate forecasts, it was also able to derive added value from this previously unavailable resource, and creating new value ad services and revenue streams.

Another great example is Shop Direct: as one of the UK’s largest online retailers, delivering more than 48 million products a year and welcoming over a million daily visitors across a variety of online and mobile platforms, the move to a hybrid cloud model increased flexibility and meant it was able to more quickly respond to changes in demand as it continues to grow.

With a number of digital department stores including £800m flagship brand, Very.co.uk, the cloud underpins the a variety of analytics, mobile, social and security offerings that enable Shop Direct to improve its customers’ online shopping experience while empowering its workforce to collaborate more easily too.

Smart use of cloud has allowed Shop Direct to continue building a preeminent position in the digital and mobile world, and it has been able to innovate and be being better prepared to tackle challenges such as high site traffic around the Black Friday and the Christmas period.

In the non-conformist, shifting and disruptive landscape of today’s businesses, innovation is the only surety of maintaining a preeminent position and setting a company apart from its competitors – as such, the place of the cloud as the market place for this innovation is insured.

Developments in big data, analytics and IoT highlight the pivotal importance of cloud environments as enablers of innovation, while cognitive capabilities like Watson (in conjunction with analytics engines), add informed intelligence to business processes, applications and customer touch points along every step of the business journey.

While many companies recognise that migration to the cloud is now a necessity, it is more important to be aware that the true, long-term business value can only be derived from what you actually operate in the cloud, and this is the true challenge for businesses and their IT departments as we look towards 2016 and beyond.

Written by Sebastian Krause, VP IBM Cloud Europe

Google upgrades Cloud SQL, promises managed MySQL offerings

Google officeGoogle has announced the beta availability of a new improved Cloud SQL for Google Cloud Platform – and an alpha version of its much anticipated Content Delivery Network offering.

In a blog post Brett Hesterberg, Product Manager for Google’s Cloud Platform, says the second generation of Cloud SQL will aim to give better performance and more ‘scalability per dollar’.

In Google’s internal testing, the second generation Cloud SQL proved seven times faster than the first generation and it now scales to 10TB of data, 15,000 IOPS and 104GB of RAM per instance, Hesterberg said.

The upshot is that transactional databases now have a flexibility that was unachievable with traditional relational databases. “With Cloud SQL we’ve changed that,” Hesterberg said. “Flexibility means easily scaling a database up and down.”

Databases can now ramp up and down in size and the number of queries per day. The allocation of resources like CPU cores and RAM can be more skilfully adapted with Cloud SQL, using a variety of tools such as MySQL Workbench, Toad and the MySQL command-line. Another promised improvement is that any client can be used for access, including Compute Engine, Managed VMs, Container Engine and workstations.

In the new cloud environment databases need to be easier to stop and restart if they are only used on occasion for brief or infrequent tasks, according to Hesterberg. Cloud SQL now caters for these increasingly common cloud applications of database technology through the Cloud Console, the command line within Google’s gCloud SDK or a RESTful API. This makes admin a scriptable job and minimises costs by only running the databases when necessary.

Cloud SQL will create more manageable MySQL databases, claims Hesterberg, since Google will apply patches and updates to MySQL, manage backups, configure replication and provide automatic failover for High Availability (HA) in the event of a zone outage. “It means you get Google’s operational expertise for your MySQL database,” says Hesterberg. Subscribers signed up for Google Cloud Platform can now get a $300 credit to test drive Cloud SQL, it announced.

Meanwhile in another Google blog, it announced an alpha release of its own content delivery network, Google Cloud CDN. The system may not be consistent and is not recommended for production use, Google warned.

Google Cloud CDN will speed up its cloud services using distributed edge caches to bring content closer to users in a bid to compensate for its relatively low global data centre coverage against rivals AWS and Azure.

Red Hat helps Medlab share supercomputer in the cloud

redhat office logoA cloud of bioinformatics intelligence has been harmonised by Red Hat to create ‘virtual supercomputers’ that can be shared by the eMedlab collective of research institutes.

The upshot is that researchers at institutes such as the Wellcome Trust Sanger, UCL and King’s College London can carry out much more powerful data analysis when researching cancers, cardio-vascular conditions and rare diseases.

Since 2014 hundreds of researchers across the eMedlab have been able to use a high performance computer (HPC) with 6,000 cores of processing power and 6 Petabytes of storage from their own locations. However, the cloud environment now collectively created by technology partners Red Hat, Lenovo, IBM and Mellanox, along with supercomputing integrator OCF, means none of the users have to shift their data to the computer. Each of the seven institutes can configure their share of the HPC according to their needs, by self-selecting the memory, processors and storage they’ll need.

The new HPC cloud environment uses a Red Hat Enterprise Linux OpenStack platform with Lenovo Flex hardware to create virtual HPC clusters bespoke to each individual researchers’ requirements. The system was designed and configured by OCF, working with partners Red Hat, Lenovo, Mellanox and eMedlab’s research technologists.

With the HPC hosted at a shared data centre for education and research, the cloud configuration has made it possible to run a variety of research projects concurrently. The facility, aimed solely at the biomedical research sector, changes the way data sets are shared between leading scientific institutions internationally.

The eMedLab partnership was formed in 2014 with funding from the Medical Research Council. Original members University College London, Queen Mary University of London, London School of Hygiene & Tropical Medicine, the Francis Crick Institute, the Wellcome Trust Sanger Institute and the EMBL European Bioinformatics Institute have been joined recently by King’s College London.

“Bioinformatics is a very, data-intensive discipline,” says Jacky Pallas, Director of Research Platforms at University College London. “We study a lot of de-identified, anonymous human data. It’s not practical for scientists to replicate the same datasets across their own, separate physical HPC resources, so we’re creating a single store for up to 6 Petabytes of data and a shared HPC environment within which researchers can build their own virtual clusters to support their work.”

In other news Red Hat has announced a new upgrade of CloudForms with better hybrid cloud management through more support for Microsoft Azure Support, advanced container management and improvements to its self-service features.

MapR claims world’s first converged data platform with Streams

Navigating big dataApache Hadoop system specialist MapR Technologies claims it has invented a new system to make sense of all the disjointed streams of real time information flooding into big data platforms. The new MapR Streams system will, it says, blend everything from systems logs to sensors to social media feeds, whether it’s transactional or tracking data, and manage it all under one converged platform.

Stream is described as a stream processing tool that can handle real-time event handling and high scalability. When combined with other MapR offerings, it can harmonise existing storage data and NoSQL tools to create a converged data platform. This, it says, is the first of its kind in the cloud industry.

Starting from early 2016, when the technology becomes available, cloud operators can combine Streams with MapR-FS for storage and the MapR-DB in-Hadoop NoSQL database, to build a MapR Converged Data Platform. This will liberate users from having to monitor information from streams, file storage, databases and analytics, the vendor says.

Since it can handle billions of messages per second and join clusters from separate data centres across the globe, the tool could be of particular interested to cloud operators, according to Michael Brown, CTO at comScore. “Our system analyses over 65 billion new events a day, and MapR Streams is built to ingest and process these events in real time, opening the doors to a new level of product offerings for our customers,” he said.

While traditional workloads are being optimised, new workloads from the emerging IoT dataflows are presenting far greater challenges that need to be solved in a fraction of the time, claims MapR. The MapR Streams will help companies deal with the volume, variety and speed at which data has to be analysed while simplifying the multiple layers of hardware stacks, networking and data processing systems, according to MapR. Blending MapR Streams into a converged data system eliminates multiple siloes of data for streaming, analytics and traditional systems of record, MapR claimed.

MapR Streams supports standard application programming interfaces (APIs) and integrates with other popular stream processors like Spark Streaming, Storm, Flink and Apex. When available, the MapR Converged Data Platform will be offered as a free to use Community Edition to encourage developers to experiment.