Category Archives: Big Data

PwC, Rosslyn partner on cloudy big data

PwC is teaming up with Rosslyn to help bring analytics-based insights to clients

PwC is teaming up with Rosslyn to help bring analytics-based insights to clients

Pricewaterhouse Coopers (PwC) announced a partnership with Rosslyn Analytics that will see the two firms jointly develop and offer cloud-based big data services to clients.

The two companies said they plan to use Rosslyn’s suite of cloud-enabled data technologies when advising clients on supply chain risk reduction, productivity optimisation and cost reduction, with PwC bringing its deep knowledge of different verticals to the table.

“For our clients, acquiring the knowledge most important to their operations, securing that information and using it optimally are critical – now more than ever before. We are delighted to be teaming up with Rosslyn to offer our joint knowledge and capabilities to clients – giving them one place to go, maximizing experience and assets from both organizations,” said Yann Bonduelle, PwC partner and head of data analytics.

“In our most recent survey of business leaders, 75 per cent of UK CEOs say that data and data analytics are proving valuable to them, whilst 79 per cent see data mining and analysis as one of the top digital technologies. This highlights how important it is to our clients to embrace the technology available to give them greater competitive advantage,” Bonduelle added.

Charles Clark, chief executive of Rosslyn Analytics, said: “Our collaboration is about helping clients to embrace their journey in analytics, and transform their organisations to thrive and maintain relevance in a rapidly changing world. An increasing number of companies, large and small, look to our data technologies to help them reduce costs and risks, and improve their revenue and productivity across their businesses.”

Like KPMG and others in the big four, PwC has struck several deals with cloud and data services providers in a bid to add more value to its client offerings. The company most recently struck a deal with Google that has seen it work closely with its clients to tailor Google Apps for Work to their specific business processes and needs, and help them optimise their operations.

Microsoft jumps into the data lake

Azure Data LakeAt the company’s annual Build conference this week Microsoft unveiled among other things an Azure Data Lake service, which the company is pitching as a hyperscale big data repository for all kinds of data.

The data lake concept is a fairly new one, the gist of it being that data of varying types and structures is created at such a high velocity and in such large volumes that it’s prompting a necessary evolution in the applications and platforms required to handle that data.

It’s really about being able to store all that data in a volume-optimised (and cost-efficient) way that maintains the integrity of that information when you go to shift it someplace else, whether that be an application / analytics or a data warehouse.

“While the potential of the data lake can be profound, it has yet to be fully realized. Limits to storage capacity, hardware acquisition, scalability, performance and cost are all potential reasons why customers haven’t been able to implement a data lake,” explained Microsoft’s product marketing manager, Hadoop, big data and data warehousing Oliver Chiu.

The company is pitching the Azure Data Lakes service as a means of running Hadoop and advanced analytics using Microsoft’s own Azure HDInsight, as well as Revolution-R Enterprise and other Hadoop distributions developed by Hortonworks and Cloudera.

It’s built to support “massively parallel queries” so information is discoverable in a timely fashion, and built to handly high volumes of small writes, which the company said makes the service ideal for Internet of Things applications.

“Microsoft has been on a journey for broad big data adoption with a suite of big data and advanced analytics solutions like Azure HDInsight, Azure Data Factory, Revolution R Enterprise and Azure Machine Learning. We are excited for what Azure Data Lake will bring to this ecosystem, and when our customers can run all of their analysis on exabytes of data,” Chiu explained.

Pivotal is also among a handful of vendors seriously bought into the concept of data lakes. However, although Chiu alluded to cost and performance issues associated with the data lakes approach, many enterprises aren’t yet at a stage where the variety, velocity and volume of data their systems ingest are prompting a conceptual change in how that data is being perceived, stored or curated; in a nutshell, many enterprises are still too siloed – not the least of which in how they treat data.

Taipei Computer Association, Government launch Big Data Alliance

TCA, government officials launching the Big Data Alliance in Taipei

TCA, government officials launching the Big Data Alliance in Taipei

The Taipei Computer Association and Taiwanese government-sponsored institutions have jointly launched the Big Data Alliance, aimed at driving the use of analytics and open data in academia, industry and the public sector.

The Alliance plans to drive the use of analytics and open data throughout industry and government to “transform and optimise services, and create business opportunities,” and hopes big data can be used to improve public policy – everything from financial management to transportation optimisation – and create a large commercial ecosystem for new applications.

The group also wants to help foster more big data skills among the domestic workforce, and plans to work with major local universities to train more data and information scientists. Alliance stakeholders include National Taiwan University, National Taiwan University of Science as well as firms like IBM, Far EasTone Telecommunications and Asus, but any data owners, analysts and domain experts are free to join the Alliance.

Taiwanese universities have been fairly active in partnering in partnering with large incumbents to help accelerate the use of big data services. Last year National Cheng Kung University (NCKU) in southern Taiwan signed a memorandum of understanding with Japanese technology provider Futjistu which saw the two organisations partner to build out a big data analytics platform and nurture big data skills in academia.

Google boosts cloud-based big data services

Google is bolstering its big data services

Google is bolstering its big data services

Google announced a series of big data service updates to its cloud platform this week in a bid to strengthen its growing portfolio of data services.

The company announced the beta launch of Google Cloud Dataflow, a Java-based service that lets users build, deploy and run data processing pipelines for other applications like ETL, analytics, real-time computation, and process orchestration, while abstracting away all the other infrastructure bits like cluster management.

The service is integrated with Google’s monitoring tools and the company said it’s built from the ground up for fault-tolerance.

“We’ve been tackling challenging big data problems for more than a decade and are well aware of the difference that simple yet powerful data processing tools make. We have translated our experience from MapReduce, FlumeJava, and MillWheel into a single product, Google Cloud Dataflow,” the company explained in a recent blog post.

“It’s designed to reduce operational overhead and make programming and data analysis your only job, whether you’re a data scientist, data analyst or data-centric software developer. Along with other Google Cloud Platform big data services, Cloud Dataflow embodies the kind of highly productive and fully managed services designed to use big data, the cloud way.”

The company also added a number of security features to Big Query, Google’s SQL cloud service, including adding row-level permissioning for data protection, made it more performant (raised the ingestion limit to 100,000 rows per second), and announced its availability in Europe.

Google has largely focused its attention on other areas of the stack as of late. The company has been driving its container scheduling and deployment initiative Kubernetes quite hard, as well as its hybrid cloud initiatives (Mirantis, VMware). It also recently introduced a log analysis for Google Cloud and App Engine users.

Pivotal punts Geode to ASF to consolidate leadership in open source big data

Pivotal is looking to position itself as a front runner in open source big data

Pivotal is looking to position itself as a front runner in open source big data

Pivotal has proposed “Project Geode” for incubation by the Apache Software Foundation, which would focus on developing the Geode in-memory database technology – the technology at the core of Pivotal’s GemFire offering.

Geode can support ACID transactions for large scaled applications such as those used for stock trading, financial payments and ticket sales, and the company said the technology is already proven in customer deployments of more than 10 million user transactions a day.

In February Pivotal announced it would open source much of its big data suite including GemFire, which the company will continue to support commercially. The move is part of a broader plan to consolidate its leadership in the open source big data ecosystem, where companies like Hortonworks are also trying to make waves.

The company also recently helped launch the Open Data Platform, which seeks to promote big data tech standardisation, and combat fragmentation around how Hadoop is deployed in enterprises and built upon by ISVs.

In the meantime, while the company said it would wait for the ASF’s decision Pivotal has already put out a call to developers as it seeks early contributions to ensure the project gets a head start.

“The opening sourcing of core components of products in the Pivotal Big Data Suite heralds a new era of how big data is done in the enterprise. Starting with core code in Pivotal GemFire, the components we intend to contribute to the open source community are already performing in the most hardened and demanding enterprise environments,” said Sundeep Madra, vice president, Data Product Group at Pivotal.

“Geode is an important part of building solutions for next generation data infrastructures and we welcome the community to join us in furthering Geode’s already compelling capabilities,” Madra said.

Hortonworks buys SequenceIQ to speed up cloud deployment of Hadoop

CloudBreak

SequenceIQ will help boost Hortonworks’ position in the Hadoop ecosystem

Hortonworks has acquired SequenceIQ, a Hungary-based startup delivering infrastructure agnostic tools to improve Hadoop deployments. The company said the move will bolster its ability to offer speedy cloud deployments of Hadoop.

SequenceIQ’s flagship offering, Cloudbreak, is a Hadoop as a Service API for multi-tenant clusters that applies some of the capabilities of Blueprint (which lets you create a Hadoop cluster without having to use the Ambari Cluster Install Wizard) and Periscope (autoscaling for Hadoop YARN) to help speed up deployment of Hadoop on different cloud infrastructures.

The two companies have partnered extensively in the Hadoop community, and Hortonworks said the move will enhance its position among a growing number of Hadoop incumbents.

“This acquisition enriches our leadership position by providing technology that automates the launching of elastic Hadoop clusters with policy-based auto-scaling on the major cloud infrastructure platforms including Microsoft Azure, Amazon Web Services, Google Cloud Platform, and OpenStack, as well as platforms that support Docker containers. Put simply, we now provide our customers and partners with both the broadest set of deployment choices for Hadoop and quickest and easiest automation steps,” Tim Hall, vice president of product management at Hortonworks, explained.

“As Hortonworks continues to expand globally, the SequenceIQ team further expands our European presence and firmly establishes an engineering beachhead in Budapest. We are thrilled to have them join the Hortonworks team.”

Hall said the company also plans to contribute the Cloudbreak code back into the Apache Foundation sometime this year, though whether it will do so as part of an existing project or standalone one seems yet to be decided.

Hortonworks’ bread and butter is in supporting enterprise adoption of Hadoop and bringing the services component to the table, but it’s interesting to see the company commit to feeding the Cloudbreak code – which could, at least temporarily, give it a competitive edge – back into the ecosystem.

“This move is in line with our belief that the fastest path to innovation is through open source developed within an open community,” Hall explained.

The big data M&A space has seen more consolidation over the past few months, with Hitachi Data Systems acquiring big data and analytics specialist Pentaho and Infosys’ $200m acquisition of Panaya.

IBM goes after healthcare with acquisitions, Apple HealthKit partnership, new business unit

IBM is pushing hard to bring Watson to the healthcare sector

IBM is pushing hard to bring Watson to the healthcare sector

IBM announced a slew of moves aimed at strengthening its presence in the healthcare sector including two strategic acquisitions, a HealthKit-focused partnership with Apple, and the creation of a new Watson and cloud-centric healthcare business unit.

IBM announced it has reached an agreement to acquire Explorys, which deploys cognitive cloud-based analytics on datasets derived from numerous and diverse financial, operational and medical record systems, and Phytel, which provides cloud-based software that helps healthcare providers and care teams coordinate activities across medical facilities by automating certain aspects of patient care.

The company said the acquisitions would bolster IBM’s efforts to sell advanced analytics and cognitive computing to primary care providers, large hospital systems and physician networks.

“As healthcare providers, health plans and life sciences companies face a deluge of data, they need a secure, reliable and dynamic way to share that data for new insight to deliver quality, effective healthcare for the individual,” said Mike Rhodin, senior vice president, IBM Watson. “To address this opportunity, IBM is building a holistic platform to enable the aggregation and discovery of health data to share it with those who can make a difference.”

That ‘holistic platform’ is being developed by the recently announced Watson Health unit, which as the name suggests will put IBM’s cognitive compute cloud service Watson at the heart of a number of healthcare-focused cloud storage and analytics solutions. The unit has also developed the Watson Health Cloud platform, which allows the medical data it collects to be anonymized, shared and combined with a constantly-growing aggregated set of clinical, research and social health data.

“All this data can be overwhelming for providers and patients alike, but it also presents an unprecedented opportunity to transform the ways in which we manage our health,” said John E. Kelly III, IBM senior vice president, solutions portfolio and research. “We need better ways to tap into and analyze all of this information in real-time to benefit patients and to improve wellness globally.”

Lastly, IBM announced an expanded partnership with Apple that will see IBM offer its Watson Health Cloud platform as a storage and analytics service for HealthKit data aggregated from iOS devices, and open the platform up for health and fitness app developers as well as medical researchers.

Many of IBM’s core technologies, which have since found their way into Watson (i.e. NLP, proprietary algorithms, etc.) are already in use by a number of pioneering medical facilities globally, so it makes sense for IBM to pitch its cognitive compute capabilities to the healthcare sector – particularly in the US, where facilities are legally incentivised to use new technologies to reduce the cost of patient care while keeping quality of service high. Commercial deals around Watson have so far been scarce, but it’s clear the company is keen to do what it can to create a market for cloud-based cognitive computing.

Twitter nixes firehose partnership with DataSift

Twitter is consolidating its grip on data analytics and resellers using its data in real-time

Twitter is consolidating its grip on data analytics and resellers using its data in real-time

Twitter has suspended negotiations over the future use of the social media giant’s data with big data analytics provider DataSift, sparking concerns the firm plans to shut out others in the ecosystem of data analytics providers it enables.

In a recent blog post penned by DataSift’s chief exec and founder, Nick Halstead, the company aimed to reaffirm to customers that’s its business model “never relied on access to Twitter data” and that it is extending its reach into “business-owned data.”

But, the company still attacked the social media giant for damaging the ecosystem it enables.

“Our goal has always been to provide a one-stop shop for our customers to access all the types of data from a variety of networks and be able to consume it in the most efficient way. Less noise, more actionable results. This is what truly matters to companies that deal with social data,” Halstead explained.

“The bottom line: Twitter has seriously damaged the ecosystem this week. 80% of our customers use technology that can’t be replaced by Twitter. At the end of the day, Twitter is providing data licensing, not processing data to enable analysis.”

“Twitter also demonstrated that it doesn’t understand the basic rules of this market: social networks make money from engagement and advertising. Revenue from data should be a secondary concern to distribution and it should occur only in a privacy-safe way. Better understanding of their audiences means more engagement and more ad spend from brands. More noise = less ad spend.”

DataSift was one three data resellers that enjoy privileged access to Twitter’s data in real-time – Gnip, which is now owned by Twitter, and NTT Data being the other two.

The move to strengthening its grip over the analysis ecosystem seems aimed at bolstering Gnip’s business. A similarly-timed post on Gnip’s blog by Zach Hofer-Shall, head of Twitter more or less explained that the Gnip acquisition was a “first step” towards developing a more direct relationship with data customers, which would suggest other firehose-related negotiations may likely sour in the coming months if they haven’t already (BCN reached out to NTT Data for comment).

Some have, reasonably, hit out at Twitter for effectively eating its own ecosystem and shutting down third party innovation.  For instance Steven Willmott, chief executive of 3Scale, an API services vendor, said shutting down firehose access will result in niche verticals being underserved.

“While it makes sense at some level to want to be closer to the consumers of data (that’s valuable and laudable from a product perspective), removing other channels is an innovation bust. Twitter will no doubt do a great job on a range of use-cases but it’s severely damaging not to have a means to enable full firehose access for others. Twitter should really be expanding firehose access, not restricting it”

Julien Genestoux, founder of data feed service provider Superfeedr, said the recent move to cut off firehose access is not very different from what Twitter did a couple years ago when they started limiting the 3rd party client’s API accesses, and that Facebook often does much the same with partners it claims to give full data access to.

“The problem isn’t the company. The problem is the pattern. When using an API, developers are completely surrendering any kind of bargain power they have. There’s a reason we talk about slave and master in computer science. API’s are whips for web companies. This is the very tool they use to enforce a strong coupling and dependence to their platform,” he said.

While Twitter seems to be severely restricting the data reseller ecosystem it’s also redoubling its efforts to capture the hearts and minds of the enterprise developer, with coveted access to its data being placed front and centre. Twitter is working with IBM to make its data stream available to Big Blue’s clients, and in March this year IBM said it has over 100 pilots in place that see the company working with enterprises in a range of verticals to create cloud-based services integrating Twitter data and Watson analytics.

Close to half of manufacturers look to cloud for operational efficiency, survey reveals

Manufacturers are flocking to cloud services to reap operational benefits

Manufacturers are flocking to cloud services to reap operational benefits

About half of all large manufacturers globally are using or plan to use IT services based on public cloud platform in a bid to driver operational efficiency, an IDC research survey reveals.

A recently published IDC survey which polled 437 IT decision makers at large manufacturing firms globally suggests manufacturers are looking to cloud services primarily to simplify their operations.

A majority of manufacturers worldwide are currently using public (66 per cent) or private cloud (68 per cent) for more than two applications, and nearly 50 per send of European manufacturers have adopted or intend to adopt ERP in the public cloud.

But only 30 to 35 per cent of respondents said operations, supply chain and logistics, sales, or engineering were likely to benefit through adoption.

“Manufacturers are in the midst of a digital transformation, in which 3rd platform technologies are absolutely essential to the way they do business and in the products and services they provide to their customers.  Consequently, a strategic approach to adopting cloud is absolutely essential,” said Kimberly Knickle, research director, IDC Manufacturing Insights.

“Because of cloud’s tremendous value in making IT resources available to the business based on business terms –speed, cost, and accessibility- manufacturers must  ensure that the line of business and IT management work together in defining their requirements,” Knickle said.

The firm said manufacturers are likely to opt for private cloud platforms in the near term in a bid to expand their IT estates to the cloud, but that capacity requirements will likely eventually shift those platforms onto larger public cloud platforms. A big driver for this will be the Internet of Things, with a cloud a key component in allowing manufacturers to more easily make use of the data that will be connected from sensors throughout manufacturing operations.

CIO Focus Interview: Isaac Sacolick, Greenwich Associates

CIO Focus InterviewFor this CIO Focus Interview, I got the pleasure of interviewing Isaac Sacolick. Isaac is the Global CIO and a Managing Director at Greenwich Associates and is recognized as an industry leading, innovative CIO. In 2013, he received Tech Target’s CIO award for Technology Advancement. The past two years, he’s been on the Huffington Post’s Top 100 Most Social CIOs list. I would highly recommend reading his blog, Social, Agile and Transformation and also following him on Twitter (@nyike).

Ben: Could you give me some background on your career?

Isaac: My career began in start-ups, and I have never lost that start-up DNA. My past few jobs have been taking the way start-ups work and applying that mentality and framework to traditional businesses that need to transform.

Ben: Could you give me some background on your company and your current role within the company?

Isaac: Greenwich is a provider of global market intelligence and advisory services to the financial services industry. I’m the CIO and am leading our Business Transformation Initiative. I’ve been focused on a couple of key areas in my role. These include creating agile practices and a core competency in software development, as well as building and standardizing our Business Intelligence platforms.

Ben: You recently started at Greenwich. As a CIO in this day and age, what are some of the challenges of starting a new role?

Isaac: When starting a new role, you’re constantly switching hats. You need your learning hat to be able to digest things that you know very little about. You need your listening hat to hear where a pain point or opportunity is so you can understand and apply your forces in the right places. It’s important to look for some quick wins while taking baby steps towards implementing changes and transformations you think are necessary. It’s like a clown picture with 7 or 8 different wheels spinning at the same time. I had to learn how our business operated and to work with the IT team to transition from one way of operating to another way of operating. An important piece is to learn the cultural dynamics of the company. That’s been my first three months here.

Ben: What projects have you been able to work on with all the chaos?

Isaac: I’ve instrumented some tangible results while getting situated. We now have an agile practice. It was one of those things that had been talked about in the past, but now we have four programs running with four different teams, each in different states of maturity. We’ve also changed our approach with our developers. They were operating in support mode and taking requests to address break fix things, etc. Now, we’ve put the brakes on some of the marginal work and have freed some of their time so some of them can be tech leads on agile projects. This has helped us make great progress on building new products. We’re a tech team focused on more strategic initiatives.

I’ve been doing similar work with Dev Ops by getting them an expanded view of support beyond service desk and having them look at considerations that our organization has that need support around applications. We’re trying to get in the mindset that we can respond to application requests in need. We’ve gone from a help desk and infrastructure model to one that adds more focus on supporting applications.

Which areas of IT do you think are having the biggest impact on businesses?

Isaac: I would say self-service BI programs. If you roll the clock back 3-4 years ago, the tools for data analytics most organizations were using could be split into two camps. You were either operate out of do-it-yourself tools like Microsoft Excel and Access or you deployed an enterprise BI solution. The enterprise BI solution cost a lot of money and required extensive training. Over the last 3 years, there has been an emergence of tools that fit in that middle ground. Users can now do more analytics in a much more effective and productive fashion. The business becomes more self-serving, and this changes the role of the IT department in regards to how to store and interpret data. There is also a lot of governance and documentation involved that needs to be accounted for. These new self-service BI programs have taken a specialized skill set and made it much more democratic and scalable so that individual departments can look at data to see how they can do their jobs better.

Ben: What’s the area of IT that interests you the most?

Isaac: I would have to say the Internet of Things. The large volumes of data and the integration of the physical world and virtual world are fascinating. The Internet of Things has capabilities to really enrich our lives by simplifying things and giving us access to data that used to be difficult to capture in real time. Take wearables for example. The Apple Watch came out and then there will be many more things just like it. I’m really interested to see the form and functionality wearables take moving forward, as well as who will adopt them.

Ben: What sorts of predictions did you have coming into 2015?

Isaac: I actually wrote a blog post back in January with my 5 predictions for 2015. One was that big data investments may be the big bubble for some CIOs. To avoid overspending and underachieving on big data promises, CIOs are going to have to close the skills gap and champion analytics programs. Another was that Boards are likely to start requesting their CIOs to formally present security risks, options and a roadmap as companies become more active to address information security issues.

 

By Ben Stephenson, Emerging Media Specialist