Category Archives: Data

MapR claims world’s first converged data platform with Streams

Navigating big dataApache Hadoop system specialist MapR Technologies claims it has invented a new system to make sense of all the disjointed streams of real time information flooding into big data platforms. The new MapR Streams system will, it says, blend everything from systems logs to sensors to social media feeds, whether it’s transactional or tracking data, and manage it all under one converged platform.

Stream is described as a stream processing tool that can handle real-time event handling and high scalability. When combined with other MapR offerings, it can harmonise existing storage data and NoSQL tools to create a converged data platform. This, it says, is the first of its kind in the cloud industry.

Starting from early 2016, when the technology becomes available, cloud operators can combine Streams with MapR-FS for storage and the MapR-DB in-Hadoop NoSQL database, to build a MapR Converged Data Platform. This will liberate users from having to monitor information from streams, file storage, databases and analytics, the vendor says.

Since it can handle billions of messages per second and join clusters from separate data centres across the globe, the tool could be of particular interested to cloud operators, according to Michael Brown, CTO at comScore. “Our system analyses over 65 billion new events a day, and MapR Streams is built to ingest and process these events in real time, opening the doors to a new level of product offerings for our customers,” he said.

While traditional workloads are being optimised, new workloads from the emerging IoT dataflows are presenting far greater challenges that need to be solved in a fraction of the time, claims MapR. The MapR Streams will help companies deal with the volume, variety and speed at which data has to be analysed while simplifying the multiple layers of hardware stacks, networking and data processing systems, according to MapR. Blending MapR Streams into a converged data system eliminates multiple siloes of data for streaming, analytics and traditional systems of record, MapR claimed.

MapR Streams supports standard application programming interfaces (APIs) and integrates with other popular stream processors like Spark Streaming, Storm, Flink and Apex. When available, the MapR Converged Data Platform will be offered as a free to use Community Edition to encourage developers to experiment.

Veritas warns of ‘databerg’ hidden dangers

Deep WebBackup specialist Veritas Technologies claims European businesses waste billions of euros on huge stories of useless information which are growing every year. By 2020 it claims the damage caused by this excessive data will cost over half a trillion pounds (£576bn) a year.

According to the Veritas Databerg Report 2015, 59% of data stored and processed by UK organisations is invisible and could contain hidden dangers. From this it has estimated that the average mid-sized UK organisation holding 1000 Terabytes of information spends £435k annually on Redundant, Obsolete or Trivial (ROT) data. According to its estimate just 12% of the cost of data storage is justifiably spent on business-critical intelligence.

The report blames employees and management for the waste. The first group treats corporate IT systems as their own personal infrastructure, while management are too reliant on cloud storage, which leaves them open to compliance violations and a higher risk of data loss.

The survey identified three major causes for Databerg growth, which stem from volume, vendor hype and the values of modern users. These root causes create problems in which IT strategies are based on data volumes not business value. Vendor hype, in turn, has convinced users to become increasingly reliant on free storage in the cloud and this consumerisation has led to a growing disregard for corporate data policies, according to the report’s authors.

As a result, big data and cloud computing could lead corporations to hit the databerg and incur massive losses. They could also sink under a prosecution for compliance failing, according to the key findings of the Databerg Report 2015.

It’s time to stop the waste, said Matthew Ellard, Senior VP for EMEA at Veritas. “Companies invest a significant amount of resources to maintain data that is totally redundant, obsolete and trivial.” This ‘ROT’ costs a typical midsize UK company, which can expect to hold 500 Terabytes of data, nearly a million pounds a year on photos, personal ID doc, music and videos.

The study was based on a survey answered by 1,475 respondents in 14 countries, including 200 in the UK.

Cloud industry shaken by European Safe Harbour ruling

Europe US court of justiceThe Court of Justice of the European Union has ruled the Safe Harbour agreement between Europe and the US, which provides blanket permission for data transfer between the two, is invalid.

Companies looking to move data from Europe to the US will now need to negotiate specific rules of engagement with each country, which is likely to have a significant impact on all businesses, but especially those heavily reliant on the cloud.

The ruling came about after Austrian privacy campaigner Max Schrems asked to find out what data Facebook was passing on to US intelligence agencies in the wake of the Snowden revelations. When his request was declined on the grounds that the safe harbour agreement guaranteed his protection he contested the decision and it was referred to the Court of Justice.

This decision had been anticipated, and on top of any legal contingencies already made large players such as Facebook, Google and Amazon are offered some protection by the fact that they have datacentres within Europe. However the legal and logistical strain will be felt by all, especially smaller companies that rely on US-based cloud players.

“The ability to transfer data easily and securely between Europe and the US is critical for businesses in our modern data-driven digital economy,” said Matthew Fell, CBI Director for Competitive Markets. “Businesses will want to see clarity on the immediate implications of the ECJ’s decision, together with fast action from the Commission to agree a new framework. Getting this right will be important to the future of Europe’s digital agenda, as well as doing business with our largest trading partner.”

“The ruling invalidating Safe Harbour is seismic,” said Andy Hardy, EMEA MD at Code42, which recently secured $85 million in Series B funding. “This decision will affect big businesses as well as small ones. But it need not be the end of business as we know it, in terms of data handling. What businesses need to do now is safeguard data. They need to find solutions that keep their, and their customer’s, data private – even when backed up into public cloud.”

“Symantec respects the decision of the EU Court of Justice,” said Ilias Chantzos, Senior Director of Government Affairs EMEA at Symantec. “However, we encourage further discussion in order to create a strengthened agreement with the safeguards expected by the EU Court of Justice. We believe that the recent ruling will create considerable disruption and uncertainty for those companies that have relied solely on Safe Harbour as a means of transferring data to the United States.”

“The issues are highly complex, and there are real tensions between the need for international trade, and ensuring European citizen data is treated safely and in accordance with data protection law,” said Nicky Stewart, commercial director of Skyscape Cloud Services. “We would urge potential cloud consumers not to use this ruling as a reason not to adopt cloud. There are very many European cloud providers which operate solely within the bounds of the European Union, or even within a single jurisdiction within Europe, therefore the complex challenges of the Safe Harbor agreement simply don’t apply.”

These were just some of the views offered to BCN as soon as the ruling was announced and the public hand-wringing is likely to continue for some time. From a business cloud perspective one man’s problem is another’s opportunity and companies will be queuing up to offer localised cloud services, encryption solutions, etc. In announcing a couple of new European datacentres today Netsuite was already making reference to the ruling. This seems like a positive step for privacy but only time will tell what it means for the cloud industry.

Twitter nixes firehose partnership with DataSift

Twitter is consolidating its grip on data analytics and resellers using its data in real-time

Twitter is consolidating its grip on data analytics and resellers using its data in real-time

Twitter has suspended negotiations over the future use of the social media giant’s data with big data analytics provider DataSift, sparking concerns the firm plans to shut out others in the ecosystem of data analytics providers it enables.

In a recent blog post penned by DataSift’s chief exec and founder, Nick Halstead, the company aimed to reaffirm to customers that’s its business model “never relied on access to Twitter data” and that it is extending its reach into “business-owned data.”

But, the company still attacked the social media giant for damaging the ecosystem it enables.

“Our goal has always been to provide a one-stop shop for our customers to access all the types of data from a variety of networks and be able to consume it in the most efficient way. Less noise, more actionable results. This is what truly matters to companies that deal with social data,” Halstead explained.

“The bottom line: Twitter has seriously damaged the ecosystem this week. 80% of our customers use technology that can’t be replaced by Twitter. At the end of the day, Twitter is providing data licensing, not processing data to enable analysis.”

“Twitter also demonstrated that it doesn’t understand the basic rules of this market: social networks make money from engagement and advertising. Revenue from data should be a secondary concern to distribution and it should occur only in a privacy-safe way. Better understanding of their audiences means more engagement and more ad spend from brands. More noise = less ad spend.”

DataSift was one three data resellers that enjoy privileged access to Twitter’s data in real-time – Gnip, which is now owned by Twitter, and NTT Data being the other two.

The move to strengthening its grip over the analysis ecosystem seems aimed at bolstering Gnip’s business. A similarly-timed post on Gnip’s blog by Zach Hofer-Shall, head of Twitter more or less explained that the Gnip acquisition was a “first step” towards developing a more direct relationship with data customers, which would suggest other firehose-related negotiations may likely sour in the coming months if they haven’t already (BCN reached out to NTT Data for comment).

Some have, reasonably, hit out at Twitter for effectively eating its own ecosystem and shutting down third party innovation.  For instance Steven Willmott, chief executive of 3Scale, an API services vendor, said shutting down firehose access will result in niche verticals being underserved.

“While it makes sense at some level to want to be closer to the consumers of data (that’s valuable and laudable from a product perspective), removing other channels is an innovation bust. Twitter will no doubt do a great job on a range of use-cases but it’s severely damaging not to have a means to enable full firehose access for others. Twitter should really be expanding firehose access, not restricting it”

Julien Genestoux, founder of data feed service provider Superfeedr, said the recent move to cut off firehose access is not very different from what Twitter did a couple years ago when they started limiting the 3rd party client’s API accesses, and that Facebook often does much the same with partners it claims to give full data access to.

“The problem isn’t the company. The problem is the pattern. When using an API, developers are completely surrendering any kind of bargain power they have. There’s a reason we talk about slave and master in computer science. API’s are whips for web companies. This is the very tool they use to enforce a strong coupling and dependence to their platform,” he said.

While Twitter seems to be severely restricting the data reseller ecosystem it’s also redoubling its efforts to capture the hearts and minds of the enterprise developer, with coveted access to its data being placed front and centre. Twitter is working with IBM to make its data stream available to Big Blue’s clients, and in March this year IBM said it has over 100 pilots in place that see the company working with enterprises in a range of verticals to create cloud-based services integrating Twitter data and Watson analytics.

Storage Has Evolved – It Now Provides the Context & Management of Data

 

Information infrastructure is taking storage, which is a very fundamental part of any data center infrastructure, and putting context around it by adding value on what has been typically seen as a commodity item.

Bits in and of themselves have little value. Add context to it and assign value to that information and it becomes an information infrastructure. Organizations need to seek to add value to their datacenter environments by leveraging some advanced technologies that have become part of our landscape. These technologies include software defined storage, solid state storage, and cloud based storage. Essentially, there is a new way to deliver a datacenter application data infrastructure.

Storage has evolved

 

http://www.youtube.com/watch?v=yzbwG0g-Y7c

Interested in learning more about the latest in storage technologies? Fill out this form and we’ll get back to you!

By Randy Weis, Practice Manager – Information Infrastructure

The Fracturing of the Enterprise Brain

Never mind BYOD (bring your own device), employee use of non-corporate online storage solutions could lead to the weakening of enterprise ability to access company data and intellectual property. In the worst case scenario, companies could lose information forever.

A post by Brian Proffitt at ReadWrite Enterprise explains:

Employees are the keepers of knowledge within a company. Want to run the monthly payroll? The 20-year-veteran in accounting knows how to manage that. Building the new company logo? The superstar designer down in the art department is your gal. When such employees leave the company, it can be a bumpy transition, but usually not impossible, because the data they’ve been using lies on the corporate file server and can be used to piece together the work that’s been done.

Of course, that’s based on the premise that, for the past couple of decades or so, data has essentially been stored in one of two places: on the file servers or the employee’s local computer.

Today, though, people store data in a variety of places, not all of it under the direct control of IT. Gmail, Dropbox, Google Drive or a company’s cloud on Amazon Web Services…

Read the article.