Category Archives: Big Data

Aspera Drive Offers Sharing, Collaboration Platform For Big Data

Aspera, Inc. today announced the beta availability of Aspera Drive, their new unified sharing and collaboration platform for big data, combining complete desktop explorer integration with performance and ease of use, transparent support for on–premise and cloud storage, and with security, management and access control.

The Aspera platform allows for transfer and synchronization of files sets of any size and any number with maximum speed and robustness at any distance, with the full access control, privacy and security of Aspera technology. Its architecture allows the platform to be deployed on-premise, in the cloud, or in a hybrid model.

Aspera Drive brings remote file browsing, transfer, synchronization, and package sending and receiving to the desktop, browser and mobile device. A backend architecture and API allows for fine-grained, centralized control over content access, security and bandwidth, regardless of content storage location – on premise or on cloud.

SAP Pilots Service to Unlock Value of Mobile Data

SAP today announced the SAP Consumer Insight 365 mobile service, a pilot initiative of a new cloud-based offering aiming to unlock the value of big data. The service will be powered by the SAP HANA platform and will allow enterprises to gain insight from the analysis of massive amounts of aggregated and anonymized consumer data residing in operator networks in real time. This market intelligence will ultimately allow brands to strengthen relationships with consumers through more targeted and context-specific marketing efforts.

“The rise of the always connected mobile world is creating a new source of data that has the potential to provide deeper insight into consumer behavior,” according to Guy Rolfe , global mobile practice leader at Kantar Mobile. ”The challenge is that this data is on a scale not seen before, therefore any service that can address this will create a new empirical data source that will not only complement existing research methodologies, but will also enable brands to better connect to consumers.”

There are more mobile devices in the world than there are people. The proliferation of mobile devices has significantly changed the way people communicate, live and engage with each other at work and in their personal lives. As more consumers get connected around the world through mobile devices, smartphones and the Internet, all of these interactions create massive amounts of data. The sheer volume and scale of this data has made analysis difficult.

With SAP Consumer Insight 365, data from operator networks will be analyzed through advanced analytics providing population level insight as well as high-definition detail through an intuitive Web portal, without drilling down into user-specific information. All mobile network operator data will be stored discreetly and individually partitioned within a global network of SAP data centers.

 

 

Big Data Without Security = Big Risk

Guest Post by C.J. Radford, VP of Cloud for Vormetric

Big Data initiatives are heating up. From financial services and government to healthcare, retail and manufacturing, organizations across most verticals are investing in Big Data to improve the quality and speed of decision making as well as enable better planning, forecasting, marketing and customer service. It’s clear to virtually everyone that Big Data represents a tremendous opportunity for organizations to increase both their productivity and financial performance.

According to WiPro, the leading regions taking on Big Data implementations are North America, Europe and Asia. To date, organizations in North America have amassed over 3,500 petabytes (PBs) of Big Data, organizations in Europe over 2,000 PBs, and organizations in Asia over 800 PBs. And we are still in the early days of Big Data – last year was all about investigation and this year is about execution; given this, it’s widely expected that the global stockpile of data used for Big Data will continue to grow exponentially.

Despite all the goodness that can stem from Big Data, one has to consider the risks as well. Big Data confers enormous competitive advantage to organizations able to quickly analyze vast data sets and turn it into business value, yet it can also put sensitive data at risk of a breach or violating privacy and compliance requirements. Big Data security is fast becoming a front-burner issue for organizations of all sizes. Why? Because Big Data without security = Big Risk.

The fact is, today’s cyber attacks are getting more sophisticated and attackers are changing their tactics in real time to get access to sensitive data in organizations around the globe. The barbarians have already breached your perimeter defenses and are inside the gates. For these advanced threat actors, Big Data represents an opportunity to steal an organization’s most sensitive business data, intellectual property and trade secrets for significant economic gain.

One approach used by these malicious actors to steal valuable data is by way of an Advanced Persistent Threat (APT). APTs are network attacks in which an unauthorized actor gains access to information by slipping in “under the radar” somehow. (Yes, legacy approaches like perimeter security are failing.) These attackers typically reside inside the firewall undetected for long periods of time (an average of 243 days, according to Mandiant’s most recent Threat Landscape Report), slowly gaining access to and stealing sensitive data.

Given that advanced attackers are already using APTs to target the most sensitive data within organizations, it’s only a matter of time before attackers will start targeting Big Data implementations. Since data is the new currency, it just makes sense for attackers to go after Big Data implementations because that’s where big value is.
So, what does all this mean for today’s business and security professionals? It means that when implementing Big Data, they need to take a holistic approach and ensure the organization can benefit from the results of Big Data in a manner that doesn’t negatively affect the risk posture of the organization.
The best way to mitigate risk of a Big Data breach is by reducing the attack surface, and taking a data-centric approach to securing Big Data implementations. These are the key steps:

Lock down sensitive data no matter the location.

The concept is simple; ensure your data is locked down regardless of whether it’s in your own data center or hosted in the cloud. This means you should use advanced file-level encryption for structured and unstructured data with integrated key management. If you’re relying upon a cloud service provider (CSP) and consuming Big Data as a service, it’s critical to ensure that your CSP is taking the necessary precautions to lock down sensitive data. If your cloud provider doesn’t have the capabilities in place or feels data security is your responsibility, ensure your encryption and key management solution is architecturally flexible in order to accommodate protecting data both on-premise and in the cloud.

Manage access through strong polices.

Access to Big Data should only be granted to those authorized end users and business processes that absolutely need to view it. If the data is particularly sensitive, it is a business imperative to have strong polices in place to tightly govern access. Fine-grained access control is essential, including things like the ability to block access by even IT system administrators (they may have the need to do things like back up the data, but they don’t need full access to that data as part of their jobs). Blocking access to data by IT system administrators becomes even more crucial when the data is located in the cloud and is not under an organization’s direct control.

Ensure ongoing visibility into user access to the data and IT processes.

Security Intelligence is a “must have” when defending against APTs and other security threats. The intelligence gained can support what actions to take in order to safeguard and protect what matters – an organization’s sensitive data. End-user and IT processes that access Big Data should be logged and reported to the organization on a regular basis. And this level of visibility must occur whether your Big Data implementation is within your own infrastructure or in the cloud.

To effectively manage that risk, the bottom line is that you need to lock down your sensitive data, manage access to it through policy, and ensure ongoing visibility into both user and IT processes that access your sensitive data. Big Data is a tremendous opportunity for organizations like yours to reap big benefits, as long as you proactively manage the business risks.

CJRadford

You can follow C.J. Radford on Twitter @CJRad.

Real-Time Processing Solutions for Big Data Application Stacks – Integration of GigaSpaces XAP, Cassandra DB

Guest post by Yaron Parasol, Director of Product Management, GigaSpaces

GigaSpaces Technologies has developed infrastructure solutions for more than a decade and in recent years has been enabling Big Data solutions as well. The company’s latest platform release – XAP 9.5 – helps organizations that need to process Big Data fast. XAP harnesses the power of in-memory computing to enable enterprise applications to function better, whether in terms of speed, reliability, scalability or other business-critical requirements. With the new version of XAP, increased focus has been placed on real-time processing of big data streams, through improved data grid performance, better manageability and end-user visibility, and integration with other parts of your Big Data stack – in this version, integration with Cassandra.

XAP-Cassandra Integration

To build a real-time Big Data application, you need to consider several factors.

First– Can you process your Big Data in actual real-time, in order to get instant, relevant business insights? Batch processing can take too long for transactional data. This doesn’t mean that you don’t still rely on your batch processing in many ways…

Second – Can you preprocess and transform your data as it flows into the system, so that the relevant data is made digestible and routed to your batch processor, making batch more efficient as well. Finally, you also want to make sure the huge amounts of data you send to long-term storage are available for both batch processing and ad hoc querying, as needed.

XAP and Cassandra DB together can easily enable all the above to happen. With built-in event processing capabilities, full data consistency, and high-speed in-memory data access and local caching – XAP handles the real-time aspect with ease. Whereas, Cassandra is perfect for storing massive volumes of data, querying them ad hoc, and processing them offline.

Several hurdles had to be overcome to make the integration truly seamless and easy for end users – including XAP’s document-oriented model vs. Cassandra’s columnar data model, XAP’s immediate consistency (data must be able to move between models smoothly), XAP offers immediate consistency with performance, while Cassandra trades off between performance and consistency (with Cassandra as the Big Data store behind XAP processing, both consistency and performance are maintained).

Together with the Cassandra integration, XAP offers further enhancements. These include:

Data Grid Enhancements

To further optimize your queries over the data grid XAP now includes compound indices, which enable you to index multiple attributes. This way the grid scans one index instead of multiple indices to get query result candidates faster.
On the query side, new projections support enables you to query only for the attributes you’re interested in instead of whole objects/documents. All of these optimizations dramatically reduce latency and increase the throughput of the data grid in common scenarios.

The enhanced change API includes the ability to change multiple objects using a SQL query or POJO template. Replication of change operations over the WAN has also been streamlined, and it now replicates only the change commands instead of whole objects. Finally, a hook in the Space Data Persister interface enables you to optimize your DB SQL statements or ORM configuration for partial updates.

Visibility and Manageability Enhancements

A new web UI gives XAP users deep visibility into important aspects of the data grid, including event containers, client-side caches, and multi-site replication gateways.

Managing a low latency, high throughput, distributed application is always a challenge due to the amount of moving parts. The new enhanced UI helps users to maintain agility when managing their application.

The result is a powerful platform that offers the best of all worlds, while maintaining ease of use and simplicity.

Yaron Parasol is Director of Product Management for GigaSpaces, a provider of end-to-end scaling solutions for distributed, mission-critical application environments, and cloud enabling technologies.

GigaSpaces Releases XAP 9.5: Enhanced for Cassandra Big Data Store, .NET Framework

GigaSpaces Technologies has released XAP 9.5, a new version of its in-memory computing platform that enables a quick launch of high-performance real-time analytics systems for Big Data.

At the core of the latest release of the GigaSpaces platform is XAP 9.5’s enhanced integration with NoSQL datastores, such as Cassandra. Combining the Cassandra datastore with the GigaSpaces in-memory computing platform adds real-time processing and immediate consistency to the application stack, while also guaranteeing dynamic scalability and transactionality – all necessary elements for enterprises that need real-time analytics or processing of streaming Big Data.

In this combined architecture, XAP in-memory computing provide the real-time data processing engine that is interoperable with any language or application framework, while the Cassandra DB provides long-term storage of data for use in real-time analytics.

GigaSpaces benchmark done for the integration of XAP with Cassandra shows that this integration dramatically improves real-time performance for data retrieval operations. Putting the GigaSpaces in-memory data grid in front of the Cassandra Big Data solution resulted in performance of read that is up to 2000 times faster.

Up until XAP 9.5. this integration was only available for XAP Java users. XAP 9.5 further innovates by allowing .Net users to leverage the same built in Cassandra integration. This integration provides a seamless bi-directional translation between Cassandra’s columnar data model and the richer document and object oriented models available in XAP. This works for both Java & .NET XAP deployments allowing for .NET developers to speed up their Cassandra based big data applications.

“The GigaSpaces XAP Cassandra integration enables companies to enjoy both in-memory data grid capabilities and Big Data processing, easily and for any framework – Java or .NET,” says Uri Cohen, GigaSpaces VP of Product. “This enables companies to be more agile in meeting both current and future data processing challenges.”

Sailthru Gets $19 Million for Smart Data Digital Brand Communications

Sailthru, a technology company specializing in digital brand experiences and communications, has announced a $19 million Series B investment led by Benchmark. Sailthru will use the investment to accelerate the growth of the company by increasing both staff and infrastructure and expanding the use of Smart Data™ by Fortune 500 companies. Sailthru joins Benchmark’s portfolio that currently includes other industry-leading companies such as Dropbox, Twitter, Instagram, Uber, Quora, Yelp and Zillow.

Sailthru’s Smart Data is leading a major shift in how companies engage with their customers through the automatic analysis of big data sets to generate informed, personalized communications across all digital channels. Unlike Big Data, which merely exists in a passive state and can often be overwhelming, Smart Data powers decisions. Sailthru’s Smart Data allows businesses to understand, predict, and engage each consumer on an individual level in real time. Sailthru’s clients are improving their ROI, customer time on-site and are seeing strong increases to customer lifetime value from their adoption of Smart Data.

What’s the BIG Deal? DATA On the Origin of the Term

NY Times BITS (Steve Lohr) today: An interesting “detective story” seeking the coiner of the phrase “Big Data”.

The unruly digital data of the Web is a big ingredient in what is now being called “Big Data.” And as it turns out, the term Big Data seems to be most accurately traced not to references in news or journal archives, but to digital artifacts now posted on technical Web sites, appropriately enough.

Elasticsearch, Trifork Partner to Expand Big Data Search

Elasticsearch today announced a partnership with Trifork, a provider of open source solutions, consulting and training for the enterprise market, headquartered in Denmark. Under the partnership, Trifork will include Elasticsearch software within its data and search product portfolio, and offer its global customer base high-end consulting, implementation, training and support for their Elasticsearch-driven big data search projects. Elasticsearch is one of the most popular open source search products in the world and is being used by thousands of companies in production that require real-time data accessibility and transparency across large volumes of distributed data.

“Trifork is an ideal partner for Elasticsearch. Together, we help customers gain invaluable insights hidden in ever-expanding data sets in key markets such as finance, healthcare and telecommunications,” said Steven Schuurman, co-founder and CEO of Elasticsearch. “Their deep experience at developing and supporting business-critical search solutions for its customers is firmly in line with our mission, and will help us to meet the overwhelming demand for Elasticsearch technology.”

A number of Trifork customers already use Elasticsearch including the University of Amsterdam, Greetz, NPO/VPRO, Suppledia, DiVault and PRIME Research. Visit the website to learn more about Trifork’s open source software offerings including the information about our Elasticsearch services.

“We have worked extensively with other open source search solutions, but Elasticsearch has taken real-time information discovery completely to the next level – the combination of its advanced search and analytics capabilities and its user-friendliness makes it the most powerful open source search solution out there,” said Bram Smeets, CTO of Trifork Amsterdam. “We believe in Elasticsearch; the team and the technology and feel both clearly differentiate in a highly competitive market. This makes Elasticsearch the logical choice for both our own internal use and on customer projects.”

Garantia Data Offers First Redis Hosting on Azure

Garantia Data, a provider of in-memory NoSQL cloud services, today announced the availability of its Redis Cloud and Memcached Cloud database hosting services on the Windows Azure cloud platform. Garantia Data’s services will provide thousands of developers who run their applications on Windows Azure with virtually infinite scalability, high availability, high-performance and zero-management in just one click.

Garantia is currently offering its Redis Cloud and Memcached Cloud services free of charge to early adopters in the US-East and US-West Azure regions.

Used by both enterprise developers and cutting-edge start-ups, Redis and Memcached are open source, RAM-based, key-value memory stores that provide significant value in a wide range of important use cases. Garantia Data’s Redis Cloud and Memcached Cloud are reliable and fully-automated services for running Redis and Memcached on the cloud – essentially freeing developers from dealing with nodes, clusters, scaling, data-persistence configuration and failure recovery.

“We are happy to be the first to offer the community a Redis architecture on Windows Azure,” said Ofer Bengal, CEO of Garantia Data. “We have seen great demand among .Net and Windows users for scalable, highly available and fully-automated services for Redis and Memcached. Our Redis Cloud and Memcached Cloud provide exactly the sort functionality they need.”

“We’re very excited to welcome Garantia Data to the Windows Azure ecosystem,” said Rob Craft, Senior Director Cloud Strategy at Microsoft. “Services such as Redis Cloud and Memcached Cloud give customers the production, workload-ready services they can use today to solve real business problems on Windows Azure.”

Redis Cloud scales seamlessly and infinitely, so a Redis dataset can grow to any size while supporting all Redis commands. Memcached Cloud offers a storage engine and full replication capabilities to standard Memcached. Both provide true high-availability, including instant failover with no human intervention. In addition, they run a dataset on multiple CPUs and use advanced techniques to maximize performance for any dataset size.