Category Archives: Big Data

Actuate, Hortonworks Collaborate to Visualize Big Data

Image representing Actuate as depicted in Crun...

Actuate Corporation, the people behind BIRT and an open source Business Intelligence (BI) vendor, today announced a collaboration between Actuate BIRT and the Hortonworks Data Platform, to enable Big Data visualization. The Hortonworks Data Platform is a completely open source, tightly integrated and tested distribution of Apache Hadoop, backed by extensive customer support and training.

The ActuateOne integrated product suite—built around BIRT—uses native access Hive query to leverage MapReduce functions to extract data from Hadoop, pulling those data sets into customizable BIRT-based dashboards and scorecards for interactive visualization and analysis.

“We have dedicated significant resources to make Apache Hadoop more robust and easier to integrate, extend, deploy and use,” said John Kreisa, VP of Marketing at Hortonworks. “Our partnership with open source BI leader Actuate enables more users to cost effectively analyze vast amounts of data stored in Hadoop using open source technologies.”

“Actuate’s collaboration with Hortonworks will ease the transition from Big Data hype to Big Data usefulness,” said Nobby Akiha, Senior Vice President of Marketing at Actuate. “We believe the key to success with Big Data lies in building the right infrastructure to manage it. Teaming with Hortonworks will further our goal of helping organizations figure out how best to leverage and integrate Big Data sources to enable better decision making.”

Large organizations are increasingly turning to Apache Hadoop for the storage and management of massive amounts of data and thus need scalable ways to explore, analyze and visualize the insights stored within it. The combination of the Hortonworks Data Platform’s distributed processing of Hadoop data sources of any size, with Actuate’s scalable infrastructure and intuitive data visualization capabilities, enables organizations to more effectively operationalize Big Data for thousands of customers, partners and employees.


Lucid Imagination Combines Search, Analytics and Big Data to Tackle the Problem of Dark Data

Image representing Lucid Imagination as depict...

Organizations today have little to no idea how much lost opportunity is hidden in the vast amounts of data they’ve collected and stored.  They have entered the age of total data overload driven by the sheer amount of unstructured information, also called “dark” data, which is contained in their stored audio files, text messages, e-mail repositories, log files, transaction applications, and various other content stores.  And this dark data is continuing to grow, far outpacing the ability of the organization to track, manage and make sense of it.

Lucid Imagination, a developer of search, discovery and analytics software based on Apache Lucene and Apache Solr technology, today unveiled LucidWorks Big Data. LucidWorks Big Data is the industry’s first fully integrated development stack that combines the power of multiple open source projects including Hadoop, Mahout, R and Lucene/Solr to provide search, machine learning, recommendation engines and analytics for structured and unstructured content in one complete solution available in the cloud.

With LucidWorks Big Data, Lucid Imagination equips technologists and business users with the ability to initially pilot Big Data projects utilizing technologies such as Apache Lucene/Solr, Mahout and Hadoop, in a cloud sandbox. Once satisfied, the project can remain in the cloud, be moved on premise or executed within a hybrid configuration.  This means they can avoid the staggering overhead costs and long lead times associated with infrastructure and application development lifecycles prior to placing their Big Data solution into production.

The product is now available in beta. To sign up for inclusion in the beta program, visit http://www.lucidimagination.com/products/lucidworks-search-platform/lucidworks-big-data.

How big is the problem of dark data? The total amount of digital data in the world will reach 2.7 zettabytes in 2012, a 48 percent increase from 2011.* 90 percent of this data will be unstructured or “dark” data. Worldwide, 7.5 quintillion bytes of data, enough to fill over 100,000 Libraries of Congress get generated every day. Conversely, that deep volume of data can serve to help predict the weather, uncover consumer buying patterns or even ease traffic problems – if discovered and analyzed proactively.

“We see a strong opportunity for search to play a key role in the future of data management and analytics,” said Matthew Aslett, research manager, data management and analytics, 451 Research. “Lucid’s Big Data offering, and its combination of large-scale data storage in Hadoop with Lucene/Solr-based indexing and machine-learning capabilities, provides a platform for developing new applications to tackle emerging data management challenges.”

Data analytics has traditionally been the domain of business intelligence technologies. Most of these tools, however, have been designed to handle structured data such as SQL, and cannot easily tap into the broad range of data types that can be used in a Big Data application. With the announcement of LucidWorks Big Data, organizations will be able to utilize a single platform for their Big Data search, discovery and analytics needs. LucidWorks Big Data is the only complete platform that:

  • Combines the real time, ad hoc data accessibility of LucidWorks (Lucene/Solr) with compute and storage capabilities of Hadoop
  • Delivers commonly used analytic capabilities along with Mahout’s proven, scalable machine learning algorithms for deeper insight into both content and users
  • Tackles data, both big and small with ease, seamlessly scaling while minimizing the impact of provisioning Hadoop, LucidWorks and other components
  • Supplies a single, coherent, secure and well documented REST API for both application integration and administration
  • Offers fault tolerance with data safety baked in
  • Provides choice and flexibility, via on premise, cloud hosted or hybrid deployment solutions
  • Is tested, integrated and fully supported by the world’s leading experts in open source search
  • Includes powerful tools for configuration, deployment, content acquisition, security, and search experience that is packaged in a convenient, well-organized application

Lucid Imagination’s Open Search Platform uncovers real-time insights from any enterprise data, whether structured in databases, unstructured in formats such as emails or social channels, or semi-structured from sources such as websites.  The company’s rich portfolio of enterprise-grade solutions is based on the same proven open source Apache Lucene/Solr technology that powers many of the world’s largest e-commerce sites. Lucid Imagination’s on-premise and cloud platforms are quicker to deploy, cost less than competing products and are more easily tailored to specific needs than business intelligence solutions because they leverage innovation from the open source community.

“We’re allowing a broad set of enterprises to test and implement data discovery and analysis projects that have historically been the province of large multinationals with large data centers. Cloud computing and LucidWorks Big Data finally level the field,” said Paul Doscher, CEO of Lucid Imagination. “Large companies, meanwhile, can use our Big Data stack to reduce the time and cost associated with evaluating and ultimately implementing big data search, discovery and analysis. It’s their data – now they can actually benefit from it.”


Teradata Acquires eCircle

Teradata and Aprimo, a Teradata company today announced the signing of a definitive agreement to acquire Munich-based eCircle, a European cloud-based digital marketing company.

The combination of Teradata’s analytical capabilities, Aprimo’s Integrated Marketing Management, and eCircle’s digital messaging solution will enable marketers worldwide to create integrated customer experiences across online and offline channels that leverage Big Data insights to grow existing customers, attract new customers, and increase revenues. Digital marketers will also have the option to leverage the eCircle solution as a standalone offering. The addition of eCircle more than triples Aprimo’s European team, expertise and reach in all major European countries, creating the largest marketing applications provider in Europe and enabling the delivery of eCircle solutions globally.

Components of the combined Teradata, Aprimo, and eCircle offering will include:

•        An Integrated Marketing Management solution that provides access to all marketing applications from the cloud – including digital, campaign and operational – thus enabling faster time to market, higher ease of use, and less IT complexity;

•        Ability to easily create targeted, personalised digital campaigns that are among the world’s most robust in their compliance with security and privacy regulations;

•        A digital messaging platform for social, mobile, web and email that can scale to support hundreds of billions of messages a year;

•        Multi-channel data management, advanced segmentation and optimisation,

•        Access to digital marketing services such as messaging, content creation, best practices and lead generation delivered by Aprimo digital marketing experts;

•        Big Data analytics from Teradata and Teradata Aster that turns content from social, mobile, web and email channels into actionable insights; and,

•        Unified reporting.

eCircle will also be available as a standalone solution for the digital marketer who wants a simple to deploy but powerful and easy to use digital messaging infrastructure.


Illumina Introduces BaseSpace Apps Genome Informatics

Illumina, Inc. today introduced BaseSpace Apps, a dedicated applications store for BaseSpace, the Company’s genomics cloud computing platform. Informatics solutions available through BaseSpace Apps will allow customers to connect with a growing community of academic, commercial and open source tool providers who are building applications around Illumina data to dramatically simplify and accelerate genomic data analysis.

BaseSpace Apps will include a publicly available API (application programming interface) that allows developers to create and deploy new applications for the analysis of genetic data generated on Illumina systems. Diagnomics, GenoLogics Life Sciences, Genomatix, Golden Helix, Ingenuity Systems, Knome, Omicia, Spiral Genetics, Omixon, Real Time Genomics, Station X, Integromics Inc., Biomax Informatics AG, and Strand Life Sciences were named as initial application development partners.

“The rapid adoption of BaseSpace coupled with BaseSpace Apps will help us achieve our goal to create an ecosystem where users of Illumina next generation sequencers can easily access a broad range of genome analysis tools from the world’s leading bioinformatics vendors,” said Alex Dickinson, Illumina’s Senior Vice President, Cloud Genomics. “By providing an open API and collaborative environment, we can encourage more rapid proliferation of the tools that will enable scientists to analyze, understand and make use of massive amounts of genetic data.”

Illumina also introduced its iSAAC genome alignment tool today. Historically, alignment has been the most time consuming and processor-intensive step in genome analysis. Available on BaseSpace as well as standard workstations, iSAAC maps sequencing reads to their proper location up to 10 times faster than existing aligners, significantly expediting and simplifying a critical component in data analysis.

Through BaseSpace Apps, a diverse array of new data analysis applications and programs such as iSAAC will be available as part of a growing toolset within the BaseSpace cloud for MiSeq® and HiSeq® systems. Collectively, the tools will provide a wide range of functionality, from workflow management and downstream data analysis, to data visualization and biological interpretation.

BaseSpace is a scalable cloud-computing environment for all of Illumina’s sequencing systems that can be accessed securely from anywhere in the world. MiSeq system data already can be seamlessly transferred to BaseSpace for storage, analysis and sharing between researchers and their peers around the world, all in a secure and user-friendly environment. HiSeq data storage and analysis capabilities will be commercially released later this year.