Category Archives: Cory O’Connor

Google reveals Bigtable, a NoSQL service based on what it uses internally

Google has punted another big data service, a variant of what it uses internally, into the wild

Google has punted another big data service, a variant of what it uses internally, into the wild

Search giant Google announced Bigtable, a fully managed NoSQL database service the company said combines its own internal database technology with open source Apache HBase APIs.

The company that helped give birth to MapReduce and its sister Hadoop is now making available the same non-relational database tech driving a number of its services including Google Search, Gmail, and Google Analytics.

Google said Bigtable is powered by BigQuery underneath, and is extensible through the HBase API (which provides real-time read / write access capabilities).

“Google Cloud Bigtable excels at large ingestion, analytics, and data-heavy serving workloads. It’s ideal for enterprises and data-driven organizations that need to handle huge volumes of data, including businesses in the financial services, AdTech, energy, biomedical, and telecommunications industries,” explained Cory O’Connor, product manager at Google.

O’Connor said the service, which is now in beta, can deliver over two times the performance of its direct competition (which will likely depend on the use case), and has a TCO of less than half that of its direct competitors.

“As businesses become increasingly data-centric, and with the coming age of the Internet of Things, enterprises and data-driven organizations must become adept at efficiently deriving insights from their data. In this environment, any time spent building and managing infrastructure rather than working on applications is a lost opportunity.”

Bigtable is Google’s latest move to bolster its data services, a central pillar of its strategy to attract new customers to its growing platform. Last month the company announced the beta launch of Google Cloud Dataflow, a Java-based service that lets users build, deploy and run data processing pipelines for other applications like ETL, analytics, real-time computation, and process orchestration, while abstracting away all the other infrastructure bits like cluster management.