Category Archives: Crate

Google adds Crate to SQL services on GCE

Google has been on a big data push

Google has been on a big data push

Google has added open source distributed SQL data store Crate to the Google Compute Engine arsenal, the latest in a series of moves aimed at bolstering the company’s data services.

Crate is a distributed open source data store built on a high availability “shared-nothing” architecture that automatically shards and distributes data across all of nodes (and maintains several replicas for fault tolerance).

It uses SQL syntax but packs some NoSQL goodies as well (Elasticsearch, Presto, Lucene are among the components it implements).

“This means when a new node is added, the cluster automatically rebalances and can self-heal when a node is removed. All data is indexed, optimized, and compressed on ingest and is accessible using familiar SQL syntax through a RESTful API,” explained Tyler Randles, evangelist at Crate.

“Crate was built so developers won’t need to “glue” several technologies together to store documents or BLOBs, or support real-time search. It also helps dev-ops by eliminating the need for manual tuning, sharding, replication, and other operations required to keep a large data store in good health.”

The move is yet another attempt by Google to bolster its data services. Earlier this week the company revealed Bigtable, a fully managed NoSQL database service the company said combines its own internal database technology with open source Apache HBase APIs.

Last month the company announced the beta launch of Google Cloud Dataflow, a Java-based service that lets users build, deploy and run data processing pipelines for other applications like ETL, analytics, real-time computation, and process orchestration, while abstracting away all the other infrastructure bits like cluster management.