Google announced a series of big data service updates to its cloud platform this week in a bid to strengthen its growing portfolio of data services.
The company announced the beta launch of Google Cloud Dataflow, a Java-based service that lets users build, deploy and run data processing pipelines for other applications like ETL, analytics, real-time computation, and process orchestration, while abstracting away all the other infrastructure bits like cluster management.
The service is integrated with Google’s monitoring tools and the company said it’s built from the ground up for fault-tolerance.
“We’ve been tackling challenging big data problems for more than a decade and are well aware of the difference that simple yet powerful data processing tools make. We have translated our experience from MapReduce, FlumeJava, and MillWheel into a single product, Google Cloud Dataflow,” the company explained in a recent blog post.
“It’s designed to reduce operational overhead and make programming and data analysis your only job, whether you’re a data scientist, data analyst or data-centric software developer. Along with other Google Cloud Platform big data services, Cloud Dataflow embodies the kind of highly productive and fully managed services designed to use big data, the cloud way.”
The company also added a number of security features to Big Query, Google’s SQL cloud service, including adding row-level permissioning for data protection, made it more performant (raised the ingestion limit to 100,000 rows per second), and announced its availability in Europe.
Google has largely focused its attention on other areas of the stack as of late. The company has been driving its container scheduling and deployment initiative Kubernetes quite hard, as well as its hybrid cloud initiatives (Mirantis, VMware). It also recently introduced a log analysis for Google Cloud and App Engine users.