Google and Microsoft have both announced the general release of Cloud Dataflow and Azure Data Factory, their respective cloud-based data integration services.
Google’s Cloud Dataflow is designed to integrate separate databases and data systems – both streaming and batch – in one programming model while giving apps full access to, and the ability to customise, that data; it is essentially a way to reduce operational overhead when doing big data analysis in the cloud.
Microsoft’s Azure Data Factory is a slightly different offering. It’s a data integration and automation service that regulates the data pipelines connecting a range of databases and data systems with applications. The pipelines can be scheduled to ingest, prep, transform, analyse and publish that data – with ADF automating and orchestration more complex transactions.
ADF is actually one of the core components of Microsoft’s Cortana analytics offering, and is deployed to automate the movement and transformation of data from disparate sources.
The maturation and commoditisation of data integration and automation is a positive sign for an industry that has for a very long while leaned heavily on expensive bespoke data integration. As more cloud incumbents bring their own integration offerings to the table it will be interesting to see how some of the bigger players in data integration and automation, like Informatica or Teradata, respond.