Microsoft selects Ubuntu for first Linux-based Azure offering

AzureMicrosoft has announced plans to simplify Big Data and widen its use through Azure.

In a blog post, T K Rengarajan, Microsoft’s corporate VP for Data Platforms, described how the expanded Microsoft Azure Data Lake Store, available in preview later this year, will provide a single repository that captures data of any size, type and speed without forcing changes to applications as data scales. In the store, data can be securely shared for collaboration and is accessible for processing and analytics from HDFS applications and tools.

Another new addition is Azure Data Lake Analytics, a service built on Apache YARN that dynamically scales, which Microsoft says will stop people being side tracked from work by needing to know about distributed architecture. This service, available in preview later this year, will include U-SQL, a language that unifies the benefits of SQL with the expressive power of user code. U-SQL’s scalable distributed querying is intended to help users analyse data in the store and across SQL Servers in Azure, Azure SQL Database and Azure SQL Data Warehouse.

Meanwhile, Microsoft has selected Ubuntu for its first Linux-based Azure offering. The Hadoop-based big data service offering, HDInsight, will run on Canonical’s open source browser Ubuntu.

Azure HDInsight uses a range of open source analytics engines including Hive, Spark, HBase and Storm. Microsoft says it is now on general release with a 99.9 per cent uptime service level agreement.

Meanwhile Azure Data Lake Tools for Visual Studio will provide an integrated development environment that aims to ‘dramatically’ simplify authoring, debugging and optimization for processing and analytics at any scale, according to Rengarajan. “Leading Hadoop applications that span security, governance, data preparation and analytics can be easily deployed from the Azure Marketplace on top of Azure Data Lake,” said Rengarajan.

Azure Data Lake removes the complexities of ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics, said Rengarajan.