While Amazon Web Services is leaving no strings behind and continuously striving to complete the loop in being a provider of end to end data services, Microsoft Azure is not too far behind. In my previous post, I talked about AWS launching QuickSight for business intelligence. Azure has now come up with its own offering in the form of Azure Data Lake. Azure is currently collecting a lot of streaming data from different devices and letting users focus on the insights using their managed Hadoop service HDInsight.
Azure Data Lake brings two capabilities. First is the Data Lake Store, which is a hyper-scale HDFS repository designed for analytics on big data workloads. It pretty much does the job of high volume, high speed data processing at any level of scale, supporting near real time sensors and devices. The Data Lake Store has no limitations in terms of the type of data or its size. It supports processing of data in any format, size, and scale. It is HDFS compatible and hence provides support for Hadoop out of the box. It also supports Azure Active Directory, allowing you to configure data streams from within your enterprise network.
The second capability is the Azure Data Lake Analytics which allows you to run queries on any storage in Azure (blobs, Data Lake Store, SQL DB, etc.) and make sense out of the large volumes of data stored. It also comes with the U-SQL query offering designed on the lines of familiar SQL syntax. You can use the query language to declaratively create big data jobs that run against the stored datasets. U-SQL jobs can also be designed from your familiar Visual Studio environment. Data Lake Analytics jobs are not limited only to U-SQL, you can write your code as well to create them.
Azure Services, HDFS, Azure Data Lake, U-SQL query language