The Apache Software Foundation has released the 2.0 version of Spark 2.0, the open source, real-time big data processing engine. According to the Spark website, “The major updates are API usability, SQL 2003 support, performance improvements, structured streaming, R UDF support, as well as operational improvements. In addition, this release includes over 2,500 patches from over 300 contributors.”
Databricks, a company founded by the Spark creators, announced support for Spark 2.0. It added that the latest release was 5 to 10 times faster than Spark 1.6 in some situations. “One of the things that’s really exciting for me as a developer of Apache Spark is seeing how quickly users start to use new features and APIs we introduce, and in turn, offer almost instantaneous feedback, so that we can continue to improve them,” Matei Zaharia, CTO of Databricks, said.