RSS Feed
Download our iPhone app
Browse DevX
Sign up for e-mail newsletters from DevX


BI-on-Hadoop Benchmarks Compare Analytics Engines

Hive, Impala, Presto and Spark SQL each excelled at different things.


AtScale Inc. has published the results of a new benchmark study of BI-on-Hadoop analytics engines. The study tested Hive, Impala, Presto and Spark SQL, and it found that each of the open source tools had its own "sweet spot."

"There is no single 'best engine,'" the study concluded. "Presto, Hive, Impala and Spark SQL were all able to effectively complete a range of queries on over 6 billion rows of data. The 'winning' engine for each of our benchmark queries was dependent on the query characteristics (join size, selectivity, group-bys)."

It added, "A successful BI-on-Hadoop architecture will likely require more than one SQL on Hadoop engine. Each engine has its strengths: Presto's and Impala's concurrency scaling support for quick metric queries, Spark SQL's handling of large joins, Hive's and Impala's consistency across multiple query types. Enterprises might consider leveraging different engines for different query patterns."

View article

Email AuthorEmail Author
Close Icon
Thanks for your registration, follow us on our social networks to keep up-to-date