Alluxio Powers Baidu Search

Alluxio Powers Baidu Search

Alluxio, an open source memory-centric distributed storage system that was formerly known as Tachyon, has just released its 1.0 version, and it is already getting attention from some of the biggest firms on the Internet. For example, Chinese search giant Baidu says it is using Alluxio to achieve blazing-fast performance.

Baidu Senior Architect Shaoshan Liu explained that the company had been using SparkSQL for queries but wasn’t achieving the desired performance levels. “With Spark SQL alone, it took 100-150 seconds to finish a query; using Alluxio, where data may hit local or remote Alluxio nodes, it took 10-15 seconds. And if all of the data was stored in Alluxio local nodes, it took about five seconds, flat ? a 30-fold increase in speed,” he explained. “Based on these results, and the system’s reliability, we built a full system around Alluxio and Spark SQL.”

Other companies working with Alluxio include Barclays, Alibaba, Intel and IBM.

View article

Share the Post:
data observability

Data Observability Explained

Data is the lifeblood of any successful business, as it is the driving force behind critical decision-making, insight generation, and strategic development. However, due to its intricate nature, ensuring the

Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular