Login | Register   
LinkedIn
Google+
Twitter
RSS Feed
Download our iPhone app
TODAY'S HEADLINES  |   ARTICLE ARCHIVE  |   FORUMS  |   TIP BANK
Browse DevX
Sign up for e-mail newsletters from DevX


advertisement
 

Parquet Now a Top-Level Apache Project

Parquet is part of the Hadoop ecosystem.


advertisement

The Apache Software Foundation (ASF) has announced that Apache Parquet has graduated from incubator status to become a top-level project. Parquet is a columnar Hadoop storage format used in many big data projects. "Lots of applications are based on existing row-oriented formats, like Avro and Thrift, that come with objects to represent the data," explained Ryan Blue, a Cloudera software engineer and a member of the Apache Parquet Project Management Committee. "A great feature of Parquet is that it is built to work natively with those existing classes, so you don’t have to change the application to go from a row-oriented to a column-oriented format. Parquet can read directly to Avro records, Spark data frames, Hive’s internal writeables, and others."

Before a project can receive the "top-level" designation, it must prove to ASF that it is being well-managed, and many in the industry see it as Apache's vote of confidence for projects.

View article



   
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

Sitemap
Thanks for your registration, follow us on our social networks to keep up-to-date