Parquet Now a Top-Level Apache Project

Parquet Now a Top-Level Apache Project

The Apache Software Foundation (ASF) has announced that Apache Parquet has graduated from incubator status to become a top-level project. Parquet is a columnar Hadoop storage format used in many big data projects. “Lots of applications are based on existing row-oriented formats, like Avro and Thrift, that come with objects to represent the data,” explained Ryan Blue, a Cloudera software engineer and a member of the Apache Parquet Project Management Committee. “A great feature of Parquet is that it is built to work natively with those existing classes, so you don?t have to change the application to go from a row-oriented to a column-oriented format. Parquet can read directly to Avro records, Spark data frames, Hive?s internal writeables, and others.”

Before a project can receive the “top-level” designation, it must prove to ASF that it is being well-managed, and many in the industry see it as Apache’s vote of confidence for projects.

View article

Share the Post:
Heading photo, Metadata.

What is Metadata?

What is metadata? Well, It’s an odd concept to wrap your head around. Metadata is essentially the secondary layer of data that tracks details about the “regular” data. The regular

XDR solutions

The Benefits of Using XDR Solutions

Cybercriminals constantly adapt their strategies, developing newer, more powerful, and intelligent ways to attack your network. Since security professionals must innovate as well, more conventional endpoint detection solutions have evolved

AI is revolutionizing fraud detection

How AI is Revolutionizing Fraud Detection

Artificial intelligence – commonly known as AI – means a form of technology with multiple uses. As a result, it has become extremely valuable to a number of businesses across

AI innovation

Companies Leading AI Innovation in 2023

Artificial intelligence (AI) has been transforming industries and revolutionizing business operations. AI’s potential to enhance efficiency and productivity has become crucial to many businesses. As we move into 2023, several

data fivetran pricing

Fivetran Pricing Explained

One of the biggest trends of the 21st century is the massive surge in analytics. Analytics is the process of utilizing data to drive future decision-making. With so much of

kubernetes logging

Kubernetes Logging: What You Need to Know

Kubernetes from Google is one of the most popular open-source and free container management solutions made to make managing and deploying applications easier. It has a solid architecture that makes