Google, Cloudera Bring Cloud Dataflow to Spark

Google, Cloudera Bring Cloud Dataflow to Spark

Google and Cloudera have partnered together on a project that will bring Google’s Cloud Dataflow programming model to Apache?s Spark data processing engine. Dataflow arose out of Google’s own internal big data processing efforts and it utilizes Google’s Compute Engine, Cloud Storage and BigQuery cloud computing services. Spark is an Apache project for very fast big data processing.

The two companies have released a “runner” that connects Dataflow to Spark. However, enterprises should note that the tool is still an alpha release and is not ready for production deployment.

View article


About Our Editorial Process

At DevX, we’re dedicated to tech entrepreneurship. Our team closely follows industry shifts, new products, AI breakthroughs, technology trends, and funding announcements. Articles undergo thorough editing to ensure accuracy and clarity, reflecting DevX’s style and supporting entrepreneurs in the tech sphere.

See our full editorial policy.

About Our Journalist