Google Open Sources Dataflow Analytics Code through Apache Incubator

Google is open-sourcing more code by contributing Cloud Dataflow to the Apache Software Foundation. The move opens new cloud-based data analytics options and integration opportunities for Big Data companies.

Christopher Tozzi, Contributing Editor

January 21, 2016

2 Min Read
Google Open Sources Dataflow Analytics Code through Apache Incubator

Google is open-sourcing more code by contributing Cloud Dataflow to the Apache Software Foundation. The move, a first for Google, opens new cloud-based data analytics options and integration opportunities for big data companies.

Cloud Dataflow is a platform for processing large amounts of data in the cloud. It features an open source, Java-based SDK, which makes it easy to integrate with other cloud-centric analytics and Big Data tools.

The platform’s main value for Big Data operations is providing compatibility with new technologies as they emerge while still integrating into existing workflows. That saves organizations from having to revamp their analytics infrastructure or code each time a new data processing framework appears.

Although the Dataflow SDK has been open source for more than a year, Google took the bigger step this week of proposing to turn the platform into an Apache Incubator project. That move paves the way for Dataflow’s codebase to eventually become a full-fledged Apache Software Foundation project.

Google has partnered with Cloudera, data Artisans, Talend, Cask and PayPal in issuing the proposal. Those partners are already celebrating the proposal, which — if approved, which it should certainly be — will make it simpler to build Dataflow’s scalability and integration features into commercial Big Data platforms in an open source, vendor-neutral way.

Talend, for instance, had this to say: “Developers leveraging the Dataflow framework won’t be ‘locked-in’ with a specific data processing runtime and will be able to leverage new data processing framework as they emerge without having to rewrite their Dataflow pipelines, making it Future-proof.”

For the channel, Google’s proposal means the cloud and big data are set to grow closer together — and that it will be easier for open source big data companies to keep the future of data analytics open.

Read more about:

AgentsMSPsVARs/SIs

About the Author

Christopher Tozzi

Contributing Editor

Christopher Tozzi started covering the channel for The VAR Guy on a freelance basis in 2008, with an emphasis on open source, Linux, virtualization, SDN, containers, data storage and related topics. He also teaches history at a major university in Washington, D.C. He occasionally combines these interests by writing about the history of software. His book on this topic, “For Fun and Profit: A History of the Free and Open Source Software Revolution,” is forthcoming with MIT Press.

Free Newsletters for the Channel
Register for Your Free Newsletter Now

You May Also Like