Data Science & Engineering Platform: Data Lineage and Provenance for Apache Spark

This is the third in a series of data engineering blogs that we plan to publish. The first blog outlined the data science and data engineering capabilities of Hortonworks Data Platform. Motivation Apache Spark is becoming the de-facto processing framework for all kinds of complex processing including ETL, LOB business data processing and machine learning. […]

The post Data Science & Engineering Platform: Data Lineage and Provenance for Apache Spark appeared first on Hortonworks.

Comments

Popular posts from this blog

Underwater Autonomous Vehicles Helping Navy Get More for the Money 

Canada regulator seeks information from public on Rogers-Shaw deal