sparkpipe-core (homepage)
Modular, non-linear data pipeline framework for Spark
@unchartedsoftware / (0)
Enhancing and maintaining productivity on the Spark platform involves implementing scripts in a modular, testable and reusable fashion.
Sparkpipe facilitates expressing and connecting components of Spark jobs in a standard way, so that they might be assembled in series (or even in a more complex dependency graph of operations), reused and shared. Easily connect traditional ETL operations with machine learning and natural language processing, through to output and data visualization.
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages software.uncharted.sparkpipe:sparkpipe-core:0.9.7
In your sbt build file, add:
libraryDependencies += "software.uncharted.sparkpipe" % "sparkpipe-core" % "0.9.7"
In your pom.xml, add:<dependencies> <!-- list of dependencies --> <dependency> <groupId>software.uncharted.sparkpipe</groupId> <artifactId>sparkpipe-core</artifactId> <version>0.9.7</version> </dependency> </dependencies>
Version: 0.9.7 ( 2aff5e | zip | jar ) / Date: 2016-02-24 / License: BSD 3-Clause
Version: 0.9.6 ( b330de | zip | jar ) / Date: 2016-02-19 / License: Apache-2.0
Version: 0.9.5 ( 1fb943 | zip | jar ) / Date: 2016-01-24 / License: Apache-2.0
Version: 0.9.4 ( c414d9 | zip | jar ) / Date: 2016-01-11 / License: Apache-2.0
Version: 0.9.3 ( 7008fc | zip ) / Date: 2016-01-08 / License: Apache-2.0