sparkpipe-core (homepage)
Modular, non-linear data pipeline framework for Spark
@unchartedsoftware / (0)
Enhancing and maintaining productivity on the Spark platform involves implementing scripts in a modular, testable and reusable fashion.
Sparkpipe facilitates expressing and connecting components of Spark jobs in a standard way, so that they might be assembled in series (or even in a more complex dependency graph of operations), reused and shared. Easily connect traditional ETL operations with machine learning and natural language processing, through to output and data visualization.
Tags
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages software.uncharted.sparkpipe:sparkpipe-core:0.9.7
sbt
In your sbt build file, add:
libraryDependencies += "software.uncharted.sparkpipe" % "sparkpipe-core" % "0.9.7"
Maven
In your pom.xml, add:<dependencies> <!-- list of dependencies --> <dependency> <groupId>software.uncharted.sparkpipe</groupId> <artifactId>sparkpipe-core</artifactId> <version>0.9.7</version> </dependency> </dependencies>
Releases
Version: 0.9.7 ( 2aff5e | zip | jar ) / Date: 2016-02-24 / License: BSD 3-Clause
Version: 0.9.6 ( b330de | zip | jar ) / Date: 2016-02-19 / License: Apache-2.0
Version: 0.9.5 ( 1fb943 | zip | jar ) / Date: 2016-01-24 / License: Apache-2.0
Version: 0.9.4 ( c414d9 | zip | jar ) / Date: 2016-01-11 / License: Apache-2.0
Version: 0.9.3 ( 7008fc | zip ) / Date: 2016-01-08 / License: Apache-2.0