yggdrasil (homepage)
Yggdrasil: Faster Decision Trees Using Column Partitioning in Spark
@fabuzaid21 / (1)
Yggdrasil is a more efficient way in Apache Spark to train decision trees for large depths and datasets with a high number of features. For depths greater than 10, Yggdrasil is an order of magnitude faster than Spark MLlib v1.6.0.
Tags
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages fabuzaid21:yggdrasil:1.0.1
sbt
If you use the sbt-spark-package plugin, in your sbt build file, add:
spDependencies += "fabuzaid21/yggdrasil:1.0.1"
Otherwise,
resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/" libraryDependencies += "fabuzaid21" % "yggdrasil" % "1.0.1"
Maven
In your pom.xml, add:<dependencies> <!-- list of dependencies --> <dependency> <groupId>fabuzaid21</groupId> <artifactId>yggdrasil</artifactId> <version>1.0.1</version> </dependency> </dependencies> <repositories> <!-- list of other repositories --> <repository> <id>SparkPackagesRepo</id> <url>https://repos.spark-packages.org/</url> </repository> </repositories>
Releases
Version: 1.0.1 ( 5de595 | zip | jar ) / Date: 2018-05-11 / License: Apache-2.0 / Scala version: 2.10
Version: 1.0 ( f2bf92 | zip | jar ) / Date: 2016-06-07 / License: Apache-2.0 / Scala version: 2.10