RD2R (homepage)

RD2R Ensemble

@djgarcia / (2)

RD2R Ensemble for Big Data. This method performs Random Discretization and Principal Component Analysis to the input data independently, and then joins the results to create more informative data.


Tags

  • 1|big data
  • 1|mllib
  • 1|ensemble
  • 1|Preprocessing

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages djgarcia:RD2R:1.0

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "djgarcia/RD2R:1.0"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "djgarcia" % "RD2R" % "1.0"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>djgarcia</groupId>
    <artifactId>RD2R</artifactId>
    <version>1.0</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 1.0 ( e4f921 | zip | jar ) / Date: 2018-01-29 / License: Apache-2.0 / Scala version: 2.11