approx-smote (homepage)

Approximated SMOTE for Big Data under the Spark Framework.

@mjuez / (1)

An approximated SMOTE implementation for Apache Spark that uses saurfang's knn based on hybrid spill trees for efficient k nearest neigbor search.


Tags

  • 1|ml
  • 1|big data
  • 1|data mining
  • 1|Preprocessing
  • 1|imbalanced
  • 1|smote
  • 1|imbalance

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages mjuez:approx-smote:1.0.0

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "mjuez/approx-smote:1.0.0"

Otherwise,

resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"

libraryDependencies += "mjuez" % "approx-smote" % "1.0.0"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>mjuez</groupId>
    <artifactId>approx-smote</artifactId>
    <version>1.0.0</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>http://dl.bintray.com/spark-packages/maven</url>
  </repository>
</repositories>

Releases

Version: 1.0.0 ( c62f55 | zip | jar ) / Date: 2020-12-01 / License: Apache-2.0 / Scala version: 2.11

Version: 0.1.2 ( b36a47 | zip | jar ) / Date: 2020-11-18 / License: Apache-2.0 / Scala version: 2.11

Version: 0.1.1 ( ec109f | zip | jar ) / Date: 2020-11-18 / License: Apache-2.0 / Scala version: 2.11

Version: 0.1.0 ( 025db5 | zip | jar ) / Date: 2020-11-18 / License: Apache-2.0 / Scala version: 2.11