smote-bd (homepage)
SMOTE-BD: A distributed Synthetic Minority Oversampling Technique (SMOTE) for Big Data.
@majobasgall / (0)
It is a fully scalable preprocessing approach for imbalanced classification in Big Data. It is based on one of the most widespread preprocessing solutions for imbalanced classification, namely the SMOTE algorithm, which creates new synthetic instances according to the neighbuorhood of each example of the minority class.
Tags
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages majobasgall:smote-bd:0.1
sbt
If you use the sbt-spark-package plugin, in your sbt build file, add:
spDependencies += "majobasgall/smote-bd:0.1"
Otherwise,
resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/" libraryDependencies += "majobasgall" % "smote-bd" % "0.1"
Maven
In your pom.xml, add:<dependencies> <!-- list of dependencies --> <dependency> <groupId>majobasgall</groupId> <artifactId>smote-bd</artifactId> <version>0.1</version> </dependency> </dependencies> <repositories> <!-- list of other repositories --> <repository> <id>SparkPackagesRepo</id> <url>https://repos.spark-packages.org/</url> </repository> </repositories>
Releases
Version: 0.1 ( 998700 | zip | jar ) / Date: 2018-11-14 / License: Apache-2.0 / Scala version: 2.11