NoiseFramework (homepage)
Noise Framework for removing noisy instances with three algorithms: HME-BD, HTE-BD and ENN.
@djgarcia / (2)
In this framework, two Big Data preprocessing approaches to remove noisy examples are proposed: an homogeneous ensemble (HME_BD) and an heterogeneous ensemble (HTE_BD) filter. A simple filtering approach based on similarities between instances (ENN_BD) is also implemented.
Tags
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages djgarcia:NoiseFramework:1.2
sbt
If you use the sbt-spark-package plugin, in your sbt build file, add:
spDependencies += "djgarcia/NoiseFramework:1.2"
Otherwise,
resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/" libraryDependencies += "djgarcia" % "NoiseFramework" % "1.2"
Maven
In your pom.xml, add:<dependencies>
<!-- list of dependencies -->
<dependency>
<groupId>djgarcia</groupId>
<artifactId>NoiseFramework</artifactId>
<version>1.2</version>
</dependency>
</dependencies>
<repositories>
<!-- list of other repositories -->
<repository>
<id>SparkPackagesRepo</id>
<url>https://repos.spark-packages.org/</url>
</repository>
</repositories>
Releases
Version: 1.2 ( 806be2 | zip | jar ) / Date: 2018-04-18 / License: Apache-2.0 / Scala version: 2.11
Version: 1.1 ( 02851b | zip | jar ) / Date: 2017-09-28 / License: Apache-2.0 / Scala version: 2.10
Version: 1.0 ( 699ae0 | zip | jar ) / Date: 2017-03-28 / License: Apache-2.0 / Scala version: 2.10