spark-stochastic-outlier-selection (homepage)

Implementation of Stochastic Outlier Selection (SOS) which is an unsupervised outlier selection algorithm.

@Fokko / (1)

Stochastic Outlier Selection (SOS) computes an outlier probabilities based on the concept of affinity for each datapoint. The implementation in Spark is horizontal scalable and can be used for example to detect faulty sensors in streams of data.


Tags

  • 1|outlier detection

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages frl.driesprong:spark-stochastic-outlier-selection_2.11:0.1.0

sbt

In your sbt build file, add:

libraryDependencies += "frl.driesprong" % "spark-stochastic-outlier-selection_2.11" % "0.1.0"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>frl.driesprong</groupId>
    <artifactId>spark-stochastic-outlier-selection_2.11</artifactId>
    <version>0.1.0</version>
  </dependency>
</dependencies>

Releases

Version: 0.1.0 ( 118950 | zip | jar ) / Date: 2015-09-11 / License: Apache-2.0 / Scala version: 2.11

Spark Scala/Java API compatibility: - 50% , - 100% , - 100% , - 100%