bisecting-kmeans

bisecting-kmeans (homepage)

This is a prototype implementation of Bisecting K-Means Clustering on Spark.

This is a prototype implementation of Bisecting K-Means Clustering on Spark.
Bisecting K-Means is like a combination of K-Means and hierarchical clustering.
I'm adding this algorithm to Spark. However, it will be merged to Spark 1.6 or later. So I publish it as a prototype implementation.
If you have any feedback, please let me know. It would be better to post issues to my github repository (https://github.com/yu-iskw/bisecting-kmeans/issues).
SEE ALSO: https://github.com/apache/spark/pull/5267

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages yu-iskw:bisecting-kmeans:0.0.1

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "yu-iskw/bisecting-kmeans:0.0.1"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "yu-iskw" % "bisecting-kmeans" % "0.0.1"

Maven

In your pom.xml, add:

<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>yu-iskw</groupId>
    <artifactId>bisecting-kmeans</artifactId>
    <version>0.0.1</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 0.1.1 ( 58c75a | zip ) / Date: 2015-08-28 / License: Apache-2.0

Version: 0.1 ( 9f9148 | zip ) / Date: 2015-08-28 / License: Apache-2.0

Version: 0.0.1 ( a853b7 | zip | jar ) / Date: 2015-08-27 / License: Apache-2.0