Affinity Propagation on Spark
@viirya / (0)
Affinity Propagation (AP), a graph clustering algorithm based on the concept of "message passing" between data points. Unlike clustering algorithms such as k-means or k-medoids, AP does not require the number of clusters to be determined or estimated before running it. AP is developed by Frey and Dueck. Please refer to the paper.
Affinity Propagation on Spark implements Affinity Propagation algorithm on cluster computing system Spark. By leveraging computing cluster, you can run this clustering algorithm on large-scale data sets.
 Brendan J. Frey; Delbert Dueck (2007). "Clustering by passing messages between data points". Science. 315 (5814): 972-976.
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages viirya:SparkAffinityPropagation:1.0
If you use the sbt-spark-package plugin, in your sbt build file, add:
spDependencies += "viirya/SparkAffinityPropagation:1.0"
resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven" libraryDependencies += "viirya" % "SparkAffinityPropagation" % "1.0"
MavenIn your pom.xml, add:
<dependencies> <!-- list of dependencies --> <dependency> <groupId>viirya</groupId> <artifactId>SparkAffinityPropagation</artifactId> <version>1.0</version> </dependency> </dependencies> <repositories> <!-- list of other repositories --> <repository> <id>SparkPackagesRepo</id> <url>http://dl.bintray.com/spark-packages/maven</url> </repository> </repositories>