SparkAffinityPropagation (homepage)

Affinity Propagation on Spark

@viirya / (0)

Affinity Propagation (AP), a graph clustering algorithm based on the concept of "message passing" between data points. Unlike clustering algorithms such as k-means or k-medoids, AP does not require the number of clusters to be determined or estimated before running it. AP is developed by Frey and Dueck. Please refer to the paper[1].

Affinity Propagation on Spark implements Affinity Propagation algorithm on cluster computing system Spark. By leveraging computing cluster, you can run this clustering algorithm on large-scale data sets.

[1] Brendan J. Frey; Delbert Dueck (2007). "Clustering by passing messages between data points". Science. 315 (5814): 972-976.


  • 1|machine learning
  • 1|clustering
  • 1|affinity propagation

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages viirya:SparkAffinityPropagation:1.0


If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "viirya/SparkAffinityPropagation:1.0"


resolvers += "Spark Packages Repo" at ""

libraryDependencies += "viirya" % "SparkAffinityPropagation" % "1.0"


In your pom.xml, add:
  <!-- list of dependencies -->
  <!-- list of other repositories -->


Version: 1.0 ( 290dde | zip | jar ) / Date: 2017-07-29 / License: MIT / Scala version: 2.10