spark-stringmetric (homepage)

Spark functions to run popular phonetic and string matching algorithms

@MrPowers / (1)

Includes similarity metrics like Dice / Sorensen, Hamming, Jaro, and Jaccard. Also includes phonetic algorithms like Megaphone, NYSIIS, and Soundex.


Tags (No tags yet, login to add one. )


How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages MrPowers:spark-stringmetric:0.2.0

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "MrPowers/spark-stringmetric:0.2.0"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "MrPowers" % "spark-stringmetric" % "0.2.0"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>MrPowers</groupId>
    <artifactId>spark-stringmetric</artifactId>
    <version>0.2.0</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 0.2.0 ( bf5419 | zip | jar ) / Date: 2019-01-27 / License: Apache-2.0 / Scala version: 2.11

Version: 2.2.0_0.1.0 ( 9dfae1 | zip | jar ) / Date: 2017-09-12 / License: Apache-2.0 / Scala version: 2.11