spark-nlp (homepage)

Natural Language Processing Library for Apache Spark.

John Snow Labs Spark-NLP is a natural language processing library built on top of Apache Spark ML. It provides simple, performant & accurate NLP annotations for machine learning pipelines, that scale easily in a distributed environment.


Tags

  • 1|pyspark
  • 1|NLP
  • 1|machine-learning
  • 1|nlu
  • 1|natural-language-processing
  • 1|natural-language-understanding
  • 1|spell-checker
  • 1|part-of-speech-tagger
  • 1|named-entity-recognition
  • 1|lemmatizer
  • 1|stemmer
  • 1|sentiment-analysis

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages JohnSnowLabs:spark-nlp:1.2.0

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "JohnSnowLabs/spark-nlp:1.2.0"

Otherwise,

resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"

libraryDependencies += "JohnSnowLabs" % "spark-nlp" % "1.2.0"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>JohnSnowLabs</groupId>
    <artifactId>spark-nlp</artifactId>
    <version>1.2.0</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>http://dl.bintray.com/spark-packages/maven</url>
  </repository>
</repositories>

Releases

Version: 1.2.0 ( e00c3f | zip | jar ) / Date: 2017-10-17 / License: Apache-2.0 / Scala version: 2.11

Version: 1.1.0 ( 05992d | zip | jar ) / Date: 2017-10-11 / License: Apache-2.0 / Scala version: 2.11