spark-corenlp (homepage)
A StanfordĀ CoreNLP wrapper for Apache Spark
@databricks / (2)
Spark-CoreNLP wraps Stanford CoreNLP annotation pipeline as a Transformer under the ML pipeline API. It reads a string column representing documents, and applies CoreNLP annotators to each document. The output column contains annotations from CoreNLP.
Tags
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages databricks:spark-corenlp:0.4.0-spark2.4-scala2.11
sbt
If you use the sbt-spark-package plugin, in your sbt build file, add:
spDependencies += "databricks/spark-corenlp:0.4.0-spark2.4-scala2.11"
Otherwise,
resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/" libraryDependencies += "databricks" % "spark-corenlp" % "0.4.0-spark2.4-scala2.11"
Maven
In your pom.xml, add:<dependencies>
<!-- list of dependencies -->
<dependency>
<groupId>databricks</groupId>
<artifactId>spark-corenlp</artifactId>
<version>0.4.0-spark2.4-scala2.11</version>
</dependency>
</dependencies>
<repositories>
<!-- list of other repositories -->
<repository>
<id>SparkPackagesRepo</id>
<url>https://repos.spark-packages.org/</url>
</repository>
</repositories>
Releases
Version: 0.4.0-spark2.4-scala2.11 ( 9ed55b | zip | jar ) / Date: 2018-11-16 / License: GPL-3.0 / Scala version: 2.11
Version: 0.3.1-s_2.11 ( cf8202 | zip | jar ) / Date: 2018-08-16 / License: GPL-3.0 / Scala version: 2.11
Version: 0.3.0-s_2.11 ( bd468c | zip | jar ) / Date: 2018-08-15 / License: GPL-3.0 / Scala version: 2.11
Version: 0.2.0-s_2.10 ( 68e907 | zip | jar ) / Date: 2016-08-29 / License: GPL-3.0 / Scala version: 2.10
Version: 0.2.0-s_2.11 ( 68e907 | zip | jar ) / Date: 2016-08-29 / License: GPL-3.0 / Scala version: 2.11
Version: 0.1 ( c7a789 | zip | jar ) / Date: 2016-06-28 / License: GPL-3.0 / Scala version: 2.10