neo4j-connector-apache-spark

neo4j-connector-apache-spark_2.11 (homepage)

Officially supported, Apache 2 licensend Neo4j Connector for Apache Spark.

The Neo4j Connector for Apache Spark is intended to make integrating graphs together with spark easy. There are effectively two ways of using the connector:

As a data source: read any set of nodes or relationships as a DataFrame in Spark
As a sink: write any DataFrame to Neo4j as a collection of nodes or relationships, or alternatively; use a Cypher statement to process records in a DataFrame into the graph pattern of your choice.

Because the connector is based on the new Spark DataSource API, other spark interpreters for languages such as Python and R will work.
The API remains the same, and mostly only slight syntax changes are necessary to accomodate the differences between (for example) Python and Scala.

Tags (No tags yet, login to add one. )

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages neo4j-contrib:neo4j-connector-apache-spark_2.11:4.0.1

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "neo4j-contrib/neo4j-connector-apache-spark_2.11:4.0.1"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "neo4j-contrib" % "neo4j-connector-apache-spark_2.11" % "4.0.1"

Maven

In your pom.xml, add:

<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>neo4j-contrib</groupId>
    <artifactId>neo4j-connector-apache-spark_2.11</artifactId>
    <version>4.0.1</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 4.0.1 ( 9f4261 | zip | jar ) / Date: 2021-04-12 / License: Apache-2.0

Version: 4.0.0 ( 0c36c4 | zip | jar ) / Date: 2020-11-10 / License: Apache-2.0