neo4j-spark-connector (homepage)

Neo4j Connector for Apache Spark, which provides bi-directional read/write access to Neo4j from Spark, using the Spark DataSource APIs

@neo4j / (0)

The Neo4j Connector for Apache Spark is intended to make integrating graphs together with spark easy. There are effectively two ways of using the connector:

- As a data source: read any set of nodes or relationships as a DataFrame in Spark
- As a sink: write any DataFrame to Neo4j as a collection of nodes or relationships, or alternatively; use a Cypher statement to process records in a DataFrame into the graph pattern of your choice.

Because the connector is based on the new Spark DataSource API, other spark interpreters for languages such as Python and R will work.

The API remains the same, and mostly only slight syntax changes are necessary to accomodate the differences between (for example) Python and Scala.


Tags (No tags yet, login to add one. )


How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages neo4j:neo4j-spark-connector:5.3.2-s_2.13

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "neo4j/neo4j-spark-connector:5.3.2-s_2.13"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "neo4j" % "neo4j-spark-connector" % "5.3.2-s_2.13"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>neo4j</groupId>
    <artifactId>neo4j-spark-connector</artifactId>
    <version>5.3.2-s_2.13</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 5.3.2-s_2.13 ( d9acd7 | zip | jar ) / Date: 2024-09-25 / License: Apache-2.0

Version: 5.3.2-s_2.12 ( d9acd7 | zip | jar ) / Date: 2024-09-25 / License: Apache-2.0