spark-sequoiadb (homepage)

Spark connector for SequoiaDB

@SequoiaDB / (2)

Spark-SequoiaDB is a library that allows users to read/write data with Spark SQL from/into SequoiaDB collections.
SequoiaDB is a document-oriented NoSQL database and provides a JSON storage model. Spark is a fast and general-purpose cluster computing system.
Spark-SequoiaDB library is used to integrate SequoiaDB and Spark, in order to give users a system that combines the advantages of schema-less storage model with dynamic indexing and Spark cluster.


Tags

  • 2|sql
  • 2|nosql
  • 2|sequoiadb
  • 1|data source

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages SequoiaDB:spark-sequoiadb:1.12-s_2.11

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "SequoiaDB/spark-sequoiadb:1.12-s_2.11"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "SequoiaDB" % "spark-sequoiadb" % "1.12-s_2.11"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>SequoiaDB</groupId>
    <artifactId>spark-sequoiadb</artifactId>
    <version>1.12-s_2.11</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 1.12-s_2.11 ( db09c9 | zip | jar ) / Date: 2015-03-30 / License: Apache-2.0 / Scala version: 2.11

Spark Scala/Java API compatibility: - 43% , - 100%

Version: 1.12-s_2.10 ( 604ac5 | zip | jar ) / Date: 2015-03-30 / License: Apache-2.0 / Scala version: 2.10

Spark Scala/Java API compatibility: - 11% , - 100% , - 37% , - 43%