spark-generic-connector (homepage)

Generic Connector for Apache Spark

@alvsanand / (1)

This library simplifies the connection of a external system with Apache Spark. Its main idea is to use a core functionality that is responsible of working with Apache Spark and implement specific connectors for any system. It can be used in batch or streaming scenarios which is awesome. From the first time, the idea is to be a _read only_ connector library. So any write operations will not be implemented.
Nowadays, it has implemented the following connectors:
Google services:
- CloudStorageSgcConnector: is able to fetch files from Google Cloud Storage.
- DataTransferSgcConnector: is able to fetch files from DoubleClick Data Transfer.
FTP servers like:
- FTPSgcConnector: is able to fetch files from a FTP server.
- FTPSSgcConnector: is able to fetch files from a FTPS server
- SFTPSgcConnector: is able to fetch files from a SFTP server


Tags

  • 1|streaming
  • 1|data source
  • 1|Google Cloud
  • 1|FTP

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages alvsanand:spark-generic-connector:0.2.0-spark_2x-s_2.11

sbt

If you use the sbt-spark-package plugin, in your sbt build file, add:

spDependencies += "alvsanand/spark-generic-connector:0.2.0-spark_2x-s_2.11"

Otherwise,

resolvers += "Spark Packages Repo" at "https://repos.spark-packages.org/"

libraryDependencies += "alvsanand" % "spark-generic-connector" % "0.2.0-spark_2x-s_2.11"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>alvsanand</groupId>
    <artifactId>spark-generic-connector</artifactId>
    <version>0.2.0-spark_2x-s_2.11</version>
  </dependency>
</dependencies>
<repositories>
  <!-- list of other repositories -->
  <repository>
    <id>SparkPackagesRepo</id>
    <url>https://repos.spark-packages.org/</url>
  </repository>
</repositories>

Releases

Version: 0.2.0-spark_2x-s_2.11 ( 4bbd5e | zip | jar ) / Date: 2017-01-17 / License: Apache-2.0

Version: 0.2.0-spark_2x-s_2.10 ( 4bbd5e | zip | jar ) / Date: 2017-01-17 / License: Apache-2.0

Version: 0.2.0-spark_1x-s_2.11 ( 4bbd5e | zip | jar ) / Date: 2017-01-17 / License: Apache-2.0

Version: 0.2.0-spark_1x-s_2.10 ( 4bbd5e | zip | jar ) / Date: 2017-01-17 / License: Apache-2.0