A community index of third-party packages for Apache Spark.
Showing packages 1 - 50 out of 62 for search "tags:"Data Sources""
spark-avro
Integration utilities for using Spark with Apache Avro data
@databricks / Latest release: 4.0.0-s_2.11 (2017-10-30) / Apache-2.0 / (13)
spark-redshift
Redshift Data Source for Apache Spark
@databricks / Latest release: 3.0.0-preview1 (2016-11-01) / Apache-2.0 / (3)
spark-csv
Spark SQL CSV data source
@databricks / Latest release: 1.5.0-s_2.11 (2016-09-07) / Apache-2.0 / (10)
deep-spark
Connecting Apache Spark with different data stores
@Stratio / Latest release: 0.7.0-RC1 (2015-01-14) / Apache-2.0 / (20)
pyspark-csv
An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parses csv data into SchemaRDD. No installation required, simply include pyspark_csv.py via SparkContext.
@seahboonsiew / No release yet / (1)
spark-mongodb
MongoDB data source for Spark SQL
@Stratio / Latest release: 0.12.0 (2016-08-31) / Apache-2.0 / (14)
pyspark-cassandra
PySpark Cassandra brings back the fun in working with Cassandra data in PySpark.
@TargetHolding / Latest release: 0.3.5 (2016-03-30) / Apache-2.0 / (1)
spark-cassandra-connector
Connects Spark to Cassandra
@datastax / Latest release: 2.4.0-s_2.11 (2018-11-29) / Apache-2.0 / (14)
spark-power-bi
Power BI API adapter for Apache Spark
@granturing / Latest release: 1.5.0_0.0.7 (2015-09-13) / Apache-2.0 / (0)
spark-sequoiadb
Spark connector for SequoiaDB
@SequoiaDB / Latest release: 1.12-s_2.11 (2015-03-30) / Apache-2.0 / (2)
couchbase-spark-connector
Deprecated, please see couchbase/couchbase-spark-connector
@couchbaselabs / Latest release: 1.0.0 (2015-10-20) / Apache-2.0 / (1)
spark-sas7bdat
Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL
@saurfang / Latest release: 3.0.0-s_2.12 (2020-09-13) / Apache-2.0 / (1)
spark-solr
Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.
@LucidWorks / Latest release: 2.0.1 (2016-06-09) / Apache-2.0 / (1)
pyspark-elastic
Pyspark support for Elastic Search
@TargetHolding / Latest release: 0.4.2 (2016-03-22) / Apache-2.0 / (1)
DDF
Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data/Compute Engine
@ddf-project / No release yet / (11)
elasticsearch-hadoop
Official integration between Apache Spark and Elasticsearch real-time search and analytics
@elastic / Latest release: 5.3.1 (2017-04-21) / Apache-2.0 / (3)
magellan
Geo Spatial Data Analytics on Spark
@harsha2010 / Latest release: 1.0.5-s_2.11 (2017-08-14) / Apache-2.0 / (1)
spark-streaming-gnip
An Apache Spark utility for pulling Tweets from Gnip's PowerTrack in realtime
@knoldus / No release yet / (1)
spark-salesforce
Spark Salesforce Wave Connector
@springml / Latest release: 1.2.0 (2018-04-25) / Apache-2.0 / (2)
infinispan-spark
Infinispan Spark Connector
@infinispan / Latest release: 0.9 (2018-11-05) / Apache-2.0 / (0)
sparksql-protobuf
Read SparkSQL parquet file as RDD[Protobuf]
@saurfang / Latest release: 0.1.2-s_2.10 (2015-08-18) / Apache-2.0 / (0)
aliyun-spark-sdk
Spark on Aliyun, supporting interactions with Aliyun's base services.
@aliyun / No release yet / (1)
spark-mainframe-connector
Spark mainframe connector
@Syncsort / Latest release: 1.0.0 (2015-09-01) / Apache-2.0 / (0)
pipeline
Docker-based, End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark Streaming, ML, MLlib, GraphX, Kafka, Cassandra, Redis, Apache Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau, H2O Flow, Tachyon,
@fluxcapacitor / No release yet / (3)
spark-xml
XML data source for Spark SQL and DataFrames
@HyukjinKwon / Latest release: 0.1.1-s_2.10 (2015-11-19) / Apache-2.0 / (1)
cookie-datasets
Popular ML Datasets for Spark ML (MNIST, IRIS, CIFAR)
@cookieai / Latest release: 0.1.0 (2015-12-22) / Apache-2.0 / (0)
spark-google-spreadsheets
Google Spreadsheets datasource for SparkSQL and DataFrames
@potix2 / Latest release: 0.6.3-s_2.11 (2019-08-21) / Apache-2.0 / (1)
spark-iqmulus
Spark Package to read and write PLY, LAS and XYZ lidar point clouds using Spark SQL.
@IGNF / Latest release: 0.1.0-s_2.10 (2015-12-08) / Apache-2.0 / (0)
spark-ryft-connector
Spark connector for Ryft ONE
@getryft / Latest release: 0.9.0 (2017-04-04) / other license / (1)
spark-sftp
Spark connector for SFTP
@springml / Latest release: 1.1.3 (2018-10-01) / Apache-2.0 / (2)
spark-sparql-connector
Data source for querying SPARQL endpoints
@USU-Research / Latest release: 1.0.0-beta1-s_2.10 (2016-01-27) / Apache-2.0 / (0)
spark-netflow
NetFlow data source for Spark SQL and DataFrames
@sadikovi / Latest release: 2.1.0-s_2.12 (2020-12-24) / Apache-2.0 / (2)
couchbase-spark-connector
The Official Couchbase Spark Connector
@couchbase / Latest release: 2.2.0 (2017-09-20) / Apache-2.0 / (2)
snappydata
SnappyData: OLTP + OLAP Database built on Apache Spark
@SnappyDataInc / Latest release: 1.2.0-s_2.11 (2020-02-07) / Apache-2.0 / (4)
spark-hazelcast-connector
Connects Spark to Hazelcast
@erenavsarogullari / Latest release: 1.0.0-s_2.11 (2016-03-07) / Apache-2.0 / (0)
spark-riak-connector
The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV
@basho / Latest release: 1.6.3 (2017-03-17) / Apache-2.0 / (2)
spark-bigquery
Google BigQuery support for Spark, SQL, and DataFrames
@spotify / Latest release: 0.2.2-s_2.10 (2017-11-29) / Apache-2.0 / (3)
neo4j-spark-connector
Officially supported, Apache 2 licensed Neo4j Connector for Apache Spark.
@neo4j-contrib / Latest release: 5.3.1-s_2.13 (2024-07-08) / Apache-2.0 / (2)
Datasource-Receiver
Spark Receiver for SQL or NoSQL Databases like Cassandra, MongoDB, Elasticsearch or JDBC
@Stratio / Latest release: 0.1.0 (2016-06-30) / Apache-2.0 / (1)
spark-kafka-writer
Write your RDDs and DStreams to Kafka seamlessly
@BenFradet / Latest release: 0.4.0 (2017-07-22) / Apache-2.0 / (0)
spark-github-pr
Spark SQL datasource for GitHub PR API
@lightcopy / Latest release: 1.3.0-s_2.10 (2016-12-25) / Apache-2.0 / (0)
spark-hadoopcryptoledger-ds
A Spark datasource for the HadoopCryptoLedger library
@ZuInnoTe / Latest release: 1.3.2-s_2.12 (2021-12-24) / Apache-2.0 / (1)
spark-hadoopoffice-ds
A Spark datasource for the HadoopOffice library
@ZuInnoTe / Latest release: 1.7.0-s_2.13 (2022-10-29) / Apache-2.0 / (1)
spark-generic-connector
Generic Connector for Apache Spark
@alvsanand / Latest release: 0.2.0-spark_2x-s_2.11 (2017-01-17) / Apache-2.0 / (1)
spark-tensorflow-connector
Spark Tensorflow Connector
@tapanalyticstoolkit / Latest release: 1.0.0-s_2.11 (2017-02-21) / Apache-2.0 / (3)