A community index of third-party packages for Apache Spark.

Showing packages 1 - 50 out of 68 for search "tags:SQL"

Integration utilities for using Spark with Apache Avro data

@databricks / Latest release: 4.0.0-s_2.11 (2017-10-30) / Apache-2.0 / (13)

  • 6|sql
  • 4|input
  • 4|avro


Redshift Data Source for Apache Spark

@databricks / Latest release: 3.0.0-preview1 (2016-11-01) / Apache-2.0 / (3)

  • 2|sql
  • 2|data source
  • 2|redshift


Spark SQL CSV data source

@databricks / Latest release: 1.5.0-s_2.11 (2016-09-07) / Apache-2.0 / (10)

  • 4|csv
  • 3|sql
  • 2|DataSource


Spark SQL DBF Library

@mraad / No release yet / (0)

  • 1|sql


Connecting Apache Spark with different data stores

@Stratio / Latest release: 0.7.0-RC1 (2015-01-14) / Apache-2.0 / (20)

  • 6|database
  • 6|mongo
  • 6|cassandra


An external PySpark module that works like R's read.csv or Panda's read_csv, with automatic type inference and null value handling. Parses csv data into SchemaRDD. No installation required, simply include pyspark_csv.py via SparkContext.

@seahboonsiew / No release yet / (1)

  • 2|python
  • 2|csv
  • 1|sql


MongoDB data source for Spark SQL

@Stratio / Latest release: 0.12.0 (2016-08-31) / Apache-2.0 / (14)

  • 5|MongoDB
  • 5|Spark SQL
  • 2|sql


PySpark Cassandra brings back the fun in working with Cassandra data in PySpark.

@TargetHolding / Latest release: 0.3.5 (2016-03-30) / Apache-2.0 / (1)

  • 1|python
  • 1|spark
  • 1|sql


Connects Spark to Cassandra

@datastax / Latest release: 2.4.0-s_2.11 (2018-11-29) / Apache-2.0 / (14)

  • 3|spark
  • 3|cassandra
  • 2|nosql


Power BI API adapter for Apache Spark

@granturing / Latest release: 1.5.0_0.0.7 (2015-09-13) / Apache-2.0 / (0)

  • 2|streaming
  • 1|sql
  • 1|realtime


Spark connector for SequoiaDB

@SequoiaDB / Latest release: 1.12-s_2.11 (2015-03-30) / Apache-2.0 / (2)

  • 2|sequoiadb
  • 2|nosql
  • 2|sql


Spark SQL IBM Cloudant External Datasource

@cloudant / No release yet / (1)

  • 1|data source
  • 1|sql


Deprecated, please see couchbase/couchbase-spark-connector

@couchbaselabs / Latest release: 1.0.0 (2015-10-20) / Apache-2.0 / (1)

  • 1|streaming
  • 1|library
  • 1|sql


Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL

@saurfang / Latest release: 3.0.0-s_2.12 (2020-09-13) / Apache-2.0 / (1)

  • 1|sas
  • 1|tools
  • 1|sql


Pyspark support for Elastic Search

@TargetHolding / Latest release: 0.4.2 (2016-03-22) / Apache-2.0 / (1)

  • 1|python
  • 1|spark
  • 1|database


Manipulate Apache Spark Streaming by SQL

@Intel-bigdata / No release yet / (1)

  • 1|streaming
  • 1|sql


A library for exposing dateTime functions from the joda library as SQL functions. With a dsl to build dateTime catalyst expressions.

@SparklineData / Latest release: 0.0.2 (2015-10-29) / Apache-2.0 / (1)

  • 1|spark
  • 1|sql
  • 1|dateTime


A Hivemall wrapper for Spark

@maropu / Latest release: 0.0.6 (2016-04-07) / Apache-2.0 / (0)

  • 1|sql
  • 1|hive
  • 1|machine learning


Official integration between Apache Spark and Elasticsearch real-time search and analytics

@elastic / Latest release: 5.3.1 (2017-04-21) / Apache-2.0 / (3)

  • 1|search
  • 1|elasticsearch
  • 1|sql


Geo Spatial Data Analytics on Spark

@harsha2010 / Latest release: 1.0.5-s_2.11 (2017-08-14) / Apache-2.0 / (1)

  • 2|geospatial
  • 2|data source
  • 2|sql


Scala library for converting Spark rows to case classes

@ypg-data / Latest release: 0.2.0-s_2.11 (2016-03-01) / Apache-2.0 / (0)

  • 1|sql
  • 1|library
  • 1|scala


Infinispan Spark Connector

@infinispan / Latest release: 0.9 (2018-11-05) / Apache-2.0 / (0)

  • 1|streaming
  • 1|sql
  • 1|scala


Read SparkSQL parquet file as RDD[Protobuf]

@saurfang / Latest release: 0.1.2-s_2.10 (2015-08-18) / Apache-2.0 / (0)

  • 1|data source
  • 1|protobuf
  • 1|sql


Spark mainframe connector

@Syncsort / Latest release: 1.0.0 (2015-09-01) / Apache-2.0 / (0)

  • 1|input
  • 1|data source
  • 1|sql


Docker-based, End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark Streaming, ML, MLlib, GraphX, Kafka, Cassandra, Redis, Apache Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau, H2O Flow, Tachyon,

@fluxcapacitor / No release yet / (3)

  • 2|streaming
  • 2|kafka
  • 1|machine learning


Enhanced Python Dataframes for Spark

@dondrake / No release yet / (0)

  • 1|python
  • 1|sql
  • 1|pyspark


Spark Modularized View

@TresAmigosSD / No release yet / (0)

  • 1|core
  • 1|sql


C# API for Apache Spark. (Package moved to http://spark-packages.org/package/Microsoft/Mobius)

@skaarthik / No release yet / (2)

  • 1|streaming
  • 1|examples
  • 1|sql


XML data source for Spark SQL and DataFrames

@HyukjinKwon / Latest release: 0.1.1-s_2.10 (2015-11-19) / Apache-2.0 / (1)

  • 1|sql
  • 1|DataSource
  • 1|SparkSQL


Google Spreadsheets datasource for SparkSQL and DataFrames

@potix2 / Latest release: 0.6.3-s_2.11 (2019-08-21) / Apache-2.0 / (1)

  • 1|sql
  • 1|data source
  • 1|scala


Spark Package to read and write PLY, LAS and XYZ lidar point clouds using Spark SQL.

@IGNF / Latest release: 0.1.0-s_2.10 (2015-12-08) / Apache-2.0 / (0)

  • 1|geospatial
  • 1|data source
  • 1|sql


Data source for querying SPARQL endpoints

@USU-Research / Latest release: 1.0.0-beta1-s_2.10 (2016-01-27) / Apache-2.0 / (0)

  • 1|data source
  • 1|sparql
  • 1|sql


NetFlow data source for Spark SQL and DataFrames

@sadikovi / Latest release: 2.1.0-s_2.12 (2020-12-24) / Apache-2.0 / (2)

  • 1|input
  • 1|library
  • 1|sql


The Official Couchbase Spark Connector

@couchbase / Latest release: 2.2.0 (2017-09-20) / Apache-2.0 / (2)

  • 1|streaming
  • 1|library
  • 1|sql


This tutorial provides a quick introduction to using Spark

@rklick-solutions / No release yet / (2)

  • 2|RDD
  • 2|spark
  • 2|Spark SQL


SnappyData: OLTP + OLAP Database built on Apache Spark

@SnappyDataInc / Latest release: 1.2.0-s_2.11 (2020-02-07) / Apache-2.0 / (4)

  • 2|database
  • 1|data source
  • 1|sql


The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV

@basho / Latest release: 1.6.3 (2017-03-17) / Apache-2.0 / (2)

  • 3|python
  • 3|riak
  • 3|data source


SparkSQL extension as a library for Apache Spark extending and improving its capabilities for a data federation system.

@Stratio / Latest release: 1.4.0 (2016-07-06) / Apache-2.0 / (6)

  • 3|SparkSQL
  • 3|sql
  • 2|library


C# API for Apache Spark

@Microsoft / Latest release: 1.6.100 (2016-05-02) / MIT / (2)

  • 1|streaming
  • 1|examples
  • 1|sql


Google BigQuery support for Spark, SQL, and DataFrames

@spotify / Latest release: 0.2.2-s_2.10 (2017-11-29) / Apache-2.0 / (3)

  • 1|input
  • 1|data source
  • 1|sql


The official MongoDB Spark Connector

@mongodb / Latest release: 3.0.1 (2021-02-03) / Apache-2.0 / (20)

  • 3|MongoDB
  • 2|Spark SQL
  • 2|nosql


Spark Receiver for SQL or NoSQL Databases like Cassandra, MongoDB, Elasticsearch or JDBC

@Stratio / Latest release: 0.1.0 (2016-06-30) / Apache-2.0 / (1)

  • 1|streaming
  • 1|library
  • 1|sql


A plugin to enable Apache Spark to read HDF5 files

@LLNL / Latest release: 0.0.4 (2016-09-10) / Apache-2.0 / (0)

  • 1|input
  • 1|sql
  • 1|hdf5


Snowflake Data Source for Apache Spark.

@snowflakedb / Latest release: 2.5.1-spark_2.4 (2019-08-01) / Apache-2.0 / (2)

  • 1|sql
  • 1|snowflake
  • 1|da


Spark SQL datasource for GitHub PR API

@lightcopy / Latest release: 1.3.0-s_2.10 (2016-12-25) / Apache-2.0 / (0)

  • 1|input
  • 1|library
  • 1|sql


Spark Structured Streaming data source for MQTT

@apache / Latest release: 2.2.0 (2017-09-09) / Apache-2.0 / (1)

  • 1|streaming
  • 1|sql
  • 1|structured streaming


Example code which can help in getting started with spark 2 

@engineerpawan / Latest release: 1 (2016-12-28) / MIT / (1)

  • 1|example
  • 1|tutorial
  • 1|sql


Spark SQL index for Parquet tables

@lightcopy / Latest release: 0.5.0-s_2.12 (2020-08-01) / Apache-2.0 / (1)

  • 1|sql
  • 1|tools
  • 1|parquet


A connector for MemSQL and Spark

@memsql / Latest release: 4.1.1-spark-3.3.0 (2022-07-14) / Apache-2.0 / (6)

  • 3|memsql
  • 2|spark
  • 2|jdbc


Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.

@samelamin / Latest release: 0.2.5 (2018-08-08) / Apache-2.0 / (1)

  • 1|streaming
  • 1|sql
  • 1|core