A community index of third-party packages for Apache Spark.

Showing packages 301 - 350 out of 516

Spark DStream connector for Akka

@apache / Latest release: 2.2.0 (2017-09-09) / Apache-2.0 / (0)

  • 1|streaming


Spark DStream connector for MQTT

@apache / Latest release: 2.2.0 (2017-09-09) / Apache-2.0 / (0)

  • 1|python
  • 1|streaming
  • 1|pyspark


Spark Structured Streaming data source for MQTT

@apache / Latest release: 2.2.0 (2017-09-09) / Apache-2.0 / (1)

  • 1|streaming
  • 1|sql
  • 1|structured streaming


Spark DStream connector for Twitter

@apache / Latest release: 2.2.0 (2017-09-09) / Apache-2.0 / (0)

  • 1|streaming


Spark DStream connector for ZeroMQ

@apache / Latest release: 2.2.0 (2017-09-09) / Apache-2.0 / (0)

  • 1|streaming


a self organizing map for scala and spark

@ShokuninSan / No release yet / (0)

  • 1|machine learning


Example code which can help in getting started with spark 2 

@engineerpawan / Latest release: 1 (2016-12-28) / MIT / (1)

  • 1|example
  • 1|tutorial
  • 1|sql


An example of Spark and GraphX with Twitter as sample

@knoldus / No release yet / (0)


Spark SQL index for Parquet tables

@lightcopy / Latest release: 0.5.0-s_2.12 (2020-08-01) / Apache-2.0 / (1)

  • 1|sql
  • 1|tools
  • 1|parquet


Twitter Sentiment Analysis - PySpark

@DayneSorvisto / No release yet / (1)

  • 1|twitter
  • 1|machine learning
  • 1|pyspark


Spark-FFM

@VinceShieh / Latest release: 0.1 (2017-01-04) / Apache-2.0 / (1)

  • 1|ml
  • 1|ma
  • 1|mllib


A Spark datasource for the HadoopOffice library

@ZuInnoTe / Latest release: 1.7.0-s_2.13 (2022-10-29) / Apache-2.0 / (1)

  • 1|data source
  • 1|excel
  • 1|office


A Spark Program that transfers data from Cassandra to Couchbase

@shiv4nsh / No release yet / (1)


Generic Connector for Apache Spark

@alvsanand / Latest release: 0.2.0-spark_2x-s_2.11 (2017-01-17) / Apache-2.0 / (1)

  • 1|streaming
  • 1|data source
  • 1|Google Cloud


Graphx Overlapping Community Detection

@bhardwajank / Latest release: 1.0 (2017-01-23) / Apache-2.0 / (0)

  • 1|graph


A lightweight, dependency-free package for extending Spark's date and timestamp operations, focused on time periods.

@danielpes / Latest release: 0.2.0-s_2.11 (2018-01-30) / Apache-2.0 / (1)

  • 1|date
  • 1|timestamp
  • 1|time


A Nearest Neighbor Classifier for High-Speed Big Data Streams with Instance Selection

@sramirez / Latest release: 0.8 (2017-01-27) / Apache-2.0 / (0)

  • 1|streaming
  • 1|machine learning
  • 1|instance selection


Spark Risk Explorer

@tibkiss / No release yet / (0)

  • 1|finance
  • 1|risk


A connector for MemSQL and Spark

@memsql / Latest release: 4.1.1-spark-3.3.0 (2022-07-14) / Apache-2.0 / (6)

  • 3|memsql
  • 2|spark
  • 2|jdbc


supplementation machine learning algorithms for Spark

@Intel-bigdata / Latest release: 0.1 (2017-02-06) / Apache-2.0 / (0)

  • 1|ml
  • 1|ma
  • 1|intel


Flint: A Time Series Library for Apache Spark

@twosigma / No release yet / (0)


TensorFlowOnSpark brings TensorFlow programs onto Apache Spark clusters

@yahoo / No release yet / (0)


Spark Tensorflow Connector

@tapanalyticstoolkit / Latest release: 1.0.0-s_2.11 (2017-02-21) / Apache-2.0 / (3)

  • 2|tensorflow
  • 2|data source
  • 1|library


Google BigQuery support for Spark, Structured Streaming, SQL, and DataFrames with easy Databricks integration.

@samelamin / Latest release: 0.2.5 (2018-08-08) / Apache-2.0 / (1)

  • 1|streaming
  • 1|sql
  • 1|core


Akka-based library to help you manage your Apache Spark jobs

@JoaoVasques / No release yet / (0)


Its a small application which collects tweets from twitter and process it with spark streaming and ingest it in cassandra ring

@phalodi / No release yet / (0)

  • 1|streaming
  • 1|example
  • 1|core


Library to schedule spark jobs related to time interval

@phalodi / No release yet / (0)


testing

@milinkp / Latest release: 1.0 (2017-03-11) / Apache-2.0 / (0)


Spark Marketo Connector

@springml / Latest release: 1.1.0 (2017-03-10) / Apache-2.0 / (1)


Open source Spark transformations and functions

@mrpowers / Latest release: 0.37.1-s_2.12 (2020-03-27) / MIT / (1)


Implementation of the Loopy Belief Propagation algorithm for Apache Spark

@HewlettPackard / No release yet / (0)

  • 1|graph
  • 1|machine learning


an example of integrating Spark Streaming with Google Pub/Sub and Google Datastore

@yu-iskw / No release yet / (0)

  • 1|streaming
  • 1|example


A simple tool for plotting Spark ML's Decision Trees

@julioasotodv / Latest release: 0.2 (2017-03-25) / MIT / (1)

  • 1|machine learning
  • 1|pyspark


Noise Framework for removing noisy instances with three algorithms: HME-BD, HTE-BD and ENN.

@djgarcia / Latest release: 1.2 (2018-04-18) / Apache-2.0 / (2)

  • 1|noise
  • 1|ensemble
  • 1|machine learning


Implementation of Lambda Architecture with Spark, Kafka, Cassandra and Twitter Streaming API

@knoldus / No release yet / (1)


Remove the splittable part

@chhokarpardeep / Latest release: 1.1.7-s_2.11 (2017-04-04) / GPL-3.0 / (0)


JSON schema parser for Apache Spark

@zalando-incubator / No release yet / (0)


Fast Apache Spark testing framework

@MrPowers / Latest release: 0.21.1-s_2.12 (2020-04-07) / MIT / (1)


DIQL: A Data Intensive Query Language for Apache Spark

@fegaras / No release yet / (0)

  • 1|tools


ASAM-ODS Data Source for Apache Spark

@onetechnologies / No release yet / (0)

  • 1|data source


Scala Client for Algorithmia Algorithms and Data API

@algorithmiaio / Latest release: 0.9.2 (2017-05-24) / Apache-2.0 / (1)


Spark SQL row-oriented indexed file format

@sadikovi / Latest release: 0.2.0-s_2.11 (2017-09-08) / MIT / (1)

  • 1|input
  • 1|library
  • 1|sql


Deep Learning Pipelines for Apache Spark

@databricks / Latest release: 1.5.0-spark2.4-s_2.11 (2019-01-25) / Apache-2.0 / (3)

  • 1|deep learning
  • 1|machine learning
  • 1|GPU


Python port of the awesome Datastax Spark Cassandra connector. Compatible w/ Spark 2.0+

@anguenot / Latest release: 2.4.1 (2022-08-03) / Apache-2.0 / (0)

  • 1|python
  • 1|nosql
  • 1|cassandra


Explore and analyze genomic data.

@hail-is / No release yet / (0)


Clustering indexes in Spark 2.1

@DanielTizon / No release yet / (0)


Deep Learning for MLlib

@JeremyNixon / No release yet / (1)

  • 1|ml
  • 1|mllib
  • 1|machine learning


Microsoft Machine Learning for Apache Spark

@Azure / Latest release: 0.17 (2019-04-23) / MIT / (4)

  • 3|ml
  • 3|Microsoft
  • 3|machine learning


This project provides a client library that allows Azure Cosmos DB to act as an input source or output sink for Spark jobs.

@Azure / No release yet / (0)


A package for dealing with crowdsourced big data.

@enriquegrodrigo / Latest release: 0.2.0 (2018-10-21) / MIT / (0)