A community index of third-party packages for Apache Spark.
Showing packages 251 - 300 out of 516
spark-sample-scripts
(WIP) This is a set of Spark application examples, which run on spark-shell for beginners
@dobachi / Latest release: 0.1.0 (2016-08-04) / Apache-2.0 / (0)
spark-hdf5
A plugin to enable Apache Spark to read HDF5 files
@LLNL / Latest release: 0.0.4 (2016-09-10) / Apache-2.0 / (0)
spark-kafka-writer
Write your RDDs and DStreams to Kafka seamlessly
@BenFradet / Latest release: 0.4.0 (2017-07-22) / Apache-2.0 / (0)
spark-ranking-algorithms
Ranking algorithms for Spark DataFrame
@yu-iskw / Latest release: 0.0.4 (2016-08-26) / Apache-2.0 / (0)
spark-graphx-cassandra
An example of Spark GraphX as an analytics engine and Cassandra as persistence layer.
@knoldus / No release yet / (1)
maelstrom
Maelstrom is an open source Kafka integration with Spark that is designed to be developer friendly, high performance (millisecond stream processing), scalable (consumes messges at Spark worker nodes), and is extremely reliable.
@jeoffreylim / No release yet / (0)
real-time-stream-processing-engine
This is an example of real time stream processing using Spark Streaming, Kafka & Elasticsearch.
@knoldus / No release yet / (0)
proxy4sparkui
Configure nginx on the master node for a reverse proxy to Apache Spark web UI and history server. No more ssh socks/tunnel.
@ekasitk / No release yet / (0)
scalable-deeplearning
Scalable implementation of artificial neural networks for Spark deep learning
@avulanov / Latest release: 1.0.0 (2016-09-09) / Apache-2.0 / (1)
spark-timeseries
A library for time series analysis on Apache Spark
@sryza / Latest release: 0.4.1 (2016-11-15) / Apache-2.0 / (0)
spark-llap
A library to load data into Spark SQL DataFrames from Hive using LLAP
@hortonworks-spark / No release yet / (0)
spark-snowflake
Snowflake Data Source for Apache Spark.
@snowflakedb / Latest release: 2.5.1-spark_2.4 (2019-08-01) / Apache-2.0 / (2)
raspi-spark-streaming-mqtt
Stream Data analysis on IoT generated data via Apache spark
@shiv4nsh / No release yet / (0)
spark-workday
Spark Connector for Workday
@springml / Latest release: 1.1.0 (2017-03-10) / Apache-2.0 / (1)
spark-LDA-example
A example for Spark ML and StanfordNLP for topic discovery using LDA clustering
@shiv4nsh / No release yet / (0)
spark-bigquery
Spark connector for BigQuery
@appsflyer-dev / Latest release: 0.1.1 (2017-01-29) / Apache-2.0 / (0)
cassandra-spark-akka-http-starter-kit
A REST Api for CRUD operations on Cassandra using Apache Spark
@shiv4nsh / No release yet / (0)
dist-keras
Distributed deep learning with Keras and Apache Spark.
@JoeriHermans / No release yet / (0)
GeoSpark
A Cluster Computing System for Processing Large-Scale Spatial Data
@DataSystemsLab / No release yet / (1)
spark-utils
Practical utilities for spark applications
@CeON / Latest release: 1.0.0 (2016-10-19) / Apache-2.0 / (0)
spark_jupyter
This library customizes some DataFrame outputs.
@jeanbaptistepriez / No release yet / (1)
spark-github-pr
Spark SQL datasource for GitHub PR API
@lightcopy / Latest release: 1.3.0-s_2.10 (2016-12-25) / Apache-2.0 / (0)
spark-kafka-0-8-sql
Spark Structured Streaming Kafka 0.8 Source Implementation
@jerryshao / No release yet / (0)
spark-netsuite
Spark NetSuite Connector
@springml / Latest release: 1.1.0 (2017-03-10) / Apache-2.0 / (2)
spark-word2vec
A parallel implementation of word2vec based on Spark
@chen-lin / No release yet / (1)
spark-openstack
Openstack Spark cluster deployment
@ispras / Latest release: 0.9.5 (2016-11-10) / Apache-2.0 / (0)
spark-hadoopcryptoledger-ds
A Spark datasource for the HadoopCryptoLedger library
@ZuInnoTe / Latest release: 1.3.2-s_2.12 (2021-12-24) / Apache-2.0 / (1)
sparkhpc
launching and controlling spark on hpc clusters made easy
@rokroskar / No release yet / (0)
Spark-AdaOptimizer
implement Adam for stochastic optimization.
@VinceShieh / Latest release: 0.1 (2016-12-13) / Apache-2.0 / (1)
spark-mergejoin
Robust and scalable join operators using sort-merge algorithm (high data skew, low cardinality, etc)
@hindog / Latest release: 2.0.1 (2017-04-04) / Apache-2.0 / (0)
SpectralLDA-TensorSpark
Quick summary: This code implements a spectral (third order tensor decomposition) learning method for learning LDA topic model on Spark.
@FurongHuang / Latest release: 1.0 (2016-12-04) / Apache-2.0 / (1)
kraps-server
Kraps: safe, robust and reliable data pipelines over Apache Spark.
@krapsh / Latest release: 0.1.9-s_2.11 (2017-01-16) / Apache-2.0 / (0)
Spark_Knapsack
A PySpark simple greedy parallel implementation of 0-1 Knapsack algorithm.
@drulm / No release yet / (0)