A community index of third-party packages for Apache Spark.
Showing packages 151 - 200 out of 516
cookie-datasets
Popular ML Datasets for Spark ML (MNIST, IRIS, CIFAR)
@cookieai / Latest release: 0.1.0 (2015-12-22) / Apache-2.0 / (0)
spark-google-spreadsheets
Google Spreadsheets datasource for SparkSQL and DataFrames
@potix2 / Latest release: 0.6.3-s_2.11 (2019-08-21) / Apache-2.0 / (1)
spark-kaggle-examples
Kaggle Job repository
@Lewuathe / Latest release: 0.0.1 (2015-11-22) / Apache-2.0 / (0)
spark-google-analytics
A Spark package for retrieving data from Google Analytics
@crealytics / Latest release: 0.8.1 (2015-12-14) / Apache-2.0 / (0)
graphx-diameter
A spark package to approximate the diameter of large graphs
@Cecca / Latest release: 0.2.0-s_2.11 (2017-03-09) / Apache-2.0 / (1)
spark-netezza
Spark Netezza Connector
@SparkTC / Latest release: 0.1.1-s_2.10 (2016-02-06) / Apache-2.0 / (0)
spark-iqmulus
Spark Package to read and write PLY, LAS and XYZ lidar point clouds using Spark SQL.
@IGNF / Latest release: 0.1.0-s_2.10 (2015-12-08) / Apache-2.0 / (0)
graphx-citymap
CityMap coding test plus 3 solutions, 1 with Spark/GraphX
@fancellu / No release yet / (0)
spark-benford-analysis
Benford Analysis for Spark package.
@dvgodoy / Latest release: v0.0.1 (2015-12-13) / Apache-2.0 / (0)
spark-google-adwords
A library for querying Google AdWords data with Apache Spark, for Spark SQL and DataFrames
@crealytics / Latest release: 0.8.2 (2015-12-14) / Apache-2.0 / (0)
Spark.statistics
Assembly of fundamental statistics implemented based on Apache Spark
@hhbyyh / No release yet / (0)
spark-ryft-connector
Spark connector for Ryft ONE
@getryft / Latest release: 0.9.0 (2017-04-04) / other license / (1)
sparkpipe-core
Modular, non-linear data pipeline framework for Spark
@unchartedsoftware / Latest release: 0.9.7 (2016-02-24) / BSD 3-Clause / (0)
spark-sftp
Spark connector for SFTP
@springml / Latest release: 1.1.3 (2018-10-01) / Apache-2.0 / (2)
opencpu-spark-executor
R OpenCPU Spark Executor (ROSE) Library
@onetapbeyond / Latest release: 1.0 (2016-01-11) / Apache-2.0 / (0)
spark-calibration
Assess binary classifier calibration (i.e., how well classifier outputs match observed class proportions) in Spark
@robert-dodier / No release yet / (0)
ARCANE-Spark
AnticipatoRy Complex Adaptive Network Extrapolation (ARCANE) Library Apache Spark Harness
@drmichaelnorth / Latest release: 1.0.0 (2016-01-20) / BSD 3-Clause / (0)
spark-sklearn
Scikit-learn integration package for Apache Spark
@databricks / Latest release: 0.2.3 (2017-09-29) / BSD 3-Clause / (1)
spark-wilcoxon
Compute Wilcoxon-Mann-Whitney rank sum statistic in Apache Spark
@robert-dodier / No release yet / (0)
spark-sparql-connector
Data source for querying SPARQL endpoints
@USU-Research / Latest release: 1.0.0-beta1-s_2.10 (2016-01-27) / Apache-2.0 / (0)
spark-netflow
NetFlow data source for Spark SQL and DataFrames
@sadikovi / Latest release: 2.1.0-s_2.12 (2020-12-24) / Apache-2.0 / (2)
lambda-spark-executor
Apache Spark AWS Lambda Executor (SAMBA)
@onetapbeyond / Latest release: 1.0 (2016-01-31) / Apache-2.0 / (0)
sparkling-graph
Large scale, distributed graph processing made easy! Load your graph from multiple formats and compute measures (but not only)
@sparkling-graph / Latest release: 0.0.7 (2017-05-16) / BSD 2-Clause / (5)
spark-DEMD-discretizer
A Distributed Evolutionary Multivariate Discretizer (DEMD)
@sramirez / Latest release: 1.0 (2016-02-04) / Apache-2.0 / (2)
spark-jms-receiver
JMS spark receiver
@tbfenet / Latest release: 0.2.1-s_2.11 (2016-11-23) / Apache-2.0 / (0)
click-through-rate-prediction
Kaggle's click through rate prediction with Pipeline API
@yu-iskw / Latest release: 1.1 (2016-02-10) / Apache-2.0 / (0)
couchbase-spark-connector
The Official Couchbase Spark Connector
@couchbase / Latest release: 2.2.0 (2017-09-20) / Apache-2.0 / (2)
spark-neighbors
Approximate nearest neighbor search using locality-sensitive hashing
@karlhigley / Latest release: 0.2.2 (2016-07-05) / MIT / (0)
spark-tutorial
This tutorial provides a quick introduction to using Spark
@rklick-solutions / No release yet / (2)
CaffeOnSpark
Scalable deep learning running Caffe inside Spark executors with peer-to-peer communication
@yahoo / No release yet / (1)
snappydata
SnappyData: OLTP + OLAP Database built on Apache Spark
@SnappyDataInc / Latest release: 1.2.0-s_2.11 (2020-02-07) / Apache-2.0 / (4)
graphframes
GraphFrames: DataFrame-based Graphs
@graphframes / Latest release: 0.8.4-spark3.5-s_2.12 (2024-07-03) / Apache-2.0 / (10)
spark-lever
Spark-lever is based on Spark Streaming,it is a proactive capability-aware load balancing system for batch stream processing on heterogeneous clusters.
@trueyao / No release yet / (2)
spark-cloudant
Deprecated, please see bahir/sql-cloudant
@cloudant-labs / Latest release: 2.0.0-s_2.11 (2016-09-23) / Apache-2.0 / (1)
spark-beetweenness
k Betweenness Centrality algorithm for Spark using GraphX
@dmarcous / Latest release: 1.0-s_2.10 (2016-02-29) / Apache-2.0 / (3)
spark-stemming
Spark MLlib wrapper around Snowball stemming
@master / Latest release: 0.2.1 (2018-11-28) / BSD 2-Clause / (0)