A community index of third-party packages for Apache Spark.
Showing packages 101 - 150 out of 515
infinispan-spark
Infinispan Spark Connector
@infinispan / Latest release: 0.9 (2018-11-05) / Apache-2.0 / (0)
spark-skewjoin
Joins for skewed datasets in Spark
@tresata / Latest release: 0.2.0-s_2.10 (2015-11-13) / Apache-2.0 / (0)
sparksql-protobuf
Read SparkSQL parquet file as RDD[Protobuf]
@saurfang / Latest release: 0.1.2-s_2.10 (2015-08-18) / Apache-2.0 / (0)
twitter-stream-ml
Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.
@giorgioinf / Latest release: 0.2.0 (2016-06-19) / GPL-3.0 / (0)
spark-corenlp
A Stanford CoreNLP wrapper for Apache Spark
@databricks / Latest release: 0.4.0-spark2.4-scala2.11 (2018-11-16) / GPL-3.0 / (2)
spark-tfocs
TFOCS for Spark, a Spark port of TFOCS: Templates for First-Order Conic Solvers (cvxr.com/tfocs)
@databricks / No release yet / (1)
bisecting-kmeans
This is a prototype implementation of Bisecting K-Means Clustering on Spark.
@yu-iskw / Latest release: 0.1.1 (2015-08-28) / Apache-2.0 / (0)
DistML
DistML provide a supplement to mllib to support model-parallel on Spark
@intel-machine-learning / No release yet / (1)
dl4j-spark-ml
Deep Learning for Spark ML
@deeplearning4j / Latest release: 0.4-rc3.4 (2015-10-02) / Apache-2.0 / (1)
aliyun-spark-sdk
Spark on Aliyun, supporting interactions with Aliyun's base services.
@aliyun / No release yet / (1)
spark-mainframe-connector
Spark mainframe connector
@Syncsort / Latest release: 1.0.0 (2015-09-01) / Apache-2.0 / (0)
spark-druid-olap
Spark Druid Package
@SparklineData / Latest release: 0.1.0 (2016-06-03) / Apache-2.0 / (3)
sparkling-ferns
Implementation of Random Ferns for Apache Spark
@CeON / Latest release: 0.2.0 (2015-10-08) / Apache-2.0 / (0)
lazy-linalg
Linear algebra operators for Apache Spark MLlib's linalg package
@brkyvz / Latest release: 0.1.0 (2015-09-09) / Apache-2.0 / (1)
spark-ext
Spark Extension : ML transformers, SQL aggregations, etc that are missing in Apache Spark
@collectivemedia / No release yet / (0)
spark-on-hpc
Script to submit spark jobs on a traditional HPC cluster
@ekasitk / No release yet / (0)
spark-stochastic-outlier-selection
Implementation of Stochastic Outlier Selection (SOS) which is an unsupervised outlier selection algorithm.
@rug-ds-lab / No release yet / (1)
spark-stochastic-outlier-selection
Implementation of Stochastic Outlier Selection (SOS) which is an unsupervised outlier selection algorithm.
@Fokko / Latest release: 0.1.0 (2015-09-11) / Apache-2.0 / (1)
spark-redis
A connector for Spark that allows reading and writing to/from Redis cluster
@RedisLabs / Latest release: 2.3.0 (2018-11-04) / BSD 3-Clause / (3)
spark-xml-utils
Spark-xml-utils provides the ability to filter documents based on an xpath expression, return specific nodes for an xpath/xquery expression, or transform documents using a xslt stylesheet.
@elsevierlabs-os / Latest release: 1.10.0 (2021-12-08) / Apache-2.0 / (0)
pipeline
Docker-based, End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark Streaming, ML, MLlib, GraphX, Kafka, Cassandra, Redis, Apache Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau, H2O Flow, Tachyon,
@fluxcapacitor / No release yet / (3)
SpatialSpark
Big Spatial Data Processing using Spark
@syoummer / Latest release: 1.0 (2015-10-08) / Apache-2.0 / (1)
spark-python-knn
Function for computing K-NN in Apache Spark
@jakac / Latest release: 0.0.3 (2015-10-06) / Apache-2.0 / (0)
spark-FM-parallelSGD
Implementation of Factorization Machines on Spark using parallel stochastic gradient descent (python and scala)
@blebreton / No release yet / (1)
Mean-Shift-LSH
Spark implementation of Nearest Neighbours Mean Shift using LSH
@Kybe67 / No release yet / (1)
spark-cc
Library for computing clustering coefficient
@SherlockYang / Latest release: 0.1 (2015-10-22) / LGPL-3.0 / (1)
sparkxgboost
gradient boosting tree with arbitrary user-defined loss function
@rotationsymmetry / Latest release: 0.2.1-s_2.10 (2015-11-01) / Apache-2.0 / (0)
SparkCLR
C# API for Apache Spark. (Package moved to http://spark-packages.org/package/Microsoft/Mobius)
@skaarthik / No release yet / (2)
spark-xml
XML data source for Spark SQL and DataFrames
@HyukjinKwon / Latest release: 0.1.1-s_2.10 (2015-11-19) / Apache-2.0 / (1)
drunken-data-quality
Some utility classes for checking data quality in Spark
@FRosner / Latest release: 5.0.0-s_2.11 (2020-03-21) / Apache-2.0 / (1)