A community index of third-party packages for Apache Spark.
Showing packages 51 - 100 out of 106 for search "tags:"Machine Learning""
Spark-CluStream
Adaptation of the CluStream method in Spark
@obackhoff / Latest release: 0.6.5 (2016-03-31) / Apache-2.0 / (1)
spark-parallelized-sgd
Parallelized Stochastic Gradient Descent (SGD) with Apache Spark
@yu-iskw / Latest release: 0.0.2 (2016-03-30) / Apache-2.0 / (0)
yggdrasil
Yggdrasil: Faster Decision Trees Using Column Partitioning in Spark
@fabuzaid21 / Latest release: 1.0.1 (2018-05-11) / Apache-2.0 / (1)
spark-kuromoji-tokenizer
Kuromoji Tokenizer for Spark DataFrame
@yu-iskw / Latest release: 1.2.0 (2016-06-29) / Apache-2.0 / (0)
spark-ranking-algorithms
Ranking algorithms for Spark DataFrame
@yu-iskw / Latest release: 0.0.4 (2016-08-26) / Apache-2.0 / (0)
scalable-deeplearning
Scalable implementation of artificial neural networks for Spark deep learning
@avulanov / Latest release: 1.0.0 (2016-09-09) / Apache-2.0 / (1)
spark-LDA-example
A example for Spark ML and StanfordNLP for topic discovery using LDA clustering
@shiv4nsh / No release yet / (0)
dist-keras
Distributed deep learning with Keras and Apache Spark.
@JoeriHermans / No release yet / (0)
spark-word2vec
A parallel implementation of word2vec based on Spark
@chen-lin / No release yet / (1)
SpectralLDA-TensorSpark
Quick summary: This code implements a spectral (third order tensor decomposition) learning method for learning LDA topic model on Spark.
@FurongHuang / Latest release: 1.0 (2016-12-04) / Apache-2.0 / (1)
Twitter-Sentiment-Analyzer
Twitter Sentiment Analysis - PySpark
@DayneSorvisto / No release yet / (1)
spark-IS-streaming
A Nearest Neighbor Classifier for High-Speed Big Data Streams with Instance Selection
@sramirez / Latest release: 0.8 (2017-01-27) / Apache-2.0 / (0)
sandpiper
Implementation of the Loopy Belief Propagation algorithm for Apache Spark
@HewlettPackard / No release yet / (0)
spark-tree-plotting
A simple tool for plotting Spark ML's Decision Trees
@julioasotodv / Latest release: 0.2 (2017-03-25) / MIT / (1)
NoiseFramework
Noise Framework for removing noisy instances with three algorithms: HME-BD, HTE-BD and ENN.
@djgarcia / Latest release: 1.2 (2018-04-18) / Apache-2.0 / (2)
spark-deep-learning
Deep Learning Pipelines for Apache Spark
@databricks / Latest release: 1.5.0-spark2.4-s_2.11 (2019-01-25) / Apache-2.0 / (3)
Optimus
Optimus is the missing library for cleansing (cleaning and much more) and pre-processing data in a distributed fashion with Apache Spark.
@ironmussa / Latest release: 1.1.0 (2017-10-25) / Apache-2.0 / (2)
SparkAffinityPropagation
Affinity Propagation on Spark
@viirya / Latest release: 1.0 (2017-07-29) / MIT / (0)
spark-kmedoids
Spark implementation of k-medoids clustering algorithm
@tdebatty / Latest release: 0.1.2 (2017-09-24) / MIT / (1)
spark-nlp
Natural Language Processing Library for Apache Spark.
@JohnSnowLabs / Latest release: 3.0.1 (2021-04-02) / Apache-2.0 / (5)
PhysOnline
PhysOnline: An Open Source Machine Learning Pipeline for Real-Time Analysis of Streaming Physiological Waveform
@rkamaleswaran / No release yet / (1)
SmartFiltering
Smart Filtering framework for Big Data
@djgarcia / Latest release: 1.0 (2018-04-09) / Apache-2.0 / (2)
Smart_Imputation
Smart Imputation. k Nearest Neighbor Imputation methods
@JMailloH / Latest release: 1.0 (2018-04-11) / Apache-2.0 / (2)
Bagging-RandomMiner
Bagging-RandomMiner ensemble method for anomaly detection
@wuicho-pereyra / Latest release: 1.0 (2018-05-22) / Apache-2.0 / (1)
TransmogrifAI
Automated machine learning for structured data
@salesforce / Latest release: 0.7.0 (2020-06-12) / BSD 3-Clause / (5)
spark-iforest
Isolation Forest on Spark
@titicaca / Latest release: v2.4.0 (2019-01-02) / Apache-2.0 / (1)
spark-gbtlr
Hybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
@titicaca / Latest release: v2.4.0 (2019-01-02) / Apache-2.0 / (1)
Equal-Width-Discretizer
Equal Width Discretizer
@djgarcia / Latest release: 1.0 (2018-10-01) / Apache-2.0 / (1)
spark-dirty-cat
Similarity encoding of dirty categorical variables (strings)
@rakutentech / No release yet / (1)
sparkml-extensions
Extensions for Spark ML/MlLib
@chitralverma / Latest release: 0.1 (2018-12-25) / Apache-2.0 / (1)
spark-ensemble
Ensemble Estimators for Apache Spark ML
@pierrenodet / Latest release: 0.4.0 (2019-02-16) / Apache-2.0 / (1)
streamDM
Stream Data Mining Library for Spark Streaming
@huawei-noah / Latest release: 0.0.1 (2019-07-21) / Apache-2.0 / (1)
spark-pspectrum
P-spectrum embedding and sequence relaxation for NLP in Spark
@sirCamp / Latest release: 1.0.0 (2019-08-07) / Apache-2.0 / (0)
ComplexityMetrics
Complexity metrics for big data problems.
@JMailloH / Latest release: 1.0 (2019-10-17) / Apache-2.0 / (1)
spark-aucmu
multi-calss performance matrix aucmu for Apache Spark
@poweihuang / Latest release: 1.0.0 (2019-10-21) / MIT / (1)