A community index of third-party packages for Apache Spark.
Showing packages 1 - 50 out of 105 for search "tags:"Machine Learning""
thunder
Large-scale neural data analysis with Spark
@freeman-lab / Latest release: 0.4.1 (2014-11-27) / BSD 3-Clause / (6)
generalized-kmeans-clustering
This project generalizes the Spark MLLIB K-Means clusterer to support arbitrary distance functions
@derrickburns / No release yet / (3)
sparkling-water
Sparkling Water provides H2O algorithms inside Spark cluster
@h2oai / Latest release: 1.4.3 (2015-07-06) / Apache-2.0 / (2)
spark-ml-streaming
Visualize streaming machine learning in Spark
@freeman-lab / No release yet / (1)
MLlib-dropout
Package adding dropout regularization to Apache Spark MLlib project
@rakeshchalasani / No release yet / (1)
spark-infotheoretic-feature-selection
Feature Selection framework based on Information Theory that includes: mRMR, InfoGain, JMI and other commonly used FS filters.
@sramirez / Latest release: 1.4.4 (2017-09-25) / Apache-2.0 / (8)
spark-MDLP-discretization
Spark implementation of Fayyad's discretizer based on Minimum Description Length Principle (MDLP)
@sramirez / Latest release: 1.4.1 (2017-09-25) / Apache-2.0 / (7)
spark-ml-class
Coursera Machine Learning class examples in Spark
@zinniasystems / No release yet / (0)
spark-pmml-exporter-validator
Using JPMML Evaluator to validate the PMML models exported from Spark
@selvinsource / No release yet / (1)
streaming-matrix-factorization
Streaming Recommendation Engine using matrix factorization with user and product bias
@brkyvz / Latest release: 0.1.0 (2015-05-26) / Apache-2.0 / (2)
dissolve-struct
Distributed solver library for large-scale structured output prediction
@dalab / No release yet / (0)
DDF
Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data/Compute Engine
@ddf-project / No release yet / (11)
sparkboost
A distributed implementation of AdaBoost.MH and MP-Boost using Apache Spark
@tizfa / Latest release: 0.6 (2015-07-01) / Apache-2.0 / (0)
hivemall-spark
A Hivemall wrapper for Spark
@maropu / Latest release: 0.0.6 (2016-04-07) / Apache-2.0 / (0)
patchwork
Highly Scalable Grid-Density Clustering Algorithm for Spark MLLib
@thomastriplet / No release yet / (0)
modelmatrix
Alternative to Spark machine learning pipeline feature extractors, focused on building sparse feature vectors.
@collectivemedia / No release yet / (1)
spark-knn-graphs
Spark algorithms for building and processing k-nn graphs
@tdebatty / Latest release: 0.13 (2016-02-17) / MIT / (1)
twitter-stream-ml
Machine Learning over Twitter's stream. Using Apache Spark, Web Server and Lightning Graph server.
@giorgioinf / Latest release: 0.2.0 (2016-06-19) / GPL-3.0 / (0)
spark-corenlp
A Stanford CoreNLP wrapper for Apache Spark
@databricks / Latest release: 0.4.0-spark2.4-scala2.11 (2018-11-16) / GPL-3.0 / (2)
spark-tfocs
TFOCS for Spark, a Spark port of TFOCS: Templates for First-Order Conic Solvers (cvxr.com/tfocs)
@databricks / No release yet / (1)
bisecting-kmeans
This is a prototype implementation of Bisecting K-Means Clustering on Spark.
@yu-iskw / Latest release: 0.1.1 (2015-08-28) / Apache-2.0 / (0)
DistML
DistML provide a supplement to mllib to support model-parallel on Spark
@intel-machine-learning / No release yet / (1)
dl4j-spark-ml
Deep Learning for Spark ML
@deeplearning4j / Latest release: 0.4-rc3.4 (2015-10-02) / Apache-2.0 / (1)
sparkling-ferns
Implementation of Random Ferns for Apache Spark
@CeON / Latest release: 0.2.0 (2015-10-08) / Apache-2.0 / (0)
lazy-linalg
Linear algebra operators for Apache Spark MLlib's linalg package
@brkyvz / Latest release: 0.1.0 (2015-09-09) / Apache-2.0 / (1)
pipeline
Docker-based, End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark Streaming, ML, MLlib, GraphX, Kafka, Cassandra, Redis, Apache Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau, H2O Flow, Tachyon,
@fluxcapacitor / No release yet / (3)
spark-FM-parallelSGD
Implementation of Factorization Machines on Spark using parallel stochastic gradient descent (python and scala)
@blebreton / No release yet / (1)
Mean-Shift-LSH
Spark implementation of Nearest Neighbours Mean Shift using LSH
@Kybe67 / No release yet / (1)
sparkxgboost
gradient boosting tree with arbitrary user-defined loss function
@rotationsymmetry / Latest release: 0.2.1-s_2.10 (2015-11-01) / Apache-2.0 / (0)
cookie-datasets
Popular ML Datasets for Spark ML (MNIST, IRIS, CIFAR)
@cookieai / Latest release: 0.1.0 (2015-12-22) / Apache-2.0 / (0)
Spark.statistics
Assembly of fundamental statistics implemented based on Apache Spark
@hhbyyh / No release yet / (0)
spark-calibration
Assess binary classifier calibration (i.e., how well classifier outputs match observed class proportions) in Spark
@robert-dodier / No release yet / (0)
spark-sklearn
Scikit-learn integration package for Apache Spark
@databricks / Latest release: 0.2.3 (2017-09-29) / BSD 3-Clause / (1)
spark-DEMD-discretizer
A Distributed Evolutionary Multivariate Discretizer (DEMD)
@sramirez / Latest release: 1.0 (2016-02-04) / Apache-2.0 / (2)
spark-neighbors
Approximate nearest neighbor search using locality-sensitive hashing
@karlhigley / Latest release: 0.2.2 (2016-07-05) / MIT / (0)
CaffeOnSpark
Scalable deep learning running Caffe inside Spark executors with peer-to-peer communication
@yahoo / No release yet / (1)
spark-stemming
Spark MLlib wrapper around Snowball stemming
@master / Latest release: 0.2.1 (2018-11-28) / BSD 2-Clause / (0)