A community index of third-party packages for Apache Spark.
Showing packages 401 - 450 out of 517
spark-RELIEFFC-fselection
Distributed version of RELIEF-F algorithm for Apache Spark.
@sramirez / Latest release: 0.5.0 (2018-04-09) / Apache-2.0 / (0)
SmartFiltering
Smart Filtering framework for Big Data
@djgarcia / Latest release: 1.0 (2018-04-09) / Apache-2.0 / (2)
SmartReduction
Smart Reduction framework for Big Data
@djgarcia / Latest release: 1.0 (2018-04-09) / Apache-2.0 / (2)
Smart_Imputation
Smart Imputation. k Nearest Neighbor Imputation methods
@JMailloH / Latest release: 1.0 (2018-04-11) / Apache-2.0 / (2)
dac
A Distributed Associative Classifier for Apache Spark MLlib
@lucaventurini / No release yet / (1)
spark-dynamodb
Plug-and-play implementation of an Apache Spark custom data source for AWS DynamoDB.
@audienceproject / No release yet / (0)
Bagging-RandomMiner
Bagging-RandomMiner ensemble method for anomaly detection
@wuicho-pereyra / Latest release: 1.0 (2018-05-22) / Apache-2.0 / (1)
spark-gdelt
Binding the GDELT universe in a Spark environment
@aamend / Latest release: 2.0 (2018-06-02) / Apache-2.0 / (1)
spark-hyperloglog
Algebird's HyperLogLog support for Apache Spark.
@jklukas / Latest release: 2.1.1.1 (2018-06-27) / Apache-2.0 / (0)
spark-hyperloglog
Algebird's HyperLogLog support for Apache Spark
@mozilla / Latest release: 2.2.0 (2018-06-29) / Apache-2.0 / (0)
ParallelTool
Tool design to speed up spark applications
@marino-serna / Latest release: 1.0.1-00 (2018-07-22) / Apache-2.0 / (1)
spark-bigquery
Google BigQuery data source for Apache Spark
@miraisolutions / Latest release: 0.1.1-s_2.11 (2019-06-07) / MIT / (2)
spark-on-k8s-operator
Kubernetes operator for specifying and running Apache Spark applications idiomatically on Kubernetes.
@GoogleCloudPlatform / No release yet / (0)
sparkMeasure
SparkMeasure is a tool for performance troubleshooting of Apache Spark workloads. It simplifies the collection and analysis of Spark task metrics data.
@LucaCanali / No release yet / (0)
social-network-analysis-community-detection
Implementation of the Batagelj-Zaversnik algorithm
@Jovic92 / No release yet / (0)
TransmogrifAI
Automated machine learning for structured data
@salesforce / Latest release: 0.7.0 (2020-06-12) / BSD 3-Clause / (5)
spark-iforest
Isolation Forest on Spark
@titicaca / Latest release: v2.4.0 (2019-01-02) / Apache-2.0 / (1)
spark-gbtlr
Hybrid model of Gradient Boosting Trees and Logistic Regression (GBDT+LR) on Spark
@titicaca / Latest release: v2.4.0 (2019-01-02) / Apache-2.0 / (1)
Bigdata-Governance
Huemul BigDataGovernance, es una libreria que trabaja sobre Spark, Hive y HDFS. Permite la implementacion de una estrategia corporativa de dato unico, basada en buenas practicas de Gobierno de Datos
@HuemulSolutions / No release yet / (1)
huemul-bigdatagovernance
Huemul BigDataGovernance, es una libreria que trabaja sobre Spark, Hive y HDFS
@HuemulSolutions / No release yet / (0)
Equal-Width-Discretizer
Equal Width Discretizer
@djgarcia / Latest release: 1.0 (2018-10-01) / Apache-2.0 / (1)
smote-bd
SMOTE-BD: A distributed Synthetic Minority Oversampling Technique (SMOTE) for Big Data.
@majobasgall / Latest release: 0.1 (2018-11-14) / Apache-2.0 / (0)
spark-dirty-cat
Similarity encoding of dirty categorical variables (strings)
@rakutentech / No release yet / (1)
sample_spark
Sample publishing project to spark
@oanhltko / Latest release: 1.0.3 (2018-12-12) / Apache-2.0 / (0)
ExternalValidity
This package contains the code for calculating external clustering validity indices in Spark.
@josemarialuna / No release yet / (0)
spark-select
Spark select enables retrieving only required data from an object
@minio / Latest release: 2.1-s_2.11 (2019-04-04) / Apache-2.0 / (1)
spark-adaptive_filtering
A Spark SQL extension for applying adaptive selection ordering techniques in filtering
@kikniknik / No release yet / (0)
sparkml-extensions
Extensions for Spark ML/MlLib
@chitralverma / Latest release: 0.1 (2018-12-25) / Apache-2.0 / (1)
spark-ensemble
Ensemble Estimators for Apache Spark ML
@pierrenodet / Latest release: 0.4.0 (2019-02-16) / Apache-2.0 / (1)
spark-utils
Basic framework utilities to quickly start writing production ready Apache Spark applications
@tupol / Latest release: 0.6.1 (2021-10-18) / MIT / (0)
spark-tools
Executable Apache Spark Tools: Format Converter & SQL Processor
@tupol / Latest release: 0.4.1-s_2.11 (2020-09-12) / MIT / (0)
nats-connector-spark-scala
A Scala based Spark Publish/Subscribe NATS Connector
@Logimethods / Latest release: 1.0.0 (2019-06-10) / MIT / (0)
nats-connector-spark
A Spark Publish/Subscribe NATS Connector
@Logimethods / Latest release: 1.0.0 (2019-06-10) / MIT / (0)
spark-client
A spark client for creating tables using the given json schema
@tejeshwr / No release yet / (1)