A community index of third-party packages for Apache Spark.
Showing packages 451 - 500 out of 517
streamDM
Stream Data Mining Library for Spark Streaming
@huawei-noah / Latest release: 0.0.1 (2019-07-21) / Apache-2.0 / (1)
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
@archivesunleashed / Latest release: 0.18.0 (2019-08-21) / Apache-2.0 / (0)
test-package
Testing spark-packages publishing, please ignore
@hmgomes / Latest release: 0.0.2 (2019-07-21) / Apache-2.0 / (0)
spark-acid
Hive ACID datasource for Apache Spark
@qubole / Latest release: 0.4.0-s_2.11 (2019-07-26) / Apache-2.0 / (0)
mllib-stacking-bagging
mllib-stacking-bagging is an implementation of the ensemble methods stacking and bagging using the classifiers LogisticRegression, NaiveBayes and DecisionTree provided by the RDD-based api of mllib.
@Pse00004 / No release yet / (0)
spark-pspectrum
P-spectrum embedding and sequence relaxation for NLP in Spark
@sirCamp / Latest release: 1.0.0 (2019-08-07) / Apache-2.0 / (0)
Spark2CassandraBulkLoad
Spark Library for Bulk Loading into Cassandra
@joswlv / No release yet / (1)
spark-xkmeans
Extension to the standard K-Means implementation of Spark ML library
@tupol / Latest release: 0.0.1 (2019-09-04) / MIT / (0)
spark-cerebro
Model Hopper Parallelism (MOP) for Efficient and Reproducible Model Selection on Apache Spark
@scnakandala / No release yet / (0)
ml-registry
Enabling continuous delivery and improvement of Spark pipeline models through devops methodology and ML governance
@aamend / Latest release: 1.1 (2019-10-17) / Apache-2.0 / (1)
ComplexityMetrics
Complexity metrics for big data problems.
@JMailloH / Latest release: 1.0 (2019-10-17) / Apache-2.0 / (1)
spark-aucmu
multi-calss performance matrix aucmu for Apache Spark
@poweihuang / Latest release: 1.0.0 (2019-10-21) / MIT / (1)
spark-streaming-jdbc-source
JDBC source for spark structured streaming
@sutugin / No release yet / (1)
spark-pipeline-utils
Utility classes to extend and generalize Spark's ML pipeline framework
@tnixon / No release yet / (0)
pyspark.dynamodb.streaming
Add PySpark support for reading AWS DynamoDB streams
@kberbic / No release yet / (0)
spark-dynamodb-source
DynamoDB source for Spark Structured Streaming
@kolia1985 / Latest release: 0.0.2 (2019-11-24) / Apache-2.0 / (1)
Imbalanced-Classification-Ensemble
SD_DeTE is a novel Smart Data driven Decision Trees Ensemble methodology for addressing the imbalanced classification problem in Big Data domains
@djgarcia / Latest release: 1.0 (2019-12-03) / Apache-2.0 / (1)
twut
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.
@archivesunleashed / No release yet / (0)
streaminglens
Qubole Streaminglens tool for tuning Spark Structured Streaming Pipelines
@qubole / Latest release: 0.5.3 (2020-02-13) / Apache-2.0 / (0)
spark-dicom
A library to read dicom file in a spark sql data frame.
@abzoobabd / No release yet / (1)
artan
Online latent state estimation with Spark
@ozancicek / Latest release: 0.3.0 (2020-05-20) / Apache-2.0 / (1)
almaren-framework
Simplified consistent minimalistic layer over Apache Spark
@music-of-the-ainur / No release yet / (0)
quenya-dsl
Simplifies the task to parser/flatten complex semi-structured data
@music-of-the-ainur / No release yet / (0)
spark-states
Custom state store providers for Apache Spark
@chermenin / Latest release: 0.2 (2020-04-24) / Apache-2.0 / (0)
spark-state-tools
Structured Streaming State Tools for Apache Spark
@HeartSaVioR / Latest release: 0.3.0 (2020-05-21) / Apache-2.0 / (0)
spark-sql-kafka-offset-committer
Kafka offset committer for Apache Spark structured streaming query
@HeartSaVioR / No release yet / (0)
spark-privacy-preserver
A privacy preserving library for Apache Spark
@ThaminduR / No release yet / (0)
spark-hugefs
Query deeply nested and huge directories from Spark efficiently
@salva / Latest release: 0.10 (2020-06-17) / Apache-2.0 / (1)
kotlin-spark-api
Kotlin language bindings and several extensions for Apache Spark
@JetBrains / No release yet / (1)
osm4scala
Scala and Spark library focused on reading OpenStreetMap Pbf files.
@simplexspatial / Latest release: 1.0.7 (2021-03-27) / MIT / (0)
neo4j-connector-apache-spark_2.11
Officially supported, Apache 2 licensend Neo4j Connector for Apache Spark.
@neo4j-contrib / Latest release: 4.0.1 (2021-04-12) / Apache-2.0 / (0)
neo4j-connector-apache-spark_2.12
Officially supported, Apache 2 licensend Neo4j Connector for Apache Spark.
@neo4j-contrib / Latest release: 4.0.1_for_spark_3 (2021-04-12) / Apache-2.0 / (0)
approx-smote
Approximated SMOTE for Big Data under the Spark Framework.
@mjuez / Latest release: 1.1.2 (2022-04-27) / Apache-2.0 / (1)
GraphXwithGPU
Spark-based graph processing system demo using GPU
@Kamosphere / Latest release: 1.1 (2020-12-22) / Apache-2.0 / (3)
S-AnomalyDSD
AnomalyDSD is a Spark Package composed of four Big Data Anomaly Dynamic and Static Detection Algorithms
@ari-dasci / Latest release: 1.0 (2021-02-17) / Apache-2.0 / (1)
seq-datasource-v2
Sequence Data Source for Apache Spark
@garawalid / Latest release: 0.2.0 (2021-03-20) / Apache-2.0 / (0)
rotation-forest-bd
Rotation Forest implementation for Big Data on Apache Spark
@mjuez / Latest release: 1.0.0 (2021-03-23) / Apache-2.0 / (1)
spark-packages-test
Test for spark-packages
@bozhang2820 / Latest release: 0.0.8 (2023-09-25) / Apache-2.0 / (0)
spark-packages-test
spark packages test
@linhongliu-db / Latest release: 0.0.7 (2021-04-30) / Apache-2.0 / (0)
snappydata
SnappyData: OLTP + OLAP Database built on Apache Spark
@TIBCOSoftware / No release yet / (1)