A community index of third-party packages for Apache Spark.
Showing packages 1 - 36 out of 36 for search "tags:"Tools""
spark-hbase-connector
Connect Spark to HBase for reading and writing data with ease
@nerdammer / Latest release: 1.0.3 (2016-04-20) / Apache-2.0 / (3)
spark-testing-base
Base classes to use when writing tests with Spark
@holdenk / Latest release: 2.2.2_0.11.0 (2018-12-23) / Apache-2.0 / (10)
sbt-spark-package
Sbt plugin for Spark packages
@databricks / Latest release: 0.2.4 (2016-07-15) / Apache-2.0 / (3)
spark-package-cmd-tool
A command line tool for Spark packages
@databricks / Latest release: 0.3.0 (2015-03-17) / Apache-2.0 / (1)
spark-archetype-scala
Maven archetype used to bootstrap a Spark Scala project
@mbonaci / Latest release: 0.9 (2015-04-24) / MIT / (0)
spark-sas7bdat
Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL
@saurfang / Latest release: 3.0.0-s_2.12 (2020-09-13) / Apache-2.0 / (1)
DDF
Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data/Compute Engine
@ddf-project / No release yet / (11)
spark-deployer
Deploy Spark cluster in an easy way.
@pishen / Latest release: 0.5.1 (2015-06-25) / Apache-2.0 / (0)
pipeline
Docker-based, End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark Streaming, ML, MLlib, GraphX, Kafka, Cassandra, Redis, Apache Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau, H2O Flow, Tachyon,
@fluxcapacitor / No release yet / (3)
ggplot2.SparkR
Rebooting ggplot2 for scalable big data visualization
@SKKU-SKT / No release yet / (3)
spark-compaction
Spark tool to handle file compaction.
@KeithSSmith / Latest release: 1.0.0 (2016-04-22) / Apache-2.0 / (0)
DataScienceTools
Some tools for outliers detection, discretisation, correlation analysis and text correction.
@hupi-analytics / No release yet / (3)
spark-df-profiling
Create HTML profiling reports from Apache Spark DataFrames
@julioasotodv / Latest release: 1.1.2 (2016-07-26) / Apache-2.0 / (1)
proxy4sparkui
Configure nginx on the master node for a reverse proxy to Apache Spark web UI and history server. No more ssh socks/tunnel.
@ekasitk / No release yet / (0)
spark-openstack
Openstack Spark cluster deployment
@ispras / Latest release: 0.9.5 (2016-11-10) / Apache-2.0 / (0)
parquet-index
Spark SQL index for Parquet tables
@lightcopy / Latest release: 0.5.0-s_2.12 (2020-08-01) / Apache-2.0 / (1)
Optimus
Optimus is the missing library for cleansing (cleaning and much more) and pre-processing data in a distributed fashion with Apache Spark.
@ironmussa / Latest release: 1.1.0 (2017-10-25) / Apache-2.0 / (2)
ParallelTool
Tool design to speed up spark applications
@marino-serna / Latest release: 1.0.1-00 (2018-07-22) / Apache-2.0 / (1)
spark-utils
Basic framework utilities to quickly start writing production ready Apache Spark applications
@tupol / Latest release: 0.6.1 (2021-10-18) / MIT / (0)
spark-tools
Executable Apache Spark Tools: Format Converter & SQL Processor
@tupol / Latest release: 0.4.1-s_2.11 (2020-09-12) / MIT / (0)
aut
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
@archivesunleashed / Latest release: 0.18.0 (2019-08-21) / Apache-2.0 / (0)
twut
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.
@archivesunleashed / No release yet / (0)