A community index of third-party packages for Apache Spark.

Showing packages 1 - 27 out of 27 for search "tags:"Tools""

Connect Spark to HBase for reading and writing data with ease

@nerdammer / Latest release: 1.0.3 (2016-04-20) / Apache-2.0 / (2)

  • 1|streaming
  • 1|hbase
  • 1|library


Base classes to use when writing tests with Spark

@holdenk / Latest release: 1.5.2_0.3.3 (2016-04-19) / Apache-2.0 / (8)

  • 3|testing
  • 1|streaming
  • 1|tools


Sbt plugin for Spark packages

@databricks / Latest release: 0.2.4 (2016-07-15) / Apache-2.0 / (3)

  • 1|tools
  • 1|sbt


A command line tool for Spark packages

@databricks / Latest release: 0.3.0 (2015-03-17) / Apache-2.0 / (1)

  • 1|tools


Docker container for spark standalone cluster.

@epahomov / No release yet / (0)

  • 1|tools
  • 1|deployment


Maven archetype used to bootstrap a Spark Scala project

@mbonaci / Latest release: 0.9 (2015-04-24) / MIT / (0)

  • 1|Maven
  • 1|tools
  • 1|scala


SBT plugin for spark-ec2

@pishen / No release yet / (0)

  • 1|tools
  • 1|sbt
  • 1|deployment


Splittable SAS (.sas7bdat) Input Format for Hadoop and Spark SQL

@saurfang / Latest release: 1.1.5-s_2.11 (2016-11-20) / GPL-3.0 / (1)

  • 1|sas
  • 1|tools
  • 1|sql


Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data/Compute Engine

@ddf-project / No release yet / (11)

  • 3|API
  • 2|tools
  • 2|machine learning


Deploy Spark cluster in an easy way.

@pishen / Latest release: 0.5.1 (2015-06-25) / Apache-2.0 / (0)

  • 1|tools
  • 1|sbt
  • 1|deployment


sbt plugin for spark-submit

@saurfang / No release yet / (0)

  • 1|tools
  • 1|sbt
  • 1|deployment


Docker-based, End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark Streaming, ML, MLlib, GraphX, Kafka, Cassandra, Redis, Apache Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau, H2O Flow, Tachyon,

@fluxcapacitor / No release yet / (3)

  • 2|streaming
  • 2|kafka
  • 1|machine learning


Solr Dictionary Annotator (Microservice for Spark)

@elsevierlabs-os / No release yet / (0)

  • 1|application
  • 1|tools


Create composable data processing pipelines in Spark, and execute them on a cluster using simple Scala code

@springnz / No release yet / (0)

  • 1|application
  • 1|testing
  • 1|tools


SparkR extension for dplyr

@saurfang / No release yet / (0)

  • 1|sparkr
  • 1|r
  • 1|tools


ScalaCheck for Spark

@juanrh / No release yet / (0)

  • 1|streaming
  • 1|testing
  • 1|tools


A command-line tool for launching Apache Spark clusters.

@nchammas / No release yet / (1)

  • 1|tools
  • 1|ec2
  • 1|deployment


Rebooting ggplot2 for scalable big data visualization

@SKKU-SKT / No release yet / (3)

  • 3|visualization
  • 2|r
  • 1|tools


Spark tool to handle file compaction.

@KeithSSmith / Latest release: 1.0.0 (2016-04-22) / Apache-2.0 / (0)

  • 1|tools


Provides GPU awareness to Spark

@ibmsoe / No release yet / (1)

  • 2|GPU
  • 1|spark
  • 1|tools


Some tools for outliers detection, discretisation, correlation analysis and text correction.

@hupi-analytics / No release yet / (3)

  • 1|spark
  • 1|tools
  • 1|scala


Create HTML profiling reports from Apache Spark DataFrames

@julioasotodv / Latest release: 1.1.2 (2016-07-26) / Apache-2.0 / (1)

  • 1|tools
  • 1|pyspark


Baryon is a library for building Spark Streaming applications that consume data from Kafka.

@groupon / Latest release: 1.0 (2016-07-29) / BSD 3-Clause / (0)

  • 1|streaming
  • 1|tools
  • 1|library


Mezzanine is a library built on Spark Streaming used to consume data from Kafka and store it into Hadoop.

@groupon / Latest release: 1.0 (2016-07-29) / BSD 3-Clause / (0)

  • 1|streaming
  • 1|tools
  • 1|library


Configure nginx on the master node for a reverse proxy to Apache Spark web UI and history server. No more ssh socks/tunnel.

@ekasitk / No release yet / (0)

  • 1|tool
  • 1|deployment


Openstack Spark cluster deployment

@ispras / Latest release: 0.9.5 (2016-11-10) / Apache-2.0 / (0)

  • 1|tools
  • 1|deployment


Spark SQL index for Parquet tables

@lightcopy / Latest release: 0.2.1-s_2.11 (2017-02-06) / Apache-2.0 / (1)

  • 1|sql
  • 1|tools
  • 1|parquet