mezzanine (homepage)
Mezzanine is a library built on Spark Streaming used to consume data from Kafka and store it into Hadoop.
@groupon / (0)
Mezzanine is a library built on Spark Streaming used to consume data from Kafka and store it into Hadoop.
This library was built to replace the batch-based model of Kafka consumption, where jobs would be launched periodically to consume and persist large amounts of data at a time. Mezzanine contains logic for transforming, partitioning, and compacting the consumed Kafka data to persist them in HDFS. It was built with Baryon to handle the Kafka consumption, but Mezzanine can still be used as library with other methods for consuming from Kafka.
Tags
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages com.groupon.dse:mezzanine:1.0
sbt
In your sbt build file, add:
libraryDependencies += "com.groupon.dse" % "mezzanine" % "1.0"
Maven
In your pom.xml, add:<dependencies> <!-- list of dependencies --> <dependency> <groupId>com.groupon.dse</groupId> <artifactId>mezzanine</artifactId> <version>1.0</version> </dependency> </dependencies>
Releases
Version: 1.0 ( 3b989b | zip | jar ) / Date: 2016-07-29 / License: BSD 3-Clause / Scala version: 2.10