Executable Apache Spark Tools: Format Converter & SQL Processor

This project contains some basic runnable tools that can help with various tasks around a Spark based project.

The main tools available:

FormatConverter Converts any acceptable file format into a different file format, providing also partitioning support.
SimpleSqlProcessor Applies a given SQL to the input files which are being mapped into tables.

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages org.tupol:spark-tools_2.11:0.2.1


In your sbt build file, add:

libraryDependencies += "org.tupol" % "spark-tools_2.11" % "0.2.1"


In your pom.xml, add:
  <!-- list of dependencies -->


Version: 0.2.1 ( a5a15c | zip | jar ) / Date: 2019-04-10 / License: MIT / Scala version: 2.11