spark-scdengine (homepage)
Capture SCD (Slowly Changing Dimension) on Spark
@dhmodi / (1)
Plug-n-Play module to implement SCD (Type-I & Type-II) on Spark.
Pre-requisites:
Spark 1.6.0
Features:
1.Atleast 10x lesser time to implement (as compared to Informatica BDE implementation)
2.Faster performance (as compared to HIVE & Tez Queries)
3.Plug-n-Play application with simple configuration (just provide few details about source & target table)
4. Auto Datatype conversions (limited)
5.Support for Hadoop Native SQL interface (HIVE, IMPALA, HAWQ etc) irrespective of underlying file formats.
6.Support for importing data from traditional RDBMS.
7.Works with any distribution of Hadoop (Cloudera, Hortonworks, MapR, IBM BigInsights etc.)
Tags
How to
This package doesn't have any releases published in the Spark Packages repo, or with maven coordinates supplied. You may have to build this package from source, or it may simply be a script. To use this Spark Package, please follow the instructions in the README.
Releases
No releases yet.