DDF (homepage)

Distributed DataFrame: Productivity = Power x Simplicity For Scientists & Engineers, on any Data/Compute Engine

@ddf-project / (11)

DDF (Distributed DataFrame) provides a powerful yet simple, table-like abstraction on top of any Big-Data engine underneath.
Not to be confused with Spark DataFrames, which is specific to Spark, DDF sits one level higher to abstract out the variability of different data/compute-engine APIs. 
DDF has been implemented on top of Apache Spark, and is the engine powering Adatao's Big Apps. Other data/compute engines will follow.
1. Treat parallel, distributed data sets like one big table
2. Idiomatic R-inspired user experience 
3. Focus on analytics, not MapReduce 
4. Seamlessly integrate with external ML libraries 
5. Share and collaborate on DDFs via URIs 
6. Use multiple languages (Java, Scala, R, Python)


  • 3|API
  • 2|machine learning
  • 2|tools
  • 1|data source

How to

This package doesn't have any releases published in the Spark Packages repo, or with maven coordinates supplied. You may have to build this package from source, or it may simply be a script. To use this Spark Package, please follow the instructions in the README.


No releases yet.