cookie-datasets (homepage)

Popular ML Datasets for Spark ML (MNIST, IRIS, CIFAR)

@cookieai / (0)

Provides DataFrame readers for popular datasets used by the ML community. The current version supports MNIST, IRIS, and CIFAR.


  • 1|machine learning
  • 1|data source

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages ai.cookie:cookie-datasets_2.10:0.1.0


In your sbt build file, add:

libraryDependencies += "ai.cookie" % "cookie-datasets_2.10" % "0.1.0"


In your pom.xml, add:
  <!-- list of dependencies -->


Version: 0.1.0 ( 74dda5 | zip | jar ) / Date: 2015-12-22 / License: Apache-2.0 / Scala version: 2.10

Spark Scala/Java API compatibility: - 14% , - 60% , - 53% , - 72% , - 59% , - 100%