spark-hugefs (homepage)

Query deeply nested and huge directories from Spark efficiently

@salva / (1)

Query deeply nested and huge directories from Spark efficiently


Tags

  • 1|file system
  • 1|glob
  • 1|find
  • 1|list
  • 1|traverse
  • 1|indexing

How to

Include this package in your Spark Applications using:

spark-shell, pyspark, or spark-submit

> $SPARK_HOME/bin/spark-shell --packages com.github.salva:spark-hugefs_2.11:0.10

sbt

In your sbt build file, add:

libraryDependencies += "com.github.salva" % "spark-hugefs_2.11" % "0.10"

Maven

In your pom.xml, add:
<dependencies>
  <!-- list of dependencies -->
  <dependency>
    <groupId>com.github.salva</groupId>
    <artifactId>spark-hugefs_2.11</artifactId>
    <version>0.10</version>
  </dependency>
</dependencies>

Releases

Version: 0.10 ( fd481e | zip | jar ) / Date: 2020-06-17 / License: Apache-2.0 / Scala version: 2.11

Version: 0.5 ( d93f58 | zip | jar ) / Date: 2020-06-10 / License: Apache-2.0 / Scala version: 2.11