gor-spark (homepage)
Relational query engine that unites SparkSQL and GORpipe into a single declarative query framework.
@gorpipe / (0)
GORpipe is a tool based on a genomic ordered relational architecture and allows analysis of large sets of genomic and phenotypic tabular data using a declarative query language, in a parallel execution engine. It is very efficient in a wide range of use-cases, including genome wide batch analysis, range-queries, genomic table joins of variants and segments, filtering, aggregation etc. The query language combines ideas from SQL and Unix shell pipe syntax, supporting seek-able nested queries, materialized views, and a rich set of commands and functions. For more information see the paper in Bioinformatics (https://dx.doi.org/10.1093%2Fbioinformatics%2Fbtw199).
Tags (No tags yet, login to add one. )
How to
Include this package in your Spark Applications using:
spark-shell, pyspark, or spark-submit
> $SPARK_HOME/bin/spark-shell --packages org.gorpipe:gor-spark:3.10.2
sbt
In your sbt build file, add:
libraryDependencies += "org.gorpipe" % "gor-spark" % "3.10.2"
Maven
In your pom.xml, add:<dependencies> <!-- list of dependencies --> <dependency> <groupId>org.gorpipe</groupId> <artifactId>gor-spark</artifactId> <version>3.10.2</version> </dependency> </dependencies>
Releases
Version: 3.10.2 ( 47ca7e | zip | jar ) / Date: 2021-05-09 / License: Apache-2.0