spark-privacy-preserver (homepage)

A privacy preserving library for Apache Spark

@ThaminduR / (0)

*This module provides a simple tool for anonymizing a dataset using PySpark. Given a spark.sql.dataframe with relevant metadata this library generates an anonymized spark.sql.dataframe. This provides the following privacy-preserving techniques for dataset anonymization.
`
-K Anonymity (Mondrian and Clustering Based)
-L Diversity (Mondrian Clustering Based )
-T Closeness (Mondrian Clustering Based)
-Differential Privacy
-Single User Anonymization
`
*Use "pip install spark-privacy-preserver" to install the package. 
*PYPI - https://pypi.org/project/spark-privacy-preserver/
`
*Collaborators - 
Sasindu Dilshara -  https://github.com/SasinduDilshara
Ruchin Amaratunga - https://github.com/ruchinamaratunga
Ahrooran Ravindran - https://github.com/ahrooran-r 


Tags

  • 1|anonymization
  • 1|k-anonymity
  • 1|l-diversity
  • 1|t-closeness
  • 1|differential privacy

How to

This package doesn't have any releases published in the Spark Packages repo, or with maven coordinates supplied. You may have to build this package from source, or it may simply be a script. To use this Spark Package, please follow the instructions in the README.

Releases

No releases yet.