Privacy via Maintaining Small Similitude Data for Big Data Statistical Representation

Philip Derbeko, Shlomi Dolev, Ehud Gudes

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Despite its attractiveness, Big Data oftentimes is hard, slow and expensive to handle due to its size. Moreover, as the amount of collected data grows, individual privacy raises more and more concerns: “what do they know about me?” Different algorithms were suggested to enable privacy-preserving data release with the current de-facto standard differential privacy. However, the processing time of keeping the data private is inhibiting and currently not practical for every day use. Combined with the continuously growing data collection, the solution is not seen on a horizon. In this research, we suggest replacing the Big Data with a much smaller similitude model. The model “resembles” the data with respect to a class of query. The user defines the maximum acceptable error and privacy requirements ahead of the query execution. Those requirements define the minimal size of the similitude model. The suggested method is demonstrated by using a wavelet transform and then by pruning the tree according to both the data reduction and the privacy requirements. We propose methods of combining the noise required for privacy preservation with noise of similitude model, that allow us to decrease the amount of added noise thus, improving the utilization of the method.

Original languageEnglish
Title of host publicationCyber Security Cryptography and Machine Learning - Second International Symposium, CSCML 2018, Proceedings
EditorsItai Dinur, Shlomi Dolev, Sachin Lodha
PublisherSpringer Verlag
Pages105-119
Number of pages15
ISBN (Print)9783319941462
DOIs
StatePublished - 1 Jan 2018
Event2nd International Symposium on Cyber Security Cryptography and Machine Learning, CSCML 2018 - Beer-Sheva, Israel
Duration: 21 Jun 201822 Jun 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10879 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd International Symposium on Cyber Security Cryptography and Machine Learning, CSCML 2018
Country/TerritoryIsrael
CityBeer-Sheva
Period21/06/1822/06/18

Keywords

  • Big Data
  • Differential privacy
  • Privacy
  • Similitude model
  • Wavelets

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Privacy via Maintaining Small Similitude Data for Big Data Statistical Representation'. Together they form a unique fingerprint.

Cite this