Efficient and Privacy preserving Approximation of Distributed Statistical Queries

Philip Derbeko, Shlomi Dolev, Ehud Gudes, Jeffrey D. Ullman

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, an increasing amount of data is collected in different and often, not cooperative, databases. The problem of privacy-preserving, distributed calculations over separate databases and, a relative to it, the issue of private data release was intensively investigated. However, despite a considerable progress, computational complexity and consequently, the performance of the computations, due to an increasing size of data, remains a limiting factor in real-world deployments. Especially in the case of privacy-preserving computations. In this paper, we suggest sampling as a method of improving computational performance. Sampling was a topic of extensive research in the past that recently received a boost of interest. We provide a sampling method targeted at separate, non-collaborating, vertically partitioned datasets. The method is exemplified and tested on an approximation of intersection set both with and without a privacy-preserving mechanism. An analysis of the bound on the error as a function of the sample size is discussed and a heuristic algorithm is suggested to further improve the performance. The algorithms were implemented and experimental results confirm the validity of the approach.

Original languageEnglish
JournalIEEE Transactions on Big Data
DOIs
StateAccepted/In press - 1 Jan 2021

Keywords

  • Approximate Computations
  • Approximation algorithms
  • Differential Privacy
  • Differential privacy
  • Distributed Computations
  • Distributed databases
  • Estimation
  • Heuristic algorithms
  • Law enforcement
  • Protocols

Fingerprint

Dive into the research topics of 'Efficient and Privacy preserving Approximation of Distributed Statistical Queries'. Together they form a unique fingerprint.

Cite this