Efficient and Privacy Preserving Approximation of Distributed Statistical Queries

Philip Derbeko, Shlomi Dolev, Ehud Gudes, Jeffrey D. Ullman

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

In recent years, an increasing amount of data is collected in different and often, not cooperative, databases. The problem of privacy-preserving, distributed calculations over separate databases and, a relative to it, the issue of private data release was intensively investigated. However, despite a considerable progress, computational complexity and consequently, the performance of the computations, due to an increasing size of data, remains a limiting factor in real-world deployments. Especially in the case of privacy-preserving computations. In this paper, we suggest sampling as a method of improving computational performance. Sampling was a topic of extensive research in the past that recently received a boost of interest. We provide a sampling method targeted at separate, non-collaborating, vertically partitioned datasets. The method is exemplified and tested on an approximation of intersection set both with and without a privacy-preserving mechanism. An analysis of the bound on the error as a function of the sample size is discussed and a heuristic algorithm is suggested to further improve the performance. The algorithms were implemented and experimental results confirm the validity of the approach.

Original languageEnglish
Pages (from-to)1399-1413
Number of pages15
JournalIEEE Transactions on Big Data
Volume8
Issue number5
DOIs
StatePublished - 1 Oct 2022

Keywords

  • Differential privacy
  • approximate computations
  • distributed computations

ASJC Scopus subject areas

  • Information Systems
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'Efficient and Privacy Preserving Approximation of Distributed Statistical Queries'. Together they form a unique fingerprint.

Cite this