TY - GEN
T1 - Privacy via Maintaining Small Similitude Data for Big Data Statistical Representation
AU - Derbeko, Philip
AU - Dolev, Shlomi
AU - Gudes, Ehud
N1 - Funding Information:
Acknowledgement. The research was partially supported by the Rita Altura Trust Chair in Computer Sciences; the Lynne and William Frankel Center for Computer Science; the Ministry of Foreign Affairs, Italy; the grant from the Ministry of Science, Technology and Space, Israel, and the National Science Council (NSC) of Taiwan; the Ministry of Science, Technology and Space, Infrastructure Research in the Field of Advanced Computing and Cyber Security; and the Israel National Cyber Bureau.
Publisher Copyright:
© 2018, Springer International Publishing AG, part of Springer Nature.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Despite its attractiveness, Big Data oftentimes is hard, slow and expensive to handle due to its size. Moreover, as the amount of collected data grows, individual privacy raises more and more concerns: “what do they know about me?” Different algorithms were suggested to enable privacy-preserving data release with the current de-facto standard differential privacy. However, the processing time of keeping the data private is inhibiting and currently not practical for every day use. Combined with the continuously growing data collection, the solution is not seen on a horizon. In this research, we suggest replacing the Big Data with a much smaller similitude model. The model “resembles” the data with respect to a class of query. The user defines the maximum acceptable error and privacy requirements ahead of the query execution. Those requirements define the minimal size of the similitude model. The suggested method is demonstrated by using a wavelet transform and then by pruning the tree according to both the data reduction and the privacy requirements. We propose methods of combining the noise required for privacy preservation with noise of similitude model, that allow us to decrease the amount of added noise thus, improving the utilization of the method.
AB - Despite its attractiveness, Big Data oftentimes is hard, slow and expensive to handle due to its size. Moreover, as the amount of collected data grows, individual privacy raises more and more concerns: “what do they know about me?” Different algorithms were suggested to enable privacy-preserving data release with the current de-facto standard differential privacy. However, the processing time of keeping the data private is inhibiting and currently not practical for every day use. Combined with the continuously growing data collection, the solution is not seen on a horizon. In this research, we suggest replacing the Big Data with a much smaller similitude model. The model “resembles” the data with respect to a class of query. The user defines the maximum acceptable error and privacy requirements ahead of the query execution. Those requirements define the minimal size of the similitude model. The suggested method is demonstrated by using a wavelet transform and then by pruning the tree according to both the data reduction and the privacy requirements. We propose methods of combining the noise required for privacy preservation with noise of similitude model, that allow us to decrease the amount of added noise thus, improving the utilization of the method.
KW - Big Data
KW - Differential privacy
KW - Privacy
KW - Similitude model
KW - Wavelets
UR - http://www.scopus.com/inward/record.url?scp=85049008106&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-94147-9_9
DO - 10.1007/978-3-319-94147-9_9
M3 - Conference contribution
AN - SCOPUS:85049008106
SN - 9783319941462
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 105
EP - 119
BT - Cyber Security Cryptography and Machine Learning - Second International Symposium, CSCML 2018, Proceedings
A2 - Dinur, Itai
A2 - Dolev, Shlomi
A2 - Lodha, Sachin
PB - Springer Verlag
T2 - 2nd International Symposium on Cyber Security Cryptography and Machine Learning, CSCML 2018
Y2 - 21 June 2018 through 22 June 2018
ER -