TY - GEN
T1 - Constrained anonymization of production data
T2 - 7th VLDB Workshop on Secure Data Management, SDM 2010
AU - Yahalom, Ran
AU - Shmueli, Erez
AU - Zrihen, Tomer
PY - 2010/12/13
Y1 - 2010/12/13
N2 - The use of production data which contains sensitive information in application testing requires that the production data be anonymized first. The task of anonymizing production data becomes difficult since it usually consists of constraints which must also be satisfied in the anonymized data. We propose a novel approach to anonymize constrained production data based on the concept of constraint satisfaction problems. Due to the generality of the constraint satisfaction framework, our approach can support a wide variety of mandatory integrity constraints as well as constraints which ensure the similarity of the anonymized data to the production data. Our approach decomposes the constrained anonymization problem into independent sub-problems which can be represented and solved as constraint satisfaction problems (CSPs). Since production databases may contain many records that are associated by vertical constraints, the resulting CSPs may become very large. Such CSPs are further decomposed into dependant sub-problems that are solved iteratively by applying local modifications to the production data. Simulations on synthetic production databases demonstrate the feasibility of our method.
AB - The use of production data which contains sensitive information in application testing requires that the production data be anonymized first. The task of anonymizing production data becomes difficult since it usually consists of constraints which must also be satisfied in the anonymized data. We propose a novel approach to anonymize constrained production data based on the concept of constraint satisfaction problems. Due to the generality of the constraint satisfaction framework, our approach can support a wide variety of mandatory integrity constraints as well as constraints which ensure the similarity of the anonymized data to the production data. Our approach decomposes the constrained anonymization problem into independent sub-problems which can be represented and solved as constraint satisfaction problems (CSPs). Since production databases may contain many records that are associated by vertical constraints, the resulting CSPs may become very large. Such CSPs are further decomposed into dependant sub-problems that are solved iteratively by applying local modifications to the production data. Simulations on synthetic production databases demonstrate the feasibility of our method.
UR - http://www.scopus.com/inward/record.url?scp=78649809556&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-15546-8_4
DO - 10.1007/978-3-642-15546-8_4
M3 - Conference contribution
AN - SCOPUS:78649809556
SN - 3642155456
SN - 9783642155451
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 41
EP - 53
BT - Secure Data Management - 7th VLDB Workshop, SDM 2010, Proceedings
Y2 - 17 September 2010 through 17 September 2010
ER -