TY - GEN
T1 - ShapGraph
T2 - 2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
AU - Davidson, Susan
AU - Deutch, Daniel
AU - Frost, Nave
AU - Kimelfeld, Benny
AU - Koren, Omer
AU - Monet, Mikaël
N1 - Funding Information:
This research has been partially funded by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 804302), the Israel Science Foundation, BSF - the Binational US-Israel Science foundation and the Mortimer and Raymond Sackler Institute of Advanced Studies. The work of Benny Kimelfeld was supported by the Israel Science Foundation (ISF), Grant 768/19, and the German Research Foundation (DFG) Project 412400621 (DIP program).
Publisher Copyright:
© 2022 ACM.
PY - 2022/6/10
Y1 - 2022/6/10
N2 - Explaining query results is an essential tool for enhancing the transparency and quality of data processing, and has been extensively studied in recent years. In particular, Data Provenance-the tracking of transformations that data undergoes in query evaluation-has been shown to be a key component of explanations. A hurdle that remains is that data provenance itself is often too large and complex to be presented in its entirety. To that end, we propose to leverage novel advancements on quantifying and computing the contributions of individual input tuples to query answers, based on the game-theoretic notion of the Shapley value. Our proposed prototype solution, called ShapGraph, combines the global view of explanations through provenance graphs with a local quantification of contributions through Shapley values. The graphical interface allows users to switch between and combine these two views to obtain a deeper understanding of the most influential parts of the database and how they interact to yield query answers.
AB - Explaining query results is an essential tool for enhancing the transparency and quality of data processing, and has been extensively studied in recent years. In particular, Data Provenance-the tracking of transformations that data undergoes in query evaluation-has been shown to be a key component of explanations. A hurdle that remains is that data provenance itself is often too large and complex to be presented in its entirety. To that end, we propose to leverage novel advancements on quantifying and computing the contributions of individual input tuples to query answers, based on the game-theoretic notion of the Shapley value. Our proposed prototype solution, called ShapGraph, combines the global view of explanations through provenance graphs with a local quantification of contributions through Shapley values. The graphical interface allows users to switch between and combine these two views to obtain a deeper understanding of the most influential parts of the database and how they interact to yield query answers.
KW - Shapley value
KW - data provenance
UR - http://www.scopus.com/inward/record.url?scp=85132780236&partnerID=8YFLogxK
U2 - 10.1145/3514221.3520172
DO - 10.1145/3514221.3520172
M3 - Conference contribution
AN - SCOPUS:85132780236
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 2373
EP - 2376
BT - SIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PB - Association for Computing Machinery
Y2 - 12 June 2022 through 17 June 2022
ER -