TY - GEN
T1 - Automated Category Tree Construction in E-Commerce
AU - Avron, Uri
AU - Gershtein, Shay
AU - Guy, Ido
AU - Milo, Tova
AU - Novgorodov, Slava
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/6/10
Y1 - 2022/6/10
N2 - Category trees play a central role in many web applications, enabling browsing-style information access. Building trees that reflect users' dynamic interests is, however, a challenging task, carried out by taxonomists. This manual construction leads to outdated trees as it is hard to keep track of market trends. While taxonomists can identify candidate categories, i.e. sets of items with a shared label, most such categories cannot simultaneously exist in the tree, as platforms set a bound on the number of categories an item may belong to. To address this setting, we formalize the problem of constructing a tree where the categories are maximally similar to desirable candidate categories while satisfying combinatorial requirements and provide a model that captures practical considerations. In previous work, we proved inapproximability bounds for this model. Nevertheless, in this work we provide two heuristic algorithms, and demonstrate their effectiveness over datasets from real-life e-commerce platforms, far exceeding the worst-case bounds. We also identify a natural special case, for which we devise a solution with tight approximation guarantees. Moreover, we explain how our approach facilitates continual updates, maintaining consistency with an existing tree. Finally, we propose to include in the input candidate categories derived from result sets to recent search queries to reflect dynamic user interests and trends.
AB - Category trees play a central role in many web applications, enabling browsing-style information access. Building trees that reflect users' dynamic interests is, however, a challenging task, carried out by taxonomists. This manual construction leads to outdated trees as it is hard to keep track of market trends. While taxonomists can identify candidate categories, i.e. sets of items with a shared label, most such categories cannot simultaneously exist in the tree, as platforms set a bound on the number of categories an item may belong to. To address this setting, we formalize the problem of constructing a tree where the categories are maximally similar to desirable candidate categories while satisfying combinatorial requirements and provide a model that captures practical considerations. In previous work, we proved inapproximability bounds for this model. Nevertheless, in this work we provide two heuristic algorithms, and demonstrate their effectiveness over datasets from real-life e-commerce platforms, far exceeding the worst-case bounds. We also identify a natural special case, for which we devise a solution with tight approximation guarantees. Moreover, we explain how our approach facilitates continual updates, maintaining consistency with an existing tree. Finally, we propose to include in the input candidate categories derived from result sets to recent search queries to reflect dynamic user interests and trends.
KW - category tree construction
KW - e-commerce
UR - http://www.scopus.com/inward/record.url?scp=85132775132&partnerID=8YFLogxK
U2 - 10.1145/3514221.3526124
DO - 10.1145/3514221.3526124
M3 - Conference contribution
AN - SCOPUS:85132775132
T3 - Proceedings of the ACM SIGMOD International Conference on Management of Data
SP - 1770
EP - 1783
BT - SIGMOD 2022 - Proceedings of the 2022 International Conference on Management of Data
PB - Association for Computing Machinery
T2 - 2022 ACM SIGMOD International Conference on the Management of Data, SIGMOD 2022
Y2 - 12 June 2022 through 17 June 2022
ER -