TY - JOUR
T1 - Selective cluster presentation on the search results page
AU - Levi, Or
AU - Guy, Ido
AU - Raiber, Fiana
AU - Kurland, Oren
N1 - Funding Information:
*The second part of the manuscript—specifically, the selective cluster retrieval approach and its evaluation—was partially reported in Levi et al. (2016). †Part of this research was conducted while the author was at the Technion—Israel Institute of Technology. ‡Part of this research was conducted while the author was working at Yahoo Research. §Part of this research was conducted while the author was at the Technion—Israel Institute of Technology. This paper is based upon work supported in part by the Israel Science Foundation under grant no. 433/12, by the Technion-Microsoft Electronic Commerce Research Center, and by a Yahoo faculty research and engagement award. Authors’ addresses: O. Levi, Biltstraat 121, Utrecht, Netherlands; email: olevi@ebay.com; I. Guy, P.O. Box 653 Beer-Sheva 8410501, Israel; email: idoguy@acm.org; F. Raiber, Matam Tower 3, 7th Floor, Haifa 31905, Israel; email: fiana@oath.com; O. Kurland, Technion City, Haifa 3200003, Israel; email: kurland@ie.technion.ac.il. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. © 2018 ACM 1046-8188/2018/02-ART28 $15.00 https://doi.org/10.1145/3158672
Funding Information:
This paper is based upon work supported in part by the Israel Science Foundation under grant no. 433/12, by the Technion-Microsoft Electronic Commerce Research Center, and by a Yahoo faculty research and engagement award.
Publisher Copyright:
© 2018 ACM.
PY - 2018/1/1
Y1 - 2018/1/1
N2 - Web search engines present, for some queries, a cluster of results from the same specialized domain (“vertical”) on the search results page (SERP). We introduce a comprehensive analysis of the presentation of such clusters from seven different verticals based on the logs of a commercial Web search engine. This analysis reveals several unique characteristics—such as size, rank, and clicks—of result clusters from community question- and-answer websites. The study of properties of this result cluster—specifically as part of the SERP—has received little attention in previous work. Our analysis also motivates the pursuit of a long-standing challenge in ad hoc retrieval, namely, selective cluster retrieval. In our setting, the specific challenge is to select for presentation the documents most highly ranked either by a cluster-based approach (those in the top-retrieved cluster) or by a document-based approach. We address this classification task by representing queries with features based on those utilized for ranking the clusters, query-performance predictors, and properties of the document-clustering structure. Empirical evaluation performed with TREC data shows that our approach outperforms a recently proposed state-of-the-art cluster-based document-retrieval method as well as state-of-the-art document-retrieval methods that do not account for inter-document similarities.
AB - Web search engines present, for some queries, a cluster of results from the same specialized domain (“vertical”) on the search results page (SERP). We introduce a comprehensive analysis of the presentation of such clusters from seven different verticals based on the logs of a commercial Web search engine. This analysis reveals several unique characteristics—such as size, rank, and clicks—of result clusters from community question- and-answer websites. The study of properties of this result cluster—specifically as part of the SERP—has received little attention in previous work. Our analysis also motivates the pursuit of a long-standing challenge in ad hoc retrieval, namely, selective cluster retrieval. In our setting, the specific challenge is to select for presentation the documents most highly ranked either by a cluster-based approach (those in the top-retrieved cluster) or by a document-based approach. We address this classification task by representing queries with features based on those utilized for ranking the clusters, query-performance predictors, and properties of the document-clustering structure. Empirical evaluation performed with TREC data shows that our approach outperforms a recently proposed state-of-the-art cluster-based document-retrieval method as well as state-of-the-art document-retrieval methods that do not account for inter-document similarities.
KW - Aggregated search
KW - Cluster-based retrieval
UR - http://www.scopus.com/inward/record.url?scp=85042867483&partnerID=8YFLogxK
U2 - 10.1145/3158672
DO - 10.1145/3158672
M3 - Article
AN - SCOPUS:85042867483
SN - 1046-8188
VL - 36
JO - ACM Transactions on Information Systems
JF - ACM Transactions on Information Systems
IS - 3
M1 - 28
ER -