TY - GEN
T1 - Investigating the Value of Subtitles for Improved Movie Recommendations
AU - Eden, Sagi
AU - Livne, Amit
AU - Sar Shalom, Oren
AU - Shapira, Bracha
AU - Jannach, Dietmar
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/4/7
Y1 - 2022/4/7
N2 - Collaborative filtering (CF) is a highly effective recommendation approach based on preference patterns observed in user-item interaction data. Since pure collaborative methods can have certain limitations, e.g., when the data is sparse, hybrid approaches are a common solution, as they are able to combine collaborative information with side-information (SI) about the items. In this work, we explore the value of subtitle information for the problem of movie recommendation. Differently from previously explored types of movie SI, e.g., titles or synopsis, subtitles are not only longer, but also contain unique information that may help us to predict more accurately if a user will enjoy a movie. To assess the usefulness of subtitles, we propose a technical framework named SubtitleCF that combines user and item embeddings derived from interaction data and SI. The subtitles may be embedded in different ways, e.g., Latent Dirichlet Allocation (LDA) and neural techniques. Computational experiments with a framework instantiation that relies on Bayesian Personalized Ranking (BPR) as industry-strength method for item ranking and different text embedding methods demonstrate the value of subtitles in terms of prediction accuracy and coverage. Moreover, a user study (N=247) reveals that the information contained in subtitles can be leveraged to improve the decision-making processes of users.
AB - Collaborative filtering (CF) is a highly effective recommendation approach based on preference patterns observed in user-item interaction data. Since pure collaborative methods can have certain limitations, e.g., when the data is sparse, hybrid approaches are a common solution, as they are able to combine collaborative information with side-information (SI) about the items. In this work, we explore the value of subtitle information for the problem of movie recommendation. Differently from previously explored types of movie SI, e.g., titles or synopsis, subtitles are not only longer, but also contain unique information that may help us to predict more accurately if a user will enjoy a movie. To assess the usefulness of subtitles, we propose a technical framework named SubtitleCF that combines user and item embeddings derived from interaction data and SI. The subtitles may be embedded in different ways, e.g., Latent Dirichlet Allocation (LDA) and neural techniques. Computational experiments with a framework instantiation that relies on Bayesian Personalized Ranking (BPR) as industry-strength method for item ranking and different text embedding methods demonstrate the value of subtitles in terms of prediction accuracy and coverage. Moreover, a user study (N=247) reveals that the information contained in subtitles can be leveraged to improve the decision-making processes of users.
KW - Hybrid Systems
KW - Movie Recommendation
KW - Side Information
KW - Subtitles
UR - http://www.scopus.com/inward/record.url?scp=85135175257&partnerID=8YFLogxK
U2 - 10.1145/3503252.3531291
DO - 10.1145/3503252.3531291
M3 - Conference contribution
AN - SCOPUS:85135175257
T3 - UMAP2022 - Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization
SP - 99
EP - 109
BT - UMAP2022 - Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization
PB - Association for Computing Machinery, Inc
T2 - 30th ACM Conference on User Modeling, Adaptation and Personalization, UMAP2022
Y2 - 4 July 2022 through 7 July 2022
ER -