TY - UNPB
T1 - Opinion Spam Detection
T2 - A New Approach Using Machine Learning and Network-Based Algorithms.
AU - Danilchenko, Kiril
AU - Segal, Michael
AU - Vilenchik, Dan
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2022/5/26
Y1 - 2022/5/26
N2 - E-commerce is the fastest-growing segment of the economy. Online reviews play a crucial role in helping consumers evaluate and compare products and services. As a result, fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. There are many reasons why it is hard to identify opinion spammers automatically, including the absence of reliable labeled data. This limitation precludes an off-the-shelf application of a machine learning pipeline. We propose a new method for classifying reviewers as spammers or benign, combining machine learning with a message-passing algorithm that capitalizes on the users' graph structure to compensate for the possible scarcity of labeled data. We devise a new way of sampling the labels for the training step (active learning), replacing the typical uniform sampling. Experiments on three large real-world datasets from this http URL show that our method outperforms state-of-the-art active learning approaches and also machine learning methods that use a much larger set of labeled data for training.
AB - E-commerce is the fastest-growing segment of the economy. Online reviews play a crucial role in helping consumers evaluate and compare products and services. As a result, fake reviews (opinion spam) are becoming more prevalent and negatively impacting customers and service providers. There are many reasons why it is hard to identify opinion spammers automatically, including the absence of reliable labeled data. This limitation precludes an off-the-shelf application of a machine learning pipeline. We propose a new method for classifying reviewers as spammers or benign, combining machine learning with a message-passing algorithm that capitalizes on the users' graph structure to compensate for the possible scarcity of labeled data. We devise a new way of sampling the labels for the training step (active learning), replacing the typical uniform sampling. Experiments on three large real-world datasets from this http URL show that our method outperforms state-of-the-art active learning approaches and also machine learning methods that use a much larger set of labeled data for training.
U2 - 10.48550/arXiv.2205.13422
DO - 10.48550/arXiv.2205.13422
M3 - Preprint
BT - Opinion Spam Detection
ER -