TY - JOUR
T1 - Matching entities across online social networks
AU - Peled, Olga
AU - Fire, Michael
AU - Rokach, Lior
AU - Elovici, Yuval
N1 - Funding Information:
We would like to thank Marisa Timko for editing and proofreading this paper. Especially, we want to thank Carol Teegarden for her editing expertise and endless helpful advice which guided this paper to completion. Additionally, we want to thank Washington Research Foundation Fund for Innovation in Data-Intensive Discovery, and Moore/Sloan Data Science Environments Project at the University of Washington for supporting this study. We also want to thank the anonymous reviewers for their helpful comments.
Publisher Copyright:
© 2016 Elsevier B.V.
PY - 2016/10/19
Y1 - 2016/10/19
N2 - Online social networks (OSNs), such as Facebook and Twitter, have become an integral part of our daily lives. There are hundreds of OSNs, and each offers particular services and functionalities. Recent studies show that many OSN users create accounts on multiple OSNs, using the same or different personal information. Collecting all the available data on an individual from several OSNs to fuse into a single profile can provide valuable information. In this paper, we introduce novel machine learning based methods for solving entity resolution (ER), a problem for matching user profiles across multiple OSNs. By using extracted features and supervised learning techniques, we developed classifiers which can perform entity matching between two profiles for the following scenarios: (a) matching users across two OSNs; (b) searching for a user by similar name; and (c) de-anonymizing a user's identity. The constructed classifiers were tested using data collected from two popular OSNs, Facebook and Xing. We then evaluated the classifiers' performances using measures such as true and false positive rates, accuracy, and the area under the receiver operator curve (AUC). The classification performance measured by AUC was quite remarkable, with an AUC of up to 0.982 and an accuracy of up to 95.9% in identifying user profiles across two OSNs.
AB - Online social networks (OSNs), such as Facebook and Twitter, have become an integral part of our daily lives. There are hundreds of OSNs, and each offers particular services and functionalities. Recent studies show that many OSN users create accounts on multiple OSNs, using the same or different personal information. Collecting all the available data on an individual from several OSNs to fuse into a single profile can provide valuable information. In this paper, we introduce novel machine learning based methods for solving entity resolution (ER), a problem for matching user profiles across multiple OSNs. By using extracted features and supervised learning techniques, we developed classifiers which can perform entity matching between two profiles for the following scenarios: (a) matching users across two OSNs; (b) searching for a user by similar name; and (c) de-anonymizing a user's identity. The constructed classifiers were tested using data collected from two popular OSNs, Facebook and Xing. We then evaluated the classifiers' performances using measures such as true and false positive rates, accuracy, and the area under the receiver operator curve (AUC). The classification performance measured by AUC was quite remarkable, with an AUC of up to 0.982 and an accuracy of up to 95.9% in identifying user profiles across two OSNs.
KW - Entity matching
KW - Entity resolution
KW - Machine learning
KW - Online social networks
UR - http://www.scopus.com/inward/record.url?scp=84977554616&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2016.03.089
DO - 10.1016/j.neucom.2016.03.089
M3 - Article
AN - SCOPUS:84977554616
SN - 0925-2312
VL - 210
SP - 91
EP - 106
JO - Neurocomputing
JF - Neurocomputing
ER -