Leveraging Exposure Networks for Detecting Fake News Sources

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    The scale and dynamic nature of the Web makes real-time detection of misinformation an extremely difficult task. Prior research mostly focused on offline (retrospective) detection of stories or claims using linguistic features of the content, flagging by users, and crowdsourced labels. Here, we develop a novel machine-learning methodology for detecting fake news sources using active learning, and examine the contribution of network, audience, and text features to the model accuracy. Importantly, we evaluate performance in both offline and online settings, mimicking the strategic choices fact-checkers have to make in practice as news sources emerge over time. We find that exposure networks provide information on considerably more sources than sharing networks (+49.6%), and that the inclusion of exposure features greatly improves classification PR-AUC in both offline (+33%) and online (+69.2%) settings. Textual features perform best in offline settings, but their performance deteriorates by 12.0-18.7% in online settings. Finally, the results show that a few iterations of active learning are sufficient for our model to attain predictive performance to comparable exhaustive labeling while incurring only 24.7% of the labeling costs. These results stress the importance of exposure networks as a source of valuable information for the investigation of information dissemination in social networks and question the robustness of textual features.

    Original languageEnglish
    Title of host publicationKDD 2024 - Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
    PublisherAssociation for Computing Machinery
    Pages5635-5646
    Number of pages12
    ISBN (Electronic)9798400704901
    DOIs
    StatePublished - 24 Aug 2024
    Event30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024 - Barcelona, Spain
    Duration: 25 Aug 202429 Aug 2024

    Publication series

    NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
    ISSN (Print)2154-817X

    Conference

    Conference30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024
    Country/TerritorySpain
    CityBarcelona
    Period25/08/2429/08/24

    Keywords

    • fact-checking
    • fake news detection
    • misinformation
    • network science
    • social media

    ASJC Scopus subject areas

    • Software
    • Information Systems

    Fingerprint

    Dive into the research topics of 'Leveraging Exposure Networks for Detecting Fake News Sources'. Together they form a unique fingerprint.

    Cite this