Simple Statistics Are Sometime Too Simple: A Case Study in Social Media Data

    Research output: Contribution to journalArticlepeer-review

    4 Scopus citations

    Abstract

    In this work we ask to which extent are simple statistics useful to make sense of social media data. By simple statistics we mean counting and bookkeeping type features such as the number of likes given to a user's post, a user's number of friends, etc. We find that relying solely on simple statistics is not always a good approach. Specifically, we develop a statistical framework that we term semantic shattering which allows to detect semantic inconsistencies in the data that may occur due to relying solely on simple statistics. We apply our framework to simple-statistics data collected from six online social media platforms and arrive at a surprising counter-intuitive finding in three of them, Twitter, Instagram and YouTube. We find that overall, the activity of the user is not correlated with the feedback that the user receives on that activity. A hint to understand this phenomenon may be found in the fact that the activity-feedback shattering did not occur in LinkedIn, Steam and Flickr. A possible explanation for this separation is the amount of effort required to produce content. The lesser the effort the lesser the correlation between activity and feedback. The amount of effort may be a proxy to the level of commitment that the users feel towards each other in the network, and indeed sociologists claim that commitment explains consistent human behavior, or lack thereof. However, the amount of effort or the level of commitment are by no means a simple statistic.

    Original languageEnglish
    Article number8642436
    Pages (from-to)402-408
    Number of pages7
    JournalIEEE Transactions on Knowledge and Data Engineering
    Volume32
    Issue number2
    DOIs
    StatePublished - 1 Feb 2020

    Keywords

    • Online social media
    • PCA
    • Simpson's paradox
    • data analysis

    ASJC Scopus subject areas

    • Information Systems
    • Computer Science Applications
    • Computational Theory and Mathematics

    Fingerprint

    Dive into the research topics of 'Simple Statistics Are Sometime Too Simple: A Case Study in Social Media Data'. Together they form a unique fingerprint.

    Cite this