What's in a hashtag? Content based prediction of the spread of ideas in microblogging communities

Oren Tsur, Ari Rappoport

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

278 Scopus citations

Abstract

Current social media research mainly focuses on temporal trends of the information ow and on the topology of the social graph that facilitates the propagation of information. In this paper we study the effect of the content of the idea on the information propagation. We present an efficient hybrid approach based on a linear regression for predicting the spread of an idea in a given time frame. We show that a combination of content features with temporal and topological features minimizes prediction error. Our algorithm is evaluated on Twitter hashtags extracted from a dataset of more than 400 million tweets. We analyze the contribution and the limitations of the various feature types to the spread of information, demonstrating that content aspects can be used as strong predictors thus should not be disregarded. We also study the dependencies between global features such as graph topology and content features.

Original languageEnglish
Title of host publicationWSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining
Pages643-652
Number of pages10
DOIs
StatePublished - 15 Mar 2012
Externally publishedYes
Event5th ACM International Conference on Web Search and Data Mining, WSDM 2012 - Seattle, WA, United States
Duration: 8 Feb 201212 Feb 2012

Publication series

NameWSDM 2012 - Proceedings of the 5th ACM International Conference on Web Search and Data Mining

Conference

Conference5th ACM International Conference on Web Search and Data Mining, WSDM 2012
Country/TerritoryUnited States
CitySeattle, WA
Period8/02/1212/02/12

Keywords

  • Hashtags
  • Information diffusion
  • Microblogging
  • Social media
  • Twitter

Fingerprint

Dive into the research topics of 'What's in a hashtag? Content based prediction of the spread of ideas in microblogging communities'. Together they form a unique fingerprint.

Cite this