Constance: Modeling annotation contexts to improve stance classification

Kenneth Joseph, Lisa Friedland, Will Hobbs, David Lazer, Oren Tsur

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

19 Scopus citations

Abstract

Manual annotations are a prerequisite for many applications of machine learning. However, weaknesses in the annotation process itself are easy to overlook. In particular, scholars often choose what information to give to annotators without examining these decisions empirically. For subjective tasks such as sentiment analysis, sarcasm, and stance detection, such choices can impact results. Here, for the task of political stance detection on Twitter, we show that providing too little context can result in noisy and uncertain annotations, whereas providing too strong a context may cause it to outweigh other signals. To characterize and reduce these biases, we develop ConStance, a general model for reasoning about annotations across information conditions. Given conflicting labels produced by multiple annotators seeing the same instances with different contexts, ConStance simultaneously estimates gold standard labels and also learns a classifier for new instances. We show that the classifier learned by ConStance outperforms a variety of baselines at predicting political stance, while the model’s interpretable parameters shed light on the effects of each context.

Original languageEnglish
Title of host publicationEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages1115-1124
Number of pages10
ISBN (Electronic)9781945626838
DOIs
StatePublished - 1 Jan 2017
Event2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017 - Copenhagen, Denmark
Duration: 9 Sep 201711 Sep 2017

Publication series

NameEMNLP 2017 - Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2017 Conference on Empirical Methods in Natural Language Processing, EMNLP 2017
Country/TerritoryDenmark
CityCopenhagen
Period9/09/1711/09/17

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Constance: Modeling annotation contexts to improve stance classification'. Together they form a unique fingerprint.

Cite this