Exploring Long-Term Temporal Trends in the Use of Multiword Expressions.

Tal Daniel, Mark Last

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Differentiating between outdated expressions and current expressions is not a trivial task for foreign language learners, and could be beneficial for lexicographers, as they examine expressions. Assuming that the usage of expressions over time can be represented by a time-series of their periodic frequencies over a large lexicographic corpus, we test the hypothesis that there exists an old–new relationship between the time-series of some synonymous expressions, a hint that a later expression has replaced an earlier one. Another hypothesis we test is that Multiword Expressions (MWEs) can be characterized by sparsity & frequency thresholds. Using a dataset of 1 million English books, we choose MWEs having the most positive or the most negative usage trends from a ready-made list of known MWEs. We identify synonyms of those expressions in a historical thesaurus and visualize the temporal relationships between the resulting expression pairs. Our empirical results indicate that old–new usage relationships do exist between some synonymous expressions, and that new candidate expressions, not found in dictionaries, can be found by analyzing usage trends.
Original languageEnglish GB
Title of host publicationMWE@ACL
DOIs
StatePublished - 2016

Fingerprint

Dive into the research topics of 'Exploring Long-Term Temporal Trends in the Use of Multiword Expressions.'. Together they form a unique fingerprint.

Cite this