TY - GEN
T1 - Exploring Long-Term Temporal Trends in the Use of Multiword Expressions.
AU - Daniel, Tal
AU - Last, Mark
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2016
Y1 - 2016
N2 - Differentiating between outdated expressions and current expressions is not a trivial task for foreign language learners, and
could be beneficial for lexicographers, as they examine expressions. Assuming that the usage of expressions over time can be represented by a time-series of their periodic frequencies over a large lexicographic corpus, we test the hypothesis that there exists an old–new relationship between the time-series of some synonymous expressions, a hint that a later expression has replaced an earlier one. Another hypothesis we test is that Multiword Expressions (MWEs) can be characterized by sparsity
& frequency thresholds. Using a dataset of 1 million English books, we choose MWEs having the most positive or the most negative usage trends from a ready-made list of known MWEs. We identify synonyms of those expressions in a historical thesaurus and visualize the temporal relationships between the resulting expression pairs. Our empirical results indicate that old–new usage relationships do exist between some synonymous expressions, and that new candidate expressions, not found in dictionaries, can be found by analyzing usage trends.
AB - Differentiating between outdated expressions and current expressions is not a trivial task for foreign language learners, and
could be beneficial for lexicographers, as they examine expressions. Assuming that the usage of expressions over time can be represented by a time-series of their periodic frequencies over a large lexicographic corpus, we test the hypothesis that there exists an old–new relationship between the time-series of some synonymous expressions, a hint that a later expression has replaced an earlier one. Another hypothesis we test is that Multiword Expressions (MWEs) can be characterized by sparsity
& frequency thresholds. Using a dataset of 1 million English books, we choose MWEs having the most positive or the most negative usage trends from a ready-made list of known MWEs. We identify synonyms of those expressions in a historical thesaurus and visualize the temporal relationships between the resulting expression pairs. Our empirical results indicate that old–new usage relationships do exist between some synonymous expressions, and that new candidate expressions, not found in dictionaries, can be found by analyzing usage trends.
U2 - 10.18653/v1/w16-1802
DO - 10.18653/v1/w16-1802
M3 - פרסום בספר כנס
BT - MWE@ACL
ER -