Abstract
As part of an effort to develop NLP-based tools for Hebrew AAC users, we
investigate the task of word prediction. Previous work on word prediction
shows that statistical methods are not sufficiently precise for languages with
highly inflected morphology, and that syntactic processing is required.
Following this assumption, we have tested Hebrew natural language
processing tools on the word prediction task. We found that while training a
language model on a very large corpus (27M), we achieve high results on
various genres including personal writing in blogs and in open forums in the
Internet. Contrary to what we expected, using morpho-syntactic information
such as part of speech tags decreases prediction results.
investigate the task of word prediction. Previous work on word prediction
shows that statistical methods are not sufficiently precise for languages with
highly inflected morphology, and that syntactic processing is required.
Following this assumption, we have tested Hebrew natural language
processing tools on the word prediction task. We found that while training a
language model on a very large corpus (27M), we achieve high results on
various genres including personal writing in blogs and in open forums in the
Internet. Contrary to what we expected, using morpho-syntactic information
such as part of speech tags decreases prediction results.
Original language | English GB |
---|---|
Title of host publication | Algorithms and Computation, 19th International Symposium, ISAAC 2008, Gold Coast, Australia, December 15-17, 2008. Proceedings |
State | Published - 2008 |