Identification of transliterated foreign words in Hebrew script

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    10 Scopus citations

    Abstract

    We present a loosely-supervised method for context-free identification of transliterated foreign names and borrowed words in Hebrew text. The method is purely statistical and does not require the use of any lexicons or linguistic analysis tool for the source languages (Hebrew, in our case). It also does not require any manually annotated data for training - we learn from noisy data acquired by over-generation. We report precision/recall results of 80/82 for a corpus of 4044 unique words, containing 368 foreign words.

    Original languageEnglish
    Title of host publicationComputational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings
    Pages466-477
    Number of pages12
    DOIs
    StatePublished - 27 Aug 2008
    Event9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008 - Haifa, Israel
    Duration: 17 Feb 200823 Feb 2008

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume4919 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008
    Country/TerritoryIsrael
    CityHaifa
    Period17/02/0823/02/08

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'Identification of transliterated foreign words in Hebrew script'. Together they form a unique fingerprint.

    Cite this