SVM Model Tampering and Anchored Learning: A case study in Hebrew NP chunking

Yoav Goldberg, Michael Elhadad

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

We study the issue of porting a known NLP method to a language with little existing NLP resources, specifically Hebrew SVM-based chunking. We introduce two SVM-based methods - Model Tampering and Anchored Learning. These allow fine grained analysis of the learned SVM models, which provides guidance to identify errors in the training corpus, distinguish the role and interaction of lexical features and eventually construct a model with ∼10% error reduction. The resulting chunker is shown to be robust in the presence of noise in the training corpus, relies on less lexical features than was previously understood and achieves an F-measure performance of 92.2 on automatically PoS-tagged text. The SVM analysis methods also provide general insight on SVM-based chunking.

Original languageEnglish
Title of host publicationACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics
PublisherAssociation for Computational Linguistics (ACL)
Pages224-231
Number of pages8
ISBN (Print)9781932432862
StatePublished - 1 Dec 2007
Event45th Annual Meeting of the Association for Computational Linguistics, ACL 2007 - Prague, Czech Republic
Duration: 23 Jun 200730 Jun 2007

Publication series

NameACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics

Conference

Conference45th Annual Meeting of the Association for Computational Linguistics, ACL 2007
Country/TerritoryCzech Republic
CityPrague
Period23/06/0730/06/07

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'SVM Model Tampering and Anchored Learning: A case study in Hebrew NP chunking'. Together they form a unique fingerprint.

Cite this