Treebank Grammar Techniques for Non-Projective Dependency Parsing
Marco Kuhlmann and Giorgio Satta. Treebank Grammar Techniques for Non-Projective Dependency Parsing. In Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL), pages 478–486, Athens, Greece, 2009.
An open problem in dependency parsing is the accurate and efficient treatment of non-projective structures. We propose to attack this problem using chart-parsing algorithms developed for mildly context-sensitive grammar formalisms. In this paper, we provide two key tools for this approach. First, we show how to reduce non-projective dependency parsing to parsing with Linear Context-Free Rewriting Systems (LCFRS), by presenting a technique for extracting LCFRS from dependency treebanks. For efficient parsing, the extracted grammars need to be transformed in order to minimize the number of nonterminal symbols per production. Our second contribution is an algorithm that computes this transformation for a large, empirically relevant class of grammars.