Parsing coordinations

  • The present paper is concerned with statistical parsing of constituent structures in German. The paper presents four experiments that aim at improving parsing performance of coordinate structure: 1) reranking the n-best parses of a PCFG parser, 2) enriching the input to a PCFG parser by gold scopes for any conjunct, 3) reranking the parser output for all possible scopes for conjuncts that are permissible with regard to clause structure. Experiment 4 reranks a combination of parses from experiments 1 and 3. The experiments presented show that n- best parsing combined with reranking improves results by a large margin. Providing the parser with different scope possibilities and reranking the resulting parses results in an increase in F-score from 69.76 for the baseline to 74.69. While the F-score is similar to the one of the first experiment (n-best parsing and reranking), the first experiment results in higher recall (75.48% vs. 73.69%) and the third one in higher precision (75.43% vs. 73.26%). Combining the two methods results in the best result with an F-score of 76.69.

Download full text files

Export metadata

Metadaten
Author:Sandra KüblerORCiDGND, Erhard Hinrichs, Wolfgang Maier, Eva Klett
URN:urn:nbn:de:hebis:30-1128345
URL:http://cl.indiana.edu/~skuebler/papers/coord.pdf
Document Type:Preprint
Language:English
Date of Publication (online):2009/05/05
Year of first Publication:2009
Publishing Institution:Universitätsbibliothek Johann Christian Senckenberg
Release Date:2009/05/05
Tag:Deutsch
Page Number:9
Note:
Erschienen in: EACL 2009 : proceedings of the 12th conference of the European Chapter of the Association for Computational Linguistics ; 30 March – 3 April 2009 Megaron Athens International Conference Centre Athens, Greece, Stroudsburg, PA : Association for Computational Linguistics (ACL), 2009, S. 406-414
Source:http://jones.ling.indiana.edu/~skuebler/papers/coord.pdf ;
HeBIS-PPN:216486602
Institutes:keine Angabe Fachbereich / Extern
Dewey Decimal Classification:4 Sprache / 40 Sprache / 400 Sprache
Sammlungen:Linguistik
Licence (German):License LogoDeutsches Urheberrecht