Computergestützte Lesartendisambiguierung

Storjohann, Petra

Computergestützte Lesartendisambiguierung

Mit diesem Papier sollen LexikografInnen an ein Automatisierungstool der Textanalyse innerhalb der Korpuslinguistik herangeführt werden. Das am IDS entwickelte statistische Recherche- und Analysewerkzeug Cosmas bietet neue Zugänge zur Gewinnung semantischer Informationen über Wörter. Die Nutzungsmöglichkeiten dieses Instrumentariums für die Lesartendisambiguierung von Lexemen und deren Verifizierung mittels Kollokations- und Kontextanalyse werden erläutert, und anhand des Beispiels cool wird gezeigt, inwieweit sich semantische Informationen durch automatische Statistik extrahieren lassen. Dabei wird auf die Vor- und Nachteile der computerbasierten Analyse eingegangen. Darüber hinaus wird dargestellt, wie empirische lexikografische Disambiguierung modellgeleitet validiert werden kann. Um die Unterschiede zwischen herkömmlichen Beschreibungsmöglichkeiten und neuen statistischen Verfahren zu verdeutlichen, werden die Lesarten zu cool, wie sie im Duden GWDS (2000) dargestellt sind, mit den identifizierten Lesarten der Analyse mit Cosmas verglichen.
The purpose of this paper is to introduce lexicographers to a computational tool for automatic content analysis in corpus linguistics. Cosmas, an efficient statistical search and text analysis tool developed at the IDS offers new ways of obtaining semantic information about words. I shall demonstrate the enhanced disambiguation techniques provided by this instrument for analysing polysemous lexemes and show how the results can be validated with the help of collocational and contextual analyses. This will be exemplified by the word cool, which illustrates how far computational and statistical methods are able to generate semantic information. Furthermore, I shall seek to emphasise the advantages and disadvantages of statistical computer-based identification. I shall also discuss how empirical lexicographic disambiguation can be validated within a theoretical framework. In order to illuminate the differences between traditional semantic description and the use of computational tools for automatic analysis, I shall draw a comparison with the dictionary entry in the Duden GWDS (2000) for the lexeme cool.

Metadaten
Author:	Petra Storjohann GND
URN:	urn:nbn:de:bsz:mh39-50083
ISSN:	0340-9341
Parent Title (German):	Deutsche Sprache
Publisher:	Schmidt
Place of publication:	Berlin
Document Type:	Article
Language:	German
Year of first Publication:	2003
Date of Publication (online):	2016/06/21
Publicationstate:	Veröffentlichungsversion
Reviewstate:	Peer-Review
GND Keyword:	Computerunterstützte Lexikographie; Deutsch; Homonym; Lexem; Polysem
Volume:	31
Issue:	1
First Page:	3
Last Page:	28
DDC classes:	400 Sprache / 410 Linguistik
Open Access?:	ja
BDSL-Classification:	Lexikographie, Wörterbücher
Leibniz-Classification:	Sprache, Linguistik
Linguistics-Classification:	Lexikografie
Licence (German):	Creative Commons - Namensnennung-Nicht kommerziell-Keine Bearbeitung 3.0 Deutschland

Open Access

Computergestützte Lesartendisambiguierung

Download full text files

Export metadata

Additional Services

Statistics