Volltext-Downloads (blau) und Frontdoor-Views (grau)

Recent developments in DeReKo

  • This paper gives an overview of recent developments in the German Reference Corpus DeReKo in terms of growth, maximising relevant corpus strata, metadata, legal issues, and its current and future research interface. Due to the recent acquisition of new licenses, DeReKo has grown by a factor of four in the first half of 2014, mostly in the area of newspaper text, and presently contains over 24 billion word tokens. Other strata, like fictional texts, web corpora, in particular CMC texts, and spoken but conceptually written texts have also increased significantly. We report on the newly acquired corpora that led to the major increase, on the principles and strategies behind our corpus acquisition activities, and on our solutions for the emerging legal, organisational, and technical challenges.

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Marc KupietzGND, Harald LüngenGND
URN:urn:nbn:de:bsz:mh39-31353
URL:http://www.lrec-conf.org/proceedings/lrec2014/index.html
Parent Title (English):Proceedings of the ninth conference on international language resources and evaluation (LREC’14)
Publisher:European Language Resources Association (ELRA)
Place of publication:Reykjavik
Document Type:Conference Proceeding
Language:English
Year of first Publication:2014
Date of Publication (online):2014/10/13
Tag:Deutsches Referenzkorpus (DeReKo); Institut für Deutsche Sprache <Mannheim>
GND Keyword:Deutsch; Korpus <Linguistik>; Textkorpus
Page Number:2378
First Page:2385
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Licence (German):License LogoUrheberrechtlich geschützt