Volltext-Downloads (blau) und Frontdoor-Views (grau)

RKorAPClient: An R Package for Accessing the German Reference Corpus DeReKo via KorAP

  • Making corpora accessible and usable for linguistic research is a huge challenge in view of (too) big data, legal issues and a rapidly evolving methodology. This does not only affect the design of user-friendly graphical interfaces to corpus analysis tools, but also the availability of programming interfaces supporting access to the functionality of these tools from various analysis and development environments. RKorAPClient is a new research tool in the form of an R package that interacts with the Web API of the corpus analysis platform KorAP, which provides access to large annotated corpora, including the German reference corpus DeReKo with 45 billion tokens. In addition to optionally authenticated KorAP API access, RKorAPClient provides further processing and visualization features to simplify common corpus analysis tasks. This paper introduces the basic functionality of RKorAPClient and exemplifies various analysis tasks based on DeReKo, that are bundled within the R package and can serve as a basic framework for advanced analysis and visualization approaches.

Download full text files

Export metadata

Additional Services

Search Google Scholar

Statistics

frontdoor_oas
Metadaten
Author:Marc KupietzGND, Nils DiewaldGND, Eliza Margaretha
URN:urn:nbn:de:bsz:mh39-98430
URL:http://www.lrec-conf.org/proceedings/lrec2020/index.html#7015
ISBN:979-10-95546-34-4
Parent Title (English):Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC), May 11-16, 2020, Palais du Pharo, Marseille, France
Publisher:European Language Resources Association
Place of publication:Paris
Editor:Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Document Type:Part of a Book
Language:English
Year of first Publication:2020
Date of Publication (online):2020/05/21
Publicationstate:Zweitveröffentlichung
Reviewstate:Peer-Review
Tag:Corpus Analysis; Corpus Tools; Data Vizualization; OAuth; Reference Corpora
GND Keyword:Forschungsdaten; Korpus <Linguistik>; R <Programm>; Visualisierung; Web Services
First Page:7015
Last Page:7021
Note:
Gefördert durch den Open-Access-Monografienfonds der Leibniz-Gemeinschaft
DDC classes:400 Sprache / 400 Sprache, Linguistik
Open Access?:ja
Leibniz-Classification:Sprache, Linguistik
Linguistics-Classification:Computerlinguistik
Linguistics-Classification:Korpuslinguistik
Program areas:S1: Korpuslinguistik
Licence (English):License LogoCreative Commons - Attribution-NonCommercial 4.0 International