Optimizing the prediction process: from statistical concepts to the case study of soccer

We present a systematic approach for prediction purposes based on panel data, involving information about different interacting subjects and different times (here: two). The corresponding bivariate regression problem can be solved analytically for the final statistical estimation error. Furthermore,...

Verfasser: Heuer, Andreas
Rubner, Oliver
FB/Einrichtung:FB 12: Chemie und Pharmazie
Dokumenttypen:Artikel
Medientypen:Text
Erscheinungsdatum:2014
Publikation in MIAMI:27.11.2014
Datum der letzten Änderung:16.04.2019
Angaben zur Ausgabe:[Electronic ed.]
Quelle:PLoS ONE 9 ( 2014) 9, 1-9, e104647
Fachgebiet (DDC):310: Statistiken
Lizenz:CC BY 4.0
Sprache:English
Anmerkungen:Finanziert durch den Open-Access-Publikationsfonds 2014/2015 der Deutschen Forschungsgemeinschaft (DFG) und der Westfälischen Wilhelms-Universität Münster (WWU Münster).
Format:PDF-Dokument
ISSN:1932-6203
URN:urn:nbn:de:hbz:6-31339424119
Weitere Identifikatoren:DOI: doi:10.1371/journal.pone.0104647
Permalink:https://nbn-resolving.de/urn:nbn:de:hbz:6-31339424119
Onlinezugriff:journal.pone.0104647.pdf

We present a systematic approach for prediction purposes based on panel data, involving information about different interacting subjects and different times (here: two). The corresponding bivariate regression problem can be solved analytically for the final statistical estimation error. Furthermore, this expression is simplified for the special case that the subjects do not change their properties between the last measurement and the prediction period. This statistical framework is applied to the prediction of soccer matches, based on information from the previous and the present season. It is determined how well the outcome of soccer matches can be predicted theoretically. This optimum limit is compared with the actual quality of the prediction, taking the German premier league as an example. As a key step for the actual prediction process one has to identify appropriate observables which reflect the strength of the individual teams as close as possible. A criterion to distinguish different observables is presented. Surprisingly, chances for goals turn out to be much better suited than the goals themselves to characterize the strength of a team. Routes towards further improvement of the prediction are indicated. Finally, two specific applications are discussed.