Towards real-time 3D vehicle detection from monocular images using deep learning

Gählert, Nils

Dissertation 2021 CC BY 4.0

Veröffentlicht

Towards real-time 3D vehicle detection from monocular images using deep learning

One key task of the environment perception pipeline for autonomous driving is object detection using monocular RGB images. This task is usually limited to 2D object detection. The question arises whether 3D object detection is also possible using only monocular RGB images. In this dissertation, we evaluate this question specifically for 3D vehicle detection in monocular RGB images in the scope of driver assistance systems and autonomous driving. We use modern deep learning techniques without utilizing temporal information and a so-called 2D-3D lifting. In particular, this includes the estimation of 3D location, orientation, and the size of the object. In addition to a reliable and high-quality detection performance, autonomous driving systems require a short runtime. Therefore, we opt for the best possible trade-off between detection performance and runtime. Since the basis of any deep learning approach is high-quality data, we introduce a new dataset, Cityscapes 3D. This dataset is characterized in particular by its annotations with 9 degrees of freedom, as well as novel and improved evaluation metrics. We published a publicly available benchmark that allows research groups to assess and compare their methods for 3D object detection to those of other researchers. We develop improvements for 2D object detection and prove their effectiveness. Firstly, we increase the 2D detection performance by more than 5% using an adapted error function during training. Secondly, we develop vg-NMS that particularly supports 2D amodal object detection. With MB-Net, BS3D, and 3D-GCK, we develop three different approaches based on the 2D-3D lifting scheme. All developed approaches stand out for their comparably good detection performances and their short runtime. In direct comparison to MB-Net and BS3D, 3D-GCK does not require any post-processing. It estimates all 9 degrees of freedom of a vehicle in 3D space and also requires no prior knowledge about possible vehicle extents.

Vorschau

Einordnung

Gutachter(in), Rezensent(in):

Denzler, Joachim ; Geiger, Andreas

Datum der Annahme der Promotion / des Abschlusses:

12.11.2021

Datum der Veröffentlichung:

2021

PPN:

1782247432

URN:

urn:nbn:de:gbv:27-dbt-20211215-133822-000

Sprache:

Englisch

Ressourcentyp:

Text

Umfang:

175 Seiten

Erscheinungsort:

Jena

Schlagwörter:

Maschinelles Sehen^GND; Künstliche Intelligenz^GND; Maschinelles Lernen^GND; Autonomes Fahrzeug^GND

DDC-Sachgruppe der DNB:

620 Ingenieurwissenschaften und Maschinenbau

Bibliothekssignatur::

2022 J 13

Einrichtung:

Friedrich-Schiller-Universität Jena, Fakultät für Mathematik und Informatik

Hochschulvermerk:

Dissertation, Friedrich-Schiller-Universität Jena, 2021

auf die Merkliste

Zitieren

Zitierform:

urn:nbn:de:gbv:27-dbt-20211215-133822-000
Zitier-Link kopieren

Rechte

Nutzung und Vervielfältigung:

Export

BibTeX, Endnote, MODS, MARCXML, RIS, ISI, PICA, DC, CSV