- AutorIn
- Shanmugapriyan Manoharan Technische Universität Chemnitz
- Titel
- 3D Object Detection based on Unsupervised Depth Estimation
- Zitierfähige Url:
- https://nbn-resolving.org/urn:nbn:de:bsz:ch1-qucosa2-775448
- Datum der Einreichung
- 12.12.2020
- Datum der Verteidigung
- 25.01.2021
- Abstract (EN)
- Estimating depth and detection of object instances in 3D space is fundamental in autonomous navigation, localization, and mapping, robotic object manipulation, and augmented reality. RGB-D images and LiDAR point clouds are the most illustrative formats of depth information. However, depth sensors offer many shortcomings, such as low effective spatial resolutions and capturing of a scene from a single perspective. The thesis focuses on reproducing denser and comprehensive 3D scene structure for given monocular RGB images using depth and 3D object detection. The first contribution of this thesis is the pipeline for the depth estimation based on an unsupervised learning framework. This thesis proposes two architectures to analyze structure from motion and 3D geometric constraint methods. The proposed architectures trained and evaluated using only RGB images and no ground truth depth data. The architecture proposed in this thesis achieved better results than the state-of-the-art methods. The second contribution of this thesis is the application of the estimated depth map, which includes two algorithms: point cloud generation and collision avoidance. The predicted depth map and RGB image are used to generate the point cloud data using the proposed point cloud algorithm. The collision avoidance algorithm predicts the possibility of collision and provides the collision warning message based on decoding the color in the estimated depth map. This algorithm design is adaptable to different color map with slight changes and perceives collision information in the sequence of frames. Our third contribution is a two-stage pipeline to detect the 3D objects from a monocular image. The first stage pipeline used to detect the 2D objects and crop the patch of the image and the same provided as the input to the second stage. In the second stage, the 3D regression network train to estimate the 3D bounding boxes to the target objects. There are two architectures proposed for this 3D regression network model. This approach achieves better average precision than state-of-theart for truncation of 15% or fully visible objects and lowers but comparable results for truncation more than 30% or partly/fully occluded objects.
- Freie Schlagwörter (EN)
- Depth estimation, Unsupervised learning, Structure from Motion, 3D Geometric Constraints, Convolutional Neural Networks, 3D Object Detection, Monocular Camera
- Klassifikation (DDC)
- 000
- Normschlagwörter (GND)
- Zellulares neuronales Netz, Lokalisierung, Schärfentiefe
- GutachterIn
- Prof.Dr.Dr.h.c. Wolfram Hardt
- BetreuerIn Hochschule / Universität
- M.Sc. Shadi Saleh
- Den akademischen Grad verleihende / prüfende Institution
- Technische Universität Chemnitz, Chemnitz
- Version / Begutachtungsstatus
- publizierte Version / Verlagsversion
- URN Qucosa
- urn:nbn:de:bsz:ch1-qucosa2-775448
- Veröffentlichungsdatum Qucosa
- 25.01.2022
- Dokumenttyp
- Masterarbeit / Staatsexamensarbeit
- Sprache des Dokumentes
- Englisch
- Lizenz / Rechtehinweis