Domain Adaptation for Image Recognition and Viewpoint Estimation

Panareda Busto, Pau

Volltext

View/Open (29.4MB)

Author

Panareda Busto, Pau

Type of Scholarly Publication

Dissertation

Date of Exam

02.09.2020

Date of Publication

08.09.2020

Advisor

Gall, Jürgen

Co-Referee

Akata, Zeynep

Involved Institutions

Rheinische Friedrich-Wilhelms-Universität Bonn

Metadata

Show full item record

Citable Links

Handle: https://hdl.handle.net/20.500.11811/8582
URN: https://nbn-resolving.org/urn:nbn:de:hbz:5-59574

Abstract

Image-based recognition tasks require in their training phase large amounts of data to capture as much visual traits as possible. In many situations, however, the collection of image data implies a tedious effort or, even worse, the test scenarios remain unknown. On top of that, the labelling process is very time consuming, expensive and prone to error. This means that the access to fast, cheap and accurate labelled data arises as ones of the main challenges in classification problems. In this work, we present three major contributions that pursue the attenuation of these issues in image recognition and viewpoint estimation problems. Overall, the main goal is reducing the amount of data collection and labelling effort.
In order to achieve that, we firstly introduce a novel domain adaptation method that allows datasets from different domains to take part in the training process and contribute to improved classification accuracies. We also revise the unrealistic setting of domain adaptation evaluation datasets and introduce open set domain adaptation for target domains that also contain irrelevant samples that belong to unknown classes.
Then, we also propose an optimisation process for fine viewpoint labelling and use synthetic data to refine viewpoints that are coarsely annotated by humans in real images. To this end, due to the differences between the real and the synthetic data, we apply domain adaptation to align both domains and improve the viewpoint refinement. The results have shown that 3D generated models can be successfully used to refine labels in real images.
We finally present an end-to-end multi-task neural network that jointly trains viewpoints and keypoints of rigid objects. We also reinforce the real training data with a novel synthetic dataset that contains annotations for both problems. The experiments show that the proposed approach successfully exploits this implicit correlation between the tasks and outperforms previous techniques that are trained independently.

Subjects

Bildverarbeitung, Objekterkennung, Posenschätzung, Klassifikator, Adaptierung

Classification (DDC)

004 Informatik

Zitiervorschlag
BibTeX

Panareda Busto, Pau: Domain Adaptation for Image Recognition and Viewpoint Estimation. - Bonn, 2020. - Dissertation, Rheinische Friedrich-Wilhelms-Universität Bonn.
Online-Ausgabe in bonndoc: https://nbn-resolving.org/urn:nbn:de:hbz:5-59574

@phdthesis{handle:20.500.11811/8582,
urn: https://nbn-resolving.org/urn:nbn:de:hbz:5-59574,
author = {{Pau Panareda Busto}},
title = {Domain Adaptation for Image Recognition and Viewpoint Estimation},
school = {Rheinische Friedrich-Wilhelms-Universität Bonn},
year = 2020,
month = sep,
note = {Image-based recognition tasks require in their training phase large amounts of data to capture as much visual traits as possible. In many situations, however, the collection of image data implies a tedious effort or, even worse, the test scenarios remain unknown. On top of that, the labelling process is very time consuming, expensive and prone to error. This means that the access to fast, cheap and accurate labelled data arises as ones of the main challenges in classification problems. In this work, we present three major contributions that pursue the attenuation of these issues in image recognition and viewpoint estimation problems. Overall, the main goal is reducing the amount of data collection and labelling effort.
In order to achieve that, we firstly introduce a novel domain adaptation method that allows datasets from different domains to take part in the training process and contribute to improved classification accuracies. We also revise the unrealistic setting of domain adaptation evaluation datasets and introduce open set domain adaptation for target domains that also contain irrelevant samples that belong to unknown classes.
Then, we also propose an optimisation process for fine viewpoint labelling and use synthetic data to refine viewpoints that are coarsely annotated by humans in real images. To this end, due to the differences between the real and the synthetic data, we apply domain adaptation to align both domains and improve the viewpoint refinement. The results have shown that 3D generated models can be successfully used to refine labels in real images.
We finally present an end-to-end multi-task neural network that jointly trains viewpoints and keypoints of rigid objects. We also reinforce the real training data with a novel synthetic dataset that contains annotations for both problems. The experiments show that the proposed approach successfully exploits this implicit correlation between the tasks and outperforms previous techniques that are trained independently.},
url = {https://hdl.handle.net/20.500.11811/8582}
}

The following license files are associated with this item: