Virtual Reconstruction of Hand-Torn Documents using Discriminative Models

Richter, Fabian

In numerous fields of computer vision such as in object detection, human pose estimation and image classification, machine learning has become an indispensable component for solving application-specific tasks. This thesis proposes and explores new ways of utilizing discriminative models for the virtual reconstruction of hand-torn documents. In this work, reassembling pieces into document pages is accomplished in a bottom-up fashion. We show that discriminative models are suitable to solve various key problems and discuss how they can be fused effectively into a graph-based algorithm. In essence, we use our models to infer different spatial configurations between pieces, which are encoded into the graph's link structure. In contrast to the widely spread heuristic solutions, supervised learning has a solid theoretical foundation and thus enables a rigorous in-depth analysis of all key components of our proposed method. We further investigate and thoroughly evaluate new methods for theIn numerous fields of computer vision such as in object detection, human pose estimation and image classification, machine learning has become an indispensable component for solving application-specific tasks. This thesis proposes and explores new ways of utilizing discriminative models for the virtual reconstruction of hand-torn documents. In this work, reassembling pieces into document pages is accomplished in a bottom-up fashion. We show that discriminative models are suitable to solve various key problems and discuss how they can be fused effectively into a graph-based algorithm. In essence, we use our models to infer different spatial configurations between pieces, which are encoded into the graph's link structure. In contrast to the widely spread heuristic solutions, supervised learning has a solid theoretical foundation and thus enables a rigorous in-depth analysis of all key components of our proposed method. We further investigate and thoroughly evaluate new methods for the representation of digital pieces. In order to deal properly with arbitrarily shaped pieces, we present a novel technique for the extraction of content-based features along their outer boundary. Our method allows an effortless integration of widely used features and therefore enables a highly discriminative, multimodal representation. We further propose a new color coding scheme based on the Fisher vector, which is extremely robust in the presence of noise and thus is ideally suited for real-world applications. Besides, we introduce two novel, fully annotated datasets. In order to obtain a ground truth, human experts were asked to reassemble all digitized pieces into pages. This not only lays the basis for supervised learning from annotated examples but also provides the means for a rigorous evaluation. Inspired by existing benchmarks in the aforementioned domains we introduce two novel performance measures that quantitatively assess the quality of reconstruction results. We extensively evaluate our proposed method and demonstrate its general applicability on three different datasets, where we achieve state-of-the-art results.… show more

Author:	Fabian Richter
URN:	urn:nbn:de:bvb:384-opus4-33918
Frontdoor URL	https://opus.bibliothek.uni-augsburg.de/opus4/3391
Advisor:	Rainer Lienhart
Type:	Doctoral Thesis
Language:	English
Publishing Institution:	Universität Augsburg
Granting Institution:	Universität Augsburg, Fakultät für Angewandte Informatik
Date of final exam:	2015/11/16
Release Date:	2015/12/30
Tag:	document reconstruction; computer vision; machine learning; graph algorithms; structural support vector machines
GND-Keyword:	Bilderkennung; Dokument; Rekonstruktion; Maschinelles Lernen
Institutes:	Fakultät für Angewandte Informatik
	Fakultät für Angewandte Informatik / Institut für Informatik
Dewey Decimal Classification:	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Licence (German):	Deutsches Urheberrecht mit Print on Demand

Open Access

Virtual Reconstruction of Hand-Torn Documents using Discriminative Models

Download full text files

Export metadata

Statistics

Print On Demand

Additional Services