Robust Semantic Segmentation with Deep Learning

Kamann, Christoph

[thumbnail of dissertation_christoph_kamann.pdf]

Preview

PDF, English - main document
Download (40MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

DOI: 10.11588/heidok.00030118
URN: urn:nbn:de:bsz:16-heidok-301182
URL: http://www.ub.uni-heidelberg.de/archiv/30118

Abstract

Semantic image segmentation is a widely studied field in computer vision with a diverse set of applications in medical diagnostics or autonomous transportation. Neural networks set the state-of-the-art of vision algorithms for semantic segmentation. Understanding the robustness of the network's module with respect to a diverse set of image corruptions is essential when a segmentation network is developed. In this thesis, we present an exhaustive study of network robustness for semantic segmentation. Our study is separated into two parts. Firstly, understanding the robustness of neural networks. We utilize a database of almost 400,000 images created from PASCAL VOC 2012, ADE20K, and the Cityscapes dataset for evaluating the performance of neural architectures. We benchmark entire neural network architectures as well as particular architectural properties established for semantic segmentation. We also view the data-driven side, where we take on a look at such networks' generalization capabilities. Based on the first part, we focus on increasing robustness in the second part of this thesis. We build upon an insight from image classification that output robustness can be improved by increasing the network-bias towards object shapes. We present a new training schema that increases this shape bias. Our basic idea is to alpha-blend a portion of the RGB training images with faked images, where each class-label is given a fixed, randomly chosen color that is not likely to appear in real imagery. This forces the network to rely more strongly on shape cues instead of texture cues. We call this data augmentation technique "Painting-by-Numbers", and we provide extensive experimental evaluation and propose a method to validate such shape-based techniques.

Document type:	Dissertation
Supervisor:	Rother, Prof. Dr. Carsten
Place of Publication:	Heidelberg
Date of thesis defense:	9 June 2021
Date Deposited:	02 Jul 2021 09:16
Date:	2021
Faculties / Institutes:	The Faculty of Mathematics and Computer Science > Department of Computer Science
DDC-classification:	004 Data processing Computer science