Preview |
PDF, English
Download (5MB) | Terms of use |
Abstract
Automatic defect detection in industrial optical inspection requires algorithms that can learn from data. A special challenge is data with incomplete labels. One of the methods that the field of machine learning has brought forth to deal with incomplete labels is multiple instance learning. One trait of this setting is that it groups datapoints (instances) into bags.
We propose a novel method to predict bag probabilities from given instance probabilities that has the advantage that its results do not depend on bag size. Also, we propose an extension of the multiple instance model that allows the user to steer the number of instances that are classified as positive.
We implement these methods with an algorithm based on the well-known random forest classifier. Results on a standard benchmark dataset show competitive performance. Furthermore, we apply this algorithm to image data that reflects the challenges of industrial optical inspection, and we show that in this setting it improves over the standard random forest.
Document type: | Dissertation |
---|---|
Supervisor: | Hamprecht, Prof. Dr. Fred A. |
Date of thesis defense: | 23 July 2014 |
Date Deposited: | 11 Sep 2014 07:35 |
Date: | 2014 |
Faculties / Institutes: | The Faculty of Physics and Astronomy > Dekanat der Fakultät für Physik und Astronomie Service facilities > Interdisciplinary Center for Scientific Computing |
DDC-classification: | 004 Data processing Computer science 310 General statistics 670 Manufacturing |