eprintid: 17287 rev_number: 14 eprint_status: archive userid: 1336 dir: disk0/00/01/72/87 datestamp: 2014-09-11 07:35:08 lastmod: 2014-09-18 08:35:46 status_changed: 2014-09-11 07:35:08 type: doctoralThesis metadata_visibility: show creators_name: Wieler, Matthias title: Multiple Instance Learning with Random Forests and Applications in Industrial Optical Inspection subjects: 004 subjects: 310 subjects: 670 divisions: 130001 divisions: 708000 adv_faculty: af-13 abstract: Automatic defect detection in industrial optical inspection requires algorithms that can learn from data. A special challenge is data with incomplete labels. One of the methods that the field of machine learning has brought forth to deal with incomplete labels is multiple instance learning. One trait of this setting is that it groups datapoints (instances) into bags. We propose a novel method to predict bag probabilities from given instance probabilities that has the advantage that its results do not depend on bag size. Also, we propose an extension of the multiple instance model that allows the user to steer the number of instances that are classified as positive. We implement these methods with an algorithm based on the well-known random forest classifier. Results on a standard benchmark dataset show competitive performance. Furthermore, we apply this algorithm to image data that reflects the challenges of industrial optical inspection, and we show that in this setting it improves over the standard random forest. date: 2014 id_scheme: DOI id_number: 10.11588/heidok.00017287 ppn_swb: 1659197406 own_urn: urn:nbn:de:bsz:16-heidok-172875 date_accepted: 2014-07-23 advisor: HASH(0x564e1c2b4650) language: eng bibsort: WIELERMATTMULTIPLEIN2014 full_text_status: public citation: Wieler, Matthias (2014) Multiple Instance Learning with Random Forests and Applications in Industrial Optical Inspection. [Dissertation] document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/17287/1/PhDThesis_Wieler.pdf