TY - GEN UR - https://archiv.ub.uni-heidelberg.de/volltextserver/17287/ ID - heidok17287 TI - Multiple Instance Learning with Random Forests and Applications in Industrial Optical Inspection Y1 - 2014/// AV - public A1 - Wieler, Matthias N2 - Automatic defect detection in industrial optical inspection requires algorithms that can learn from data. A special challenge is data with incomplete labels. One of the methods that the field of machine learning has brought forth to deal with incomplete labels is multiple instance learning. One trait of this setting is that it groups datapoints (instances) into bags. We propose a novel method to predict bag probabilities from given instance probabilities that has the advantage that its results do not depend on bag size. Also, we propose an extension of the multiple instance model that allows the user to steer the number of instances that are classified as positive. We implement these methods with an algorithm based on the well-known random forest classifier. Results on a standard benchmark dataset show competitive performance. Furthermore, we apply this algorithm to image data that reflects the challenges of industrial optical inspection, and we show that in this setting it improves over the standard random forest. ER -