eprintid: 35416 rev_number: 11 eprint_status: archive userid: 8445 dir: disk0/00/03/54/16 datestamp: 2024-09-25 06:44:01 lastmod: 2024-09-25 06:59:38 status_changed: 2024-09-25 06:44:01 type: doctoralThesis metadata_visibility: show creators_name: Boll, Bastian Benjamin title: On Structured Prediction of Discrete Data: Geometry and Statistical Learning divisions: i-110400 adv_faculty: af-11 abstract: Structured prediction is the task of jointly predicting realizations of multiple coupled random variables. This statistical problem is central to many advanced applications of deep learning, including image segmentation and graph node classification. This thesis presents a two-pronged study of predicting structured discrete data, exploring geometric aspects and statistical learning. On the geometric side, we first interpret distributions of independent discrete random variables as points on a product manifold of probability simplices. We find that this manifold is isometrically embedded into the meta-simplex of joint probability distributions. This finding illuminates the relationship between inference dynamics on the product manifold, called assignment flows, and replicator dynamics on the meta-simplex. The former can be seen as the replicator dynamics of multi-population games and the constructed embedding formally reduces them to high-dimensional single-population game dynamics. Based on these geometric insights, we propose two types of generative models for discrete data by facilitating measure transport through randomized assignment flows. The first approximates a given energy-based model, while the second is learned directly from data. Experiments on image segmentation data illustrate the viability of the proposed method. With regard to statistical learning, we explore current methods in PAC-Bayesian risk certification and propose a classification approach with favorable computational properties. Further, we develop a novel PAC-Bayesian risk bound for structured prediction, which can account for generalization even from a single structured datum. The lack of independent data is addressed by distilling the coupling structure of the joint data distribution, given as a Knothe-Rosenblatt rearrangement of a reference measure, allowing for the use of modern concentration of measure results. date: 2024 id_scheme: DOI id_number: 10.11588/heidok.00035416 ppn_swb: 1903432634 own_urn: urn:nbn:de:bsz:16-heidok-354167 date_accepted: 2024-09-17 advisor: HASH(0x55de5798b7e0) language: eng bibsort: BOLLBASTIAONSTRUCTUR20240920 full_text_status: public place_of_pub: Heidelberg citation: Boll, Bastian Benjamin (2024) On Structured Prediction of Discrete Data: Geometry and Statistical Learning. [Dissertation] document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/35416/1/boll_dissertation.pdf