The primary endpoint in oncology is usually overall survival, where differences between therapies may only be observable after many years. To avoid withholding of a promising therapy, preliminary approval based on a surrogate endpoint is possible. The approval can be confirmed later by assessing overall survival. When planning and analysing trials in this context, the correlation between surrogate endpoint and overall survival has to be taken into account. For the binary surrogate endpoint response, this relation can be modeled by means of the responder stratified exponential survival (RSES) model that was proposed elsewhere. The RSES model has three parameters: response probability, the logarithmic hazard of responders, and the logarithmic hazard of non-responders. The aim of this dissertation is to investigate the RSES model and to develop and evaluate methods for parameter estimation, hypothesis testing, and sample size calculation within the RSES model.
Estimators for the parameters are derived by the Maximum Likelihood method. Approximate confidence intervals for the model parameters are constructed and are found to have very satisfying coverage probability. A hypothesis test for the difference of model parameters between two treatment groups, called approximate RSES test, is constructed. When it is compared with the logrank test and the stratified logrank test regarding power, results vary based on the scenario. When survival benefit in one group is mainly due to response benefit, the approximate RSES test is considerably more powerful than the other tests. Approximate confidence intervals for the parameter differences are derived and show very satisfying coverage probability. Where possible, exact formulas for the calculation of coverage probabilities and rejection probabilities are given. An approximate and an exact sample size calculation method for the approximate RSES test are developed. The sample size calculation method is applied to a clinical example and the power of the approximate RSES test, the logrank test, and the stratified logrank test is compared within this example. The approximate RSES test turns out to be considerably more powerful.
It is discussed that the assumptions of the RSES model are relatively strict. Also, the results of the approximate RSES test have to be interpreted carefully, since a rejection of the null hypothesis does not necessarily translate to a uniform survival benefit. In practice, more flexible methods may be desired for estimating and testing survival distributions conditional on a binary response variable. Testing could be based on an effect measure that indicates survival benefit, like the Restricted Mean Survival Time (RMST). Combining a non-parametric survival estimation method that considers the response status with a meaningful effect measure like RMST could be a flexible way to analyse studies in the described context. When planning such a study with concrete assumptions, the RSES model can be applied. Also, it is pointed out that the approach presented in this thesis is applicable to other parametric survival models. Further research is needed to develop distribution estimators and a test of survival difference with more flexible distribution assumptions, as well as extending the methods to the situation of an early interim decision based on response rates.
It is concluded that this thesis contains a comprehensive investigation of the RSES model. It provides point estimators and confidence interval estimators for the RSES model which are necessary for applying the RSES model in practice. Furthermore, the general approaches used in this dissertation regarding the derivation of estimators, confidence intervals, hypothesis tests, sample size calculation, and exact calculations can be applied to further models describing the relationship between a surrogate endpoint and a survival endpoint.
The aim of this thesis is to develop a rigorous and consistent framework for all statistical aspects of planning and evaluating a single-arm phase II trial with binary endpoint ‘tumour response to treatment’. This includes guidance on the definition of a situation-specific objective criterion under planning uncertainty, methods to react flexibly to new trial-external evidence that might arise during the course of the trial, and inference after concluding the trial. To this end, a novel numerical approach is presented which makes the global optimisation of such design feasible in practice and improves existing approaches in terms of both flexibility and speed. The problem of incorporating a priori uncertainty about the true effect size in the planning process is discussed in detail taking a Bayesian perspective on quantifying uncertainty about the unknown response probability p is taken. Subsequently, the close interplay between point estimation, p values, confidence intervals, and the final test decision is illustrated and a framework is developed which allows consistent and efficient inference in binary single-arm two-stage designs. Finally, issues are addressed that may arise during the implementation of the proposed methods in practice. In particular, the problem of unplanned design modifications is revisited and the distinction between pre-specified adaptations within optimal two-stage designs and unplanned adaptations of ongoing designs discussed in more depth.
This doctoral thesis addresses critical methodological aspects within machine learning experimentation, focusing on enhancing the evaluation and analysis of algorithm performance. The established "train-dev-test paradigm" commonly guides machine learning practitioners, involving nested optimization processes to optimize model parameters and meta-parameters and benchmarking against test data. However, this paradigm overlooks crucial aspects, such as algorithm variability and the intricate relationship between algorithm performance and meta-parameters. This work introduces a comprehensive framework that employs statistical techniques to bridge these gaps, advancing the methodological standards in empirical machine learning research. The foundational premise of this thesis lies in differentiating between algorithms and classifiers, recognizing that an algorithm may yield multiple classifiers due to inherent stochasticity or design choices. Consequently, algorithm performance becomes inherently probabilistic and cannot be captured by a single metric. The contributions of this work are structured around three core themes:
Algorithm Comparison: A fundamental aim of empirical machine learning research is algorithm comparison. To this end, the thesis proposes utilizing Linear Mixed Effects Models (LMEMs) for analyzing evaluation data. LMEMs offer distinct advantages by accommodating complex data structures beyond the typical independent and identically distributed (iid) assumption. Thus LMEMs enable a holistic analysis of algorithm instances and facilitate the construction of nuanced conditional models of expected risk, supporting algorithm comparisons based on diverse data properties.
Algorithm Performance Analysis: Contemporary evaluation practices often treat algorithms and classifiers as black boxes, hindering insights into their performance and parameter dependencies. Leveraging LMEMs, specifically implementing Variance Component Analysis, the thesis introduces methods from psychometrics to quantify algorithm performance homogeneity (reliability) and assess the influence of meta-parameters on performance. The flexibility of LMEMs allows a granular analysis of this relationship and extends these techniques to analyze data annotation processes linked to algorithm performance.
Inferential Reproducibility: Building upon the preceding chapters, this section showcases a unified approach to analyze machine learning experiments comprehensively. By leveraging the full range of generated model instances, the analysis provides a nuanced understanding of competing algorithms. The outcomes offer implementation guidelines for algorithmic modifications and consolidate incongruent findings across diverse datasets, contributing to a coherent empirical perspective on algorithmic effects.
This work underscores the significance of addressing algorithmic variability, meta-parameter impact, and the probabilistic nature of algorithm performance. This thesis aims to enhance machine learning experiments' transparency, reproducibility, and interpretability by introducing robust statistical methodologies facilitating extensive empirical analysis. It extends beyond conventional guidelines, offering a principled approach to advance the understanding and evaluation of algorithms in the evolving landscape of machine learning and data science.
Statistical methods play a crucial role in modern astronomical research. The development and understanding of these methods will be of fundamental importance to future work on large astronomical surveys. In this thesis I showcase three different statistical approaches to survey data. I first apply a semi-supervised dimensionality reduction technique to cluster similar high resolution spectra from the GALAH survey to identify 54 candidate extremely metal-poor stars. The approach shows promising potential for implementation in future large-scale stellar spectroscopic surveys. Next, I employ a method to classify sources in the Gaia survey as stars, galaxies or quasars, making use of additional infrared photometry from CatWISE2020 and discussing the importance of applying adjusted priors to probabilistic classification. Lastly, I utilise a method to estimate the rotational parameters of star clusters in Gaia, with an application to open clusters. This is done by considering the rotation of a cluster as a 3D solid body, and finding the best fitting parameters by sampling constructed likelihood functions. The methods developed in this thesis underscore the significant contributions statistical methodologies make to astronomy, and illustrate how the development and application of statistical methods will be essential for extracting meaningful insights from future large scale astronomical surveys.
In this thesis, estimation in regression and classification problems which include low dimensional structures are considered. The underlying question is the following. How well do statistical learn- ing methods perform for models with low dimensional structures? We approach this question using various algorithms in various settings. For our first main contribution, we prove optimal convergence rates in a classification setting using neural networks. While non-optimal rates ex- isted for this problem, we are the first to prove optimal ones. Secondly, we introduce a new tree based algorithm we named random planted forest. It adapts particularly well to models which consist of low dimensional structures. We examine its performance in simulation studies and include some theoretical backing by proving optimal convergence rates in certain settings for a modification of the algorithm. Additionally, a generalized version of the algorithm is included, which can be used in classification settings. In a further contribution, we prove optimal con- vergence rates for the local linear smooth backfitting algorithm. While such rates have already been established, we bring a new simpler perspective to the problem which leads to better understanding and easier interpretation. Additionally, given an estimator in a regression setting, we propose a constraint which leads to a unique decomposition. This decomposition is useful for visualising and interpreting the estimator, in particular if it consits of low dimenional structures.
In this thesis, we use observation-driven models for time-series of daily RCs. That is, we assume a matrix-variate probability distribution for the daily RCs, whose parameters are updated based on the RC realizations from previous days.
In particular, Chapter 2 looks at different matrix-variate probability distributions for the RCs and their theoretical and empirical properties. Chapter 3 proposes a flexible observation-driven model to update all distribution-specific time-varying parameters, not just the expected value matrix as is done in the literature so far. Chapter 4 introduces an observation-driven updating mechanism that is applicable to high-dimensional time-series of RCs. Each of these three chapters is a self-contained paper.
Das Ziel dieser Arbeit ist es, Methoden der Netzwerk-Metaanalyse zu evaluieren, die sowohl Studien mit einer allgemeinen Patientenpopulation sowie Studien, die nur eine Subpopulation, die anhand eines Biomarker identifiziert wird, einschließen. In dieser Arbeit wird die betrachtete Situation durch ein reales Netzwerk-MetaanalyseProblem motiviert. Bei diesem Beispiel wird ein Biomarker identifiziert, der eine Patientenpopulation in Patienten mit größerem und solche mit geringerem Nutzen von speziellen Behandlungen unterteilt. In späteren Studien wurden daher nur Teilpopulationen eingeschlossen, die von einer solchen zielgerichteten Behandlung profitieren, während in früheren Studien gemischte Patientenpopulationen, ungeachtet des Biomarkerstatus eingeschlossen wurden. Zudem wird davon ausgegangen, dass nur zielgerichtete Studien zu einer weiteren neueren Behandlung vorliegen. Um diese verschiedenen Therapien miteinander zu vergleichen und alle verfügbaren Studien miteinzuschließen, sollte eine Netzwerk-Metaanalyse durchgeführt werden. Da die Patientenpopulationen der Studien in Bezug auf diesen Biomarker heterogen sind, stellt die Durchführung einer NetzwerkMetaanalyse aller verfügbaren Studien und Therapien (zielgerichtet und nicht zielgerichtet) eine Herausforderung bei der Evidenzsynthese dar, insbesondere wenn nur aggregierte Studiendaten verfügbar sind. Derzeit existierende Methoden gehen entweder davon aus, dass individuelle Patientendaten der Studien verfügbar sind, oder beinhalten nicht alle verfügbaren Studien. Daher werden in dieser Arbeit Methoden diskutiert, um solch eine NetzwerkMetaanalyse durchzuführen und diese im Hinblick auf das im Folgenden dargestellte Setting bewertet. Das Netzwerk-Metaanalyse-Setting ist ein "Dreiecks-" Netzwerk, das (wenige) direkte und (viele) indirekte Beweise für einen bestimmten Vergleich enthält. Zwei Settings (A und B) werden untersucht, die sich hinsichtlich des Anteils an biomarker-positiven Patienten in den einzelnen Studien unterscheiden. Das erste Setting (A) geht davon aus, dass im nicht zielgerichteten Therapiearm nur Studien mit einem gemischtem Patientenkollektiv im Hinblick zu einem bestimmten Biomarker vorliegen. Für Setting B wird Setting A erweitert, indem angenommen wird, dass zusätzlich zwei Studien mit ausschließlich biomarker-positiven Patientenpopulationen im nicht zielgerichteten Studienarm verfügbar sind. Basierend auf diesen beiden Szenarien werden drei häufig verwendete netzwerkmetaanalytische Methoden, der naive Ansatz, bei dem die Heterogenität in der Patientenpopulation ignoriert wird, die "Stand-alone-" Analyse, die Studien mit nur biomarker-positiven Patienten umfasst, und die Netzwerk Meta-regression zur Analyse dieses "Dreiecksnetzwerks" untersucht. Für Setting A wird auch als mögliche Lösung eine Art "Missing-data"-Ansatz eingeführt. Zusätzlich wird der "Enriching-through weighting" Ansatz, eine in der Evidenzsynthese entwickelte Methode zur Kombination randomisierter und nichtrandomisierter Daten, modifiziert und für das Setting dieses Dreiecksnetzwerks angepasst. Für Setting B werden zusätzlich zu den naiven, Stand-alone- und Netzwerk Meta-RegressionAnsätzen weitere Modifikationen des Enriching-through-weighting-Ansatzes, sowie ein "Informative-Prior"-Ansatz untersucht, bei dem die Ergebnisse der gemischten Patientenpopulation als Priorinformation verwendet werden. Die Performance dieser verschiedenen Methoden wird in einer Simulationsstudie bei der Berechnung des mittleren Bias, des Root-Mean-Squared Error, der Präzision, der Abdeckungswahrscheinlichkeit und der Aussagekraft in Bezug auf den geschätzten Behandlungseffekt im Vergleich zwischen zielgerichteten und nicht zielgerichteten Therapien bewertet. Zusätzlich wird ein aktueller klinischer Datensatz mit zielgerichteten und nicht zielgerichteten Therapien analysiert und diskutiert. Die Ergebnisse der Simulationsstudie sowohl für Setting A als auch für Setting B zeigen, dass keine der Methoden über alle untersuchten Szenarien als eindeutig günstig zu beobachten ist. Der Missing-data-Ansatz, die Stand-alone-Analyse und die naive Schätzung schneiden jedoch in allen bewerteten Leistungskennzahlen und Simulationsszenarien vergleichbar oder schlechter ab als die anderen Methoden und werden daher nicht empfohlen. Während eine erhebliche Heterogenität zwischen den Studien für alle Ansätze eine Herausforderung darstellt, hängen die Leistung der Netzwerk Meta-Regression, der Enriching-throughweighting-Ansatz und der Informative-Prior-Ansatz vor allem Simulationsszenario und des einzelnen Bewertungsmaßes ab. Da diesen Schätzmethoden außerdem leicht unterschiedliche Annahmen zugrunde legen, von denen einige das Vorhandensein zusätzlicher Informationen für die Schätzung erfordern, werden Sensitivitätsanalysen empfohlen, wo immer dies möglich ist. Für Setting A empfiehlt sich der Enriching-through-weighting-Ansatz als günstigste Lösung und zusätzlich empfiehlt sich der Netzwerk Meta-Regressions-Ansatz als Sensitivitätsanalyse. Liegen biomarker-positive Studien für den nicht zielgerichteten Studienarm (Setting B) vor, sind der Informative-prior- oder der Enriching-through-weighting-Ansatz im Vorteil. Als Sensitivitätsanalyse wird zusätzlich der Ansatz der Netzwerk-Meta-Regression empfohlen. Alle Ansätze basieren auf Bayesianischen Modellen. In dieser Arbeit wurde ein alternativer Heterogenitätsprior, ein halbnormaler Prior mit Skala 0.5, für das Setting B zusätzlich zum Heterogenitätsprior mit Gleichverteilung analysiert. Die Verwendung dieses informativeren Priors ändert nichts am Abschneiden der einzelnen Ansätze in Bezug auf Bias oder RootMean-Square Error, führt jedoch für die meisten Ansätze zu einer geringeren Abdeckungswahrscheinlichkeit und höherer Präzision und für einige Szenarien zu einer noch höheren Power. Daher wird eine Sensitivitätsanalyse mit diesem Heterogenitätsprior empfohlen, wenn eine Bayesianische Netzwerk-Metaanalyse durchgeführt wird. Diese Arbeit gibt einen Überblick über Methoden zur Durchführung einer NetzwerkMetaanalyse mit unterschiedlichen Patientenpopulationen und leistet einen Beitrag auf dem Gebiet der populationsadjustierten Netzwerk-Metaanalysen. Obwohl keine Methode in allen untersuchten Szenarien eindeutig die beste Leistung erbrachte, sind der Enriching-throughweighting-Ansatz, das Netzwerk-Meta-Regressionsmodell und das Informative- Prior-Modell zu empfehlen, um eine Netzwerk-Metaanalyse durchzuführen und alle verfügbare Evidenz zu nutzen.
More deaths can be attributed to lung cancer, than to any other cancer type. Evidence collected over the last 10 years, from randomized trials in the USA and Europe, indicates that screening by means of low-dose computed tomography (LDCT) could reduce the number of lung cancer (LC) deaths by about 20%-24%. While these findings have led to the implementation of screening programs in the USA, South Korea and Poland, discussions on their optimal design and execution are still ongoing in various countries, including Germany. Optimizing screening means finding the right balance between mortality reduction and risks, harms, and monetary costs. LDCT-scans are expensive, expose participants to radiation and put them at risk for overdiagnosis, as well as at risk for unnecessary invasive and expensive confirmatory procedures triggered by false positive (FP) results. Minimizing the number of unnecessary screening and confirmatory examinations should be prioritized. While risk-based eligibility has been shown to best target candidates, questions regarding optimal screening frequency, accurate nodule evaluation, stop-screening criteria to reduce overdiagnosis, and the use of complementary non-invasive diagnostic methods, remain open. Statistical models and biomarkers have been developed to help answer these questions. However, there is limited evidence of their validity in data from screening contexts and populations other than those in which they were developed. The analyses presented in this thesis are based on data collected as part of the German Lung Cancer Screening Intervention (LUSI) trial in order to validate models that address the questions: 1) can candidates for biennial vs annual screening be identified on the basis of their LC risk? 2) can the number of FP test results be reduced by accurately estimating the malignancy of LDCT-detected nodules? 3) What was the extent of overdiagnosis in the LUSI trial and how does overdiagnosis risk relate to the age and remaining lifetime of participants? Additionally, blood samples from participants of the LUSI were measured to evaluate: 4) whether the well-validated diagnostic biomarker test EarlyCDT®-Lung is sensitive enough to detect tumors seen in LDCT images. The LCRAT+CT and Polynomial models predict LC risk based on subject characteristics and LDCT imaging findings. Results of this first external validation confirmed their ability to identify participants with LC detected within 1-2 years after first screening. Discrimination was higher compared to a criterion based on nodule size and, to a lesser degree, compared to a model based on smoking and subject characteristics (LCRAT). This suggested that while LDCT findings can enhance models, most of their performance can could be attributed to information on smoking. Skipping 50% of annual LDCT examinations (i.e., for participants with estimated risks <5th decile) would have caused <10% delayed diagnoses, indicating that candidates for biennial screening could be identified based on their predicted LC risks without compromising on early detection. Absolute risk estimates were, on average, below the observed LC rates, indicating poor calibration. Models developed using data from the Canadian screening study PanCan showed excellent ability to differentiate between tumors and non-malignant nodules seen on LDCT scans taken at first screening participation and to accurately predict absolute malignancy risk. However, they showed lower performance when applied on data of nodules detected in later rounds. In contrast, a model developed on data from the UKLS trial and models developed on data from clinical settings did not perform as well in any screening round. Excess incidence of screen-detected lung tumors, an estimator of overdiagnosis, was within the range of values reported by other trials after similar post-screening follow-up (ca. 5-6 years). Estimates of mean pre-clinical sojourn time (MPST) and LDCT detection sensitivity were obtained via mathematical modeling. The highest excess incidence and longest MPST estimates were found among adenocarcinomas. The proportion of tumors with long lead times predicted based on MPST estimates (e.g., 23% with lead times ≥8 years) suggested a substantial overdiagnosis risk for individuals with residual life expectancies shorter than these hypothetical lead times, for example for heavy smokers over the age of 75. The tumor autoantibody panel measured by EarlyCDT®-Lung, a test widely validated as a diagnostic tool in clinical settings and recently tested as a pre-screening tool in a large randomized Scottish trial (ECLS), was found to have insufficient sensitivity for the identification of lung tumors detected via LDCT and of participants with screen-detected pulmonary nodules for whom more invasive diagnostic procedures should be recommended. Overall, the findings presented in this thesis indicate that risk prediction models can help optimize LC screening by assigning participants to appropriate screening intervals, and by increasing the accuracy of nodule evaluation. However, there is a need for further external model validation and re-calibration. Additionally, while excess incidence can provide estimates of overdiagnosis risk at a population-level, a better approach would be to obtain model-based personalized estimates of tumor lead and residual lifetime. Better individualized decisions about whether to start or stop screening could be taken on the basis of the relationship between these estimates and the risk of overdiagnosis. Finally, although there is evidence for the potential of biomarkers to complement LC screening, the so far most promising candidate (EarlyCDT®-Lung) cannot be recommended as a pre-screening tool given its poor sensitivity for the identification of lung tumors detected via LDCT. In conclusion, while steps have been taken in the right direction, more research is required in order to answer all open questions regarding the optimal design of lung cancer screening programs.
Die Dissertation stellt die Gründe für den Schwangerschaftskonflikt in den Kontext des Diskurses um den Schwangerschaftsabbruch in Deutschland, wozu Aspekte einer historisch-theoretischen Arbeit mit denen einer empirisch-retrospektiven Studie verbunden werden.
Der theoretische Teil geht zunächst auf die Geschichte des Schwangerschaftsabbruchs von der Antike bis in die Gegenwart ein und stellt die seit den 1970er Jahren erfassten Zahlen von Schwangerschaftsabbrüchen in Deutschland dar. Anschließend werden ausführlich der Diskurs bis ins Jahr 2021 um die seit dem 1. Oktober 1995 bestehende Gesetzgebung zum Schwangerschaftsabbruch, die medizinrechtlichen und medizinethischen Dilemmata und ihre praktischen Folgen behandelt: Der schwierige parlamentarische Weg zu einer einheitlichen Regelung für das wiedervereinte Deutschland wird nachgezeichnet, das durch die Reform etablierte Beratungskonzept (nach dem eine Schwangere nach Beratung innerhalb der ersten 12 Wochen nach Empfängnis strafffrei die Schwangerschaft abbrechen kann) näher betrachtet, die „medizinische Indikation“ (nach der ein Abbruch der Schwangerschaft bis zum Beginn der Geburtswehen möglich ist) und ihre Herausforderungen im Umgang mit potentiell behinderten Kindern thematisiert sowie die Situation und Rolle von Ärztinnen und Ärzten beim Schwangerschaftsabbruch dargestellt.
Der empirische Teil untersucht die Gründe für den Schwangerschaftskonflikt. Dazu wurde zunächst anhand von über 1.800 Konfliktfällen retrospektiv eine Erfassungsmethode etabliert, die eine standardisierte Erfassung und Auswertung der oftmals höchst individuellen Konfliktfälle ermöglicht. Die Auswertung der Konfliktfälle umfasst neben der ausführlichen Analyse der (Haupt-)Konfliktgründe weitere Aspekte wie etwa Ressourcen, die eine Schwangere zum Austragen des Kindes bewegen oder ermutigen können. Außerdem werden die Ergebnisse mit bereits existierenden Daten zu Gründen des Schwangerschaftskonflikts abgeglichen. Es zeigt sich, dass der Schwangerschaftskonflikt zwar häufig multikausal ist, der dominierende Grund jedoch Partnerschaftsprobleme sind. Differenziert man die Konfliktgründe weiter, so ergibt sich, dass der Druck und die Einflussnahme Dritter auf die Schwangere, insbesondere die ablehnende Haltung des Kindesvaters in Bezug auf ein Austragen der Schwangerschaft, der maßgebliche Grund für die Konfliktlage der Frau ist.
In einem abschließenden Diskussionsteil werden wesentliche Inhalte des theoretischen Teils im Zusammenhang mit den Ergebnissen des empirischen Teils besprochen und Schlussfolgerungen gezogen. Hier wird insbesondere darauf hingewiesen, dass in der Diskussion um den Schwangerschaftsabbruch, bei der sich im Wesentlichen das Lebensrecht des Kindes und die Selbstbestimmung der Frau scheinbar unvereinbar gegenüberstehen, ein bedeutender Faktor häufig unterschlagen wird: Bei einem Großteil der Frauen im Schwangerschaftskonflikt ist die Konfliktlage vorrangig durch den direkten oder indirekten Druck Dritter bedingt. Diese wenig beachtete und zahlenmäßig unterschätzte Gruppe von Schwangeren findet in der aktuellen Gesetzeslage zu wenig Hilfe und Schutz, eine weitere Liberalisierung des Schwangerschaftsabbruchs würde die Situation jener Schwangeren weiter verschlechtern.
The scientific advances in medical research in the last two decades have shifted the focus to a personalized treatment approach. An immanent consequence is the need for clinical trials which cover these new treatment approaches. A group of clinical trial designs which account for this are gathered under the generic term master protocols. The basket trial design has evolved as the most prominent master protocol design and investigates one treatment in several different diseases. The joint investigation is justified by a common characteristic, such as a genetic aberration, which is prevalent in all of the diseases and which is used as an effect pathway by the investigated treatment. Basket trial designs have been a virulent field of research with respect to the statistical tools and characteristics of such trials. This has led to an unclear situation in literature and an increasing level of complexity in the statistical tools which are proposed for the use throughout a basket trial. However, a practical application as well as an increased interest in basket trials is rather hampered if the complexity is increased and if no access point to the topic is available. Hence, the aim of this thesis was to introduce a systematic approach to basket trial designs and to investigate the statistical tools in order to facilitate, connect, and improve statistical tools, all with the intention to make the complete basket trial setting more accessible, understandable, and applicable from a statistical perspective. The here elaborated systematic approach towards basket trial designs consists of two aspects, first a categorization of the trial designs and second a modular construction kit for basket trials. The categorization of basket trials is based on the purpose of the trial and on the statistical techniques that are applied. The modular construction separates a basket trial into four different components and presents available statistical tools for the components in a common notation. It moreover elaborates the methodological connections among the sharing tools and shows that they use different techniques. However, even though their complexity varies strongly, the tools are connected with each other or can be the same, even if they were proposed in different ways in different publications. The modular construction kit additionally serves as a catalogue to look up the available statistical tools when a basket trial is planned. The decision tools in basket trials were investigated with a focus on the difference in the statistical methodologies, namely between the frequentist one-sided binomial test and the Bayesian decision based on the posterior distribution from a beta-binomial model. It was shown that the decision tools can be tuned such that the same decisions are made. The difference between the frequentist p-value and the Bayesian posterior probability under a uniform prior was quantified analytically and it was shown by how much the two decision measures deviate from each other. With the elaborated difference, the p-value and the posterior probability can be given as functions of each other and therefore can be used interchangeably. The practical feasibility of that relationship for basket trials was shown with the conversion of the decision tools in a frequentist design into Bayesian decisions. Additionally, the connections between the other decision tools from the construction kit were investigated. The construction kit showed that the hierarchical model with normally distributed, logit-transformed response rate is the base for the majority of the sharing tools. In this thesis, a detailed investigation of a hierarchical model directly relying on the beta distributed, non-transformed response rate was conducted with respect to its feasibility as a basic sharing tool in basket trials. It was shown that the non-transformed model shares information to a slightly stronger degree, that the different underlying distributional assumption for the response rate persists, and that, in general, it is a feasible sharing tool which does have advantages in the interpretation of the hyperparameters. Therefore, its use in basket trial designs should be further investigated in future research. To conclude, this thesis provides a thorough investigation of basket trial designs, it starts with the elaboration of a systematic approach to them and continues with the investigation of the particular components and their statistical tools.
Area-based socioeconomic inequalities in cancer survival have been reported in several countries and for several cancer sites showing that cancer patients living in affluent regions have better survival than those living in deprived regions. It has been shown that deprivation-associated survival disparities might be more apparent when using smaller area-level deprivation measures. Possible reasons for these survival disparities could originate in differences in clinical prognostic factors or cancer care. The aims of this dissertation were to first give a comprehensive summary of the current literature on socioeconomic differences in lung cancer survival and then mainly to investigate deprivation-associated differences in cancer survival in Germany and if these differences depend on patient characteristics, clinical prognostic factors or cancer care. Furthermore, a comparison of survival disparities was made between individual and area-based education by using data for patients with colorectal cancer from the Finnish Cancer Registry. First, a systematic review and meta-analysis was conducted including studies reporting a measure of lung cancer survival in relation to education, income, occupation, or composite measures on individual or area-based level. In total, 23 studies measured the socioeconomic status on individual level and 71 on area-based level. The meta-analyses revealed a poorer prognosis for lung cancer patients with low individual income. Group comparisons of area-based studies indicated a poorer prognosis for lower socioeconomic groups. A consistent relationship between level of aggregation and effect size could not be confirmed due to heterogeneous reporting of measurements. To investigate the association between municipality-level socioeconomic deprivation and cancer survival in Germany, data for the 25 most common cancer sites from seven population-based cancer registries (covering 32 million inhabitants) were used. Patients were diagnosed in 1998-2014 and socioeconomic deprivation was assessed using the categorized German Index of Multiple Deprivation on municipality level. Relative survival was estimated using the period approach for 2012-2014 and model-based period analysis to calculate relative excess risk adjusted for age and stage. In total, 2,333,547 cases were included. For most cancer sites, the most deprived quintile had lower 5-year relative survival compared to the least deprived quintile even after adjusting for stage (all cancer sites combined, relative excess risk 1.16, 95 % confidence interval 1.14-1.19). To further investigate the underlying reasons for deprivation-associated survival disparities in Germany, data from three clinical cancer registries (Regensburg, Dresden, and Erfurt, covering 4 million inhabitants) were used. Patients diagnosed with lung cancer in 2000-2015 and female patients diagnosed with breast cancer in 2006-2016 were included. For lung cancer, the association of deprivation with overall survival was investigated using Cox regression models. For breast cancer, 5-year relative survival using the period approach for 2011-2016 and model-based period analysis to calculate relative excess risk was used. Both models were adjusted for age, stage, and grading, the breast cancer models additionally for estrogen receptor status. Region-specific analyses and subgroup analyses for patients receiving specific types of treatment were conducted. Overall, 22,905 lung cancer and 31,357 breast cancer cases were included. For lung cancer, the most deprived group had a lower overall survival compared to the least deprived group in the fully adjusted model. Patients diagnosed with stage I-III showed a lower survival in the most deprived quintile which persisted when further restricting to surgery but was attenuated for chemo- or radiotherapy subgroups. For breast cancer, the fully adjusted model showed no association between deprivation and 5-year relative survival. By contrast, there was an association between region and breast cancer survival, even after adjustment for socioeconomic deprivation. Regarding the comparison of cancer survival disparities between individual and municipality-level education, data of colorectal cancer patients diagnosed in 2007-2016 in Finland were used. Relative survival and relative excess risk were estimated by sex using period approach adjusted for age, stage at diagnosis, cancer site, urbanity, hospital district and municipality. In total, 24,462 cases were included. Area-based education revealed smaller effect estimates than individual education in colorectal cancer survival. Associations for individual education persisted even after adjustment for municipality-level education. The results of this dissertation show that a further approach for Germany should be to include individual socioeconomic status as well as area-based indices in analyses of cancer survival disparities. These future studies should include region, prognostic factors, complete data on cancer treatment but also other possibly relevant factors such as comorbidities. Furthermore, these analyses should be conducted stratified by cancer site as the present analyses showed different patterns for different cancer types.
In this doctoral dissertation we will investigate dependence structures in three different cases.
We first provide a framework for empirical process theory of (locally) stationary processes for classes of either smooth or nonsmooth functions. The theory is approached by using the so-called functional dependence measure in order to quantify dependence. This work extends known results for stationary Markov chains and mixing sequences while accounting for additional time dependence. The main contributions consist of functional central limit theorems and nonasymptotic maximal inequalities. These can be employed to show, for example, uniform convergence rates for nonparametric regression with locally stationary noise. We further derive rates for kernel density estimators in the case of stationary and locally stationary observations. A special focus is placed on the functional convergence of the empirical distribution function (EDF). Comparisons with results based on other measures of dependence are carried out, as well.
In a subsequent step, we consider high-dimensional stationary processes where new observations are generated by a noisy transformation of past observations. By means of our previous results we prove oracle inequalities for the empirical risk minimizer if the data is generated by either an absolutely regular mixing sequence ( β -mixing) or a Bernoulli shift process under functional dependence. Assuming that the underlying transformation of our data follows an encoder-decoder structure, we construct an encoder-decoder neural network estimator for the prediction of future time steps. We give upper bounds for the expected forecast error under specific structural and sparsity conditions on the network architecture. In a quantitative simulation we discuss the behavior of network estimators under different model assumptions and provide a weather forecast for German cities using data available by the German Meteorological Service (Deutsche Wetterdienst).
Moving onto a different setting, we study the nonparametric estimation of an unknown survival function with support on the positive real line based on a sample with multiplicative measurement errors. The proposed fully data-driven procedure involves an estimation step of the survival function’s Mellin transform and a regularization of the Mellin transform’s inverse by a spectral cut-off. A data-driven choice of the cut-off parameter balances bias and variance. In order to discuss the bias term, we consider Mellin-Sobolev spaces which characterize the regularity of the unknown survival function by the decay behavior of its Mellin transform. When analyzing the variance term we consider the standard i.i.d. case and incorporate dependent observations in form of Bernoulli shift processes and absolutely regular mixing sequences. In the i.i.d. setting we are able to show minimax-optimality over Mellin-Sobolev spaces for the spectral cut-off estimator.
In recent years, the topic of sex differences has rightfully become a focus of scientific research. Current findings suggest sexual dimorphism in the neurophysiological brain pathways that are crucial for drug-seeking and addictive behavior, but how this affects the underlying neurochemical processes, is still widely unexplored. Nevertheless, female subjects have been systemically ignored for decades in the field of microdialysis experiments and the few existing studies using female animals provide only small numbers. Moreover, there is increasing evidence that single preclinical studies often lack reproducibility. Therefore, a hypothesis-free meta-analysis approach was used that provides adequate statistical power for this subject area. The main question of this thesis was whether data of microdialysis experiments indicate a difference in the dopaminergic overflow in reaction to drugs of abuse in male and female rats. Thereby, systematic data mining was performed on the PubMed online library (https://www.ncbi.nlm.nih.gov/pubmed/) focusing on studies measuring extracellular dopamine concentrations in the striatal complex. The focused lied on six drugs of abuse (alcohol, amphetamine, cocaine, nicotine, morphine and tetrahydrocannabinol) and two brain regions (caudate putamen and nucleus accumbens). Data from 45 microdialysis experiments on female rats (number of animals = 842) were extracted and statistically compared with data from 6402 male rats. Overall, 291 studies were included, providing averages of the peak percentage baseline value of dopamine for 103 different dosages. All drugs under investigation notably increased the dopaminergic transmission in the striatal complex. For some drugs, a positive dose-response relationship was detected. Regarding the entity of dose groups, no sex differences in the dopaminergic response to drugs of abuse were found, but for some small subgroups. Neither did the rats’ age, strain, stage of consciousness or the route of administration have an impact on the overall peak percentage baseline values, suggesting robustness of these parameters. Attempts were also made to extract the rats’ estrous cycle as a variable, but only one study monitored it. Overall, the neglect of female subjects in basic research, which had lasted for decades and is far from defeated, was a phenomenon well reflected in the results of the search query in this thesis. What can be therefore concluded, is that future research should intensify its efforts to include female subjects and to close the sex-gap in preclinical as well as in clinical research. This will provide more data that are crucial to get valid results about sex similarities or differences, as this thesis only shed light on a small subdivision.
Prediction of outcome or diagnoses from intake data or assessing the importance of variables as either risk factors or protective factors are fundamental tasks in psychotherapy research, in order to help clinicians and researchers to evaluate and improve treatments. With regard to data analytic assessment, these tasks can be handled by a range of parametric approaches such as regression models. However, there are cases where parametric approaches are either not applicable or have severe limitations (e.g. Strobl et al., 2009). Also, there is increasing support to the notion that biopsychosocial contributions to psychopathology are complex and cannot be sufficiently explained by a small number of variables restricted to linear relationships (Franklin, 2019; Kendler, 2019). Machine Learning (ML) algorithms offer an additional suite of methods able to deal with such complexity and can be used to extend the toolbox of psychotherapy researchers. The aim of the dissertation is to provide an understanding of machine learning application for psychotherapy research and to foster the motivation to use and improve these methods in future research.
Das Ziel dieser Arbeit war es, zu untersuchen, ob und wie Daten einer bereits durchgeführten (historischen) zweiarmigen klinischen Studie in eine neue klinische Studie eingebunden werden können. Es wurde überprüft, ob diese Einbindung mit einem Mehrwert im Sinne einer Erhöhung der Power beziehungsweise einer Reduzierung des erforderlichen Stichprobenumfangs einer neuen klinischen Studie im Vergleich zu einer konventionellen Studie ohne Einbindung historischer Daten einhergehen kann. Eine Reduzierung des erforderlichen Stichprobenumfangs reduziert in der Regel auch den zeitlichen Aufwand und die Kosten einer neuen klinischen Studie. Dies kann daher aus operativer Sicht als sehr wünschenswert angesehen werden. Darüber hinaus kann eine Reduzierung des Stichprobenumfangs und der Dauer einer klinischen Studie auch aus Sicht der Patienten als vorteilhaft angesehen werden, da wirksame Behandlungen schneller ihren Weg in die klinische Praxis finden können. In einem regulatorischen Kontext ist eine notwendige Bedingung für die erfolgreiche Einbindung historischer Daten in eine neue Studie die Kontrolle der Wahrscheinlichkeit eines Fehlers 1. Art unterhalb eines vorgegebenen Signifikanzniveaus. Im Allgemeinen vergrößert sich jedoch die Wahrscheinlichkeit des Fehlers 1.Art mit steigendem Anteil an eingebundenen historischen Daten. Daher wurden in dieser Arbeit Ansätze entwickelt, die auf einer der sogenannten Power-Prior-Methode beruhen, welche es erlaubt, den Anteil der in die neue Studie einfließenden historischen Daten zu kontrollieren. Diese Bayes'sche Methode wurde in einen frequentistischen Rahmen überführt, da die statistischen Konzepte des Fehlers 1. Art und der Power ursprünglich innerhalb der Inferenztheorie eines frequentistischen Settings entwickelt wurden. Im Rahmen dieser Arbeit wurde gezeigt, dass für ein zweiseitiges statistisches Testproblem mit steigendem Anteil an historischen Daten aus zwei Studienarmen die Wahrscheinlichkeit eines Fehlers 1. Art zunächst abnimmt, bevor er zunimmt. Dadurch war es möglich, bei gleichzeitiger Kontrolle der Wahrscheinlichkeit eines Fehlers 1. Art zum vorgegebenen Signifikanzniveau, einen entsprechenden Anteil an historischen Daten in eine neue Studie einzubinden. Es wurde gezeigt, dass das Ausmaß dieses Anteils von verschiedenen Parametern abhängt. Unter der Berücksichtigung dieser sogenannten Störparameter, wurden drei verschiedene Ansätze entwickelt um den Anteil der einzubeziehenden historischen Daten zu bestimmen. Im weiteren Verlauf dieser Arbeit wurden diese drei Ansätze insbesondere bezüglich der Möglichkeit Stichprobenumfang einzusparen untersucht und miteinander verglichen. Es konnte gezeigt werden, dass durch die Einbeziehung historischer Daten in vielen Szenarien die Power zur Aufdeckung des gleichen Effekts, wie er in den historischen Daten beobachtet wurde, erhöht werden kann. Folglich kann der erforderliche Stichprobenumfang für eine neue Studie für viele praktisch relevante Situationen reduziert werden. Es wurden jedoch auch einige Szenarien identifiziert, in denen die Einbeziehung historischer Daten nicht mit einem Mehrwert verbunden ist. Die in dieser Arbeit entwickelten Ansätze sind mit einem hohen Rechenaufwand verbunden. Es wurden daher praktische Empfehlungen gegeben, um diesen zu verringern. Darüber hinaus wurde ein Algorithmus für die Bestimmung des optimalen Stichprobenumfangs entwickelt, der den Rechenaufwand bei den entwickelten Verfahren deutlich reduziert. Zusammenfassend wurde in dieser Arbeit gezeigt, dass die Einbeziehung historischer Daten aus zwei Studienarmen in eine neue Studie mit einem Mehrwert verbunden sein kann. Dieser Mehrwert spiegelt sich im Sinne eine Erhöhung der Power zugunsten des Effekts, wie er in den historischen Daten beobachtet wurde beziehungsweise in einer Reduzierung des erforderlichen Stichprobenumfangs wider. Gleichzeitig wird dabei die Wahrscheinlichkeit eines Fehlers 1. Art durch das vorgegebene Signifikanzniveau eingehalten. Die Existenz und das Ausmaß dieses Mehrwerts hängt jedoch maßgeblich von den zugrundeliegenden historischen Daten ab. Es wurden sowohl Szenarien identifiziert, die mit einem hohen als auch solche, die mit gar keinem Mehrwert einhergehen.
The overall objective of this thesis was to clarify the extent to which functional neurobiological (functional magnetic resonance imaging (fMRI)) and psychophysiological (startle reflex) measures of cue reactivity are suitable predictors of severe relapse in individuals with alcohol use disorder (AUD). This could be achieved by complex modern methods of survival analysis. It could be shown that fMRI cue reactivity in the ventral striatum (VS) is suitable as a prognostic factor for relapse in patients suffering from AUD by calculating different fMRI-based aggregation measures which are not yet implemented in standard whole-brain fMRI software. For the VS a measure combining the spatial extent of cue-induced brain activation with the intensity of this activation was found to be most appropriate as a biomarker for relapse prediction. Furthermore the startle response as a psychophysiological measure showed a moderator effect when evaluating differential medication effects of naltrexone and acamprosate. The results suggest that AUD patients benefit more from naltrexone than from acamprosate treatment, the more appetitive their startle response appears. In contrast the findings point to lower relapse risk for patients with an aversive startle response pattern when treated with acamprosate compared to naltrexone. The findings support the idea of an individualized treatment based on differential pharmacological treatment due to different underlying biological mechanisms which can be identified by the affective modulation of the startle response. The presented methods offer a potential for future analyses of high clinical relevance, also in areas besides addiction (i.e. psychiatry, oncology).
The following chapters portray how, in the context of globalization, some social, political, and economic mechanisms can differentially explain local, national, and international development in the world. While processes of globalization such as aid and trade agreements do convey gains expressed in improved figures of human and economic development, they do not operate in a political and social vacuum. The lessons on the politics of redistribution visited in all chapters ‒ but more heavily expressed in Chapter 3 ‒ speak loudly in favor of the latter. Globalization and the construction of welfare is a multi-layered process, dependent not only on international and national policy-making, but also on local factors that shape the numbers of winners and losers around the world. I hope that this thesis contributes to the latest, increasing array of work in charge of sustaining effective national- and local-level policy-making in the fields of aid, trade, and leaders’ regional influence on specific and more general forms of development.
This dissertation contains four separate chapters.
CHAPTER 1 This chapter examines the current, lagged, and indirect effects of tropical cyclones on annual sectoral growth worldwide. The main explanatory variable is a new damage measure for local tropical cyclone intensity based on meteorological data weighted for individual sectoral exposure, which is included in a panel analysis for a maximum of 205 countries over the 1970–2015 period. I find a significantly negative influence of tropical cyclones on two sector aggregates including agriculture, as well as trade and tourism. In subsequent years, tropical cyclones negatively affect the majority of all sectors. However, the Input-Output analysis shows that production processes are sticky and indirect economic effects are limited.
CHAPTER 2 People in low-lying coastal areas live under the potentially great threat of damage due to coastal flooding from tropical cyclones. Understanding how coastal population settlements react to such events is of high importance for society in order to consider future potential adaptation strategies and policies. In this study, we generate a new global hydrological data set on storm surge damage for the period 1850–2010. By combining this new data set with spatial data on human populations at a resolution of 10 km, we analyze the influence of storm surge damage on the rural, urban, and total population in low elevation coastal zones. We find that 8% of the global coastal population moved away per decade over the 1950–2010 period as a consequence of storm surges, on average. It is the urban population where we find the largest reductions (-15%). We show that the exposed coastal population has adapted over time and started to reduce its exposure in recent decades. This finding applies to most regions, with the exceptions of North America, Oceania, and Western Asia.
CHAPTER 3 Allocation decisions are vulnerable to political influence, but little is known about when politicians can use their discretion to pursue their strategic goals. We show the nonlinearity of political favoritism in an exogenous framework of U.S. disaster relief. Based on a simple theoretical model, we demonstrate that political biases are most pronounced when the need for a disaster declaration is ambiguous. Exploiting the spatiotemporal randomness of all hurricane strikes in the United States from 1965–2018, we find that presidents favor areas governed by their fellow party members when allocating disaster declarations. Our nonlinear estimations reveal that political influence varies immensely with respect to storm intensity. The alignment bias for medium-strength hurricanes exceeds standard linear estimates eightfold.
CHAPTER 4 We examine the design and implementation of the United Nations Flash Appeal triggered in response to the highly destructive 2015 Nepal earthquake. We consider how local need and various distortions affect the proposed project number, the proposed financial amount, and the subsequent funding decision by aid donors. Specifically, we investigate the extent to which the allocation of this humanitarian assistance follows municipalities’ affectedness and their physical and socioeconomic vulnerabilities. We then analyze potential ethnic, religious, and political distortions. Our results show that aid allocation is associated with geophysical estimates of the earthquake damage. Controlled for disaster impact, however, aid allocation shows little regard for the specific socioeconomic and physical vulnerabilities. It is also worrisome that the allocation of the flash appeal commitments favors municipalities dominated by higher castes and disadvantages those with a greater distance to the Nepali capital Kathmandu.
In this thesis, we study statistical properties of the Fréchet mean and its generalizations in abstract settings. These settings include large classes of scenarios, which may be of great interest in practice when dealing with nonstandard data. Our main focus is on the convergence of sample Fréchet means of independent observations to their population counterpart. The results are exemplarily applied to some specific spaces.
The expectation of a real-valued, square-integrable random variable is characterized by being the unique constant value that minimizes the expected squared difference to the random variable. One can use this property to generalize the notion of mean. A Fréchet mean of a metric space-valued random variable is any minimizer of the expected squared distance to that random variable. This definition achieves two important things: Firstly, it encompasses many commonly used types of mean -- like the expectation, the median, or the geometric mean -- allowing to state powerful, general, and far-reaching theorems about properties of means. Secondly, it defines a mean for non-Euclidean spaces -- like the sphere, the space of phylogenetic trees, or Wasserstein spaces -- opening up these spaces for profound applications of probability theory and statistics.
We show strong laws of large numbers of Fréchet mean sets with two different notions of convergence of sets assuming only a first moment condition. After having established consistency of the sample Fréchet mean, we investigate the rate of this convergence. We demonstrate, using projected means, an instance of the Fréchet mean, that Fréchet means may exhibit very different rates depending on the geometry of the metric space and properties of the distribution of the data. Then we prove rates of convergence in a general setting under some conditions. One of these is the quadruple inequality -- a generalization of the Cauchy-Schwarz inequality. This and some other conditions are fulfilled in Hadamard spaces -- geodesic metric spaces of nonpositive curvature -- which makes them particularly interesting to study in the context of Fréchet means. We show a quadruple inequality for certain powers of Hadamard metrics -- a purely geometric result with an intriguingly complex proof. Lastly, we examine regression models where responses live in a metric space and the regression function is a conditional Fréchet mean. We compare two approaches to transform known estimators to this non-Euclidean setting. In doing so, we establish rates of convergence for four different estimation procedures, two of which are new methods. To illustrate these regression estimators, an R-package was developed that allows their application and comparison on the sphere.
Multiple myeloma (MM) is a malignant bone marrow (BM) disease characterized by somatic hypermutation and DNA damage in plasma cells; leading to the overproduction of dysfunctional malignant myeloma cells. Accumulation of myeloma cells has direct and indirect effects on the BM and other organs. Despite the development of new therapeutic options; MM remains incurable and only a small fraction of patients experiences long-term survival (LTS). The past has shown that ultimately all patients still relapse; leading to the hypothesis that a state of active immune-surveillance is required to control the residual disease.
To understand the long-term survival phenomenon and its link to the immune-phenotypes in MM disease; we collected paired bone marrow samples from 24 patients who survived for about 7 to 17 years after Autologous Stem Cell Transplant (ASCT), with a high plasma cell infiltration in the BM (median 49.5%) at diagnosis time. Response assessment according to the International Myeloma Working Group (IMWG) revealed that 15 patients were in complete remission (CR), whereas 9 patients were in non-complete remission (non-CR) that had tumor cells which remained stable over recent years.
We performed single-cell RNA-seq sequencing on more than 290,000 bone marrow cells from 11 patients before treatment (BT) and in LTS, as well as three healthy controls using 10x Genomics technology. I developed a computational approach using the state-of-the-art single cell methods, statistical inference and machine learning models to decipher the bone marrow immune cell types and states across all clinical groups. I performed in-depth analyses of the bone marrow immune microenvironment across all captured cell types, and provided the global landscape of cellular states across all clinical groups.
In this work, I defined new cellular states, marker genes, and gene signatures associated with the patients’ clinical and survival states. Additionally, I defined a new myeloid population termed Myeloma-associated Neutrophils (MAN) cells and a T cell exhaustion population termed Aberrant Memory Cytotoxic (AMC) CD8+ T cells in newly diagnosed Multiple Myeloma patients.
Moreover, I propose new therapeutic targets CXCR3 and NR4A2 in AMC CD8+ T cells, which could be further investigated to reverse the T cell exhaustion state in newly diagnosed MM patients. Furthermore, I defined new prognostic markers in the CD8+ T cell compartment which could be predictive for the global disease state.
Finally, I propose that MM long-term survivors go through a complex and evolving immune landscape and acquire cellular states in a stepwise manner. Furthermore, I propose the Continuum Immune Landscape (CIL) Model which explains the immune landscape of MM patients before and after long-term survival. Additionally, I introduced the Disease-State Trajectories (DST) hypothesis regarding the disease-associated dysregulated cellular states in MM context, which could be generalized into other tumor entities and diseases.
The gold standard for clinical studies are blinded randomized trials, but such a design is not always feasible due to ethical or practical reasons. Using an external historical control group out of an earlier conducted trial or registry might be an option. When using historical controls, one often faces the situation of non-comparable study populations. Matching procedures may help to build balanced samples for comparison. In this thesis an adaptive matched case-control trial design is established, which allows for a sample size recalculation at a planned interim analysis with the goal to enhance the matching rate at final analysis. The recalculation is based on the lower confidence interval limit of the matching rate observed at interim analysis. The newly developed resampling CI method estimates the 1:1 matching rate using a bootstrap like procedure (without replacement) and equal-sized groups for matching at interim. A naïve approach would be to use all patients for estimating the matching rate and directly reflect this value for recalculating the sample size. The new approach shows good performance in terms of power and type I error rate but needs more newly recruited patients than the naïve approach. Additionally, investigations for the time point of interim analysis are done. Simulations result in a number of 1/2 to 2/3 of the control patients, however, it seems that the time point is more depending on the actual number of patients used for matching than on the proportion. However, if the historical control group is large and for example only a small phase II trial is feasible the before described method might not be a good choice. Rather, each intervention patient may find more than one matching partner. Therefore, an iterative procedure to determining the number of matching partners is developed. The idea is an interim analysis, which includes an iterative increase in the number of matching partners and a parallel calculation of the matching rate. The number will be increased as long as the 1:M matching rate is higher than the 1:1 matching rate including a potential tolerance. The 1:M matching rate at interim analysis can then be used for recalculating the sample size. This procedure is easy to implement and can be combined with many study designs, such as two-stage designs. One has to note that the number of matching partners highly depends on the overlap of patient populations, meaning a small overlap leads to a low number of matching partners and vice versa. To conclude, by involving the trial-specific matching rate in the sample size recalculation one is able to enhance power in a matched case-control trial.
Not only in the generation of evidence unbalanced patient cohorts arise, but also in evidence synthesis this poses a problem. A common situation in evidence synthesis is an indirect comparison, where the comparison of interest, assume treatment A versus C, is not examined in a direct comparison. But there are trials comparing A with treatment B and another trial comparing C and B. using those trials to calculate a treatment effect for A versus C is called indirect comparison. It is likely that the independent trials AB and CB do not have the same underlying population. A special case, where individual patient data is available for one of the trials is assumed. Then a matching-like procedure can help to balance the cohorts, this method is called matching adjusted indirect comparison which is not sufficiently examined, yet. Another widely used method for indirect comparisons is the method of Bucher. A method comparison between those two methods is conducted for clinically relevant scenarios where assumptions of the methods are violated. Simulations lead to the conjecture that indirect comparisons are considerably underpowered. The method of Bucher and the matching adjusted indirect comparison show similar performance in scenarios without cross-trial differences. The matching approach leads to higher coverage and power when populations differ, effect modifiers are present, and regression models are not sufficiently adjusted. But matching confounders which do not modify the effect leads to increased bias. Until now, indirect comparisons are applied using one study per treatment comparison because the matching adjusted indirect comparison is designed for this setting. Nevertheless, it is likely that there are two or even more studies comparing the same treatments. When synthesizing evidence, one should always aim to include all appropriate evidence. Therefore, approaches to include multiple studies in indirect comparisons are introduced and compared. All include a step for combining treatment effects and one for calculating indirect treatment effects. The main difference between the approaches is the order of those two steps. An increasing number of studies can enhance power to desired regions above 80%, but it was not possible to identify one best performing method over all considered scenarios. In conclusion, when applying matching procedures in evidence synthesis the underlying situation needs to be checked carefully, and matching variables need to be chosen carefully because adjusting for confounders influences the precision of the indirect comparison.
Extensive efforts in characterizing the biological architecture of schizophrenia have moved psychiatric research closer towards clinical application. As our understanding of psychiatric illness is slowly shifting towards a conceptualization as dimensional constructs that cut across traditional diagnostic boundaries, opportunities for personalized medicine applications that are afforded by the application of advanced data science methods on the increasingly available, large-scale and multimodal data repositories are starting to be more broadly recognized. A particularly intriguing phenomenon is the discrepancy between the high heritability of schizophrenia and the difficulty in identifying predictive genetic signatures, for which polygenic risk scores of common variants that explain approximately 18% of illness-associated variance remain the gold standard. A substantial body of research points towards two lines of investigation that may lead to a significant advance, resolve at least in part the ‘missing heritability’ phenomenon, and potentially provide the basis for more predictive, personalized clinical tools.
First, it is paramount to better understand the impact of environmental factors on illness risk and elucidate the biology underlying their impact on altered brain function in schizophrenia. This thesis aims to close a major gap in our understanding of the multivariate, epigenetic landscape associated with schizophrenia, its interaction with polygenic risk and its association with DLPFC-HC connectivity, a well-established and robust neural intermediate phenotype of schizophrenia. As a basis for this, we have developed a novel biologically-informed machine learning framework by incorporating systems-level biological domain knowledge, i.e., gene ontological pathways, entitled ‘BioMM’ using genome-wide DNA methylation data obtained from whole blood samples. An epigenetic poly-methylation score termed ‘PMS’ was estimated at the individual level using BioMM, trained and validated using a total of 2230 whole-blood samples and 244 post-mortem brain samples. The pathways contributing most to this PMS were strongly associated with synaptic, neural and immune system-related functions. The identified PMS could be successfully validated in two independent cohorts, demonstrating the robust generalizability of the identified model. Furthermore, the PMS could significantly differentiate patients with schizophrenia from healthy controls when predicted in DLPFC post-mortem brain samples, suggesting that the epigenetic landscape of schizophrenia is to a certain extent shared between the central and peripheral tissues. Importantly, the peripheral PMS was associated with an intermediate neuroimaging phenotype (i.e., DLPFC-HC functional connectivity) in two independent imaging samples under the working memory paradigm. However, we did not find sufficient evidence for a combined genetic and epigenetic effect on brain function by integrating PRS derived from GWAS data, which suggested that DLPFC-HC coupling was predominantly impacted by environmental risk components, rather than polygenic risk of common variants. The epigenetic signature was further not associated with GWAS-derived risk scores implying the observed epigenetic effect did likely not depend on the underlying genetics, and this was further substantiated by investigation of data from unaffected first-degree relatives of patients with SCZ, BD, MDD and autism. In summary, the characterization of PMS through the systems-level integration of multimodal data elucidates the multivariate impact of epigenetic effects on schizophrenia-relevant brain function and its interdependence with genetic illness risk.
Second, the limited predictive value of polygenic risk scores and the difficulty in identifying associations with heritable neural differences found in schizophrenia may be due to the possibility that the manifestation of the functional consequences of genetic risk is modulated by spatio-temporal as well as sex-specific effects. To address this, this thesis identifies sex-differences in the spatio-temporal expression trajectories during human development of genes that showed significant prefrontal co-expression with schizophrenia risk genes during the fetal phase and adolescence, consistent with a core developmental hypothesis of schizophrenia. More specifically, it was found that during these two time-periods, prefrontal expression was significantly more variable in males compared to females, a finding that could be validated in an independent data source and that was specific for schizophrenia compared to other psychiatric as well as somatic illnesses. Similar to the epigenetic differences described above, the genes underlying the risk-associated gene expression differences were significantly linked to synaptic function. Notably, individual genes with male-specific variability increases were distinct between the fetal phase and adolescence, potentially suggesting different risk associated mechanisms that converge on the shared synaptic involvement of these genes. These results provide substantial support to the hypothesis that the functional consequences of genetic risk show spatiotemporal specificity. Importantly, the temporal specificity was linked to the fetal phase and adolescence, time-periods that are thought to be of predominant importance for the brain-functional consequences of environmental risk exposure. Therefore, the presented results provide the basis for future studies exploring the polygenic risk architecture and its interaction with environmental effects in a multivariate and spatiotemporally stratified manner.
In summary, the work presented in this thesis describes multivariate, multimodal approaches to characterize the (epi-)genetic basis of schizophrenia, explores its association with a well-established neural intermediate phenotype of the illness and investigates the spatio-temporal specificity of schizophrenia-relevant gene expression effects. This work expands our knowledge of the complex biology underlying schizophrenia and provides the basis for the future development of more predictive biological algorithms that may aid in advancing personalized medicine in psychiatry.
Die Frage, wer mit wem eine Partnerschaft eingeht, verbindet sich mit vielen sozialen Folgen. Viele der für die Partnerwahl relevanten persönlichen Eigenschaften spielen auch bei der Abgrenzung gesellschaftlicher Gruppen und bei der Einteilung in gesellschaftliche Schichten eine entscheidende Rolle. In welchem Ausmaß Partnerschaften innerhalb der eigenen (Status-)Gruppe stattfinden, entscheidet somit auch über die horizontale und vertikale Durchlässigkeit gesellschaftlicher Strukturen. Der Fokus der vorliegenden Dissertation richtet sich auf den gelegenheitsstrukturellen Rahmen bzw. die „Angebotsseite“ der Partnerwahl und die Frage, inwieweit individuelle Partnerwahlentscheidungen durch die sozialstrukturelle Umgebung beeinflusst werden. Sie soll dazu beitragen, den Einfluss der Gelegenheitsstruktur auf homogame Partnerwahl mithilfe von räumlich und inhaltlich angemessenen Indikatoren zu analysieren. Die Grundlage hierfür bilden Partnermarktindikatoren, die im Zuge des DFG-Projektes „Die makrostrukturellen Rahmenbedingungen des Partnermarkts im Längsschnitt“ entwickelt wurden und welche eine differenzierte Analyse der Partnermarktaspekte Konkurrenz, Verfügbarkeit, Transparenz und Effizienz auf der Ebene von Landkreisen und kreisfreien Städten in Deutschland ermöglichen.
The three main chapters of this dissertation are self-contained research articles that can be read independently from each other. They all focus on forecasting with financial and macroeconomic data. The analyses in Chapter 1 and 2 are joint works with Christian Conrad. Both focus on forecasting volatility for financial markets. In Chapter 1, we address aggregate stock market volatility and in Chapter 2 stock-specific volatility for investment decisions. Chapter 3 is single-authored and, in contrast to the other two chapters, focuses on the evaluation of distribution forecasts.
Despite the recent success of deep learning, the mammalian brain is still unrivaled when it comes to interpreting complex, high-dimensional data streams like visual, auditory and somatosensory stimuli. However, the underlying computational principles allowing the brain to deal with unreliable, high-dimensional and often incomplete data while having a power consumption on the order of a few watt are still mostly unknown. In this work, we investigate how specific functionalities emerge from simple structures observed in the mammalian cortex, and how these might be utilized in non-von Neumann devices like “neuromorphic hardware”. Firstly, we show that an ensemble of deterministic, spiking neural networks can be shaped by a simple, local learning rule to perform sampling-based Bayesian inference. This suggests a coding scheme where spikes (or “action potentials”) represent samples of a posterior distribution, constrained by sensory input, without the need for any source of stochasticity. Secondly, we introduce a top-down framework where neuronal and synaptic dynamics are derived using a least action principle and gradient-based minimization. Combined, neurosynaptic dynamics approximate real-time error backpropagation, mappable to mechanistic components of cortical networks, whose dynamics can again be described within the proposed framework. The presented models narrow the gap between well-defined, functional algorithms and their biophysical implementation, improving our understanding of the computational principles the brain might employ. Furthermore, such models are naturally translated to hardware mimicking the vastly parallel neural structure of the brain, promising a strongly accelerated and energy-efficient implementation of powerful learning and inference algorithms, which we demonstrate for the physical model system “BrainScaleS–1”.
The dissertation investigated the differentiation of subsyndromes in a spectrum from regional to widespread chronic musculoskeletal pain on the basis of mechanism-related somatosensory and clinical phenotypes within the framework of the multidimensional model of chronic pain. The first study analyzed the dimensional structure of the chronicity construct and its necessary and sufficient components. The second study identified discriminable pain-related phenotypes in two exemplary syndromes of chronic musculoskeletal pain by a stepwise cluster-analytic approach and related these to secondary comorbidity and psychosocial factors. In the first study, diagnostic entrance data of 185 patients with chronic regional vs. widespread musculoskeletal pain (unspecific back pain, fibromyalgia syndrome) from regional pain clinics and of 170 active employees in a nationwide prevention program were included in a retrospective cross-sectional analysis to reanalyze the construct of chronicity. The marker sets of three established chronicity indices (IASP Pain Taxonomy Axis IV, Chronic Pain Grade, Mainz Pain Staging System) were reanalyzed by correlations and frequency distributions of successive duration classes. Factor and latent class analyses were applied to assess the dimensional structure of pain and chronicity. Pain intensity distributions showed inhomogeneous courses from short to long durations differing between groups. Both dimensions, pain intensity and duration, related unsystematically to CPG and MPSS. Different dimensions and clusters of chronicity markers were discovered, that differed between the groups (three dimensions and clusters in patients, two dimensions and clusters in employees). In fact, there was evidence for at least three weakly coupled core domains of chronicity, i.e., the primary clinical pain characteristics, the direct consequences of current interference with activities and aspects of the patient history (duration and health care utilization). In the second study, the sensory and clinical characteristics of the patient sample were reanalyzed to identify necessary and sufficient markers differentiating subsyndromes with different sensory-clinical phenotypes along the continuum from regionally confined to extensively widespread pain. For this purpose, 107 chronic unspecific back pain patients and 78 patients with fibromyalgia syndrome were taken as exemplary instantiations with circumscribed diagnoses. Four clusters of differential sensory-clinical phenotypes covering a spectrum from regional to widespread pain were discovered on the basis of four pressure pain sensitivity markers (number of sensitive ACR tender and control points, test pain intensity and pressure pain threshold) and two clinical pain markers (number of pain regions, present pain intensity). A consecutive discriminant analysis showed that the pressure sensitivity markers alone sufficed already to discriminate between the clusters with a high correct rate. The sensory-clinical phenotypes differed also in other somatic symptoms and impairment but not in psychopathology nor in psychosocial co-factors. The project showed that differential diagnostics of chronic musculoskeletal pain requires at least a multifactorial determination of its chronicity with respect to the necessary components of duration, severity and impairment and the identification of the individual pain phenotype by comprehensive sensory and clinical assessment. This is considered as the prerequisite of differential indication of specific modules in multimodal pain therapy to avoid unselective polypragmasia.
The etiology of breast cancer (BC) involves both non-genetic and genetic factors. Environmental and lifestyle factors such as age, use of menopausal hormone therapy, smoking, and body mass index have been associated with the risk of developing BC. Genetic susceptibility determined mainly by family history and ethnic background have an important role in the risk of developing this disease (Dossus and Benusiglio, 2015). In recent years, novel variants robustly associated with BC risk have been identified in large-scale genetic association studies in women of European and Asian origin. However, few studies directed towards the identification of BC susceptibility variants have been conducted among Latin American and Hispanic populations.
This thesis examined the contributions of genetic ancestry, established risk factors, and newly identified susceptibility variants to BC risk in Colombia. A total of 2,045 participants from the Colombian Breast Cancer Case-Control study were included in this analysis: 1,022 BC patients, and 1,023 healthy controls. BC patients were unselected for family history and age at BC diagnosis. European, Native American, and African ancestry proportion were quantified in each woman based on 30 ancestry informative markers, aiming to obtain the relationship between ancestry and BC risk. Seventy-eight previously identified common BC susceptibility variants were genotyped and associations of these variants with BC risk in the Colombian population were determined. To assess the interactions between the variants and ancestry proportions logistic regression models were applied. Native American proportions were lower in Colombian BC patients than in unaffected controls (P-value=5.2x10-16). This difference translated into an unadjusted decreased BC risk of 2.6% per each 1% increase in the Native American proportion (95% CI: 2.0-3.2). Associations with BC risk in Colombian women were obtained for thirteen variants, which in comparison with European women have partially different risk effects and allele frequencies. The risk effects of rs941764 (CCDC88C) and rs3803662 (TOX3) was controlled for ancestry proportions. One variant was associated with estrogen receptor negative (ER-), seven with estrogen receptor positive (ER+), and three with ER+ and ER- disease. The variance in BC liability due to susceptibility variants in European and Colombian women was estimated. Out of 13 variants associated with BC risk in Colombia, four explained a larger attributable heritability in Europe than in Colombia and nine revealed larger attributable heritability in Colombia than in Europe.
Area under the Receiver Operating Characteristic curves (AUCs) with their corresponding 95% CIs were estimated for established risk factors, genetic ancestry, and common BC susceptibility variants based on risk estimates from the literature and own Colombian data. The discriminative ability to separate Colombian cases and controls of family history of BC in first-degree female relatives (AUC=0.58) and the combination of all 13 associated risk variants (AUC=0.57) were similar to the discriminative ability of Native American proportions (AUC=0.61).
The findings demonstrate that individual ancestry proportions predict BC risk in Colombia as accurately as established BC risk factors. Combining Native American proportions, established risk factors, and newly identified genetic susceptibility variants could translate in promising clinical strategies on BC prevention in Latin American and Hispanic women.
Considering a family of statistical, linear, ill-posed inverse problems, we propose their study from two perspectives, the Bayesian and frequentist paradigms. Under the Bayesian paradigm, we investigate two different asymptotic analyses for Gaus- sian sieve priors and their hierarchical counterpart. The first analysis is with respect to an iteration procedure, where the posterior distribution is used as a prior to compute a new posterior distribution while using the same likelihood and data. We are interested in the limit of the sequence of distributions generated this way, if it exists. The second analysis, more traditionally, investigates the behaviour of the posterior distri- bution as the amount of data increases. Assuming the existence of a true parameter, one is then interested in showing that the posterior distribution contracts around the truth at an optimal rate. We illustrate all those results by their application to the inverse Gaussian sequence space model. Finally we exhibit that the posterior mean of the hierarchical Gaussian sieve prior is both a shrinkage and an aggregation estimator, with interesting optimality properties. Motivated by the last findings about posterior mean of hierarchical Gaussian sieves, we propose to investigate the quadratic risk of aggregation estimators, which shape mimics the one of the above-mentioned posterior means. We introduce a strategy, relying on the decomposition of the risk, which allows to obtain optimal convergence rates in the cases of known and unknown operator, for dependent as well as absolutely regular data. We demonstrate the use of this method on the inverse Gaussian sequence space model as well as the circular density deconvolution and obtained optimality results under mild hypotheses.
In routine clinical practice, the risk of xerostomia is typically managed by limiting the mean radiation dose to parotid glands. This approach used to give satisfying results. In recent years, however, several studies have reported mean-dose models to fail in the recognition of xerostomia risk. This can be explained by a strong improvement of overall dose conformality in radiotherapy due to recent technological advances, and thereby a substantial reduction of the mean dose to parotid glands. This thesis investigated novel approaches to building reliable normal tissue complication probability (NTCP) models of xerostomia in this context.
For the purpose of the study, a cohort of 153 head-and-neck cancer patients treated with radiotherapy at Heidelberg University Hospital was retrospectively collected. The predictive performance of the mean-dose to parotid glands was evaluated with the Lyman-Kutcher-Burman (LKB) model. In order to examine the individual predictive power of predictors describing parotid shape (radiomics), dose shape (dosiomics), and demographic characteristics, a total of 61 different features was defined and extracted from the DICOM files. These included the patient’s age and sex, parotid shape features, features related to the dose-volume histogram, the mean dose to subvolumes of parotid glands, spatial dose gradients, and three-dimensional dose moments. In the multivariate analysis, a variety of machine learning algorithms was evaluated: 1) classification methods, that discriminated patients between a high and a low risk of complication, 2) feature selection techniques, that aimed to select a number of highly informative covariates from a large set of predictors, 3) sampling methods, that reduced the class imbalance, 4) data cleaning methods, that reduced noise in the data set. The predictive performance of the models was validated internally, using nested cross-validation, and externally, using an independent patient cohort from the PARSPORT clinical trial.
The LKB model showed fairly good performance on mild-to-severe (G1+) xerostomia predictions. The corresponding dose-response curve revealed that even small doses to parotid glands increase the risk of xerostomia and should be kept as low as possible. For the patients who did develop moderate-to-severe (G2+) xerostomia, the mean dose was not an informative predictor, even though the efficient sparing of parotid glands allowed to achieve low G2+ xerostomia rates. The features describing the shape of a parotid gland and the shape of a dose proved to be highly predictive of xerostomia. In particular, the parotid volume and the spatial dose gradients in the transverse plane explained xerostomia well. The results of the machine learning algorithms comparison showed that a particular choice of a classifier and a feature selection method can significantly influence predictive performance of the NTCP model. In general, support vector machines and extra-trees achieved top performance, especially for the endpoints with a large number of observations. For the endpoints with a smaller number of observations, simple logistic regression often performed on a par with the top-ranking machine learning algorithms. The external validation showed that the analyzed multivariate models did not generalize well to the PARSPORT cohort. The only features that were predictive of xerostomia both in the Heidelberg (HD) and the PARSPORT cohort were the spatial dose gradients in the right-left and the anterior-posterior directions. Substantial differences in the distribution of covariates between the two cohorts were observed, which may be one of the reasons for the weak generalizability of the HD models.
The results presented in this thesis undermine the applicability of NTCP models of xerostomia based only on the mean dose to parotid glands in highly conformal radiotherapy treatments. The spatial dose gradients in the left-right and the anterior-posterior directions proved to be predictive of xerostomia both in the HD and the PARSPORT cohort. This finding is especially important as it is not limited to a single cohort but describes a general pattern present in two independent data sets. The performance of the sophisticated machine learning methods may indicate a need for larger patient cohorts in studies on NTCP models in order to fully benefit from their advantages. Last but not least, the observed covariate-shift between the HD and the PARSPORT cohort motivates, in the author’s opinion, a need for reporting information about the covariate distribution when publishing novel NTCP models.
The content of the following three chapters concerns different fields of application. In Chapter 2 we analyse the passengers' journeys within the framework of complex network analysis. In Chapter 3, we focus on social support networks of people in old age and the association with well-being and mental health. In Chapter 4, we discuss the association between religion and moral behavior and attitudes.
Im Beobachtungszeitraum von April 2006 bis Oktober 2011 wurden Daten zu 13.548 Intensivpatienten in der elektronischen Patientenakte aufgezeichnet. Daraus wurden 256 Polytraumapatienten, darunter 85 Sepsisfälle, durch einen automatisierten Auswahlschritt, gefolgt von manueller, ärztlicher Validierung identifiziert und für diese auch gegebenenfalls der Zeitpunkt ihrer Sepsis bestimmt.
Die klinischen SIRS-Kriterien wurden für die Anwendung bei Intensivpatienten durch die Berücksichtigung einer maschinellen Unterstützung der Beatmung sowie Kreislaufunterstützung durch Katecholamine erweitert und in einen Algorithmus übersetzt. Mit dessen Hilfe konnte für jede Minute des Patientenaufenthaltes die Anzahl der erfüllten SIRS-Kriterien bestimmt werden. Diese wurden nachfolgend auf verschiedene Weise zusammengefasst und als SIRS-Parameter in logistischen und bedingten logistischen Regressionsmodellen auf ihren Zusammenhang mit der Sepsis hin analysiert. Zusätzlich wurden die SIRS-Kriterien anhand von drei Deskriptoren als dynamische Parameter für relevante Zeitabschnitte im Behandlungsverlauf zusammengefasst. Dies geschah unter Berücksichtigung der Veränderung der Zahl der SIRS-Kriterien in einer gegebenen Minute im Vergleich zur vorhergehenden Minute. Als SIRS-Deskriptoren eines Intervalls wurden (1) der Durchschnitt der erfüllten SIRS-Kriterien als durchschnittliches λ, (2) die Anzahl der Veränderungen in der Anzahl der SIRS-Kriterien von einer Minute zur nächsten als C und (3) die Differenz der Anzahl der SIRS-Kriterien in der letzten und der ersten Minute des Intervalls als Trend Δ definiert.
Für die Vorhersage der Sepsis wurden Sepsisfälle mit allen übrigen Patienten der Polytraumakohorte verglichen, dazu wurden die SIRS-Deskriptoren der ersten 24 Stunden nach Aufnahme untersucht und ihre Eignung zur Sepsisidentifikation mit dem klassischen SIRS verglichen. Zur Erkennung von Sepsisfällen zum Zeitpunkt ihrer Diagnosestellung (Differentialdiagnose) wurden diese im Rahmen einer eingebetteten Fall-Kontroll-Studie mit 10.995 sepsisfreien Kontrollintervallen gleicher Behandlungsdauer gematcht. In Fällen und Kontrollen wurden die SIRS-Deskriptoren des 24-Stunden Intervalls vor Sepsisdiagnose bzw. vor dem für das Matching herangezogenen Zeitpunkt verglichen. Für die multivariable Modellierung wurden neben SIRS-Parametern weitere 59 Parameter als mögliche Sepsisrisikofaktoren aus der elektronischen Datenbasis definiert, für die zunächst univariable Analysen durchgeführt wurden. Die multivariable Modellentwicklung erfolgte mit Hilfe von automatisierten Selektionsmethoden (Stepwise- und Forward-Methode), die auf alle Parameter und vorausgewählte Parametergruppen angewendet wurden. Für die Sepsisvorhersage mit SIRS-Deskriptoren wurde logistische Regression, für die Differentialdiagnose der Sepsis mit SIRS-Deskriptoren wurde auch bedingte logistische Regression angewendet.
Die Daten der elektronischen Patientenakte der operativen Intensivstation der Universitätsmedizin Mannheim erlaubten eine erfolgreiche Umsetzung der klinischen SIRS-Kriterien mit einem Algorithmus. Mit Hilfe des Algorithmus wurde eine durchschnittliche Prävalenz des konventionellen SIRS (≥2 Kriterien) auf der Intensivstation der UMM von 43,3% bestimmt. Von 256 Polytraumapatienten entwickelten 85 (33,2%) eine Sepsis. Das konventionelle SIRS mit mindestens ≥1 Minute hatte eine Sensitivität von 91% und eine Spezifität von 19%, während ein SIRS-Kriteriendurchschnitt (durchschnittliches λ) von 1,72 eine Sensitivität von 51% und eine Spezifität von 77% zur Vorhersage der Sepsis hatte. Für die Sepsisdiagnose konnten, im Vergleich zum konventionellen SIRS, das eine Sensitivität von 99% und eine Spezifität von nur 31% aufwies, eine Sensitivität von 82% und eine Spezifität von 71% mit einer Kombination aus durchschnittlichem λ und dem Trend Δ erreicht werden. Das multivariable Modell, das statistisch und klinisch am besten für die Sepsisvorhersage geeignet war, enthielt 11 Parameter: neben der Anzahl der Minuten mit mehr als 2 erfüllten SIRS-Kriterien (SIRS-Zeit) den SAPSII, Thrombozyten, Kreatinin, Hb, Hkt, ISS, Ramsay-Skala, Vorerkrankungen der Atemwege und des Herz-Kreislauf Systems, sowie Diabetes. Das Modell erreichte eine AUC von 0,856. Das am besten zur Differentialdiagnose geeignete Modell beinhaltete 9 Parameter: den SIRS-Kriteriendurchschnitt (SIRS-Niveau) 8-4 Stunden vor Sepsisdiagnose, Temperatur, Laktat, Transfusion von Erythrozytenkonzentraten, Produkt aus AMV x pCO2, FiO2, Katecholamingabe, GCS und AIS Schädel. Es wies eine AUC von 0,864 auf.
In dieser Arbeit wurde der Nutzen von Routinedaten für klinisch relevante, medizinische Fragestellungen aus der Intensivmedizin anhand einer umfassenden, komplexen Datenbasis gezeigt. Dies konnte insbesondere durch Entwicklung eines SIRS-Algorithmus dargestellt werden. Durch Umsetzung in einen dynamischen Parameter konnte für SIRS bei Polytraumapatienten eine Verbesserung der Spezifität für die Vorhersage der Sepsis, für die Differentialdiagnose eine Sensitivität und Spezifität erreicht werden, die mit etablierten Biomarkern konkurrieren kann. Auch für die multivariable Modellierung spielten mittels des SIRS-Algorithmus definierte Parameter eine wichtige Rolle für die Vorhersage und Differentialdiagnose der Sepsis bei Patienten nach erlittenem Polytrauma.
Many applications nowadays rely on statistical machine-learnt models, such as a rising number of virtual personal assistants. To train statistical models, typically large amounts of labelled data are required which are expensive and difficult to obtain. In this thesis, we investigate two approaches that alleviate the need for labelled data by leveraging feedback to model outputs instead. Both scenarios are applied to two sequence-to-sequence tasks for Natural Language Processing (NLP): machine translation and semantic parsing for question-answering. Additionally, we define a new question-answering task based on the geographical database OpenStreetMap (OSM) and collect a corpus, NLmaps v2, with 28,609 question-parse pairs. With the corpus, we build semantic parsers for subsequent experiments. Furthermore, we are the first to design a natural language interface to OSM, for which we specifically tailor a parser. The first approach to learn from feedback given to model outputs, considers a scenario where weak supervision is available by grounding the model in a downstream task for which labelled data has been collected. Feedback obtained from the downstream task is used to improve the model in a response-based on-policy learning setup. We apply this approach to improve a machine translation system, which is grounded in a multilingual semantic parsing task, by employing ramp loss objectives. Next, we improve a neural semantic parser where only gold answers, but not gold parses, are available, by lifting ramp loss objectives to non-linear neural networks. In the second approach to learn from feedback, instead of collecting expensive labelled data, a model is deployed and user-model interactions are recorded in a log. This log is used to improve a model in a counterfactual off-policy learning setup. We first exemplify this approach on a domain adaptation task for machine translation. Here, we show that counterfactual learning can be applied to tasks with large output spaces and, in contrast to prevalent theory, deterministic logs can successfully be used on sequence-to-sequence tasks for NLP. Next, we demonstrate on a semantic parsing task that counterfactual learning can also be applied when the underlying model is a neural network and feedback is collected from human users. Applying both approaches to the same semantic parsing task, allows us to draw a direct comparison between them. Response-based on-policy learning outperforms counterfactual off-policy learning, but requires expensive labelled data for the downstream task, whereas interaction logs for counterfactual learning can be easier to obtain in various scenarios.
In this thesis, a fast and likelihood-free approach for parameter inference is introduced. The convolutional neural network, named DeepInference, learns to predict the posterior mean and variance of multi-dimensional posterior distributions from raw simulated data. It is shown how DeepInference can be applied to the drift diffusion model (DDM) and the Lévy flight model, a likelihood-free extension of the DDM. For both models, state-of-the-art results in terms of accuracy of parameter estimation are observed.
Paper 1 (Chapter 2): We investigate the question of whether macroeconomic variables contain information about future stock volatility beyond that contained in past volatility. We show that forecasts of GDP/IP growth from the Federal Reserve's Survey of Professional Forecasters predict volatility in a cross-section of 49 industry portfolios. The expectation of higher growth rates is associated with lower stock volatility. Our results are in line with both counter-cyclical volatility in dividend news as well as in expected returns. Inflation forecasts predict higher or lower stock volatility depending on the state of the economy and the stance of monetary policy. Forecasts of higher unemployment rates are good news for stocks during expansions and go along with lower stock volatility. Our results hold in- as well as out-of-sample and pass various robustness checks.
Paper 2 (Chapter 3): We analyze the covariates of average individual inflation uncertainty and the cross-sectional variance of point forecasts (`disagreement') based on data from the European Central Bank's Survey of Professional Forecasters. We empirically confirm the implication from a theoretical variance decomposition that disagreement is an incomplete approximation to overall uncertainty. Both measures are associated with macroeconomic conditions and indicators of monetary policy, but the relations differ qualitatively. In particular, average individual inflation uncertainty is higher during periods of expansionary monetary policy, whereas disagreement rises during contractionary periods. This implies that conclusions based on disagreement as a single indicator of ex-ante uncertainty are incomplete and potentially misleading.
Paper 3 (Chapter 4): We analyze the relationship between forecaster disagreement and macroeconomic uncertainty in the Euro area using data from the European Central Bank's Survey of Professional Forecasters for the period 1999Q1-2018Q2. We find that disagreement is generally a poor proxy for uncertainty. However, the strength of this link varies with the employed dispersion statistic, the choice of either the point forecasts or the histogram means to calculate disagreement, the considered outcome variable and the forecast horizon. In contrast, distributional assumptions do not appear to be very influential. The relationship is weaker during economically turbulent periods when indicators of uncertainty are needed most. Accounting for the entry and exit of forecasters to and from the survey has little impact on the results. We also show that survey-based uncertainty is associated with overall policy uncertainty, whereas forecaster disagreement is more closely related to the fluctuations on financial markets.
Paper 4 (Chapter 5): Although survey-based point predictions have been found to outperform successful forecasting models, corresponding variance forecasts are frequently diagnosed as heavily distorted. Forecasters who report inconspicuously low ex-ante variances often produce squared forecast errors that are much larger on average. In this paper, we document the novel stylized fact that this variance misalignment is related to the rounding behavior of survey participants. Discarding responses which are strongly rounded provides an easily implementable correction that i) can be carried out in real time, i.e., before outcomes are observed, and ii) delivers a significantly improved match between ex-ante and ex-post forecast variances. According to our estimates, uncertainty about inflation, output growth and unemployment in the U.S. and the Euro area is higher after correcting for the rounding effect. The increase in the share of non-rounded responses in recent years also helps to understand the trajectory of survey-based average uncertainty during the years after the financial and sovereign debt crisis. Our findings are in line with assertions from the previous literature regarding the connection between survey respondents' rounding behavior and their uncertainty about future macroeconomic outcomes.
Die Dissertation beschäftigt sich mit dem Thema ausländischer Direktinvestitionen (FDI) und Wirtschaftswachstum unter Berücksichtigung der Rolle der politischen Faktoren in den beiden osteuropäischen Ländern Ukraine und Polen. Im ersten Kapitel werden terminologische und theoretische Grundlagen der FDI dargestellt. Im Mittelpunkt des zweiten Kapitels steht die Frage, welche Bedeutung die politische Orientierung der Region für die FDI in der Ukraine und Polen hat. Das dritte Kapitel untersucht, welche Verbindung zwischen der „pork-barrel“-Politik und dem regionalen Wirtschaftswachstum in der Ukraine und Polen besteht. Ziel des vierten Kapitels ist es, ein Zusammenhang zwischen den FDI und dem regionalen Wirtschaftswachstum in der Ukraine und Polen zu untersuchen. Das vierte Kapitel geht auch der Frage nach, wie stark sich die Erhöhung der FDI-Zuflüsse der räumlich benachbarten Regionen auf die Wachstumsrate einer Region auswirkt.
Das heutige Produktions- und Konsumniveau führt zu einer Übernutzung der bestehenden natürlichen Ressouren und Ökosysteme und ist mit zahlreichen negativen Auswirkungen für Mensch und Umwelt verbunden. Das gilt sowohl für die globale Perspektive als auch – in noch schärferer Form – aus nationaler Sicht für Deutschland (vgl. Steffen et al. 2015c; Global Footprint Network 2017). Eine wichtige Ursache dafür sind externe Kosten, die eine Form des Marktversagens darstellen. Eine Internalisierung der externen Kosten würde zu einer Verminderung der negativen Umweltwirkungen und einer Steigerung der gesellschaftlichen Wohlfahrt führen. Eine Umsetzung der Internalisierung wird jedoch oft mit dem Argument einer mangelnden sozialen Verträglichkeit abgelehnt. Die Überprüfung dieses Arguments stellt das zentrale erkenntnisleitende Forschungsinteresse dieser Arbeit dar. Dazu werden die sozialen Verteilungswirkungen berechnet und analysiert, die bei einer Internalisierung der externen Kosten des Konsums in Deutschland für die privaten Haushalte auftreten würden. Dabei werden die Untersuchungen sowohl aus der Bruttoperspektive, also bei alleiniger Betrachtung der finanziellen Belastungswirkungen (Forschungsfrage 1a), als auch der Nettoperspektive, also bei zusätzlicher Berücksichtigung der Verwendung der Internalisierungseinnahmen (Forschungsfrage 1b), durchgeführt.
Als Instrument der Einnahmeverwendung wird der Ökobonus – also die Rückverteilung mittels eines pauschalen Betrags – eingesetzt. Dem Ökobonus wird ein hohes Potenzial zur Schaffung der für die Umsetzung der Internalisierungsmaßnahmen benötigten gesellschaftlichen Akzeptanz attestiert, weil er zum einen eine direkte Verknüpfung von Einnahmen und Ausgaben herstellt und zum anderen – wie verschiedene Studien belegen (vgl. Smith 1993; Iten et al. 1999; Loske 2013; Ekardt 2010; Iten und Beck 2003; Iten und Beck 2003; Büchs et al. 2011; Loske 2013; Müller und Spillmann 2015) – einer regressiven Belastungswirkung effektiv entgegenwirkt.
In der vorliegenden Arbeit wurden keine eigenen Erhebungen durchgeführt, sondern allein Sekundärquellen verwendet. Die wichtigsten Datenquellen und den Ausgangspunkt der Auswertungen bilden die Scienfitic-Use-Files (SUF) der Einkommens- und Verbrauchsstichprobe (EVS) der Jahre 2008 (EVS2008; FDZ 2010) und 2013 (EVS2013; FDZ 2016). Neben der EVS sind dabei als weitere wichtige Datenquellen insbesondere das zur Verfeinerung der Ergebnisse des Mobilitätsbereichs eingesetzte Public-Use-File (PUF) der Erhebung „Mobilität in Deutschland 2008“ (MiD2008; BMVBS 2010), das für die Bestimmung der Emissionsfaktoren verwendete Globale Emissions-Modell integrierter Systeme (GEMIS v4.94; IINAS 2015) und die zur Festlegung der Kostensätze eingesetzte Methodenkonvention 2.0 des Umweltbundesamts (MK 2.0; Schwermer et al. 2014) zu nennen. Der Untersuchungsbereich ist auf die Bereiche Haushaltsstrom, Wärme und Mobilität beschränkt; andere Konsumfelder konnten in Ermangelung aussagekräftiger Daten nicht berücksichtigt werden. Im Mobilitätsbereich beschränkt sich der Untersuchungsbereich ab dem Schritt der Berechnung der bestehenden Nettointernalisierung außerdem auf die Verkehrsmittel des motorisierten Individualverkehrs (MIV) und des Flugzeugs, da beim öffentlichen Personenverkehr (ÖPV) zum einen die Datenlage problematisch war und zum anderen die Auswertungen sehr komplex und – auf Grund des relativ geringen Anteils des ÖPV an den negativen Umweltwirkungen des Mobilitätsbereichs – mit nur geringem Mehrwert verbunden gewesen wären. Mittels einer selbstentwickelten Methodik werden die aus der EVS stammenden Ausgaben in Verbräuche, in Emissionen, in externe Kosten, in Internalisierungslücken und schließlich mittels Preis-elastizitäten in finanzielle Belastungswirkungen (Forschungsfrage 1a) und Nettowirkungen des Ökobonus (Forschungsfrage 1b) umgerechnet. Dabei werden Ergebnisse für die kurze (niedrigere Preiselastizitäten) und die lange (höhere und einkommensspezifische Preiselastizitäten) Frist berechnet. Um Aussagen bezüglich der sozialen Verträglichkeit treffen zu können, werden die Haushalte auf Basis ihrer Nettoäquivalenzeinkommen in Dezile eingeteilt und für diese Dezile – sowie den Durchschnitt – die jeweiligen Werte berechnet. Darüber hinaus werden weitere statistische Auswertungen eingesetzt, beispielsweise lineare Regressionsanalysen. Da bereits die methodischen Zwischenschritte interessante Ergebnisse darstellen, werden neben der zentralen Forschungsfrage (1a/1b) vier weitere untergeordnete Forschungsfragen (2-5) aufgestellt und beantwortet. Diese befassen sich unter anderem damit, welche einkommensspezifischen Unterschiede es beim Energieverbrauch gibt (2b), wie sich die verursachten externen Kosten von 2008 bis 2013 verändert haben (3d) und welche Auswirkungen die Internalisierung auf den Treibhausgas (THG)-Ausstoß hätte (5c).
Die Auswertungen ergeben in der Bruttoperspektive (Forschungsfrage 1a) für die Bereiche Haushaltsstrom, Wärme und MIV regressive Belastungswirkungen, wobei beim MIV vor allem die Mittelschicht stark belastet würde. Beim Flugverkehr herrscht hingegen eine progressive Belastungswirkung vor. Aggregiert man die Belastungen über alle der vier betrachteten Bereiche (Haushaltsstrom, Wärme, MIV und Flugzeug), so zeigen sich insgesamt deutlich regressive Verteilungswirkungen: Während im ersten Dezil eine Internalisierungsbelastung in Höhe von 3,83% des Nettoeinkommens (lange Frist: 3,35%) auftreten würde, sind es im zehnten Dezil nur 2,54% (lange Frist: 2,00%).
In der Nettoperspektive (Forschungsfrage 1b) zeigen sich bei Verwendung des Ökobonus hingegen klar progressive Verteilungswirkungen. Das gilt sowohl in der aggregierten Gesamtbetrachtung als auch – in unterschiedlich starker Ausprägung – für die einzelnen Bereiche. Die Berechnungen ergeben, dass in der Gesamtbetrachtung sowohl in der kurzen als auch der langen Frist die Dezile eins bis fünf vom Ökobonus netto profitieren würden. Die Nettowirkung fällt in der kurzen Frist von maximal 4,26% des Nettoeinkommens (lange Frist: 3,38%) im ersten Dezil streng monoton bis auf -1,09% (lange Frist: 0,79%) im zehnten Dezil. Was die Auswirkungen der Internalisierung auf die Umweltwirkungen angeht, so ergeben die Berechnungen (ceteris paribus) in der kurzen Frist einen Rückgang der THG-Emissionen um 15% (lange Frist: 27%).
Allerdings zeigen die tiefergehenden Analysen, dass es jenseits der Durchschnittswerte auch innerhalb der einkommensschwachen Dezile in der Nettobetrachtung Ökobonus-Verlierer – also Personen, die mehr für die Internalisierung zahlen müssten, als sie durch den Ökobonus zurückbekommen – gibt. Der Anteil steigt in der Gesamtbetrachtung (kurze Frist) zwar streng monoton über die Dezile an auf 62% im zehnten Dezil, trifft allerdings auch auf 8% des ersten Dezils (2. Dezil: 13%, 3. Dezil: 22%) zu. Dabei zeigen sich deutliche Unterschiede zwischen den Bereichen: Während im Mobilitätsbereich mit 4% (Flugzeug) beziehungsweise 11% (MIV) nur ein relativ geringer Anteil des ersten Dezils negative Nettowirkungen des Ökobonus tragen müsste, sind dies im Wärme- und Strombereich mit 28% (Wärme) beziehungsweise 30% (Haushaltsstrom) deutlich größere Teile. Um die Sozialverträglichkeit weiter zu verbessern, sollten deswegen einkommensschwache Haushalte bei einer Umsetzung der Internalisierung – inbesondere in den Bereichen Wärme- und Stromverbrauch – mit zusätzlichen Maßnahmen unterstützt werden.
Die Aussagekraft der Ergebnisse dieser Arbeit ist jedoch einigen Einschränkungen unterworfen. So mussten aufgrund von Datenbeschränkungen verschiedene vereinfachende Annahmen (zum Beispiel zu den Preiseinflussfaktoren, Emissionsfaktoren, Preis¬elastizitäten) getroffen werden. Außerdem werden verschiedene Datenquellen eingesetzt, die wiederum selbst bezüglich ihrer Genauigkeit Einschränkungen aufweisen. In der Arbeit durchgeführte statistische Berechnungen, theoretische Überlegungen, Szenariorechnungen und Vergleiche mit anderen Studien führen jedoch zu der Einschätzung, dass die festgestellte Progressivität der Nettowirkung des Ökobonus als robust einzustufen ist. Allerdings gibt es erhebliche Unsicherheiten bezüglich deren genauen Ausmaßes. Hier, sowie bezüglich der konkreten Ausgestaltung der Internalisierung, besteht weiterer Forschungsdarf. Als zentrales Forschungsergebnis dieser Arbeit kann jedoch festgehalten werden, dass eine Internalisierung der externen Kosten in den Bereichen Haushaltsstrom, Wärme, MIV und Flugzeug mit progressiven Verteilungswirkungen verbunden wäre und zu einer deutlichen Minderung der betrachteten Umweltwirkungen führen würde.
This dissertation consists of four essays on the decision-making and the effects of international organizations. Its empirical focus is on the International Monetary Fund (IMF). The dissertation shows that political interests influence the IMF's decision-making and that IMF programs have important economic effects on sovereign creditworthiness and income inequality. Chapter 1: Buying Votes and International Organizations: The Dirty-Work Hypothesis; Chapter 2: Room for Discretion: Biased Decision-Making in International Financial Institutions; Chapter 3: Stigma or Cushion? IMF Programs and Sovereign Creditworthiness; Chapter 4: The Economics of the Democratic Deficit: The Effect of IMF Programs on Inequality
Planning and analyzing a multiple biomarker trial is a challenging task comprising various factors which have to be considered. It is an area of ongoing research and only a limited number of multiple biomarker trials have already been completed and their results published. Learning from these completed trials is an important part of the planning process, which can help to avoid issues and pitfalls that these trials may have encountered. Some of the issues which were reported by completed trials, such as low prevalence of the biomarkers and not being able to react to the latest developments regarding biomarkers and treatments, are addressed in this thesis. Sample size calculation and data analysis methods for testing an overall treatment strategy are investigated for situations where biomarker prevalences make it unfeasible to test within the individual biomarker-groups. Additionally, the issue of a large number of biomarker-negative patients is addressed, which is a side effect in trials that investigate lower prevalence biomarkers. Different analysis approaches for a trial that includes biomarker-negative patients are compared and it is examined whether inclusion of biomarker-negative patients in the analysis can improve bias and standard deviation of the treatment effect estimates. Finally, a flexible study design is considered that allows a new biomarker-group with corresponding experimental treatment to be included in the study after accrual has already begun. Different aspects of study design modification are discussed and different models for analysis of such a study are compared. Furthermore, the issue of missing biomarker data is addressed. If the initial biomarker screening did not include the new biomarker before it was added to the study, the biomarker status regarding this biomarker has to be determined retrospectively for patients that are included in the study before adding the new biomarker. This may lead to missing data for some or all of the patients. For cases where data is only partially missing, different methods for missing data imputation for models with interaction terms are investigated and compared. The first issue of three issues which are addressed in this thesis is low prevalence of the biomarkers. For a study which tests an overall biomarker-guided treatment strategy, the sample size calculation method by Palta and Amini appears to be the most appropriate choice when heterogeneous treatment effects are expected. The results from the simulation study suggest that the subsequent data analysis could be performed using the two-step approach suggested by Mehrotra or a shared frailty model. If no other covariates are included in the model, an exact log-rank test could also be used. The asymptotic log-rank test and the stratified Cox PH model suffers loss of power in the simulation study and therefore should not be used for heterogeneous treatment effects. To test the individual biomarker-groups as secondary hypotheses after testing the overall treatment strategy, some strategies for multiple testing are suggested. The second issue that is addressed is a large expected number of biomarker-negative patients at the screening stage. For a situation where an overall biomarker-guided treatment strategy is not desirable, a combined analysis model using the data from the entire study, including biomarker-negative patients, is investigated. This combined model estimates the treatment effects for the individual biomarkers. Application of the Firth correction appeared to be a good method for reduction of small sample size bias, which is likely to occur for low prevalence biomarkers. The inclusion of biomarker-negative patients in the model can provide a small additional benefit with respect to reduction of bias and standard deviation. The third issue considered is the constant discovery of new biomarkers and corresponding biomarker-guided experimental therapies. It is desirable for a clinical trial to be able to react to these continuous developments by investigating options to add new biomarkers and corresponding therapies to an ongoing study. Different models for data analysis are compared for a situation with a belatedly added biomarker, an overlap of biomarkers within the population, and an effect of the new biomarker on the response to the experimental treatment of an already existing biomarker-group. Adding an interaction term to the combined analysis model can help avoiding biased treatment effect estimates when there is overlap of the biomarkers within the patient population, and when patients with both biomarkers respond differently to the experimental therapy than patients with only one of the biomarkers. If there is missing data regarding the biomarker status of the belatedly added biomarker, data imputation can be utilized. However, the correct model specification is crucial to avoid biased estimates when interaction terms are part of the model for the final analysis. These interaction terms should already be included in the imputation model rather than imputing them passively. The simulation study suggests that for the considered scenario, the `just-another-variable'-approach with polytomous logistic regression is the best option to avoid obtaining biased estimates after data imputation. Due to the heterogeneity of biomarkers and treatments and the rapid developments in this field, the planning phase of a multiple-biomarker trial is a complex process and each trial has to be adjusted to the individual situation. This thesis can give guidance in some of the aspects that need to be considered, but of course there are many more aspects that need to be addressed.
Case-control association studies in human genetics and microbiome pave the way to personalized medicine by enabling a personalized risk assessment, improved prognosis, or allowing an early diagnosis. However, confounding due to population structure, or other unobserved factors, can produce spurious findings or mask true associations, if not detected and corrected for. As a consequence, underlying structure improperly accounted for could explain lack of power or some unsuccessful replications observed in case-control association studies. Besides, points considered as outliers are commonly removed in such studies although they do not always correspond to technical errors. A wealth of methods exist to determine structure in genetic and microbiome association studies. However, there are few systematic comparisons between these methods in the frame of genetic or microbiome association studies, and even less attempts to apply robust methods, which produce stable estimates of confounding underlying structure, and which are able to incorporate information from outliers without degrading estimates quality. Consequently, the aim of this thesis was to detect and control robustly for underlying confounding structure in genetic and microbiome data, by comparing systematically the most relevant standard and robust forms of principal components analysis (PCA) or multidimensional scaling (MDS) based methods, and by contributing new robust methods. Own contributions include robustification of existing methods, adaption to the genetic or to the microbiome framework, and a dimensionality exploration and reduction method, nSimplices. Analysed datasets include a first synthetic example with a low-variance 2-groups confounding structure, a second synthetic example with a simple linear underlying structure, genome-wide single nucleotide polymorphism (SNP) from 860 case and control individuals enrolled in the European Prospective Investigation into Cancer and nutrition (EPIC prostate), and finally, 2 255 microbiome samples from the human microbiome project (HMP). Synthetic or real outliers were added in the second example and in EPIC and HMP datasets. All meaningful existing and contributed methods were applied to the EPIC and HMP datasets, while a restricted set was applied to the synthetic, illustrative examples. The 10 principal components or top axes resulting from each method were kept for further analysis. Quality of a method was assessed by how well these axes summarized the underlying structure (using Akaike's information criterion -AIC- from the regression of the 10 axes on known underlying structure in the data), and by how robust the estimates stayed in the presence of outliers (adjusted R2 from the regression of each outlier-disturbed axis on the original axis). In synthetic example 1, only ICA was able to uncover the low-variance confounding structure, whereas PCA or MDS failed to do so, in agreement with the fact that these methods detect large rather than small variance or distance components. In synthetic example 2, non-metric MDS remained the most representative and robust method when distance outliers are included, while nSimplices combined with classical MDS was the only method to stay representative and robust if contextual outliers are present. In the EPIC dataset, Eigenstrat was the most representative method (AIC of 782.8) whereas sample ancestry was best captured by new method gMCD (unbiased genetic relatedness estimates used in a Minimum Covariance Determinant procedure). Methods gMCD, spherical PCA, IBS (MDS on Identity-by-State estimates) and nSimplices were more robust than Eigenstrat, with a small to moderate loss in terms of representativity (AIC between 789.6 and 864.9). Association testing yielded p-values comparable with published values on candidate SNPs. Further SNPs rs8071475, rs3799631, rs2589118 with lowest p-value were identified, whose known role in other disorders could point to an indirect link with prostate cancer. In the HMP dataset, the new method nSimplices combined to data-driven normalization method qMDS mirrored best the underlying structure. The most robust method was qMDS (with nSimplices or alone), followed by CSS and MDS. Lastly, the original method nSimplices performed in all settings at least comparably (except ancestry in EPIC), and in some cases considerably better than other methods, while remaining tractable and fast in high-dimensional datasets. The improved performance of gMCD and qMDS agrees with the fact that these methods use adapted measures (genetic relatedness, selected model distribution, respectively) and recognized robust approaches (minimum covariance determinant and quantiles). Conversely, wMDS is likely to have failed because variance is not an adequate parameter for microbiome data. More generally, different methods report the underlying structure differently and are advantageous in different settings, for example PCA or non-metric MDS were best in some settings but failed in other. Finally, the original method nSimplices proved useful or markedly better in a variety of settings, with the exception of highly noisy datasets, and provided that distance outliers are corrected. Current genetic case-control association studies tend to integrate several types of data, for example clinical and SNP data, or several omics datasets. These approaches are promising but could be subject to increased inaccuracies or replication issues, by the mere combination of several sources of data. This motivates a reinforced use of robust methods, which are able to mirror accurately and steadily genetic information, such as gMCD, nSimplices or spherical PCA. Nevertheless, results on Eigenstrat show this stays a reasonable method. Results in microbiome confirmed that MDS based on proportions is a suboptimal method, and suggested the exponential distribution should be considered instead of multinomial-based distributions, certainly because the exponential better represents the inherent competitiveness between phylogenies in the microbiome. Moreover, illustrative and real world examples showed that methods could capture relevant, but different information, encouraging to apply several complementary methods when starting to explore a dataset. In particular, a low-variance confounder could stay undetected in some methods. Additionally, methods based on least absolute residuals revealed several shortcomings in spite of their utility in a univariate frame, but their expected benefit in a multivariate setting should motivate the development of more tractable implementations. Finally, SPH, IBS, gMCD are recommended methods in a genetic SNP dataset, while Eigenstrat should perform best if no more than 2% outliers are present. To mirror structure in a microbiome dataset, nSimplices (combined with qMDS, or with CSS) can be expected to perform best, whereas MDS on proportions is likely to underperform. Method nSimplices proved beneficial or largely better in various situations and should therefore be considered to analyse datasets including, but not limited to, genetic SNP and microbiome abundances.
While antipoverty programs become increasingly popular in low-income countries, they often fail to adequately cover the poor. This dissertation considers a voluntary micro-health insurance scheme in Burkina Faso, which has applied community-based targeting to offer poor households a premium discount. It addresses two main reasons for low program coverage in low-income countries, inaccurate targeting of poverty programs and low take-up by the poor. The empirical analysis rests on the combination of four different micro-datasets and follows three different approaches to program evaluation. The main findings from this dissertation are as follows. First, community-based targeting targets consumption-poor households fairly accurately when compared to four statistical targeting methods. Furthermore, for common transfer amounts it is by far the most cost-effective method. Second, the community-based targeting decision exhibits a moderate but highly statistically significant allocations bias due to ethnic favoritism. Third, the 50 percent premium subsidy in this context is successful in increasing health insurance demand among moderately poor urban households but is ineffective for very poor rural households.
Ausgangspunkt der Arbeit sind Treibhausgasemissionsminderungsziele auf Unternehmensebene. Dies bezieht sich auf direkt und indirekt verursachte Emissionen. Eine Quantifizierung der indirekten Emissionen erfordert die Modellierung der Verflechtung mit der übrigen Wirtschaft. Die intermediäre Verflechtung von Sektoren nach Sektoren wird in volkswirtschaftlichen Input-Output-Rechnungen ausgewiesen. In Verbindung mit umweltökonomischen Gesamtrechnungen bilden sie die Datengrundlage für ökologisch erweiterte Input-Output-Modelle, deren nachfrageseitige Form sämtliche Emissionen auf die letzte Verwendung projiziert. Es gibt jedoch auch eine angebotsseitige Form, die sämtliche Emissionen auf die Wertschöpfung alloziert. In der betrieblichen Treibhausgasbilanzierung wird vor allem die nachfrageseitige Modellierung zur Schätzung indirekter Emissionen aus Einkäufen verwendet. Dies verursacht theoretisch partielle Doppelzählungen der direkten Emissionen. Bei gleichzeitiger Modellierung vor- und nachgelagerter Emissionen kommen Doppelzählungen aus der Wertschöpfungskette hinzu. Dementsprechend stellt sich die Frage, welche Modellmodifikationen zur Behebung der Doppelzählungen erforderlich sind. Die Bearbeitung der theoretischen Fragestellung führt zur empirischen Fragestellung nach der Relevanz von Doppelzählungen. Die Zielsetzung untergliedert sich in die Identifikation der Ursachen, die Entwicklung einer doppelzählungsfreien Methodik sowie die Evaluierung ihrer empirischen Relevanz.
Partielle Doppelzählungen direkter Emissionen werden durch die Exogenisierung des Unternehmens beseitigt. Zur Eliminierung der Doppelzählungen aus der Wertschöpfungskette wird die Interaktion zwischen Unternehmen und übriger Wirtschaft als intermediärer Produktionskreislauf interpretiert. Hierbei wird gezeigt, dass die Unternehmenspositionierung für die Summe indirekter Emissionen irrelevant ist. Anhand empirischer Daten werden herkömmlich berechnete Ergebnisse mit doppelzählungsbereinigten Ergebnissen verglichen, erwartete Fehler quantifiziert sowie in Relation zu Umsatz und Wirtschaftsbereich gesetzt. Die Bereitstellung von Polynomfunktionen nach Wirtschaftsbereichen ermöglicht beliebigen Unternehmen eine Schätzung des erwarteten doppelzählungsbedingten Fehlers in Abhängigkeit ihres Umsatzes. Es wird jedoch gezeigt, dass selbst bei einer Fehlertoleranz von 1% lediglich die 500 umsatzstärksten Unternehmen der Welt die Notwendigkeit einer Doppelzählungsbereinigung überprüfen müssten. Die Exogenisierung des Unternehmens verringert den Fehler bereits so stark, dass Doppelzählungen aus der Wertschöpfungskette sogar lediglich innerhalb der 100 umsatzstärksten Unternehmen zu erwarteten Fehlern von mehr als 1% führen. Kleine und mittlere Unternehmen sowie die meisten übrigen Unternehmen können ökologisch erweiterte Input-Output-Modelle somit ohne Verursachung wesentlicher Doppelzählungen verwenden. Neben diesem Beitrag zur allgemeinen Debatte über Doppelzählungen in betrieblichen Treibhausgasbilanzen bietet die Arbeit Anknüpfungspunkte zur Exogenisierung ganzer Sektoren oder gar Volkswirtschaften in multiregionalen Modellen. Die Methodik eröffnet somit den Ausgangspunkt für eine Synthese aus produzenten-, konsum- und einkommensbasierter Verantwortung auf sektoraler und regionaler Ebene.
This thesis investigates robust strategies of optimal experimental design for discrimination between several nonlinear regression models. It develops novel theory, efficient algorithms, and implementations of such strategies, and provides a framework for assessing and comparing their practical performance. The framework is employed to perform extensive case studies. Their results demonstrate the success of the novel strategies.
The thesis contributes advances over existing theory and techniques in various fields as follows:
The thesis proposes novel “misspecification-robust” data-based approximation formulas for the covariances of maximum-likelihood estimators and of Bayesian posterior distributions of parameters in nonlinear incorrect models. The formulas adequately quantify parameter uncertainty even if the model is both nonlinear and systematically incorrect.
The thesis develops a framework of novel statistical measures and tailored efficient algorithms for the simulation-based assessment of covariance approximations for maximum-likelihood estimator for parameters. Fully parallelized variants of the algorithms are implemented in the software package DoeSim.
Using DoeSim, the misspecification-robust covariance formula for maximum-likelihood estimators (MLEs) and its “classic” alternative are compared in an extensive numerical case study. The results demonstrate the superiority of the misspecification-robust formula.
Two novel sequential design criteria for model discrimination are proposed. They take into account parameter uncertainty with the new misspecification-robust posterior covariance formula. It is shown that both design criteria constitute an improvement over a popular approximation of the Box-Hill-Hunter-criterion. In contrast to the latter, they avoid to overestimate the expected amount of information provided by an experiment.
The thesis clarifies that the popular Gauss-Newton method is generally not appropriate for finding least-squares parameter estimates in the context of model discrimination. Furthermore, it demonstrates that a large class of optimal experimental design optimization problems for model discrimination is intrinsically non-convex even under strong simplifying assumptions. Such problems are NP-hard and particularly difficult to solve numerically.
A framework is developed for the quantitative assessment and comparison of sequential optimal experimental design strategies for model discrimination. It consists of new statistical measures of their practical performance and problem-adapted algorithms to compute these measures. A state-of-the-art modular and parallelized implementation is provided in the software package DoeSim. The framework permits quantitative analyses of the broad range of behaviour that a design strategy shows under fluctuating data.
The practical performance of four established and three novel sequential design criteria for model discrimination is examined in an extensive simulation study. The study is performed with DoeSim and comprises a large number of model discrimination problems. The behaviour of the design criteria is examined under different magnitudes of measurement error and for different number of rival models.
Central results from the study are that a popular approximation of the Box-Hill-Hunter-criterion is surprisingly inefficient, particularly in problems with three or more models, that all parameter-robust design criteria in fact outperform the basic Hunter-Reiner-strategy, and that the newly proposed novel design criteria are among the most efficient ones. The latter show particularly strong advantages over their alternatives when facing demanding model discrimination problems with many rival model and large measurement errors.
Background: Mathematical models are used to gain an integrative understanding of biochemical processes and networks. Commonly the models are based on deterministic ordinary differential equations. When molecular counts are low, stochastic formalisms like Monte Carlo simulations are more appropriate and well established. However, compared to the wealth of computational methods used to fit and analyze deterministic models, there is only little available to quantify the exactness of the fit of stochastic models compared to experimental data or to analyze different aspects of the modeling results. Results: Here, we developed a method to fit stochastic simulations to experimental high-throughput data, meaning data that exhibits distributions. The method uses a comparison of the probability density functions that are computed based on Monte Carlo simulations and the experimental data. Multiple parameter values are iteratively evaluated using optimization routines. The method improves its performance by selecting parameters values after comparing the similitude between the deterministic stability of the system and the modes in the experimental data distribution. As a case study we fitted a model of the IRF7 gene expression circuit to time-course experimental data obtained by flow cytometry. IRF7 shows bimodal dynamics upon IFN stimulation. This dynamics occurs due to the switching between active and basal states of the IRF7 promoter. However, the exact molecular mechanisms responsible for the bimodality of IRF7 is not fully understood. Conclusions: Our results allow us to conclude that the activation of the IRF7 promoter by the combination of IRF7 and ISGF3 is sufficient to explain the observed bimodal dynamics.
Let Φ = (φij)1 ⩽ij⩽n be a random matrix whose components φij are independent stochastic processes on some index set T. Let S = ∑i=1nφiπ(i), where Π is a random permutation of {1,2, …, n}, independent from Φ. This random element is compared with its symmetrized version S0 := ∑i=1n ξiφiπ(i) and its decoupled version S := ∑i=1n φiπ(i), where ξ = (ξi)1 ⩽i⩽n is a Rademacher sequence and Π is uniformly distributed on {1,2,…,n}n such that Φ, Π, Π and ξ are independent. It is shown that for a broad class of convex functions Ψ on RT the following symmetrization and decoupling inequalities hold: EΨ(S−ES) ⩽ Ψ(kS0)EΨ(γ(S−ES)) where κ, γ > 0 are universal constants.
Wielandt (1967) proved an eigenvalue inequality for partitioned symmetric matrices, which turned out to be very useful in statistical applications. A simple proof yielding sharp bounds is given.
How do we draw a distribution on the line? We give a survey of somewell known and some recent proposals to present such a distribution, based onsample data. We claim: a diagnostic plot is only as good as the hard statisticaltheory that is supporting it. To make this precise, one has to ask for theunderlying functionals, study their stochastic behaviour and ask for the naturalmetrics associated to a plot. We try to illustrate this point of view for someexamples.
Motivated by interval/region prediction in nonlinear timeseries, we propose a minimum volume predictor (MV-predictor) for astrictly stationary process. The MV-predictor varies with respect tothe current position inthe state space and has the minimum Lebesgue measure amongall regions with the nominal coverage probability.We have established consistency, convergence rates, andasymptotic normality for both coverage probability and Lebesguemeasure of the estimated MV-predictor under the assumption thatthe observations are taken from a strong mixing process.Applications with both real and simulated data sets illustrate theproposed methods.
We identify some of the requirements for document integration of software components in statistical computing, and try to give a general idea how to cope with them in an implementation.
The paper gives an overview over the work of the Teilprojekt B2 inspectral estimation for time series during the period 1988-1992. Highresolution spectral estimates are introduced and the role of data tapers arediscussed. Parametric models such as ARMA-models are fitted and judged byfrequency domain methods. Furthermore, a method for the detection of hiddenfrequencies is discussed. The methods are illustrated by simulations.
In this paper least squares penalized regression estimates withtotal variation penalities are considered. It is shown that theseestimators are least squares splines with locally data adaptive placed knotpoints. Algorithms and asymptotic properties are discussed.
The basic idea of the excess mass approach is to measure the amountof probability mass not fitting a given statistical model. It came up first inthe context of testing for a treatment effect, was later applied to inferenceabout the modality of a distribution and even density estimation. Recently theframework has been extended to regression problems. In this survey article wedescribe the idea and summarize the main results.
By using empirical process theory we study a method addressed totesting for multimodality and estimating density contour clusters in higherdimensions. The method is based on the so-called excess mass. Given aprobability measure F and a class of sets in the d-dimensional Euclidean space, the excess mass is defined as the maximal difference between theF-measure and l times the Lebesgue measure of sets in the given class. The excess mass can be estimated by replacing F by the empirical measure. Thecorreponding maximizing sets can be used for estimating density contourclusters. Comparing excess masses over different classes yields informationabout the modality of the underlying probability measure. This can be usedto construct tests for multimodality. The asymptotic behaviour of theconsidered estimators and test statistics is studied for different classesof sets, including the classes of balls, ellipsoids and convex sets.
We prove that Efron's bootstrap applied to the sample ofstudentized periodogram ordinates works quite well for ratio statistics,e.g. estimates for the autocorrelations. The bootstrap approximation for the distribution of these statistics is accurate to the order o 1/SQRT(T)a.s. As a consequence this result carries over to the Whittle estimates.Some simulation studies are reported for a medium-sized stretch of atime series.
Let (Pt : t in Rp) be a simple shift family of distributionson Rp, and let K be a convex cone in Rp. Within the class ofnonrandomized tests of K versus Rp \ K , whose acceptance region A satisfiesA = A + K, tests with minimal bias are constructed. They are compared tolikelihood ratio type tests, which are optimal with respect to a differentcriterion. The minimax tests are mimicked in the context of linearregression and one-sided tests for covariance matrices.
A concept of asymptotically efficient estimation is presented whena misspecified parametric time series model is fitted to a stationary process.Efficiency of several minimum distance estimates is proved and the behavior ofthe Gaussian maximum likelihood estimate is studied. Furthermore, the behaviorof estimates that minimize the h-step prediction error is discussed briefly. The paper answers to some extent the question what happens when a misspecifiedmodel is fitted to time series data and one acts as if the model were true.
It is shown that Tyler's (1987) M-functional of scatter, whichis a robust surrogate for the covariance matrix of a distribution on R^p ,is Fr'echet-differentiable with respect to the weak topology. This propertyis derived in an asymptotic framework, where the dimension p may tend toinfinity. If applied to the empirical distribution of n i.i.d. randomvectors with elliptically symmetric distribution, the resulting estimatorhas the same asymptotic behavior as the sample covariance matrix in anormal model, provided that p tends to infinity and p/n tends to zero.
A definition of discrete evolutionary spectra is given thatcomplements the notion of evolutionary spectral density given by Dahlhaus(Dahlhaus, R.: Fitting time series models to nonstationary processes. Preprint,Univ. Heidelberg, 1992). For processes that have a discrete evolutionary spectrum,the asymptotic behaviour of linear functionals of the periodogram is investigated.The results are applied in a mathematical analysis of Licklider's theory of pitchperception. A pitch estimator based on this theory is investigated with respect tothe shift of the pitch of the residue described by Schouten et al.(Schouten, J.F.,Ritsma, R.J., Lopes Cardozo: Pitch of the residue, J. Acoust.Soc.Am. Vol.34, No.8,1962, 1418-1424).
An unknown signal plus white noise is observed at n discretetime points. Within a large convex class of linear estimators of the signal, we choose the one which minimizes estimated quadratic risk. By construction,the resulting estimator is nonlinear. This estimation is done after orthogonal transformation of the data to a reasonable coordinate system. The procedure adaptively tapers the coefficients of the transformed data. If the class of candidate estimators satisfies a uniform entropy condition, then our estimator is asymptotically minimax in Pinsker's sense over certain ellipsoids in the parameter space and dominates the James-Stein estimatorasymptotically. We describe computational algorithms for the modulation estimator and construct confidence sets for the unknown signal.These confidence sets are centered at the estimator, have correctasymptotic coverage probability, and have relatively small risk asset-valued estimators of the signal.
Suppose one observes a process V on the unit interval, wheredV(t) = f(t) + dW(t) with an unknown function f and standard Brownian motion W. We propose a particular test of one-point hypotheses about f which is based on suitably standardized increments of V.This test is shown to have desirable consistency properties if, for instance, fis restricted to various Hölder smoothness classes of functions. Thetest is mimicked in the context of nonparametric density estimation,nonparametric regression and interval censored data. Under shaperestrictions on the parameter f such as monotonicity or convexity, weobtain confidence sets for f adapting to its unknown smoothness.
In this paper higher order performance of kernel basedadaptive location estimators are considered. Optimalchoice of smoothing parameters is discussed and it isshown how much is lossed in efficiency by not knowingthe underlying translation density.
Additive regression models have turned out to be a useful statistical tool in analyses of high-dimensional data sets. Recently, an estimator of additive components has been introduced by Linton and Nielsen which is based on marginal integration. The explicit definition of this estimator makes possible a fast computation and allows an asymptotic distribution theory. In this paper an asymptotic treatment of this estimate is offered for several models. A modification of this procedure is introduced. We consider weighted marginal integration for local linear fits and we show that this estimate has the following advantages.
(i) With an appropriate choice of the weight function, the additive components can be efficiently estimated: An additive component can be estimated with the same asymptotic bias and variance as if the other components were known.
(ii) Application of local linear fits reduces the design related bias.
Kernel smoothing in nonparametric autoregressive schemes offers a powerful tool in modelling time series. In this paper it is shown that the bootstrap can be used for estimating the distribution of kernel smoothers. This can be done by mimicking the stochastic nature of the whole process in the bootstrap resampling or by generating a simple regression model. Consistency of these bootstrap procedures will be shown.
Consider a partial linear model, where the expectation of arandom variable Y depends on covariates (x,z) through F( theta_0 x + m_0 (z)), with theta_0 an unknown parameter, and m_0 an unknown function. We apply the theory of empirical processes to derive the asymptotic properties of the penalized quasi-likelihood estimator.
Discriminant analysis for two data sets in R^d with probability densities f and gcan be based on the estimation of the set G = { x : f(x) > = g(x) }. Weconsider applications where it is appropriate to assume that the regionG has a smooth boundary. In particular, this assumption makes sense if discriminant analysis is used as a data analytic tool. We discussoptimal rates for estimation of G.
This dissertation consists of four research articles that deal with different aspects of the modeling of financial volatility and dynamic correlations. They all focus on the U.S. stock market and its link to macroeconomic fundamentals by applying MIDAS techniques. The contributions of the articles are of theoretical, methodological, and empirical nature. Each chapter is self-contained and can be read independently.
Chapter 1 and 2 consider GARCH-MIDAS component models and the relationship between long-term financial volatility, the variance risk premium, and the stance of the macroeconomy. Chapter 3 presents a new GARCH model that links time-varying volatility persistence to explanatory variables. Finally, Chapter 4 applies the multivariate DCC-MIDAS model to returns on the stock and the oil market and analyzes their relation to macroeconomic fundamentals.
In den vergangenen Jahrzehnten ist ein immer stärkeres Zusammenwachsen zahlreicher Länder Europas in der Europäischen Union (EU) zu beobachten. Viele politische Entscheidungen werden zwischenzeitlich von den Organen der EU wie dem Europäischen Parlament oder von der Europäischen Kommission gemeinsam für die Mitgliedsstaaten der EU getroffen. Um Politik und Bürgern eine geeignete Informationsgrundlage für den politischen Meinungsbildungs- und Entscheidungsprozess zu geben, sind geeignete statistische Berichtssysteme von existenzieller Bedeutung. Vor diesem Hintergrund erscheint es besonders wichtig, eine fundierte Entscheidung über Messkonzepte für den zentralen Politikbereich der nachhaltigen Entwicklung gesellschaftlicher Wohlfahrt zu treffen. Aufgrund des bisher unzureichenden Instrumentariums zur umfangreichen gesellschaftlichen Wohlfahrtsmessung soll im Rahmen dieser Arbeit die Diskussion um die dazu notwendigen Berichtssysteme geführt werden, entsprechende Ansätze für die Länder Europas und die EU vorgestellt und berechnet werden.
The investigation of dependence structures plays a major role in contemporary statistics. During the last decades, numerous dependence measures for both univariate and multivariate random variables have been established. In this thesis, we study the distance correlation coefficient, a novel measure of dependence for random vectors of arbitrary dimension, which has been introduced by Szekely, Rizzo and Bakirov and Szekely and Rizzo. In particular, we define an affinely invariant version of distance correlation and calculate this coefficient for numerous distributions: for the bivariate and the multivariate normal distribution, for the multivariate Laplace and for certain bivariate gamma and Poisson distributions. Moreover, we present a useful series representation of distance covariance for the class of Lancaster distributions and derive a generalization of an integral, which plays a fundamental role in the theory of distance correlation.
We further investigate a variable clustering problem, which arises in low rank Gaussian graphical models. In the case of fixed sample size, we discover that this problem is mathematically equivalent to the subspace clustering problem of data for independent subspaces. In the asymptotic setting, we derive an estimator, which consistently recovers the cluster structure in the case of noisy data.
This thesis looks at prices in two different markets.
The first one is the market for food products in Europe. With the introduction of the common market in 1992, most European markets have been integrated. When 10 more countries joined the EU in 2004, another round of integration took place and the common market was extended to these countries as well. We analyse if retail prices for food products have converged in the time after this "shock" of the EU enlargement. While there exists an extensive literature on convergence in general, this chapter is the first to be able to look at retail price convergence within the European Union at a micro-data level. By decomposing price convergence into within sub-groups and between sub-groups of countries convergence, we add further insight to the literature on what causes the strong price convergence within the enlarged EU.
The second market this thesis looks at is the one for internet facilitated sexual services in Germany. Sex work and the advertisement thereof is legal in Germany, which has led to a range of internet platforms concerned with selling sexual services. While many platforms only contain advertisements, one of these platforms gives sex workers the opportunity to sell their services either as an auction or at a fixed price. This has allowed us to create a dataset on sex work with information based on actual concluded contracts, which is a very unique feature in this kind of literature. Furthermore, each data point is geo- and time-referenced. This dataset is used to show that 1) offering unprotected sexual services is endogenous, 2) local events influence the supply, demand and price of sexual services, and 3) regional effects influence local prices and habits.
Spatial point processes provide a statistical framework for modeling random arrangements of objects, which is of relevance in a variety of scientific disciplines, including ecology, spatial epidemiology and material science. Describing systematic spatial variations within this framework and developing methods for estimating parameters from empirical data constitute an active area of research. Image analysis, in particular, provides a range of scenarios to which point process models are applicable. Typical examples are images of trees in remote sensing, cells in biology, or composite structures in material science. Due to its real-world orientation and versatility, the class of the recently developed locally scaled point processes appears particularly suitable for the modeling of spatial object patterns. An unknown normalizing constant in the likelihood, however, makes inference complicated and requires elaborate techniques. This work presents an efficient Bayesian inference concept for locally scaled point processes. The suggested optimization procedure is applied to images of cross-sections through the stems of maize plants, where the goal is to accurately describe and classify different genotypes based on the spatial arrangement of their vascular bundles. A further spatial point process framework is specifically provided for the estimation of shape from texture. Texture learning and the estimation of surface orientation are two important tasks in pattern analysis and computer vision. Given the image of a scene in three-dimensional space, a frequent goal is to derive global geometrical knowledge, e.g. information on camera positioning and angle, from the local textural characteristics in the image. The statistical framework proposed comprises locally scaled point process strategies as well as the draft of a Bayesian marked point process model for inferring shape from texture.
Automatic defect detection in industrial optical inspection requires algorithms that can learn from data. A special challenge is data with incomplete labels. One of the methods that the field of machine learning has brought forth to deal with incomplete labels is multiple instance learning. One trait of this setting is that it groups datapoints (instances) into bags.
We propose a novel method to predict bag probabilities from given instance probabilities that has the advantage that its results do not depend on bag size. Also, we propose an extension of the multiple instance model that allows the user to steer the number of instances that are classified as positive.
We implement these methods with an algorithm based on the well-known random forest classifier. Results on a standard benchmark dataset show competitive performance. Furthermore, we apply this algorithm to image data that reflects the challenges of industrial optical inspection, and we show that in this setting it improves over the standard random forest.
In the recent past the state of the art in meteorology has been to produce weather forecasts from ensemble prediction systems. Forecast ensembles are generated from multiple runs of dynamical numerical weather prediction models, each with different initial and boundary conditions or parameterizations of the model. However, ensemble forecasts are not able to catch the full uncertainty of numerical weather predictions and therefore often display biases and dispersion errors and thus are uncalibrated. To account for this problem, statistical postprocessing methods have been developed successfully. However, many state of the art methods are designed for a single weather quantity at a fixed location and for a fixed forecast horizon. This work introduces extensions of two established univariate postprocessing methods, Bayesian model averaging (BMA) and Ensemble model output statistics (EMOS) to recover inter-variable and spatial dependencies from the original ensemble forecasts. For this purpose, a multi-stage procedure is proposed that can be applied for modeling dependence structures between different weather quantities as well as modeling spatial or temporal dependencies. This multi-stage procedure combines the postprocessing of the margins by the application of a univariate method as BMA or EMOS with a multivariate dependence structure, for example via a correlation matrix or via the multivariate rank structure of the original ensemble. The multivariate postprocessing procedure that models inter-variable dependence employs the UWME 8-member forecast ensemble over the North West region of the US and the standard BMA method, resulting in predictive distributions with good multivariate calibration and sharpness. The spatial postprocessing procedure is applied to temperature forecasts of the ECMWF 50-member ensemble over Germany. The procedure employs a spatially adaptive extension of EMOS, utilizing recently proposed methods for fast and accurate Bayesian estimation in a spatial setting. It yields excellent spatial univariate and multivariate calibration and sharpness. Further the method is able to capture the spatial structure of observed weather fields. Both extensions improve calibration and sharpness in comparison to the raw ensemble and to the respective standard univariate postprocessing methods.
Among the rich material on graphical presentation of information in "La Graphique et le Traitement Graphique de l'Information" (1977), engl. "Graphics and Graphic Information Processing" (1981), Jaques Bertin discusses the presentation of data matrices, with a particular view to seriation. "A Tribute to J. Bertin's Graphical Data Analysis", presented at the SoftStat Conference '97, gives an appraisal of this aspect of Bertin's work.
Bertin's approach has been implemented in the Voyager system for data analysis. This is a video of the SoftStat '97 presentation using Voyager. Video recorded in Feb. 2014.
References:
J. Bertin: La Graphique et le Traitement Graphique de l'Information. Flammerion, Paris (1977)
J. Bertin: Graphics and Graphic Information Processing. De Gruyter, Berlin(1981)
A. de Falguerolles; F. Friedrich, G. Sawitzki: A Tribute to J. Bertin's Graphical Data Analysis. In: W. Bandilla, F. Faulbaum (eds.) Advances in Statistical Software 6. Lucius&Lucius Stuttgart 1997 ISBN 3-8282-0032-X pp. 11 - 20.
G. Sawitzki: Extensible Statistical Software: On a Voyage to Oberon. Journal of Computational and Graphical Statistics Vol. 5 No 3 (1996)
SoftStat '97: the 9th Conference on the Scientific Use of Statistical Software, March 3-6, 1997, Heidelberg.
Project home page: <http://www.statlab.uni-heidelberg.de/projects/bertin/>
Author's home page: <http://www.statlab.uni-heidelberg.de/users/gs/>
The private provision of public goods (PPPG) is still one of the most fascinating puzzles in economics. Underpredicted by standard economic theory but outrightly evident in empirical evidence, its presence opened doors for new methods to enter the economist’s toolkit and helped birthing the nowadays more vibrant than ever field of behavioral economics. Variants of the question of what determines giving in PPPG make up, for the most part, the research questions of the five articles that constitute this dissertation. Residing on the overlap of public economics, environmental economics, and behavioral/experimental economics, it reports on two field-experimental and one lab-experimental projects, delivering results with respect to, for example, the price elasticity of giving to public goods, the willingness to pay for a voluntary one-ton emissions reduction, the pronounced effect of education on PPPG, the "pure" effect of group size in public good provision, and the effects of ambient noise and outdoor temperature on PPPG.
The Tailorshop simulation is a computer-based dynamic decision-making task in which participants lead a fictional company for 12 simulated months. The present study investigated whether the performance measure in the Tailorshop simulation is reliable and valid. The participants were 158 employees from different companies. Structural equation models were used to test τ-equivalent measurement models. The results indicate that the trends of the company value between the second and the twelfth month are reliable variables. Furthermore, this measure predicted real-life job performance ratings by supervisors and was associated with the performance in another dynamic decision-making task. Thus, the trend of the company value provides a reliable and valid performance indicator for the Tailorshop simulation.
We present methods for the systematic modelling and clustering of time series. Our data is associated with behavioral studies of alcoholism in animals. We analyze multivariate time series obtained from an automated drinkometer system. Here, rats have free access to water and three alcoholic solutions (this being the baseline treatment level), which is then interrupted by repeated deprivation phases. We develop a methodology to simultaneously classify into- and characterize dynamic patterns of the observed drinking behavior. This is achieved by extending known results on generalized linear models (GLM) for univariate time series to the multivariate case. We simplify the computational fitting procedure, by assuming a shared seasonal pattern throughout individuals and implementing an estimation maximization (EM) algorithm to fit mixtures of the mentioned multivariate GLM. A partition of the data, as well as a characterization of each group is obtained. The observed patterns of drinking behavior differ in their preference profile for the three alcoholic solutions, and also in the net alcohol intake. We observe an evolution of the drinking behavior over the repeated cycles of alcohol admission and deprivation, with a clear initial preference profile and a development to one of the advanced profiles. Furthermore, to measure the alcohol deprivation effect in this 4-bottle setting, a new criterion is developed, which enables us to classify each rat into presenting ADE or not. This classification shows that the rats develop a tolerance to taste adulteration after few deprivation phases. The proposed framework can be employed to find differences in behavior between different conditions and/or groups of animals and in the prediction of alcoholism from early phases of alcohol intake. The developed methods can also be used in different fields, where the analysis of time series plays an important role (e.g. microarray analysis and neuroscience).
"Campus-Report" heißt die Radiosendung der Universitäten Heidelberg, Mannheim, Karlsruhe und Freiburg. Die Reportagen über aktuelle Themen aus Forschung und Wissenschaft werden montags bis freitags jeweils um ca. 19.10h im Programm von Radio Regenbogen gesendet. (Empfang in Nordbaden: UKW 102,8. In Mittelbaden: 100,4 und in Südbaden: 101,1) Uni-Radio Baden: ein gemeinsames Projekt der Universitäten Freiburg, Heidelberg, Karlsruhe und Mannheim in Zusammenarbeit mit Radio Regenbogen – unterstützt von der Landesanstalt für Kommunikation. Sendung vom 14. März 2012
A classical model in time series analysis is a stationary process superposed by one or several deterministic sinusoidal components. Di erent methods are applied to estimate the frequency (w) of those components such as Least Squares Estimation and the maximization of the periodogram. In many applications the assumption of a constant frequency is violated and we turn to a time dependent frequency function (w(s)). For example in the physics literature this is viewed as nonlinearity of the phase of a process. A way to estimate w(s) is the local application of the above methods. In this dissertation we study the maximum periodogram method on data segments as an estimator of w(s) and subsequently a least squares technique for estimating the phase. We prove consistency and asymptotic normality in the context of "infill asymptotics", a concept that off ers a meaningful asymptotic theory in cases of local estimations. Finally, we investigate an estimator based on a local linear approximation of the frequency function, prove its consistency and asymptotic normality in the "infi ll asymptotics" sense and show that it delivers better estimations than the ordinary periodogram. The theoretical results are also supported by some simulations.