Network phenomena are of key importance in the majority of scientific disciplines. They motivate the desire to better understand the implications of interactions between connected entities. In the focus of this thesis are two of the most prominent tasks in the research of such phenomena: the modelling and the inference of connections within networks. In particular, I provide a systematic framework for using the topology and unifying characteristics of networks from fields as diverse as biology, sociology, and economics to predict and validate connections. I build on existing random graph models and node similarity measures, which I then employ in both unsupervised and supervised machine learning approaches. Furthermore, I present novel methods for identifying the statistically significant connections in network settings that involve multiple types of entities and connections — a crucial element of modelling, which most available methods fail to address.
To demonstrate the potential of these new tools, I use them to filter networks that were constructed from large-scale noisy data generated by biological experiments as well as records of online social activity. Subsequently, I predict previously unobserved connections within these networks and evaluate the performance of the developed tools based on ground truth data. In further data sets without direct evidence for the connections in the network, a second, bipartite network serves as proxy for the analysis. Specifically, in an e-commerce setting I use connections between products and customers to deduce similarities between the products based on customer behaviour. In an analysis of high-throughput screening data on the other hand, I utilize relations between proteins and experimental conditions to identify potential functional affinities among the proteins.
The findings presented here show that the computational prediction of connections can both help researchers gain a better understanding of costly large-scale data and guide further experimental design. The thesis demonstrates the potential of a network analytic approach to modelling and inference on multiple applications, such as the uncovering of possible privacy issues in the context of online social networking platforms and the optimization of drug development in cancer treatment.
|Supervisor:||Heermann, Prof. Dr. Dieter|
|Date of thesis defense:||12 December 2013|
|Date Deposited:||19 Dec 2013 10:32|
|Faculties / Institutes:||The Faculty of Physics and Astronomy > Dekanat der Fakultät für Physik und Astronomie|
|Subjects:||004 Data processing Computer science