Mechanistic dissection of cellular perturbations with interpretable deep learning models

Doncevic, Daria Ivona

[thumbnail of Doncevic_Daria_komplett.pdf]

Preview

PDF, English
Download (65MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

DOI: 10.11588/heidok.00035577
URN: urn:nbn:de:bsz:16-heidok-355777

Abstract

Background: Deep Learning (DL) is becoming more and more state-of-the-art for the analysis of next-generation sequencing data such as RNA-seq and single-cell RNA-seq, due to its ability to capture more complex patterns in the data. In particular, variational autoencoders (VAEs) have been used for a variety of tasks ranging from batch effect removal to data integration. One disadvantage of DL lies in its limited interpretability due to the the non-linear nature of the models. However, interpretability is crucial especially in the biomedical context.

Results: In this thesis, we developed OntoVAE, an interpretable VAE model whose latent space and decoder are reflecting a biological regulatory network. OntoVAE can be installed from Pypi and is available on GitHub at https://github.com/hdsu-bioquant/onto-vae. We used OntoVAE to compute pathway activities and to predict the outcome of a gene knockout and of interferon treatment response. We then further developed COBRA, a tool that extends OntoVAE with an adversarial approach to disentangle the effects of different covariates. We used COBRA to study interferon response, adrenal medulla development, and schizophrenia.

Conclusion: OntoVAE and COBRA are useful VAE tools that are based on an interpretable latent space and decoder. They can compute pathway activities, but also be used for predictive modeling, and in the case of COBRA, also to extract effects otherwise overshadowed by confounders. Both tools are easy to install and easy to use, and thus a valuable resource to the scientific community.

Document type:	Dissertation
Supervisor:	Herrmann, Prof. Dr. Carl
Place of Publication:	Heidelberg
Date of thesis defense:	14 October 2024
Date Deposited:	05 Nov 2024 08:56
Date:	2024
Faculties / Institutes:	Fakultät für Ingenieurwissenschaften > Institute of Pharmacy and Molecular Biotechnology
DDC-classification:	004 Data processing Computer science 570 Life sciences
Controlled Keywords:	Bioinformatik, Maschinelles Lernen