eprintid: 33464 rev_number: 28 eprint_status: archive userid: 5498 dir: disk0/00/03/34/64 datestamp: 2023-07-07 09:22:07 lastmod: 2024-07-02 08:00:34 status_changed: 2023-07-07 09:22:07 type: doctoralThesis metadata_visibility: show creators_name: Simoes Costa, Ana Luisa title: Computational approaches to the study of chromatin in disease: examples from cancer and infection subjects: ddc-004 subjects: ddc-500 subjects: ddc-570 divisions: i-140001 divisions: i-716000 adv_faculty: af-14 cterms_swd: Bioinformatik cterms_swd: Daten cterms_swd: Statistik cterms_swd: Genomik cterms_swd: Epigenetik cterms_swd: Infektion cterms_swd: Krebs abstract: Background: Machine learning approaches are becoming increasingly common in biological research, as these allow for a better understanding of the complex cell dynamics. Epigenetics encompasses processes able to modulate gene expression that do not depend on genomic sequence. Oftentimes, epigenetic alterations have been linked to disease. In this thesis, we applied several computational approaches to characterise the epigenetic landscape of diseased states caused by Human Immunodeficiency Virus infection and cancer in the brain. Results: On the first part of this thesis, we applied non-negative matrix factorisation to build an epigenetic state map for the C20 microglial cell line and assessed the connection between integration and epigenetics in the context of HIV-1 infection. Through random forest models, we observed that genomic targets of HIV-1 integration are influenced by the initial epigenetic landscape and that infection leads to changes in the chromatin accessibility and TF binding. Furthermore, we found that regions often targeted by viral integration are associated to higher order chromatin structures, in particular topologically associated domains. On the second part of this thesis, we characterised CpG islands (CGI) of four glioblastoma subtypes and identified a new phenotype of CGI hypermethylation associated to RTK-II subtype, different from the one observed on the IDH subtype. We compared the CGI hypermethylation phenotypes associated to the IDH and RTK-II subtypes using random forests and use progenitor states to assess the tendency within each CGI to become hypermethylated. We observed that CGI most likely to become hypermethylated in cancer are marked already on undifferentiated cell states. Moreover, we observed that RTK-II CGI hypermethylation disturbs the astrogenic/neurogenic fate balance. Conclusions: This thesis provides novel insights into the epigenetics of HIV-1 integration and CGI hypermethylation in glioblastoma. Through a genomic and epigenomic data-driven approach, we emphasise the importance of computational approaches like non-negative matrix factorisation, random forest, and bayesian networks into epigenetic research, as these provided an hollistic view of the global effects of viral integration and CGI hypermethylation in human cells. date: 2024 id_scheme: DOI id_number: 10.11588/heidok.00033464 ppn_swb: 1893067513 own_urn: urn:nbn:de:bsz:16-heidok-334645 date_accepted: 2023-06-23 advisor: HASH(0x55fc36bc6cf0) language: eng bibsort: SIMOESCOSTCOMPUTATIO20230705 full_text_status: public place_of_pub: Heidelberg citation: Simoes Costa, Ana Luisa (2024) Computational approaches to the study of chromatin in disease: examples from cancer and infection. [Dissertation] document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/33464/1/thesis_17-04_SimoesCosta.pdf