Directly to content
  1. Publishing |
  2. Search |
  3. Browse |
  4. Recent items rss |
  5. Open Access |
  6. Jur. Issues |
  7. DeutschClear Cookie - decide language by browser settings

Cell-type deconvolution model for read-level DNA methylomes

Jeong, Yunhee

[thumbnail of PhD_Thesis_YJ_v2.pdf]
Preview
PDF, English - main document
Download (18MB) | Terms of use

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

Abstract

The cell-type composition in bulk samples serves as key evidence for examining disease progression, phenotypic characterisation and treatment responses. Therefore, cell-type deconvolution has been spotlighted as a computational approach to estimating cell-type composition. DNA methylation (DNAm) has been broadly used as epigenetic marks for cell-type deconvolution because it carries cell type-specific signals at CpG sites in mammal genomes. In particular, sequencing-based DNAm data provides broader genomic coverage and better captures rare cell-type signals compared to array-based data. Despite the advantages of sequencing-based data, so far, array-based data has been the primary target of cell-type deconvolution methods. Hence, we introduce a new sequencing-based cell-type deconvolution method using DNAm data and perform a systematic benchmarking of existing cell-type deconvolution methods. To address the limitations of existing methods in the benchmarking, we developed the deep learning method MethylBERT based on Bidirectional Encoder Representations from Transformers (BERT). The proposed method is specifically designed for tumour purity estimation. MethylBERT classifies DNAm patterns into tumour and normal cell types, and infers the proportion of tumour cell type via maximum likelihood estimation. The evaluation demonstrates the good performance of the proposed method for DNAm pattern classification and estimation of tumour purity. In addition, we show that MethylBERT is capable of detecting rare tumour signals by yielding accurate tumour purity estimation results for bulk samples with a very low tumour percentage (<1%) demonstrating the potential of MethylBERT for non-invasive early cancer diagnostics via blood tests.

Document type: Dissertation
Supervisor: Rohr, PD Dr. Karl
Place of Publication: Heidelberg
Date of thesis defense: 14 March 2024
Date Deposited: 25 Mar 2024 13:28
Date: 2024
Faculties / Institutes: The Faculty of Mathematics and Computer Science > Department of Computer Science
DDC-classification: 004 Data processing Computer science
Controlled Keywords: Transformers, Bioinformatics, Epigenetics, Machine learning
About | FAQ | Contact | Imprint |
OA-LogoDINI certificate 2013Logo der Open-Archives-Initiative