title: Segmentation-free inference of cell types from in situ transcriptomics data creator: Park, Jeongbin subject: ddc-004 subject: 004 Data processing Computer science subject: ddc-500 subject: 500 Natural sciences and mathematics description: Recent advances in the fields of genome editing, whole-genome sequencing, single-cell RNA sequencing, and in situ spatial transcriptomics have enabled the cost-efficient production of high-throughput big data. However, the lack of dedicated bioinformatics algorithms to analyze such data has been a big hurdle. In this thesis, several novel bioinformatics tools applicable to each field are presented. First, a series of web-based tools for genome editing are presented: Cpf1-Database, Cas-Analyzer, web-based Digenome-seq software, BE-Designer/Analyzer. These tools have been developed to guide researchers to easily use genome editing systems, using Cas9 or Cpf1, by providing an easily accessible web-based interface. Second, the development of two bioinformatics pipelines are described: a small variant calling pipeline to process tumor genome sequencing data without a matched control, and a pipeline to pre-process single-cell RNA sequencing data. Third, a novel segmentation-free algorithm to call cell-types from in situ transcriptomics data, namely Spot-based Spatial cell-type Analysis by Multidimensional mRNA density estimation (SSAM) is presented. Recent advances of in situ spatial transcriptomics techniques, such as multiplexed fluorescence in situ hybridization or in situ/intact tissue sequencing have enabled the discovery of spatial heterogeneity of cell types at the tissue level. However, cell type calling methods are often limited by cell segmentation algorithms due to various imaging problems. SSAM circumvents these problems by estimating spatial gene expressions as a density estimation of the mRNA in a spatial context and identifying de novo cell-types and their spatial organization without the need to segment cells. Optionally, SSAM can be guided by external sources of cell-type information, integrating them in a spatial context. In this thesis, SSAM is demonstrated with three different mouse brain tissues imaged by different imaging techniques: the somatosensory cortex (SSp) imaged by osmFISH; the hypothalamic preoptic region (POA) by MERFISH; and the visual cortex (VISp) by multiplexed smFISH. SSAM can produce similar results compared to segmentation-based methods and outperforms them when cell segmentation is the limiting factor. In summary, the bioinformatics tools presented in this thesis overcome major obstacles that would normally hinder effective analysis: the web-based tools for genome editing have a wide user base due to their easy-to-use web-based interfaces; omics data analysis pipeline that enables fast analysis of omics data utilizing a compute cluster and facilitate hypothesis generation when lacking control tissue; and SSAM that enables the analysis of in situ spatial transcriptomics data without being limited by cell segmentation. All of the tools and pipelines described in this thesis are open-sourced and freely accessible for non-profit, research-purpose use. date: 2020 type: Dissertation type: info:eu-repo/semantics/doctoralThesis type: NonPeerReviewed format: application/pdf identifier: https://archiv.ub.uni-heidelberg.de/volltextserverhttps://archiv.ub.uni-heidelberg.de/volltextserver/28273/7/PhD_thesis_recompiled.pdf identifier: DOI:10.11588/heidok.00028273 identifier: urn:nbn:de:bsz:16-heidok-282738 identifier: Park, Jeongbin (2020) Segmentation-free inference of cell types from in situ transcriptomics data. [Dissertation] relation: https://archiv.ub.uni-heidelberg.de/volltextserver/28273/ rights: info:eu-repo/semantics/openAccess rights: http://archiv.ub.uni-heidelberg.de/volltextserver/help/license_urhg.html language: eng