Direkt zum Inhalt
  1. Publizieren |
  2. Suche |
  3. Browsen |
  4. Neuzugänge rss |
  5. Open Access |
  6. Rechtsfragen |
  7. EnglishCookie löschen - von nun an wird die Spracheinstellung Ihres Browsers verwendet.

Performance comparison of four human whole-genome sequencing technologies

Rieber, Nora

[thumbnail of thesis_rieber_final.pdf]
Vorschau
PDF, Englisch
Download (9MB) | Nutzungsbedingungen

[thumbnail of technical_annex.zip] Komprimierte Archivdatei, Englisch
Download (141kB) | Nutzungsbedingungen

Zitieren von Dokumenten: Bitte verwenden Sie für Zitate nicht die URL in der Adresszeile Ihres Webbrowsers, sondern entweder die angegebene DOI, URN oder die persistente URL, deren langfristige Verfügbarkeit wir garantieren. [mehr ...]

Abstract

After almost 30 years of inertia in the field of sequencing, the emergence of a whole range of so-called "next-generation" sequencing technologies has revolutionized the way we approach genomic and genetic research. Sequencing all 3 gigabases of a human genome, once a costly task of 13 years of international efforts, can now be done within a matter of days with a coverage of 30x and more, and comes with a price tag that is affordable for a middle-sized lab. Among the different next-generation sequencing machines developed over the course of the last 6 to 8 years, four instruments from three different companies have established themselves on the market for human whole-genome sequencing: Illumina's HiSeq2000, Life Technologies' SOLiD 4 and 5500xl SOLiD, and Complete Genomics' technology.

However, these next-generation sequencing platforms are still relatively new, and a comprehensive comparative assessment of their performance is lacking. For this purpose, the DNA of two tumor-normal pairs from medulloblastoma patients was sequenced individually to 30x coverage on each of the four instruments. The resulting data was analyzed with respect to its coverage distribution and biases over the genome, in particular GC bias, and regions without coverage as well as specific genomic regions were assessed. SNP calls on the different sequencing machines were compared, and the benefits of combining read information from different instruments were evaluated. Additionally, somatic mutations were analyzed.

The most striking result is the poor coverage of GC-rich regions by SOLiD 4 and 5500xl SOLiD, discouraging their use in particular for methylation experiments and exome sequencing. In contrast, Complete Genomics seems the least affected by GC content and shows the most comprehensive coverage of many genomic regions, except for short repeats. HiSeq2000 exhibits the most even genome-wide coverage distribution and the least sample-to-sample variation, while consistently achieving the highest sensitivity in SNP calling. A combination of read data from different technologies is shown to entail limited improvement in most cases, and is advisable only for very specific applications. Finally, the comparison of somatic variation confirms that calling somatic alterations is still a big challenge, which is due in particular to low allele frequency. In summary, this comparative study illustrates the assets and drawbacks of each individual machine and can be used as a guide to find the most suitable platform for a specific experimental goal.

Dokumententyp: Dissertation
Erstgutachter: Eils, Prof. Dr. Roland
Tag der Prüfung: 6 November 2013
Erstellungsdatum: 08 Nov. 2013 09:06
Erscheinungsjahr: 2013
Institute/Einrichtungen: Fakultät für Biowissenschaften > Dekanat der Fakultät für Biowissenschaften
DDC-Sachgruppe: 004 Informatik
500 Naturwissenschaften und Mathematik
570 Biowissenschaften, Biologie
Normierte Schlagwörter: Bioinformatik, Krebsforschung, Genomik, Genomprojekt
Freie Schlagwörter: Next-generation sequencing
Leitlinien | Häufige Fragen | Kontakt | Impressum |
OA-LogoDINI-Zertifikat 2013Logo der Open-Archives-Initiative