Directly to content
  1. Publishing |
  2. Search |
  3. Browse |
  4. Recent items rss |
  5. Open Access |
  6. Jur. Issues |
  7. DeutschClear Cookie - decide language by browser settings

Collection and modeling of data provenance with an integrated metadata concept in the context of biomedical workflows in Data Integration Centers

Gierend, Kerstin

[thumbnail of KerstinGierend_KumulativeDissertation_pdfa.pdf]
Preview
PDF, English - main document
Download (15MB) | Lizenz: Creative Commons LizenzvertragCollection and modeling of data provenance with an integrated metadata concept in the context of biomedical workflows in Data Integration Centers by Gierend, Kerstin underlies the terms of Creative Commons Attribution 4.0

Citation of documents: Please do not cite the URL that is displayed in your browser location input, instead use the DOI, URN or the persistent URL below, as we can guarantee their long-time accessibility.

Abstract

In the context of the Medical Informatics Initiative funded by the German government, medical data Integration centers have implemented complex data flows to load routine health care data into research data repositories for secondary use. Data management practices to (sensitive) medical data elements are of key importance throughout these processes, but less scientific work has so far been undertaken to examine and enforce the data provenance aspects in this specific medical use case. Insufficient knowledge about these medical data and processes can lead to validity risks and weaken the quality of the extracted data. This cumulative dissertation presents the combination of a two-stage methodological approach to facilitate extensive provenance information enrichment in the data Integration pipelines. A MIRACUM wide mixed-method study investigated both, the data management maturity Status and provenance readiness and presented recommendations. The subsequent proof-of-concept study took up this outcome to model and implement an algorithm gathering, storing and extracting continuously relevant provenance information on medical data element level and achieved satisfying pipeline execution times. Overall, the implemented provenance tracking solution indicates a high degree of traceability, accuracy, and reliability of the transformed medical data elements, with which a data Integration center can meet any accountability obligations. In addition, this dissertation serves as a catalyst for the derivation of an overarching data management strategy, abiding data integrity and provenance characteristics as a key factor for quality and FAIR sustained health and research data. This thesis enabled for the first-time extensive provenance information enrichment in the data Integration pipelines in a German medical data Integration center. The dissertation anticipates recommendations enforce quality of patient data dissemination and guide the implementation of auditable and measurable provenance approaches. This development has a potentially broad application since it contributes as initial work to the envisioned European Health Data Space.

Document type: Dissertation
Supervisor: Ganslandt, Prof. Dr. med. Thomas
Place of Publication: Heidelberg
Date of thesis defense: 18 November 2024
Date Deposited: 04 Dec 2024 14:47
Date: 2024
Faculties / Institutes: Medizinische Fakultät Mannheim > Zentrum für Präventivmedizin und Digitale Gesundheit Baden-Württemberg
DDC-classification: 004 Data processing Computer science
600 Technology (Applied sciences)
610 Medical sciences Medicine
Controlled Keywords: Biomedizin, Informatik, Herkunft , Daten , Wiederverwendung
Uncontrolled Keywords: Data provenance; secondary use; data integration center; research data management
About | FAQ | Contact | Imprint |
OA-LogoDINI certificate 2013Logo der Open-Archives-Initiative