title: Efficient Management of Huge Data Sets on Cluster Computers creator: Vasquez Lucas, Hipolito subject: ddc-004 subject: 004 Data processing Computer science description: In a cluster computer a parallel file system is encharged to spread one single parallel file on the different computer's I/O nodes using a determined distribution function. In file I/O intensive parallel scientific applications with "semi-random temporal parallel file I/O acess patterns", this file is accessed at different addresses at the sametime by a number of processes that may vary between two consecutive iterations. In this thesis a set of "semi-random temporal parallel file I/O access patterns" generated by a phylogenetical application is categorized. For these patterns a partitioning function is proposed that guarantees at any time during execution access to the parallel file. This thesis shows the correlation existing between the type of I/O access patterns and the type and setting of two round robin based distribution functions so that the overall application's execution time can be reduced. date: 2012 type: Dissertation type: info:eu-repo/semantics/doctoralThesis type: NonPeerReviewed format: application/pdf identifier: https://archiv.ub.uni-heidelberg.de/volltextserverhttps://archiv.ub.uni-heidelberg.de/volltextserver/13149/1/HVasquezLucasPhD04072011.pdf identifier: DOI:10.11588/heidok.00013149 identifier: urn:nbn:de:bsz:16-opus-131491 identifier: Vasquez Lucas, Hipolito (2012) Efficient Management of Huge Data Sets on Cluster Computers. [Dissertation] relation: https://archiv.ub.uni-heidelberg.de/volltextserver/13149/ rights: info:eu-repo/semantics/openAccess rights: http://archiv.ub.uni-heidelberg.de/volltextserver/help/license_urhg.html language: eng