eprintid: 37230
rev_number: 15
eprint_status: archive
userid: 9251
dir: disk0/00/03/72/30
datestamp: 2025-09-04 11:03:57
lastmod: 2025-09-23 10:40:39
status_changed: 2025-09-04 11:03:57
type: masterThesis
metadata_visibility: show
creators_name: Bespalov, Sergej
title: Reducing Global Memory Accesses in DNN Training using Structured Weight Masking
subjects: ddc-004
divisions: fac-720000
divisions: i-160001
adv_faculty: af-19
abstract: Training large deep neural networks (DNNs) is often constrained by memory bandwidth, with frequent global memory accesses representing a significant performance bottleneck. This thesis investigates the potential of dynamic structured weight masking to alleviate this bottleneck during training, focusing on the ResMLP architecture—a feedforward network composed exclusively of Multi-Layer Perceptrons. A novel framework implementing block-wise masking based on L2 norm magnitude and top-k selection was developed and evaluated on the CIFAR-10 dataset. The study systematically varied block
sizes and sparsity ratios, analyzing the impact on classification accuracy, theoretical computational cost (FLOPs), and theoretical memory movement.
Results indicate that model accuracy remains robust up to approximately 50% sparsity when the mask is also applied during the backward pass; beyond this threshold, classification accuracy degradation is observed. Notably, larger blocks contribute to improved computational efficiency under masked backward conditions by offering
hardware-friendly memory access patterns, whereas in unmasked backward passes, smaller blocks tend to perform more favorably in terms of maintaining accuracy. A key observation is the discrepancy between the substantial reduction in computationally active weights and the limited decrease in estimated memory movement, suggesting
that tangible memory savings can only be achieved with hardware-aware implementations that bypass unnecessary data loads. Theoretical FLOPs decrease linearly with increasing sparsity, confirming the potential for computational efficiency gains.
Overall, this work contributes an empirical analysis of dynamic structured weight masking in MLP-based architectures, offering insights into the trade-offs between mask ratio, block granularity, and training stability. The findings underscore the importance of co-designing masking patterns to achieve improvements in both computational cost and memory access, while also highlighting considerations for maintaining training stability. Furthermore, they provide practical guidelines for the efficient training of DNNs on systems with limited memory or computational resources.
date: 2025
id_scheme: DOI
id_number: 10.11588/heidok.00037230
ppn_swb: 1936556707
own_urn: urn:nbn:de:bsz:16-heidok-372307
date_accepted: 2025-06-10
advisor: HASH(0x561f65e34698)
language: eng
bibsort: BESPALOVSEREDUCINGGL2025
full_text_status: public
place_of_pub: Heidelberg
thesis_type: master
citation:   Bespalov, Sergej  (2025) Reducing Global Memory Accesses in DNN Training using Structured Weight Masking.  [Master's thesis]     
document_url: https://archiv.ub.uni-heidelberg.de/volltextserver/37230/1/Bespalov_Sergej.pdf