%0 Journal Article %A Mücke, Manfred %A Gansterer, Wilfried N. %D 2011 %F heidok:11585 %R 10.11588/heidok.00011585 %T Analysis of Inter-Chip Communication Patterns on Multi-Core Distributed Shared-Memory Computers %U https://archiv.ub.uni-heidelberg.de/volltextserver/11585/ %X Multi-core multi-socket distributed shared-memory com- puters (DSM computers, for short) have become an impor- tant node architecture in scientific computing as they provide substantial computational capacity with relatively low space and power requirements. Compared to conventional computer networks, inter-chip networks used in DSM computers feature higher bandwidth, lower latency and tighter integration with the CPU. The inter-chip network is a shared resource among the user application and many other services, which can lead to consid- erable variation of execution times of identical communication tasks. In this work, we explore traffic patterns resulting from MPI collective communication primitives and investigate the ques- tion whether inter-chip link load is a reliable indicator and predictor for the execution time of collective communication primitives on a DSM computer. Our experiments on a Sun Fire X4600 M2 DSM computer with 32 cores (eight quad-core CPUs) indicate that specific single link loads are positively correlated with the execution time of MPI ALLREDUCE. Ob- serving patterns over multiple links allows refinement of the single-link observation.