Scalable Algorithms for Constructing Balanced Spanning Trees on System-ranked Process Groups
European MPI Users' Group Meeting (EuroMPI) 2012
Publication Type: Talk
Repository URL:
Current implementations of process groups (subcommunicators) have non-scalable (O(group size)) memory footprints and even worse time complexities for setting up communication. We propose system- ranked process groups, where member ranks are picked by the runtime system, as a cheaper and faster alternative for a subset of collective operations (barrier, broadcast, reduction, all reduce).

This talk presents two distributed algorithms for balanced, k-ary spanning tree construction over system-ranked process groups obtained by splitting a parent group. Our schemes have much smaller memory footprints and also perform better, even at modest process counts. We demonstrate performance results up to 131,072 cores of BlueGene/P.
Research Areas