Live Webcast 15th Annual Charm++ Workshop

Power and Energy Management of Modern Architectures in Adaptive HPC Runtime Systems
Thesis 2015
Publication Type: PhD Thesis
Repository URL:
Power and energy efficiency are important challenges for the High Performance Computing (HPC) community. Excessive power consumption is a main limitation for further scaling of HPC systems, and researchers believe that current technology trends will not provide Exascale performance within a reasonable power budget in near future. Hardware innovations such as the proposed Exascale architectures and Near Threshold Computing are expected to improve power efficiency significantly, but more innovations are required in this domain to make Exascale possible. To help shrink the power efficiency gap, we argue that adaptive runtime systems can be exploited. The runtime system (RTS) can save significant power, since it is aware of both the hardware properties and the application behavior. We use application-centric analysis of different architectures to design automatic adaptive RTS techniques that save significant power in different system components, only with minor hardware support. In a nutshell, we analyze different modern architectures and common applications and illustrate that some system components such as caches and network links consume extensive power disproportionately for common HPC applications. We demonstrate how a large fraction of power consumed in caches and networks can be saved using our approach automatically. In these cases, the hardware support the RTS needs is the ability to turn off ways of set-associative caches and network links. We also present some required RTS techniques, such as recognizing the running application's pattern using pattern recognition to predict its future and adapt the hardware appropriately. Furthermore, we address two types of prevalent heterogeneity: utilization of accelerator devices and process variation. To study accelerators, we analyze and optimize an example application on a heterogeneous architecture and demonstrate techniques for efficient mapping on different devices (CPU and GPU). To address process variation challenges, we develop accurate models that let the RTS schedule efficiently in the presence speed and power consumption variation. Using the models, we develop a novel scheduling framework that uses integer linear programming to enforce different performance and power consumption constraints.
Research Areas