Ankit Sethia

About the Event

High performance computing is evolving at a rapid pace, with throughput oriented
processors such as graphics processing units (GPUs), substituting for
traditional processors as the computational workhorse. Their adoption has seen
a tremendous increase as they provide high peak performance and energy
efficiency while maintaining a friendly programming interface. Furthermore, many
existing desktop, laptop, tablet, and smartphone systems support accelerating
non-graphics, data parallel workloads on their GPUs. However, the multitude of
systems that use GPUs as an accelerator run different genres of data parallel
applications, which have significantly contrasting runtime characteristics.
GPUs use thousands of identical threads to efficiently exploit the on-chip
hardware resources. Therefore, if one thread uses a resource (compute,
bandwidth, data cache) more heavily, there will be significant contention for
that resource. This contention will eventually saturate the performance of the
GPU due to contention for the bottleneck resource, leaving other resources
underutilized at the same time. Traditional policies of managing the massive
hardware resources work adequately, on well designed traditional scientific
style applications. However, these static policies, which are oblivious to the
application's resource requirement, are not efficient for the large spectrum of
data parallel workloads with varying resource requirements. Therefore, several
standard hardware policies such as using maximum concurrency, fixed operational
frequency and round-robin style scheduling are not efficient for modern GPU
applications.
This thesis defines dynamic hardware resource management mechanisms which
improve the efficiency of the GPU by regulating the hardware resources at
runtime. The first step in successfully achieving this goal is to make the
hardware aware of the application's characteristics at runtime through novel
counters and indicators. After this detection, dynamic hardware modulation
provides opportunities for increased performance, improved energy consumption,
or both, leading to efficient execution. The key mechanisms for modulating the
hardware at runtime are dynamic frequency regulation, managing the amount of
concurrency, managing the order of execution among different threads and
increasing cache utilization. The resultant increased efficiency will lead to
improved energy consumption of the systems that utilize GPUs while maintaining
or improving their performance.