In the context of massively parallel processors such as Graphics Processing Units (GPUs), an emerging non-volatile memory – STT-RAM – provides substantial power, area savings, and increased capacity compared to the conventionally used SRAM. The use of highly dense, low static power STT-RAM in processors that run just few threads of execution does not seem attractive because of several times slower write latency, which can relatively impair the performance of the system. However, hundreds to thousands of threads executing in parallel in high-throughput GPUs hide the long write latency of STT-RAM through fine-grained multithreading. In this thesis, evaluation of possibility of STT-RAM for the shared memory in GPUs was done. Performance, energy and area were evaluated across various configurations of shared memory capacity, banks and ports across a set of benchmarks displaying different characteristics. Results show performance degrades only up to 2% on average for an STT-RAM write latency, which is four times that of SRAM write latency. Performance is even increased by 20% on average when the denser STT-RAM is used to increase shared memory capacity, banks and ports. The energy savings are up to 17% and area savings up to 50%. The evaluation helps understand the trade-offs involved in the use of STT-RAM in GPUs. In the process, a few configurations were identified which encourage their use. To better understand the low impact of high shared memory latency on GPU’s performance, a theoretical analysis was done.

Email address protected by JavaScript. Activate javascript to see the email.

We use cookies to improve our service for you. You can find more information in our data protection declaration. By continuing to use our site, you accept our use of cookies and Privacy Policy.OkPrivacy policy