9.15.11 AMD's Lightweight Profiling Instructions

LWP enables applications to collect and manage performance data, and
react to performance events. The collection of performance data
requires no context switches. LWP runs in the context of a thread and
so several counters can be used independently across multiple threads.
LWP can be used in both 64-bit and legacy 32-bit modes.