Anomaly detection is an important task for hyperspectral data exploitation. Although many algorithms have been developed for this purpose in recent years, due to the large dimensionality of hyperspectral image data, fast anomaly detection remains a challenging task. In this work, we exploit the computational power of commodity graphics processing units (GPUs) and multicore processors to obtain implementations of a well-known anomaly detection algorithm developed by Reed and Xiaoli (RX algorithm), and a local (LRX) variant which basically consists in applying the same concept to a local sliding window centered around each image pixel. LRX has been shown to be more accurate to detect small anomalies but computationally more expensive than RX. Our interest is focused on improving the computational aspects, not only through efficient parallel implementations, but also by analyzing the mathematical issues of the method and adopting computationally inexpensive solvers. Futhermore, we also assess the energy consumption of the newly developed parallel implementations, which is very important in practice. Our optimizations (based on software and hardware techniques) result in a significant reduction of execution time and energy consumption, which are key to increase the practical interest of the considered algorithms. Indeed, for RX the runtime obtained is less than the data acquisition time when real hyperspectral images are used. Our experimental results also indicate that the proposed optimizations and the parallelization techniques can significantly improve the general performance of the RX and LRX algorithms while retaining their anomaly detection accuracy.