21.6. Checking for Hardware Errors

Red Hat Enterprise Linux 7 introduced the new hardware event report mechanism (HERM.) This mechanism gathers system-reported memory errors as well as errors reported by the error detection and correction (EDAC) mechanism for dual in-line memory modules (DIMMs) and reports them to user space. The user-space daemon rasdaemon, catches and handles all reliability, availability, and serviceability (RAS) error events that come from the kernel tracing mechanism, and logs them. The functions previously provided by edac-utils are now replaced by rasdaemon.

To install rasdaemon, enter the following command as root:

~]# yum install rasdaemon

Start the service as follows:

~]# systemctl start rasdaemon

To make the service run at system start, enter the following command:

~]# systemctl enable rasdaemon

The ras-mc-ctl utility provides a means to work with EDAC drivers. Enter the following command to see a list of command options: