How to Leverage Decimal Floating-Point unit on POWER6 for Linux

This page provides software developers the easy steps to follow to take advantage of the decimal floating-point unit (DFU) available on IBM POWER6 processor-based systems running Linux. We will first present a brief introduction of the decimal floating-point (DFP) technology, and then go over the information on how to use the compilers to leverage DFP.

Today, there are two packages with compilers that can be used to exploit the decimal floating-point functionality on POWER6-based Linux systems:

Newer versions of the Advance Toolchain are available now - in particular Advance Toolchain version 2.1-1. The nice advantage of this latest release is that official support can be made available for that - and POWER7 exploitation features are being introduced.

Information provided in this page can be applied to both Red Hat and SUSE's SLES 10. The examples below were carried out on the system running RHEL5.2. Java applications can also take advantage of the Decimal Floating Point enhancements, but that is beyond the scope of this article. Check out references provided at the end of this page for more information on Java exploitation.

Decimal Floating-Point

Decimal (the classic day-to-day base 10) data is widely used in commercial and financial applications. However, most computer systems have only binary (base two) arithmetic, using 0 and 1 to represent numbers. There are two binary number systems in computers: integer (fixed-point), and floating-point. Unfortunately, decimal calculations cannot be directly implemented with binary floating-point. For example, the value 0.1 would need an infinitely recurring binary fraction while a decimal number system can represent it exactly, as one tenth. So, using binary floating-point cannot guarantee that results will be the same as those using decimal arithmetic.

In general, decimal floating-point operations have been emulated with binary fixed-point integers. Decimal numbers are traditionally held in a binary-coded decimal (BCD) format. While BCD provides sufficient accuracy for decimal calculation, it imposes a heavy cost in performance because it is usually implemented in software.

IBM POWER6 processor-based systems provide hardware support for decimal floating-point arithmetic. POWER6 microprocessor core includes the decimal floating-point unit that provides acceleration for the decimal floating-point arithmetic. The IBM POWER instruction set is expanded; 54 new instructions were added to support the decimal floating-point unit architecture.

Next, we show how developers can exploit decimal floating point math on Linux.

The Advance Toolchain version 1.1-0

The Advance Toolchain is a set of free-software development tools allowing users to take greater leading edge advantage of IBM latest hardware features: (1) Power6 enablement and exploitation, (2) ppc970, POWER4, POWER5, POWER5+,POWER6, POWER6x optimized system libraries, and (3) Decimal Floating Point capability.

Advance Toolchain is a self contained toolchain which does not rely on the base system toolchain for operability, and in fact is designed to coexist with the toolchain shipped with the operating system. That is, you do not have to uninstall the regular GCC compilers that come with your Linux distribution in order to use the Advance Toolchain.

The recommended installation method is to use YaST or YUM commands in order to verify the authenticity of the packages. Please consult the Release Notes for the Advance Toolchain for the detailed instructions. In our experience, installing with the rpm method is just fine.

The following is a list of gcc compiler options for Advance Toolchain related to Decimal Floating Point:

-mno-dfp: instructing the compiler to use calls to library functions to handle decimal floating point computation, regardless of the architecture level. You may experience performance degradation when using software emulation.

IBM XL C/C++ Compilers

IBM XL C/C++ Advanced Edition for Linux is a standards-based compiler with advanced optimizing features for select Linux distributions running on POWER-based systems. It is not free but there is a 60-days trial program in case you would like to check it out.

To try out DFP functionality with IBM XL compiler, you need to first install the Advance Toolchain and then configure the IBM compiler to use it. The Advance Toolchain provides the runtime support for DFP.

Assuming that you installed the Advance Toolchain and the IBM compiler at their default locations, to configure the IBM compiler, you basically have to execute the following command:

Following is a list of compiler options for IBM XL compilers related to Decimal Floating Point:

-qdfp : enabling decimal floating-point support. Specifically, this option will make the compiler to recognize decimal floating-point literal suffixes, and the _Decimal32, _Decimal64, and _Decimal128 keywords.

-qfloat=dfpemulate : instructing the compiler to use calls to library functions to handle decimal floating point computation, regardless of the architecture level. You may experience performance degradation when using software emulation.

-qfloat=nodfpemulate : this is the default when -qarch=pwr6 or -qarch=pwr6e is specified.

Next, we will show how to use OProfile to determine if your code is really using the Decimal Floating-point Unit. OProfile uses hardware performance counters to enable profiling all running program with little overhead. In addition to the event-based profiling, we can use OProfile to get the basic time-spent profiling as well. At the time of this writing, the latest version of OProfile (v0.9.3) has the support for DFU-related events on POWER6. Those events can be found in the following groups: Group 89 pm_dfu and Group 90 pm_dfu2.

Summary

Decimal numbers are widely used in commercial and financial applications. Software support for DFP is generally available today but has performance problem. Decimal Floating-Point Unit provides hardware support for decimal floating-point arithmetic on POWER6-processor based systems. There are two compilers available for Linux: IBM XL C/C++ compilers and Advance Toolchain, to exploit this feature. This hardware support in general will give you a performance boost. The level of performance improvement however depends on the nature of your applications.