Detecting Intel Processors

Knowing the generation
of a system CPU

by Robert R. Collins

The debate about the correct way to detect different generations of
Intel microprocessors has raged for years. In one corner are programmers
who traditionally used a series of PUSHF/POPF instructions to detect the
FLAGs differences between processors. In the other corner, it always seemed
I stood alone, arguing that this technique is flawed. The debate subsided
somewhat in 1989, when Intel published an algorithm that relied upon PUSHF/POPF
for microprocessor identification, But even while the naysayers said, "See,
even Intel does it our way," I stood in my little corner saying "Sure,
but it's wrong."

The truth is, neither algorithm is fail-safe. Intel's PUSHF/POPF method
can misdiagnose which processor family is running and does not guarantee
to operate outside of real mode. My technique should always run in v86
mode, but sometimes doesn't because of shortcomings in the design of many
v86-memory managers - like EMM386 from Microsoft.

Intel's Algorithm

All current-generation Intel x86 processors have an instruction called
CPUID that reads CPU identification information. This information can be
used by software to dynamically take advantage of processor-specific programming
techniques. Before CPUID, you needed to write an algorithm to detect differences
between different generations of processors. This algorithm would serve
much of the same purpose as executing the CPUID instruction. Intel didn't
invent the algorithm; the company borrowed one that was in wide distribution
on the Internet, and published it in the i486 Microprocessor Programmer's
Reference Manual(Intel Corp. 1990), claiming "Copyright Intel
Corporation." Oddly, the original algorithm was published in two halves,
in opposite ends of the manual. Section 22. 10 contained the algorithm
to detect the differences between 8086 through 80386. Figure 3-23 contained
the algorithm to detect the difference between the 80386 and 80486. The
latest edition of this manual removes the code fragments, referring you
to "AP-485, Intel Processor Identification With the CPUID Instruction,"
Order Number 241618 (ftp://ftp.intel.com/pub/IAL/software
specs/ap48504f.pdf).

AP-485 includes the following comment:

Please understand that the code sequences have been validated by
Intel to detect CPU ID, math coprocessor function, and initialize accordingly.
Any other approach may produce unpredictable results in future processors.

It's ironic that Intel claims that "any other approach may produce
unpredictable results," since its algorithm is prone to failures that
yield unpredictable results (as I'll demonstrate in this article). For
more information on CPUID, see the text box "Pentium Detection,"
by Robert Moote (which accompanied the article "Processor-Detection
Schemes," by Richard C. Leinecker, DDJ, June 1993).

The Intel algorithm relies on a series of PUSHF/POPF instructions to
set and clear various FLAGs bits. Each generation of processor has a slightly
different behavior which may be detected by this approach. This algorithm
makes no attempt to detect the 80186/88 series of processors. In this regard,
the algorithm is incomplete.

The 8086/88 is distinguished from the 80286 by attempting to clear bits
12 - 15 of the FLAGs register, The 8086/88 will always set these bits,
regardless of what values are popped into them (see Listing
One). The 286 treats these bits differently. In real mode, these bits
are always cleared by the 286; in protected mode, they are used for IOPL
(I/O Privilege Level) and NT (Nested Task). To continue the detection code,
you need to set bits 12 - 15 in the FLAGs register, and see if they are
cleared by the processor. If they are, then a 286 has been detected (see
Listing Two).

If you gethis point in the algorithm, you know you have at least a 386.
Therefore, it is safe to use 32-bit instructions, like PUSHFD. This will
be necessary in detecting the difference between a 386 and 486. These processors
are distinguished from each othmpting to set the AC flag in the EFLAGs
register. This flag was introduced in the 486, The 386 never sets this
bit, and always clhen it is set by POPFD. Therefore, to detect the difference
between these processor generations, the algorithm attempts to set thiee
if it is latched or cleared by the processor (see Listing
Three).

At this point in the algorithm, you're almost home. To detect the difference
between the 486 and the Pentium, you attempt to set another new EFLAG bit
(bit-21) called the "ID flag." This flag has only one purpose
- to indicate the presence of the CPUID instruction. This bit was first
introduced on the Pentium, but later retrofitted into the 486. If the CPUID
instruction exists on either processor, it may be executed to return the
processor-identification information. 486s without the CPUID instruction
will not be able to toggle this bit. Therefore, it is safe to execute a
sequence of instructions on either processor that detects the processor's
ability to toggle this bit (see Listing Four).

Once the algorithm gets to this point, you can execute the CPUID instruction
to obtain the processor identification. This instruction can be run in
any processor mode, at any privilege level. On the Pentium and 486, the
CPUID instruction has two levels:

The Caveats

In spite of Intel's claim, this algorithm is far from perfect. For one
thing, it fails to detect the 80186/88 series of processors. Even though
this processor wasn't adopted by many PC manufacturers, it was used, in
some computers, primarily notebook computers. The 80186/88 processor contains
most of the new instructions and CPU-generated exceptions contained in
the 80286. These instructions include PUSHA/POPA, PUSH immed, SHL reg,
immed, and the invalid opcode exception. The only 80286 instructions
and exceptions not implemented in the 80186/88 are those specifically used
for protected mode. Failure to detect this processor could prohibit the
use of some software that can take advantage of these new instructions
and exceptions.

This algorithm is only designed to run in real mode, not in a virtual-8086
DOS box running under Windows. This limitation is even mentioned in the
486 manual. This results from the fact that PUSHF and POPF are privileged
instructions that are sensitive to the I/O Privilege Level while running
in protected mode. (DOS boxes, running under Windows, run in virtual-8086
mode - a special form of protected mode.) If IOPL is not equal to three,
then a general-protection fault occurs while attempting to execute these
instructions. The operating system then intervenes to emulate the instruction
as it sees fit. Therefore, there is no guarantee that the operating system
will mimic the real-mode behavior of the specific processor under test.
In reality, this may not be as big a problem as it sounds. Windows sets
IOPL equal to three for DOS boxes. This renders these instructions transparent
to the operating system, and they execute without generating a fault.

Not all operating systems with a DOS-compatibility box follow the example
set by Windows. OS/2 Warp uses a special form of virtual-8086 mode, called
Virtual Mode Extensions (VME). Running in VME affords the protection advantages
of running at IOPL=2 without incurring the faults generated by PUSHF/POPF
used in this algorithm. (See http://www.x86.org/vme1 for a discussion on
VME.) To accommodate this behavior, Intel modified the algorithms of PUSHF/POPF
to allow them to run in VME without faulting to the host operating system.
When IOPL<3, PUSHF always pushes an IOPL value of three onto the stack.
This doesn't cause any problems for the Intel algorithm, as none of the
detection code depends upon setting or clearing these two bits alone.

Should the CPUID instruction ever return a signed number (for example,
80000001h), the Intel algorithm will fail. In Listing Five,
the instruction above the designated "<--" symbol is a conditional
jump based on a signed comparison. This is a common programming error which
can easily be fixed in the Intel algorithm.

This algorithm relies on undocumented processor behavior to detect the
differences between early generations of Intel processors. The use of such
programming tricks violates Intel's own recommendations. Consider the following
guidelines set forth in various Intel manuals:

Reserved Bits and Software Compatibility

Software should not try to identify features by exploiting programming
tricks, undocumented features, or otherwise deviating from the guidelines
presented in this application note.

When bits are marked as reserved, it is essential for compatibility
with future processors that software treat these bits as having a future,
though unknown, effect. The behavior of reserved bits should be regarded
as not only undefined, but unpredictable. Software should follow these
guidelines in dealing with reserved bits:

Do not use undocumented features of a processor to identify steppings
or features.

Do not depend on the states of any reserved bits when testing the
values of registers which contain such bits. Mask out the reserved bits
before testing.

Do not depend on the states of any reserved bits when storing to
memory or to a register.

Do not depend on the ability to retain information written into
any reserved bits.

When loading a register, always load the reserved bits with the
values indicated in the documentation, if any, or reload them with values
previously read from the same register.

These are strong guidelines set forth in Intel's documentation, and
the irony of Intel's algorithm is that it violates each and every one of
them. Detecting the difference between 8086/88 and 80286/88, and between
80286/88 and 80386, completely depends upon setting and clearing reserved
bits in the FLAGs register, and then depends on the state of those bits
when they are stored to a resultant register. Detecting the difference
between 386 and 486, and between 486 and Pentium, depends upon setting
an EFLAGs bit that is undefined on the previous-generation processor, then
depends on that processor to clear the undefined bit. To abide by Intel's
guidelines, the behavior of these undocumented FLAGs bits must be documented
in their respective manuals - but they aren't. None of these differences
are documented in any of the processors' respective data sheets. Processor
behavior often isn't documented until many years after release. The 8086
FLAGs behavior was first described in the 386 programmer's reference manual
in 1988 (nearly ten years after the 8086's introduction). The 80286 FLAGs
behavior wasn't described until the Pentium manuals were introduced in
1993 (ten years after the 80286 introduction, and four years after Intel
introduced this algorithm in the 486 manuals).

Even though Intel's algorithm violates all of its own guidelines, the
company is partially exonerated by the Pentium programmer's reference manual,
where Intel says that it's acceptable to use this algorithm to detect the
differences in these processors. However, the Pentium manual doesn't change
the prohibitions set forth in the 386 or 486 manuals; those prohibitions
still exist. The following excerpt was taking from the Pentium Programmer's
Reference Manual, chapter 5:

The setting of the flags stored by the PUSHF instruction, by interrupts,
and by exceptions is different on the 32-bit processors than that stored
by the 8086, and Intel 286 processors in bits 12 and 13 (IOPL), 14 (NT),
and 15 (reserved). These differences can be used to distinguish what type
of processor is present in a system while an application is running.

My biggest objection to this algorithm is that it's prone to failure
on all processors newer than a 386. When it fails, the algorithm incorrectly
determines that a 386 processor is installed in the system. The failure
is caused when an interrupt occurs precisely where the "" appears
in Listing Three. When this occurs, the AC flag is
cleared (in real mode), and the algorithm fails to detect the correct processor
type. The AC flag has always behaved in this manner, but the behavior wasn't
documented until the 1994 edition of the Pentium Programmer's Reference
Manual(chapter 25, description of INT instruction). There are
a few ways to demonstrate this failure (assuming you're running on a 486
or later processor). You can put an HLT instruction or an INT instruction
at the point designated by the "<--", or run the algorithm
in a loop. Eventually, a timer-tick interrupt will occur at this point.
Inserting an HLT instruction will force the processor to wait for an interrupt
before continuation. When the interrupt occurs, the AC flag will be cleared
during its invocation. Listing Six presents source
code to demonstrate this behavior.

Conclusion

The Intel algorithm isn't nearly as bad as it sounds. It has a few bugs
that can easily be fixed. Intel's intentions were noble, but their implementation
was flawed. In spite of its drawbacks, the reasons this algorithm is in
such widespread use are simple:

It's conveniently available and published by Intel.

It works - most of the time, even in v86 mode.

The biggest drawbacks are that it's not guaranteed to work outside of
real mode, and it depends upon undocumented processor behavior. It would
be nice if an algorithm existed to get the actual stepping information
of processors that didn't support the CPUID instruction, and didn't rely
on undocumented processor behavior. In my next column, I'll present such
an algorithm, discuss its strengths and weaknesses, along with a comparison
of the two algorithms under real operating conditions.