> [I expect he's thinking of the segmentation on x86 chips, which has> rather bad performance. Segment register loads are very slow, and> since there aren't very many seg registers, there were a lot of> loads. -John]

At the time, I didn't worry about performance that much. In 1990 I
had an 80286 machine with 7MB RAM running OS/2 1.0. Everyone else had
DOS machines, maybe with EMS or XMS, but basically a 640K machine.

But every segment register load also loads 8 bytes of segment
descriptor. I believe it does that even if it is the same segment
selector. A small segment descriptor cache would have helped, but the
80286 was already a lot bigger than previous processor.

If you loop through an array in large model:

for(i=0;i<n;i++) x[i]=0;

every array element access normally requires a segment descriptor
load, though the compiler might move that out of the loop. (I suppose
this is the right group for that question.) If you did it indirectly:

for(i=0;i<n;i++) x[i][0]=0;

It requires one even if the first is factored out, and likely two.

When the 80386 came out with the huge 4GB segments, everyone forgot
about segmentation. Who would ever need more than 4GB?

The Pentium Pro came out with 36 bit physical address space, but no
good way to use it. Addresses came through a 32 bit paging MMU and
there wasn't much you could do about it. If they had done that right,
including a segment descriptor cache, the move to 64 bits might have
been delayed for years. (Even now with 64 bits everywhere, 64GB RAM
is still very rare.)

Maybe intel hoped people would switch to Itanium.

-- glen
[None of the compilers I used were smart enough to factor out the
segment loads, although I believe that some had hacks to tell the
compiler that addresses were all relative to the same segment. -John]