Founded
in 1993, NVIDIA has delivered during the last six years (since they released
their first 3D card in 1995), various 3D graphic cards that were all more
powerful than the others. Since the introduction of the TNT graphic chip in
1998, NVIDIA became the undeniable worldwide 3D chipmaker giant: they buried
all their competitors and even bought back recently 3DFX that was the last
survivor capable to offer a viable alternative to NVIDIA’s supremacy. Anyway
they didn’t get the king of the hill 3D chipmaker status easily. Indeed it’s
been years now since gamers have had eyes only for NVIDIA GPUs which known
for their amazing power and the quality of their drivers. This is because
when you buy a graphics card drivers are the most important thing after the
engine to get the most out of your purchase. When NVIDIA introduced the
first GeForce, the GeForce 256, it gave it the sweet designation of GPU
where GPU stands for Graphics Processing Unit. Indeed they were definitely
right naming their graphic chips that way since the GeForce 3 contains 57
million transistors against only 42 million for the Pentium 4 processor!
This statement reveals itself the power you can expect from the GeForce 3
Titanium 500. The release of the GeForce III Titanium 500 GPU arrived in
October 2001, a few months only after the launch of the GeForce III. It was
somewhat a surprise since it wasn’t really expected. Many analysts consider
the GeForce III Titanium 500 to be NVIDIA’s answer to the launch of the ATI
Radeon 8500. The GeForce III Titanium 500 is to the GeForce III what the
GeForce II Ultra to the GeForce 2 was. The GeForce 3 Titanium 500 is
NVIDIA’s chance to demonstrate they are still the king of the hill and can
beat ATI and its Radeon line of cards easily. Below are the features of the
GeForce 3 Titanium 500:

256 bit
GPU engraved in 0.15µ

57M Transistors

960 Billion Operation Per Second

16 AA samples/clock - 3.8 Billion AA
pixels/sec

4 pixel
pipelines

2
simultaneous textures by pixel

4
active textures max per pixel per pass

36
simultaneous Pixel shading operations per pass

128
Vertex instructions per pass

GPU
clocked at 240 MHz

DDR
clocked at 250 MHz

64 MB
of onboard DDR memory using a 128-bit interface

8.0 GB
per second memory bandwidth

DVD
Motion Compensation technology

We didn’t
resist dismantling the heatsink that was over the GPU of the Hercules 3D
Prophet III Titanium 500 to discover a GeForce 3 branded GPU carrying the
revision number ‘A5’. Despite the fact the GPU looks like the first GeForce
3 GPU and is still engraved in 0.15µ, the manufacturing process has been
refined. The GeForce 3 Titanium 500 is engraved using TSMC High Performance
0.15µ process so less heat emanates from the GPU. Thus NVIDIA engineers were
able to enhance the frequency of the GPU by 40 MHz in comparison to the
GeForce 3. The DDR memory is now clocked at 250 MHz which grants to the card
a much better memory bandwidth of 8.0 GB per second against 7.36 GB per
second for the GeForce 3. The PCB of the card now contains 8 layers with a
revamped power source. The GeForce 3 Titanium 500 doesn’t include any new
features since it uses all the technologies previously introduced in the
GeForce 3 like the Light Speed Memory architecture that optimizes the
bandwidth as well as the brand new graphic engine called nFinite FX we’ll
review in details. The only difference is the native enabling of the Shadow
Buffers & 3D Textures features that were already present in the GeForce 3
but disabled by the drivers. For those of you who have already read the 3D
Prophet III review, you can skip the technical part of the review (since it
is the same as the Prophet III) and jump directly to the “Newly GeForce 3
Titanium 500 Enabled Features” section.

Light Speed Memory
Architecture

As said
before compared to the GeForce II Ultra, the GeForce 3 chip’s specifications
don’t vary a lot and most of you have already noticed that the 960 Mpixels/s
fillrate of the GeForce 3 is inferior to the one of the GeForce II Ultra
(1GB/s). If typically the fillrates announced by 3D chipmakers are never
reached, it’s not the case anymore thanks to the new architecture that
GeForce 3 carries. With the GeForce 3 most of the changes are under the
hood! The new Light Speed Memory Architecture is aimed to optimize the
memory’s bandwidth for a better and more realistic gaming experience. This
new architecture includes three new unique technologies responding to the
sweet names of ‘Z-Occlusion’, ‘Lossless Z Compression’, ‘Z-Buffer CrossBar’.

CrossBar

The
GeForce 3 GPU comes with a new memory controller called CrossBar whose main
task is to widely compensate the slow fillrate of the chip by avoiding bit
wasting, reducing way latency times, and ensuring it beats the GeForce II
Ultra 99% of the time. Traditionally a GPU uses a 256-bit memory controller
that can transfer data only in 256-bits. So if a triangle is only one pixel
in size it requires a memory access of 32 bytes when only 8 bytes are in
fact required: more than 75% of the memory bandwidth is wasted with this
process! That’s why NVIDIA intelligently solved the problem by implementing
the new CrossBar controller. Unlike yesterday’s GPU, the CrossBar controller
has four independent wide memory sub-controllers that can treat 64 bit
blocks per clock for a global total of 256 bits (it can also group data to
treat them entirely in 256 bits). This new memory controller is the key for
better memory management in order to answer today’s game developers’ needs:
complexity of 3D scenes (the number of triangles per frame has widely
increased in recent games). Comparing to a traditional memory controller,
the CrossBar cuts the average latency down to 25%. That way any 3D
applications can take benefit of this marvel of technology. According to
NVIDIA, the CrossBar controller can speed up memory access up to four times:
it’s obviously the case if data that are about to be written or read make
only 64 bits: but hopefully this situation is far from being an everyday
occurrence.

GeForce 3 Ti 500 CrossBar
Controller

Z-Occlusion Culling

I’m
pretty sure you’re wondering what the hell is Z-Occlusion Culling? Well the
fact is that the name of this new technology isn’t clear at all. Behind this
complex name lies a very simple idea to boost 3D performance. Just like old
PowerVR chips from NEC or the recent Kyro 2, the Z-Occlusion Culling
technology featured by the new Light Speed Memory Architecture of the
GeForce 3 is in fact an HSR (hardware surface removal). Everyone knows that
when a 3D scene is rendered by the GPU, all the pixels are calculated even
those who’d be hidden behind an earlier rendered pixel (for a reason or
another) before the scene is finally displayed. The purpose of Z-Occlusion
Culling is to not calculate the pixels that would be hidden so they won’t be
processed by the pixel shader, saving 50% of the bandwidth with actual
games. Anyway to get the best result with Z occlusion culling the 3D
application should ideally sort its scene’s objects before they are sent to
the 3D chip.

Lossless Z Compression

This new
compression process concerns the Z parameter of a pixel (where Z stands for
depth of the pixel in a 3D scene). Usually when a scene is displayed, the Z
value (coded in 16, 24 or 32 bit) determines if a pixel should be visible or
not. The more the games are beautiful and realistic the more the depth
values are numerous, obstructing the memory. Just like in ATI Radeon chips,
the GeForce 3 Lossless Z Compression reduces the amount of required z-buffer
bandwidth by compressing the information flux, with a factor of 4:1. If
NVIDIA doesn’t detail the algorithm used by the Lossless Z compression, it
can in theory reduce z-buffer memory accesses by 75%. Obviously the
compression is not destructive and doesn’t alter the way scenes are
displayed.