Share This article

Earlier this year, Caustic Professional began shipping the first dedicated ray tracing cards in more than a decade. As we discussed in our coverage last year, ray tracing is capable of producing some stunning visual effects. Multiple companies, including Nvidia, offer ray tracing plugins for professional workstations, but the Optix solution we covered then doesn’t leverage dedicated hardware to perform calculations. Caustic Professional’s R2100 and R2500 add-in boards, on the other hand, serve as dedicated hardware accelerators.

Meet the RTU

The R2500 uses a pair of RT2 ray trace units (RTUs). Each RTU is capable of calculating up to 50 million incoherent rays per second, and each chip has 8GB of dedicated DDR2 RAM (16GB total). There’s also a smaller card — the single-core R2100, with 4GB of on-board memory. The RT2 is a custom ASIC built on a 90nm process, but the card draws comparatively little power. Caustic, which is owned by Imagination Technologies, the same company that makes the PowerVR line of mobile GPUs, specs the R2500 at a max of 65W, while the R2100 can draw 30W.

Looking at the R2500, I’m struck by how much it resembles a 3dfx Voodoo 5 5500. Not so much the 6000 — I owned a working V5 6000 once upon a time, and that card was absolutely covered in copper traces. But the long, thin board, the rows and rows of memory — for a card that’s meant to usher in the future, it conjures a lot of memories.

Part of what keeps the R2100 and R2500 relatively svelte is that they offload object shading to the CPU rather than handling it on the card itself. Storing shader maps and texture data in system memory frees up resources on the card for the ray tracing calculations and model storage. Offloading such capabilities to the GPU could theoretically increase performance, but GPUs generally aren’t designed to share data in that fashion.

Here’s the ASIC itself; the RT2-ES1. It measures 11x10mm, and is built on a 90nm process. PLX bridge chip in the center of the card is a 27x27mm and the RT2-ES1 is smaller than that. You can see the RAM arrays around the chip clearly, but the only visible traces run from the RT2-ES1 ASIC to the PLX 8362 bridge chip.

The PLX 8632 bridge chip is a PCI-Express 2.0 controller with a total of 32-lanes of PCIe connectivity. There’s an x8 connection for each of the two RTUs and a x16 link back to the main system bus.

Post a Comment

looks like it’s still a long way to photo realistic ray traced games, but it is really exciting to actually see it on the horizon!

Abot13

can multiple cards be used to further speed it up? for an equal investment of 4000, 4 of these should be smoking :)

Joel Detrow

Perhaps they could in the future, but these obviously don’t have Crossfire capability ;)

Abot13

shouldnt need crossfire though

Joel Hruska

Right now, no, not officially. Since the CPU is also used for offload, I imagine the attempt to simultaneously juggle CPU-Caustic, Caustic-Caustic and Caustic Card #2 to CPU would be difficult over the PCIe bus.

Update: I actually did manage to test the difference between x16 and x8 rendering in the test scenes I had. I found no difference in rendering times when the card used a PCIe x16 slot vs. an x8 slot — but — that may very well have been scene-dependent.

Caustic says the x16 slot is optimal, and I’ll take their word on it. I think the penalty for going with x8 is small in most cases, but it could reasonably hit you for 5-10% in heavy ray tracing.

gavingreenwalt

Theoretically it should work. But you would only see a substantial improvement on scenes like AO where the shading isn’t the bottleneck. To the system each RTU simply shows up in the device manager as a unique processor. You can enable or disable each. I tried a dual card setup a year or so ago but it didn’t seem to help much. That might be a driver limitation or it might be a scene limitation, I didn’t have much time to experiment. ;)

osowskit

I’d love to try it if you know anyone with a 4 processor machine. :)

Joel Hruska

I’ve got a 16-core system here, but that’s still dual-socket and I don’t have enough PCIe lanes to even try such a thing.

CPU load when rendering is already ~100%. I suspect that scene settings would also need to ramp to show a scaling benefit.

That’s because they are ASICs, not even FPGAs. They are designed specifically for ray tracing, they are hardcoded for that and cannot be programmed to do anything else. ASICs trade versatility and programmability for performance, they are the ultimate specialists. I wonder how the performance will scale with each smaller fabrication node, if they can do it near linearly or even add +50% performance boost with each full new node they have a winner.

Jamie MacDonald

I remember when I first discovered Ray Tracing on my old Dell while playing around with Blender.

Oh boy, a Pentium IV, 1GB RAM, and a Geforce 5200. It took a long time to render, but it was smexy.

Marble Shark

lol one ‘preview’ frame every 2+ minutes is hardly ‘realtime’…

zapper

Ray tracing is the Holy grail of video games which may blur the line between reality & virtual reality almost totally . I will be happy at 30 FPS ray traced gaming. Lets hope it comes out in our lifetime.

http://www.korioi.net/ Korios

Ray tracing has already been massively used in FMVs (full motion videos), but artists that render FMVs can do that while they take their time, and can employ tens of minutes or even hours of computing time for each individual frame. We then easily reproduce these prerendered videos just like we reproduce a movie.

Gameplay, on the other hand, is dynamic, and very few to no parts of it can be prerendered, so the rendering has to occur in real time. This is a huge feat compared to prerendered videos. Even if you can only spend a minute per frame to create a ray traced prerendered video, real time rendering requires 1/30″ per frame for 30 fps videos and 1/60″ per frame for 60 fps videos, id est you have to do it 180 & 360 times faster respectively.

The computing power for that feat does not even exist, (outside of supercomputing environments), but these new ray tracing ASICs certainly shine a bright light towards that goal. But we still have a long, long, long way to go, since these are intended for workstation environments, not for real time ray tracing but for faster ray tracing prerendering. The likes of Pixar, ILM and Weta will love these babies.

berock212

Now this might not sound like much, but it’s built on 90nm manufacturing process. If it was 24nm or smaller, just imagine the performance.

Use of this site is governed by our Terms of Use and Privacy Policy. Copyright 1996-2015 Ziff Davis, LLC.PCMag Digital Group All Rights Reserved. ExtremeTech is a registered trademark of Ziff Davis, LLC. Reproduction in whole or in part in any form or medium without express written permission of Ziff Davis, LLC. is prohibited.