Seemingly out of nowhere -- well, South Korea actually -- a four-year old startup recently burst on the scene with a ray-tracing chip, the RayCore. The company was founded by Dr. Hyung Min Yoon, formerly at Samsung; Hee Jin Shin from LG; Byoung Ok Lee from MtekVision; and Woo Chan Park from Sejong University.

What they demonstrated simultaneously in talks at Hot Chips and Siggraph was a proof of concept for an IP block, not a product. No matter how delivered, they have an interesting and impressive piece of work. In fact, the company has been licensing its ray-tracing IP to OEM partners since 2011, and is currently working with a vendor of mobile apps processors on a next-generation SoC.

RayCore consists of two major components: ray-tracing units (RTUs) and a tree-building unit (TBU) for dynamic scenes. The MIMD execution model of the RTU's unified traversal and intersection (T&I) pipelines was chosen to meet power efficiency and silicon-area needs of mobile devices.

A look inside the RayCore device.

According to SiliconArts, its 28 nm evaluation ASIC measuring 18 mm2 with six RTUs can achieve up to 239 Mrays/second while consuming just one watt. The TBU uses a K-D tree construction to test models with up to 64,000 triangles, and the company says it can run such a test in 20 ms.

SiliconArts uses a novel latency-hiding technique to reduce performance degradation from off-chip memory accesses. The technique run on the T&I pipelines, combined with the TBU and texture mip-mapping, is called "looping for the next chance." SiliconArts shows benchmarks of the approach delivering real-time Whitted ray tracing with six RTUs and K-D-tree construction with one TBU using less than 1.1 GByte/s of memory bandwidth, much less than the 12.8 GByte/s bandwidth provided by today's mobile LPDDR3 memory.

Remember, these are test results using an FPGA. A tightly coupled IP block in a 22 nm or smaller SoC would run significantly faster.

The startup provides OpenGL ES 1.1-like API extensions to separate static and dynamic objects. Static objects are retained for subsequent frames while dynamic objects are transferred to the tree builder via vertex arrays to reconstruct dynamic sub-trees during each frame.

The most beautiful, physically accurate images will be constained in their quality by the color gamut of the display. It's almost a chicken and egg problem, will the computer and mobile device builders (IHDs) add better screen if the contrnt demands it or will the content developers (ISVs) demand there be better screens befroe they will invest in higher quality images. The breakthrough in that log-jam will come from the either pary (IHDs or ISVs) looking for differentiation and taking a chance. Engines like Siliconarts and Caustic will empower them to go forward.

FYI the company's chief strategy officer Hee-Jin Shin said the company's RayCore 1000 is now in verification with hopes of shipping in Sept. A second generation chip will support OpenGL ES 2.x and 3.x which seems like a requirement for use in a mainstream Android phone these days.

Shin said in a phone interview:

"I believe the initial traction may come from simulation systems like mitiary simulators where the quality of the graphics are really low today. This chip can add value in visual quality. There will also be some commercial markets like digital signage and industrial use. It's easier to start in the commercial space and then go to the consumer space."