My current research focuses on parallel-prefix scan and its variants and their efficient implementations on graphics hardware. The goal is to use it as a primitive in developing fast parallel sort and sparse solvers. I am also developing algorithms to build data structures which can store sparse data. The two avenues that I have explored are using a fast parallel sort to build Bounding Volume Hierarchies and develop a very fast algorithm which builds a perfect hash table on data parallel hardware. I am releasing all my work as a part of the data parallel primitives library CUDPP.

I have also worked with Aaron Lefohn on rendering techniques and parallel algorithms on graphics processors (GPUs) which currently enable interactive film preview and will enable high-quality graphics for games in the near future. During 2005 and 2006, I worked on generating high-quality shadows for dynamic scenes at interactive rates on GPUs. This work has also highlighted the importance of basic parallel algorithms in doing high-quality rendering. We have developed the fastest known implementation of the parallel scan algorithm on the GPU and have identified the need for fast implementations of other key algorithms . I also worked with Aaron on Glift, which provides the multi-dimensional hierarchical data structure needed to generate high-quality shadows.

[ past life ]

In my past life, I was working in the SunONE Application Server group at Sun Microsystems. I was there for four years and through two generations of the product, before the call of graphics proved irresistible. My non-graphics work also involves working on StorageTek's (now part of Sun Microsystems) REELs tape library management software. I also did a short stint in Tokyo, designing a large-scale online system.