I focus on systems and algorithms for large-scale data-intensive computing. Below is
a list of open source projects that I work on:

Tachyon: A memory-centric
storage system enabling reliable file sharing at memory-speed across cluster frameworks, such as
Spark and MapReduce. The project is open source and is deployed at multiple companies. It has more
than 50 contributors from over
20 institutions, including Yahoo, Intel, Redhat, Pivotal etc.[SOCC 13] [Github] [Meetup]

Shark: A high-speed query
engine runs Hive SQL queries on top of Spark, and supports fault recovery and complex analytics
(e.g. machine learning). I contributed to the integration with Tachyon. [Github]