Umit Catalyurek
¡
Joel Saltz
¡
¢
Department of Computer Science and Engineering£
Department of Biomedical Informatics
Ohio State University, Columbus OH 43210
Abstract. There is an increasing trends towards distributed and shared repositories for storing
scientific datasets. Developing applications that retrieve and process data from such repositories
involves a number of challenges. First, these data repositories store data in complex, low-level
layouts, which should be abstracted from application developers. Second, as data repositories are
shared resources, part of the computations on the data must be performed at a different set of
machines than the ones hosting the data. Third, because of the volume of data and the amount of