DiskRouter: A Flexible Infrastructure for High Performance Large Scale Data Transfers

George Kola, Miron Livny
2004

The steady increase in data sets of scientific applications, the trend towards collaborative research and the emergence of grid computing has created a need to move large quantities of data over wide-area networks. The dynamic nature of network makes it difficult to tune data transfer protocols to use the full bandwidth. Further, data transfers are limited by the bottleneck link and different links become the bottleneck at different times resulting in under-utilization of other network hops. To address these issues, we have designed a flexible infrastructure that uses hierarchical main memory and disk buffering at intermediate points to speed up transfers. The infrastructure supports application-level multicast to reduce network load and enables easy construction of application-level overlay networks to maximize bandwidth. It can perform dynamic protocol tuning, use higher-level knowledge, and is being in real-life to transfer successfully several terabyts of astronomy images and educational research videos.