PDB-Hadoop

Alnasir, Jamie;
Shanahan, Hugh

This is the alpha release of the PDB-Hadoop framework. This framework developed by Jamie Alnasir and Hugh Shanahan at Royal Holloway University of London facilitates the parallel execution of protein structure analysis tools to be carried out on the entire (or subsets of) the Protein Databank (PDB) using the Apache Hadoop platform. The framework is designed so that structural Biologists can use the Hadoop platform without having to explicitly write Hadoop code. The framework is easily scalable and uses a mapper architecture that functions on a stand-alone basis or can be extended to include further Map-Reduce operations.