ProPairs: A Data Set for Protein-Protein Docking

ProPairs identifies and presents protein docking complexes and their unbound structures. They can be used as
benchmark sets to develop or to test docking algorithms. ProPairs:

Provides protein-protein complexes with interface-aligned and superposed structures of the unbound binding partners

Provides a nonredundant set of docking complexes by clustering all detected interfaces

Selects the most representative docking complexes with their most representative unbound structures

Assigns the cofactors of each docking complex to cofactors in the unbound structures

Detects protein docking complexes within the Protein Data Bank (PDB) and presents them as pairs of proteins

Assigns suitable unbound structures to at least one of the two binding partners

Uses only protein structures and biological assembly information from the PDB

Identifies the interface of each docking complex

Is fully automatic and open source

For a multi-chain protein complex different partitions into two binding partners are possible.
Here, we use the existence of corresponding unbound structures in the PDB as the criterion for legitimate
partitions of the considered protein complexes.