DESCRIPTION

Extract specific data from PDBFile(s) and generate appropriate PDB or sequence file(s). Multiple PDBFile names are separated by spaces. The valid file extension is .pdb. All other file name extensions are ignored during the wild card expansion. All the PDB files in a current directory can be specified either by *.pdb or the current directory name.

During Chains and Sequences values of -m, --mode option, all ATOM/HETAM records for chains after the first model in PDB fils containing data for multiple models are ignored.

OPTIONS

Specify which atom records to extract from PDBFiles(s) during AtomNums, AtomsRange, and AtomNames value of -m, --mode option: extract records corresponding to atom numbers specified in a comma delimited list of atom numbers/names, or with in the range of start and end atom numbers. Possible values: "AtomNum[,AtomNum,..]", StartAtomNum,EndAtomNum, or "AtomName[,AtomName,..]". Default: None. Examples:

10
15,20
N,CA,C,O

-c, --chainsFirst | All | ChainID,[ChainID,...]

Specify which chains to extract from PDBFile(s) during Chains | Sequences value of -m, --mode option: first chain, all chains, or a specific list of comma delimited chain IDs. Possible values: First | All | ChainID,[ChainID,...]. Default: First. Examples:

This option allows retrieval ATOM and HETATM record lines for a specific chain which spread across TER record in PDBFile(s).

--CombineChainsyes | no

During Chains value of <-m, --mode> option with Yes value of <--CombineChains>, extracted data for specified chains is written into a single file instead of individual file for each chain.

During Sequences value of <-m, --mode> option with Yes value of <--CombineChains>, residues sequences for specified chains are extracted and concatenated into a single sequence file instead of individual file for each chain.

-d, --distancenumber

--RecordMode option controls type of record lines to extract from PDBFile(s): ATOM, HETATM or both.

--DistanceModeAtom | Hetatm | Residue | XYZ

Specify how to extract ATOM/HETATM records from PDBFile(s) during Distance value of -m, --mode option: extract all the records within a certain distance specifed by -d, --distance from an atom or hetro atom record, a residue, or any artbitrary point. Possible values: Atom | Hetatm | Residue | XYZ. Default: XYZ.

During Residue value of --distancemode, distance of ATOM/HETATM records is calculated from all the atoms in the residue and the records are selected as long as any atom of the residue lies with in the distace specified using -d, --distance option.

--RecordMode option controls type of record lines to extract from PDBFile(s): ATOM, HETATM or both.

--DistanceSelectionModeByAtom | ByResidue

Specify how how to extract ATOM/HETATM records from PDBFile(s) during Distance value of -m, --mode option for all values of --DistanceMode option: extract only those ATOM/HETATM records that meet specified distance criterion; extract all records corresponding to a residue as long as one of the ATOM/HETATM record in the residue satisfies specified distance criterion. Possible values: ByAtom, ByResidue. Default value: ByAtom.

During the generation of new PDB files, unnecessay CONECT records are dropped.

For Chains mode, data for appropriate chains specified by --c --chains option is extracted from PDBFile(s) and placed into new PDB file(s).

For Sequences mode, residues names using various sequence related options are extracted for chains specified by --c --chains option from PDBFile(s) and FASTA sequence file(s) are generated.

For Distance mode, all ATOM/HETATM records with in a distance specified by -d --distance option from a specific atom, residue or a point indicated by --distancemode are extracted and placed into new PDB file(s).

For NonWater mode, non water ATOM/HETATM record lines, identified using value of --WaterResidueNames, are extracted and written to new PDB file(s).

For NonHydrogens mode, ATOM/HETATOM record lines containing element symbol other than H are extracted and written to new PDB file(s).

For all other options, appropriate ATOM/HETATM records are extracted to generate new PDB file(s).

--RecordMode option controls type of record lines to extract and process from PDBFile(s): ATOM, HETATM or both.

--NonStandardKeepyes | no

Specify whether to include and convert non-standard three letter residue codes into a code specified using --nonstandardcode option and include them into sequence file(s) generated during Sequences value of -m, --mode option. Possible values: yes | no. Default: yes.

A warning is also printed about the presence of non-standard residues. Any residue other than standard 20 amino acids and 5 nucleic acid is considered non-standard; additionally, HETATM residues in chains also tagged as non-standard.

--NonStandardCodecharacter

A single character code to use for non-standard residues. Default: X. Possible values: ?, -, or X.

-o, --overwrite

Overwrite existing files.

-r, --rootrootname

New PDB and sequence file name is generated using the root: <Root><Mode>.<Ext>. Default new file name: <PDBFileName>Chain<ChainID>.pdb for Chainsmode; <PDBFileName>SequenceChain<ChainID>.fasta for Sequencesmode; <PDBFileName>DistanceBy<DistanceMode>.pdb for Distance-m, --mode <PDBFileName><Mode>.pdb for Atoms | CAlphas | NonWater | NonHydrogens-m, --mode values. This option is ignored for multiple input files.

This option is ignored during Sequences values of -m, --mode option.

Specify which resiude records to extract from PDBFiles(s) during ResidueNums, ResiduesRange,and ResidueNames value of -m, --mode option: extract records corresponding to residue numbers specified in a comma delimited list of residue numbers/names, or with in the range of start and end residue numbers. Possible values: "ResidueNum[,ResidueNum,..]", StartResidueNum,EndResiduNum, or <"ResidueName[,ResidueName,..]". Default: None. Examples:

20
5,10
TYR,SER,THR

--RecordMode option controls type of record lines to extract from PDBFile(s): ATOM, HETATM or both.

--SequenceLengthnumber

Maximum sequence length per line in sequence file(s). Default: 80.

--SequenceRecordsAtom | SeqRes

Specify which records to use for extracting residue names from PDBFiles(s) during Sequences value of -m, --mode option: use ATOM records to compile a list of residues in a chain or parse SEQRES record to get a list of residues. Possible values: Atom | SeqRes. Default: Atom.

--SequenceIDPrefixFileName | HeaderRecord | Automatic

Specify how to generate a prefix for sequence IDs during Sequences value of -m, --mode option: use input file name prefix; retrieve PDB ID from HEADER record; or automatically decide the method for generating the prefix. The chain IDs are also appended to the prefix. Possible values: FileName | HeaderRecord | Automatic. Default: Automatic

--WaterResidueNamesAutomatic | "ResidueName,[ResidueName,...]"

Identification of water residues during NonWater value of -m, --mode option. Possible values: Automatic | "ResidueName,[ResidueName,...]". Default: Automatic - corresponds to "HOH,WAT,H20". You can also specify a different comma delimited list of residue names to use for water.

To extract data for all ATOM and HETATM records for complete residues with any atom or hetatm less than 10 angstrom of an atom specifed by atom serial number and name "1,N" in Sample2.pdb file and generate Sample2DistanceByAtom.pdb, type:

AUTHOR

SEE ALSO

COPYRIGHT

Copyright (C) 2019 Manish Sud. All rights reserved.

This file is part of MayaChemTools.

MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.