Information about this tutorial can be found in the following reference:P. Cieplak, W.D. Cornell, C. Bayly & P.A. Kollman, Application of the multimolecule and multiconformational RESP methodology to biopolymers: Charge derivation for DNA, RNA, and proteins. J. Comput. Chem.1995, 16, 1357-1377, Abstract, [Local PDF file].

This tutorial presents the execution of the Ante_R.E.D.-1.x and R.E.D. III.x programs to derive RESP charges for (i) whole molecules and (ii) molecule fragments using the dimethylalanine (AIB, 1-aminoisobutyric acid) dipeptide as example. RESP charge derivation involving $n molecules, $i conformations and $j orientations are described in a sequential approach.

Although eight different charge derivation procedures have been developed in the R.E.D. III.x program, it is important to underline that only RESP charge derivations useful in simulations based on the Cornell et al. force field (and its different adaptations) are demonstrated in this tutorial. Many examples and choices made are described, and whole data generated from R.E.D. III.x runs are available for download. However, RESP charges required for force field simulations using the GLYCAM force fields, and ESP charges used in some CHARMM, OPLS and even AMBER force field simulations can also be generated following similar strategies to those presented. Moreover, by changing few parameters in the R.E.D. source code, an infinite number of different procedures can be developed. Thus for instance, by modifying a single line in the R.E.D. III.x source code, charge values compatible with the Duan et al. force field can also be derived.

This tutorial has been tested on a laptop with a single PIV 2.4 GHz cpu and with 1 Gb RAM under Linux Fedora Core 4.0 (kernel 2.6.12.3) using the 22 NOV 2004 (R1) GAMESS version, the Gaussian 1998 RevA.11.4 version and the RESP version from AMBER8 (RESP was recompiled using qtol = 0.1d-5, maxq = 5000, maxlgr = 500 and maxmol = 200).

The list of corrections applied to this tutorial after its first release can be obtained here.

This tutorial has been updated in agreement with the new features incorporated in R.E.D. III.4 and R.E.D. IV version June 2010.

For a new molecule, RESP and ESP charge derivation is performed in three steps (i) Geometry optimization, (ii) Molecular Electrostatic Potential (MEP) computation using the optimized geometry obtained in the first step and (iii) fitting the charges centered on the atoms to the MEP calculated in the second step. The RESP ESP charge Derive program (R.E.D.) sequentially executes these three steps by interfacing the GAMESS (GAMESS-US or Firefly) or Gaussian (Gaussian 94, 98 or 03 version) quantum mechanics (QM) program and the RESP program, and allows the automatic derivation of RESP and ESP charges for the target molecule.

With the R.E.D. version I, highly reproducible RESP and ESP charge values (+/- 0.0001 e) can be derived by controlling the molecular orientation of the optimized geometry of a single conformation of a single molecule whichever the QM program or initial structure choice is. A new charge derivation procedure has been developed based on multi-orientation RESP or ESP fit. The use of multiple molecular orientations in the fitting process is supposed to limit the charge uncertainty induced when using a single orientation. With the R.E.D. version II, the use of multiple conformations of a single molecule in the fitting procedure has been implemented allowing for the derivation of charge values of high quality (i. e. more general; needed in molecular dynamics simulation). Multi-conformation and multi-orientation RESP or ESP fit can be performed together or independently for structures containing chemical elements of the periodic table up to Z = 35 (Bromine) according to the user's choice. With the R.E.D. version III, the control of charge constraints for atoms and groups of atoms in a molecule or a set of molecules (intra-molecule charge constraint) or between two molecules (inter-molecule charge constraint and inter-molecular charge equivalencing) has been developed allowing for the derivation of the RESP and ESP atom charge values for molecule fragments and sets of molecules. Thus, fitting procedures involving multiple molecules, and for each molecule, multiple conformations, and for each conformation, multiple orientations, can now be automatically carried out. Finally, eight different charge derivation procedures using different MEP computation algorithms (Connolly surface and CHELPG algorithms) and different fitting procedures (with or without hyperbolic restraints) are now available. Potentially, an infinite number of approaches can be developed by simply changing a few words in the R.E.D. III.x source code. Such procedures can be used in simulations based on AMBER, CHARMM, OPLS and GLYCAM force fields. Once the R.E.D. execution has correctly completed, the charge values are available in Tripos mol2 file(s) which can be considered as precursors of AMBER OFF and CHARMM RFT or PSF force field libraries.

-I.1- A "Mini HowTo"

The global procedure for deriving RESP or ESP charge values and building force field library(ies) for new molecules and/or new molecular fragments is summarized in Figure 1:

Below is a "Mini HowTo" that summarizes the important steps one could follow in order to derive RESP or ESP atom charge values for whole molecule(s) and/or molecule fragment(s) starting from $n molecules, $i conformations and $j orientations using the Ante_R.E.D. and R.E.D. III.x programs.

-1- Build the $n initial whole molecules in a graphical interface or some other tools, and save them in the Protein Data Bank (PDB) format. When different conformations of a molecule are used in charge fitting, generate the conformers separately, and save them in the PDB format.

-2- Carefully check the atom names, residue names and residue numbers in each of these PDB files since they will be used in the Tripos mol2 files (or, in other words, the force field library precursors) generated by R.E.D. III.x, and in AMBER OFF and CHARMM RTF/PSF force field libraries. Indeed, when one has the goal of developing a new force field library two different atoms in a PDB file cannot have the same atom name within a single residue, and a residue is characterized by a single residue name and a single residue number. Moreover, two residues differ at least by their residue number. Finally, at this stage, the presence of the atom connectivities in the PDB files and the atom order of a molecule to the PDB format do not really matter (although the atom order of two conformations of a molecule must be the same in each PDB file, when multi-conformational calculations are performed).

-3- Execute the Ante_R.E.D. program to convert the different PDB files into the corresponding P2N files by repeating the following command for each PDB file:perl Ante_RED.pl File.pdb orperl Ante_RED.pl File.pdb > Ante_RED.log

-4- Check and modify the P2N files generated by Ante_R.E.D. (an example of P2N file is available by clicking here).
- For each molecule to the P2N format, determine its name (using the international nomenclature rules for instance), and add it to each P2N file as the molecule title (Example: "REMARK TITLE Dimethylalanine-dipeptide"). This molecule name is used in the GAMESS, Gaussian, and RESP input files during the charge derivation procedure, and is required in the corresponding project submission in the RESP ESP charge DDataBase (if one wishes to use this option).
- Check whether the molecule total charge value (the default value is 0.0; Example: "REMARK CHARGE-VALUE 0") and the electronic spin multiplicity (the default value is 1.0; Example: "REMARK MULTIPLICITY-VALUE 1") generated by Ante_R.E.D. are correct. If not, provide the correct values.
- Check the atom connectivities whether they are all listed and correct. Those connectivities are crucial because they are used to create molecular topology information in the Tripos mol2 files generated by R.E.D. III.x, but also in the AMBER and CHARMM force field libraries.
- Select the re-orientation procedure (i. e.QMRA or RBRA) which will be applied to the optimized Cartesian coordinates for each of the molecules used in the charge derivation procedure. This involves the use of a single molecular orientation or multiple orientations in the RESP or ESP fit, respectively. In the case of selecting the RBRA approach, add the corresponding keyword (Example: "REMARK REORIENT 5 18 19" or "REMARK REORIENT 5 18 19 | 19 18 5"). With R.E.D. III.4 and R.E.D. IV version June 2010, two new transformation procedures have been developed: they differentiate translation from rotation (see below).
- Check the atom names found in each P2N file. PDB and P2N files only differ by the presence of a single and two columns of atom names, respectively. In a P2N file, the first column of atom names (or "yellow column" examplified in the P2N file below) is used in the automatic RESP input generation, which was originally implemented in the R.E.D. program version I. The rules followed by R.E.D. to generate these RESP inputs are defined below. The second column (or "red column" also examplified in the P2N file below) contains the original molecule atom names (defined in the PDB file before Ante_R.E.D. execution), and are those used in the Tripos mol2 files generated by R.E.D. III.x (and consequently in the AMBER and CHARMM force field libraries). In the absence of such a second column, the atom names found in the Tripos mol2 files are automatically generated by R.E.D., and are constituted by the chemical symbol of the chemical element and a number which is incremented.
- Add the keywords corresponding to the use of intra-molecular charge constraint(s) (Example: "REMARK INTRA-MCC 0.0 | 1 2 3 4 5 6 | R" or "REMARK INTRA-MCC 0.2719 | 8 | K"), inter-molecular charge constraint(s) (Example: "REMARK INTER-MCC 0.0 | 1 2 | 1 2 3 4 | 1 2 3 4 5 6 7 8"), and/or inter-molecular charge equivalencing (Example: "REMARK INTER-MEQA 1 2 3 4 | 1 2 3 4 5 6 7 8") if one wishes to use such constraints during a RESP or ESP fit. Such constraints are required in charge derivation of molecule fragments and force field topology databases.
- Check the atom order, especially, whether the hydrogen atoms are located after their heavy atom counterparts (quickly locating methylene and methyl groups in a molecule is convenient in RESP charge derivation, in particular when one wishes to study the corresponding RESP inputs).
- Finally, in the case of multi-conformation charge fitting, merge the P2N files of the different conformations (each single P2N file should have the same atom order) of a molecule into a single P2N file. In this file, each set of cartesian cordinates representing a conformation are separated by the 'TER' keyword, and a single set of atom connectivities is conserved and located after the first set of Cartesian coordinates.

-6- Decide whether one wishes to run R.E.D. III.x using the $OPT_Calc = "On" variable, or using the $OPT_Calc = "Off" one (this parameter can be changed in the main program of the R.E.D. III.x perl script).
- If $OPT_Calc = "Off" is setup in R.E.D. III.x, then the geometry optimization for each of the $n molecules (and for the different conformations of each molecule, if required) should be run separately in a standalone mode i. e. using the Gaussian or GAMESS input(s) that are automatically generated by Ante_R.E.D. If different conformations are used for a molecule in a multi-conformational charge fit, concatenate the geometry optimization output files of these conformations into a single file. Finally, rename (or create a symbolic link for) the Gaussian or GAMESS outputs of the $n molecules into $n Mol_red$n.log files, keeping the order of the $n Mol_red$n.log files identical to that of the $n Mol_red$n.p2n P2N files. Run R.E.D. III.x by using the following command:perl RED-vIII.4.pl or perl RED-vIII.4.pl > RED-vIII.log
- If $OPT_Calc = "On" is setup in R.E.D. III.x, then simply run R.E.D. III.x using the command as just above ("perl RED-vIII.4.pl > RED-vIII.log"). Althought it seems simpler, such approach is not recommended when deriving RESP or ESP charge values using a multiple-molecule multiple-conformation and/or multiple-orientation RESP fit. This is because it is unlikely that all the geometry optimization jobs for the different molecules and conformations converge. In the latter case, this leads to a "FAILED" R.E.D. III.x execution.

-7- Convert the Tripos mol2 files generated by R.E.D. III.x into force field libraries specific of AMBER or CHARMM using adequate scripts or programs. For AMBER, LEaP scripts can be written to convert Tripos mol2 files into OFF libraries (examples of such LEaP scripts are already available in the RESP ESP charge DDataBase; click here and here for examples). The convertion of Tripos mol2 files into CHARMM force field libraries will be handled in the future version of R.E.D. However, a project available in the RESP ESP charge DDataBase can be updated and a script dedicated to this task can be added to a project by its author any time she/he wishes (click here and here for examples).

-8- Submit the data generated by R.E.D. III.x into the RESP ESP charge DDataBase or R.E.DD.B. (if one wishes to use this option).

Well, as we say in French: Y'a qu'a... i. e. we just have to start... ;-)

-I.2- General information about input files required by R.E.D. III.x

-I.2.1- The initial P2N file

To execute the R.E.D. program version III an initial P2N file (PDB file format with two columns of atom names and the P2N extension) for _each molecule_ used in the charge derivation procedure is required (previously a single initial PDB file was used by R.E.D. II). R.E.D. III.x automatically recognizes P2N files having the 'Mol_red$n.p2n' file name(s) (where $n is the number of molecule(s) used in the fitting procedure; the "$MOL_START" variable available in the R.E.D. II source code is obsolete in R.E.D. III.x). In this file the following information has to be provided: the molecule title, the molecule total charge, the spin multiplicity of the molecule, the Cartesian coodinates of each conformation describing the molecule, the information about the molecule orientation(s) used in the MEP computation, the atom connectivities, the residue name(s) and the atom names (1 or 2 columns of atom names). Information about intra-molecular charge constraint(s) within a molecule, inter-molecular charge constraint(s), and/or the inter-molecular charge equivalencing between different molecules are also required in charge derivation of molecule fragments and force field topology databases.

-1- The molecule title is defined using the "REMARK TITLE" keywords (Example: REMARK TITLE Dimethylalanine-dipeptide; do not use any space character with the title itself; keywords present at the beginning of a line). If the title is not provided the default title 'Molecule_$n' is used ($n is the number of molecule(s) used in the fitting procedure, the "$TITLE" variable available in the R.E.D. II source code is obsolete). The molecule title is used in the three steps that R.E.D. sequentially executes (geometry optimization, MEP computation and fitting inputs).

-2- The total charge of the molecule is defined using the "REMARK CHARGE-VALUE" keywords (Example: REMARK CHARGE-VALUE 0; keywords present at the beginning of a line). If the total charge value is not found, the default value of 0.0 is assigned (the "$CHR_VAL" variable available in the R.E.D. II source code is obsolete). The total charge of the molecule is required in the geometry optimization and MEP computation steps.

-3- The spin multiplicity of the molecule is defined using the "REMARK MULTIPLICITY-VALUE" keywords (Example: REMARK MULTIPLICITY-VALUE 1; keywords present at the beginning of a line). If the spin multiplicity value is not provided, the default value of '1' is assigned (the "$MLT_VAL" variable available in the R.E.D. II source code is obsolete). The spin multiplicity of the molecule is required in the geometry optimization and MEP computation steps.

-4- The Cartesian coordinates of each conformation describing the molecule have to be provided and have to be separated with the "TER" keyword. The atom order in each conformation has to be the same (this was implemented in R.E.D. II and was not modified in R.E.D. III.x).

-5- The residue name(s) and residue number(s) have to follow rules 'that make sense'. In particular, a single residue name and residue number must characterize a residue: The residue name(s) available in an initial P2N file is(are) used by R.E.D. for the definition of the residue name(s) in a Tripos mol2 file (this was implemented in R.E.D. II and was not modified in R.E.D. III.x). This residue name also plays an important role in the AMBER OFF and CHARMM RFT/PSF force field library definition. The "UNK" residue name used by many graphic interfaces is a generic name which is not representative of the studied molecule. Consequently, "UNK" should be replaced by a more specific residue name.

-6- The atom connectivities have to be provided (only for the first conformation of a molecule since the atom order is identical in different conformations of a molecule) if the user wants the R.E.D. program to generate corresponding Tripos mol2 file(s) at the end of the R.E.D. execution (this was implemented in R.E.D. II and was not modified in R.E.D. III.x).

-7- The information about the molecular orientation of each optimized geometry used in the MEP computation (i. e. the choice of the re-orientation procedure applied before the MEP computation) has also to be provided.

General information about molecular orientation and charge values
It is known that the number of MEP points depends on the molecular orientation in space. Consequently, the values of derived RESP and ESP atom charges also depend on the molecular orientation of the optimized geometry. The molecular orientation can be partially controlled in the GAMESS program using the molecular principal axes ("COORD = CART" keyword) and in the Gaussian program by placing the center of nuclear charge at the origin ("Symmetry" keyword). Thus, since GAMESS and Gaussian do not apply the same internal algorithm, the molecular orientation of each optimized structure generated by both programs will be different. Moreover, each QM program can generate different molecular orientations for a target minimum using its internal re-orientation algorithm. To address these problems, a new reorientation algorithm has been introduced in R.E.D. version I which can be applied to the optimized molecular system before calculating the MEP. It allows a full control of the molecular orientation of the optimized geometry independently of the QM program or initial structure selected. Consequently, highly reproducible RESP or ESP charges can be derived. This procedure applied to the optimized geometry uses a Rigid-Body Re-orientation Algorithm based on the choice of three atoms. The first two atoms define the (O, X) axis, the third one defines the (O, X, Y) plane; the Z-axis being defined as the X x Y cross-product.

Different procedures for controlling the molecular orientation of the optimized geometry with R.E.D.
Three different procedures are available in R.E.D. to control the molecular orientation of the optimized geometry before computing the MEP. We strongly recommend to use the third one in RESP or ESP charge derivation. With R.E.D. III.4 and R.E.D. IV version June 2010, two additional transformation procedures have been developed. They differentiate translation from rotation (see below):

-1- If re-orientation information based on three atoms (see case -2-, below) is not provided in the initial P2N file, then the internal Re-orientation Algorithm available in GAMESS or Gaussian is used to reorient the optimized Cartesian coordinates (we called this the QMRA procedure). In this case, the molecular orientation of the optimized geometry is partially controlled, the charges derived using the two QM programs differ from each other, and are not reproducible. Indeed, it is important to underline that the Gaussian program reorients by default (using its own internal algorithm by placing the center of nuclear charge at the origin) a Cartesian coordinate set each time a molecular energy calculation is performed. This leads to the well known set of coordinates called 'Standard orientation'. Thus, at the end of a geometry optimization the Cartesian coordinates generated by Gaussian are always re-oriented with respect to the initial geometry (if the "Symmetry" keyword is used in the input). On the contrary, the GAMESS program only reorients the molecular Cartesian coordinates once at the beginning of a job (if the "COORD = CART" keyword is used). Thus, at the end of a geometry optimization the Cartesian coordinates generated by GAMESS are not perfectly re-oriented (according to the GAMESS internal algorithm). In order to get a well re-oriented optimized geometry using GAMESS, one has to use the optimized Cartesian coordinates in a new single point energy run. In the R.E.D. I version, this was performed during the MEP computation (if the Rigid-Body Re-orientation Algorithm was not used). This has been modified in the R.E.D. II version, where a new routine has been incorporated (which uses the Jacobi matrix diagonalization method) to re-orient the GAMESS optimized geometry. This is a similar algorithm to the one used internally by GAMESS, and presents the advantage that the GAMESS type molecular orientation can be generated without the need of executing the GAMESS program itself. Finally, such a 'standalone' re-orientation algorithm is not needed if the Gaussian program is used since the molecular orientation available at the end of the geometry optimization output is well re-oriented (according to the Gaussian internal algorithm).

-2- If re-orientation information based on three atom numbers (not atom names!) is provided in the initial P2N file, the optimized geometry is automatically re-oriented using the Rigid-Body Re-orientation Algorithm implemented in R.E.D. (we called this the RBRA procedure). Thus, in this case the molecular orientation of the optimized geometry is fully controlled and highly reproducible RESP and ESP charges can be derived independently of the QM software or initial structure choice. The charges are reproducible since the three atoms used in the RBRA procedure are known. In order to apply this approach, one has to provide the following keywords in the P2N file format before the set of Cartesian coordinates of the first conformation:
After the "REMARK REORIENT" keywords (which has to be present at the beginning of a line), one needs to provide the atom numbers involved in the Rigid-Body Re-orientation Algorithm i. e.REMARK REORIENT atm_nb$A atm_nb$B atm_nb$C where "$A", "$B" and "$C" stand for the atom numbers.

-3- If $j sets of three atom numbers [$j is a positive integer representing the number of molecular orientation(s); an infinite number of orientations can be in principle used] are given, the optimized Cartesian coordinates obtained from the QM program output are reoriented $j times using the Rigid-Body Re-orientation Algorithm (we called this the multi-RBRA procedure). The reoriented sets of Cartesian coordinates are then used to compute $j MEPs. A $j orientation fit (or multi-orientation RESP fit) is then applied using the RESP program. This allows averaging out the charge value differences observed for a particular orientation over several different orientations. The derived atom charges are also highly reproducible in this case. Here is an example in which $j = 4:REMARK REORIENT atm_nb$A atm_nb$B atm_nb$C | atm_nb$C atm_nb$B atm_nb$A | atm_nb$D atm_nb$C atm_nb$A | atm_nb$A atm_nb$C atm_nb$D - The keywords and the atom numbers must be written in the same line,
- The pipe character "|" is used as separator between two different orientations,
- "$A", "$B", "$C", and "$D" are the atom numbers.
For larger numbers of molecular reorientations, the corresponding format can be used, each line starting by the "REMARK REORIENT" keywords i. e. using several lines:REMARK REORIENT atm_nb$A atm_nb$B atm_nb$C | atm_nb$C atm_nb$B atm_nb$A
REMARK REORIENT atm_nb$D atm_nb$B atm_nb$A | atm_nb$A atm_nb$B atm_nb$D
REMARK REORIENT atm_nb$E atm_nb$D atm_nb$A | atm_nb$A atm_nb$D atm_nb$E

General remarks
As mentioned above, the R.E.D. III.x program can be used to derive RESP or ESP charges for $n molecules, and for each molecule $i conformations, and for each conformation $j molecular orientations. For each molecule represented by a P2N file and used in the charge derivation procedure, the three molecular re-rorientation procedures previously described to control the molecular orientation of an optimized geometry can be independentyl used to perform charge fitting:
- For a molecule, if the information about the Rigid Body Re-orientation Algorithm information is not provided in its initial P2N file, a $i conformation RESP fit is performed using the molecular orientation calculated by the internal Re-orientation Algorithm of the QM program (QMRA procedure).
- For a molecule, if $j orientations ($j = 1 or a higher positive integer, denoting the molecular orientation based on the Rigid-Body Re-orientation Algorithm) are requested, a $i conformation * $j orientation RESP fit is performed (RBRA or multi-RBRA procedure). In this case, the re-orientation information based on the three atom names is only provided for the first conformation (before the first set of Cartesian coordinates) and is used for the $i conformations.
- R.E.D. III.x allows using the geometry optimization output (single conformation optimization output or concatenated file created after $i conformation optimizations) generated by the GAMESS program as input for MEP computation using the Gaussian program (and vice-versa). If the Rigid-Body Re-orientation Algorithm is not selected, the molecular orientation used to compute the MEP is based on the QM output and is not based on the QM software. This means that if a Gaussian geometry optimization output is used to compute MEP using the GAMESS program (for instance), then the molecular orientation selected is based on the Gaussian internal re-orientation algorithm. However, this particularity is obviously avoided if the Rigid-Body Re-orientation Algorithm is used.

Two additional transformation procedures have been implemented in R.E.D. III.4 and R.E.D. IV version June 2010
These new procedures allow differentiating translation from rotation. In the RBRA procedure presented in the case -2- and case -3- above, the first selected atom is translated to the origin of axes, the first two atoms define the (O, X) axis while the third one is used to define the (O, X, Y) plane. The (O, Z) axis is automatically set as the cross-product between the (O, X) and (O, Y) axes. Thus, the RBRA algorithm uses translations and rotations to reorient the considered geometry. As charge values are affected by the RBRA procedure, this raises a new question: are translation and/or rotation responsible of these charge discrepancies? To answer to this question two new transformation procedures have been developed: the first one uses only translation(s), while the second one corresponds to rotations.

-4- Rotation
Rotation and multiple-rotations can be carried out using the "REMARK ROTATE" keywords. After the "REMARK ROTATE" keywords (which has to be present at the beginning of a line), one needs to provide the atom numbers involved in the rigid-body rotation algorithm. The following format has to be followed:REMARK ROTATE atm_nb$A atm_nb$B atm_nb$C, where "$A", "$B" and "$C" stand for the atom numbers.
By analogy to the multiple re-orientation procedure described above (case -3-), multiple-rotations can be carried out (the pipe character "|" can be used as separator between two different rotations):
Examples of multiple-rotations:REMARK ROTATE atm_nb$A atm_nb$B atm_nb$C | atm_nb$C atm_nb$B atm_nb$A, orREMARK ROTATE atm_nb$A atm_nb$B atm_nb$CREMARK ROTATE atm_nb$C atm_nb$B atm_nb$A

-5- Translation
Translations and multiple-translations can be carried out using the "REMARK TRANSLATE" keywords. After the "REMARK TRANSLATE" keywords (which has to be present at the beginning of a line), one needs to provide three values for the rigid-body translations one wants to perform on the X, Y and Z axes. The following format has to be followed:REMARK TRANSLATE $dX $dY $dZ where "$dX", "$dY" and "$dZ" stand for the translations done on the X, Y and Z axes.
Example:REMARK TRANSLATE 1 0 -1.5 means that +1 is added to the X Cartesian coordinates, and -1.5 is substracted to the Z axis for the selected orientation (while Y Cartesian coordinates remain unaffected).
By analogy to the multiple re-orientation procedure described above (case -3-), multiple-translations can be carried out (the pipe character "|" can be used as separator between two different translations):
Examples of multiple-translations:REMARK TRANSLATE $dX1 $dY1 $dZ1 | $dX2 $dY2 $dZ2, orREMARK TRANSLATE $dX1 $dY1 $dZ1REMARK TRANSLATE $dX2 $dY2 $dZ2

Conclusions about re-orientation, rotation and translation
Using the rigid-body re-orientation, rigid-body rotation and rigid-body translation algorithms implement in R.E.D. III.4 and R.E.D. IV version June 2010 any user can check that:
- Translation does not affect charge values, and applying the rigid-body translation algorithm or the QMRA algorithm lead to identical charge values.
- Applying the rigid-body rotation algorithm or rigid-body re-orientation algorithm lead to identical charge values.Consequently, we suggest to all users using the rigid-body re-orientation algorithm and its multiple re-orientation approach (i. e. the case -3-) defined in the first version of R.E.D.

-8- An initial P2N file is defined by two columns of atom names. The first column of atom names (or "yellow column" examplified in the P2N file below) is used in automatic RESP input generation, while the second one (or "red column" also examplified in the P2N file below) is involved in the conservation of international atom name conventions (i. e. CA atom name for alpha carbons of aminoacids, or C1' atom name for anomeric carbons of sugars, for instance) in the Tripos mol2 files generated by R.E.D. III.x. In the case of the absence of the second column of atom names in an initial P2N file, the atom names in Tripos mol2 files are automatically generated by R.E.D. III.x following another approach. In this case, the atom names of Tripos mol2 files are constituted by the chemical symbol of the chemical element and a number which is automatically incremented (as a consequence, the international atom name conventions defined for the molecule atoms are lost). The implementation of automatic RESP input generation in the R.E.D. versions I, II and III has to be described. Indeed, it is strongly related to the choice of creating the P2N file format.

General information about charge fitting
* In ESP charge derivation [in Weiner et al. force field, (1984 & 1986)], atom charge values are fitted to reproduce the MEP, and charge equivalencing of chemically equivalent atoms is performed a posteriori to the fit.
* In RESP charge derivation [in Cornell et al. force field (parm94.dat) and its successive adaptations (parm96.dat, parm98.dat and parm99.dat), and Duan et al. force field], atom charge values are fitted to reproduce the MEP in a two stage fit. The charge values are affected by the use of hyperbolic restraints, and charge equivalencing of chemically equivalent atoms is carried out during the fit.
- In 'standard' RESP charge derivation (i. e. originally published by the Kollman's group) a weak restraint (qwt = 0.0005) is used in charge derivation for all the heavy atoms during the first stage, and a stronger restraint (qwt = 0.001) is only applied to the methyl and methylene carbons during the second stage.
- In 'non-standard' RESP charge derivation, the use of the "qwt = 0.001" restraint in the second stage can be potentially extended to any type of carbon (in C=O, CH, and C=C, for instance), nitrogen, oxygen, silicon, phosphate and sulfur atoms. However, the use of such 'non-standard' RESP inputs is only recommended for expert users in test studies.

Rules followed by the R.E.D. program for automatic RESP input generation
R.E.D. automatically generates the inputs used in ESP, 'standard' and 'non-standard' RESP charge derivation for the RESP program based on the atom names found in the first column of atom names in initial P2N files by following three simple rules:
* The heavy atoms, whose the atom charge values have to be re-optimized in the second RESP stage (i. e. using the "qwt = 0.001" restraint) have to present in their atom name the "T" letter after their chemical symbol. With R.E.D. I, only 'standard' RESP inputs could be produced. Consequently, only methyl and methylene carbons had to present this "T" letter in their atom name. The other atom names had only to use the letter of their chemical symbol [i. e. "C" for the other carbons (in C=O, CH and C=C, for instance), "O" for oxygen, "H" for hydrogen etc...] in the definition of their atom name. With R.E.D. II, the extension of this rule to five other atoms (N, O, Si, P and S) has been implemented, and can be applied for the generation of 'non-standard' RESP inputs.
* The same number (positive integer) has to be added to this/these letter(s) for equivalent atoms (whatever the equivalencing procedure is, i. e. during the fit or a posteriori to the fit). A consequence to this is that non-equivalent atoms display different atom names.
* Each hydrogen linked to an heavy atom must have the same number as this atom.
Thus, an atom name belonging to the first column of atom names in a P2N file is defined as follows:

Chemical symbol + "T" (if needed) + positive integer

Remarks about atom names found in the first column of atom names in a P2N file
* HO1, 1H2 or O4' (often found in PDB files) are not valid names. Using the three rules reported above, a correct way could be H1, H2 or O4, respectively. I1 and Ag1 atom names are also rejected because their total number of electrons are above Z = 35.
* As previously said, equivalent atoms must have the same atom names [same letter(s) and number(s)]. The insightII molecular graphics program from Accelrys Inc. automatically renames the atoms, which have the same name to differentiate them. On the contrary, the VMD program displays the atom name labels without modifying them, and thus, is very convenient for checking the selected atom names written in P2N file format.
* When several conformations are available in a P2N file, only the atom names (defined with the rules previously defined) belonging to the first conformation are used in RESP input generation, and are applied to the other conformations.

Limitations of the R.E.D. versions I and II and creation of the P2N file format
With the R.E.D. versions I and II, only a single PDB file with the regular column of atom names could be used as input for a R.E.D. execution. The user had to manually modify these atom names for RESP input generation before running R.E.D. according to the rules previously defined. Although quite simple and flexible to use, such a strategy presented two important limitations:
* Such PDB atom name editing is tedious, time-consuming, and error-prone for large and multiple molecules.
* The international atom name conventions defined for those atom names are lost. This is particularly problematic when the atom names with those conventions are required in AMBER and CHARMM force field libraries.
Thus, to solve these problems, PDB files with two different types of atom names (one used in RESP input generation, and the other used to keep those atom name conventions), are now used by R.E.D. III.x as inputs. This type of PDB file has been named P2N for PDB file with two atom names (by analogy to the PQR file format).

Justification of these atom naming rules
A different strategy could have been followed for RESP input generation in R.E.D. Indeed, an algorithm based on the detection of chemical group topology could have been used for a full automatic RESP input generation. Such an approach presents the advantage that RESP inputs are automatically generated without the need of modifying PDB or P2N atom names. Thus, this approach is clearly simpler in particular for novice users. However, in our opinion it presents also strong limitations. In this case, the corresponding program is used as a "black box", meaning that the user generates charge values without understanding the scientific basis behind it. Moreover, it is rigid and can only generate a single set of RESP inputs and atom charge values for a molecule.
The R.E.D. program is a tool designed for automatic charge derivation, but also for the study and improvement of atom charge models. Thus, we decided to developed an algorithm, which allows automatic charge derivation, but lets also human beings controlling and modifying it. In our opinions, the creation of the P2N file format associated with simple rules of atom naming is a good compromise between some automaticity required in charge derivation and force field library development, and some hability of studying and improving atom charge models. Thus, by modifying atom names in P2N files one can generate many different sets of charge values based on ESP, standard and non-standard RESP charge derivation. One can also fully control charge equivalencing of chemical equivalent atoms, and group of atoms that one considers equivalent. Indeed, so far no algorithm is available allowing a general, efficient and flexible charge value equivalencing approach for groups of atoms. In these conditions, we do believe that only a human being can make the choice of making equivalent or not equivalent groups of atoms.