Contents

Analysis

The analysis module is used to analyze molecular trajectories generated by the NWChem molecular dynamics module, or partial charges generated by the NWChem electrostatic potential fit module. This module should not de run in parallel mode.

Directives for the analysis module are read from an input deck,

analysis
...
end

The analysis is performed as post-analysis of trajectory files through using the task directive

task analysis

or

task analyze

System specification

system <string systemid>_<string calcid>

where the strings systemid and calcid are user defined names for the chemical system and the type of calculation to ber performed, respectively. These names are used to derive the filenames used for the calculation. The topoly file used will be systemid.top, while all other files are named systemid_calcid.ext.

Reference coordinates

Most analyses require a set of reference coordinates. These coordinates are read from a NWChem restart file by the directive,

reference <string filename>

where filename is the name of an existing restart file. This input directive is required.

File specification

The trajectory file(s) to be analyzed are specified with

file <string filename> [<integer firstfile> <integer lastfile>]

where filename is an existing trj trajectory file. If firstfile and lastfile are specified, the specified filename needs to have a ? wild card character that will be substituted by the 3-character integer number from firstfile to lastfile, and the analysis will be performed on the series of files. For example,

file tr_md?.trj 3 6

will instruct the analysis to be performed on files tr_md003.trj, tr_md004.trj, tr_md005.trj and tr_md006.trj.

From the specified files the subset of frames to be analyzed is specified by

For example, to analyze the first 100 frames from the specified trajectory files, use

frames 100

To analyze every 10-th frame between frames 200 and 400 recorded on the specified trajectory files, use

frames 200 400 10

A time offset can be specified with

time <real timoff>

Solute coordinates of the reference set and ech subsequent frame read from a trajectory file are translated to have the center of geometry of the specified solute molecule at the center of the simulation box. After this translation all molecules are folded back into the box according to the periodic boundary conditions. The directive for this operation is

center <integer imol> [<integer jmol default imol>]

Coordinates of each frame read from a trajectory file can be rotated using

rotate ( off | x | y | z ) <real angle units degrees>

If center was defined, rotation takes place after the system has been centered. The rotate directives only apply to frames read from the trajectory files, and not to the reference coordinates. Up to 100 rotate directives can be specified, which will be carried out in the order in which they appear in the input deck. rotate off cancels all previously defined rotate directives.

where {atomlist} is the set of atom names selected from the specified residues. By default all solute atoms are selected. When keyword super is specified the selecion applies to the superimposition option.

The selected atoms are specified by the string atomlist which takes the form

[{isgm [ - jsgm ] [,]} [:] [{aname[,]}]

where isgm and jsgm are the first and last residue numbers, and aname is an atom name. In the atomname a question mark may be used as a wildcard character.

For example, all protein backbone atoms are selected by

select _N,_CA,_C

To select the backbone atoms in residues 20 to 80 and 90 to 100 only, use

select 20-80,90-100:_N,_CA,_C

This selection is reset to apply to all atoms after each file directive.

Solvent molecules within range nm from any selected solute atom are selected by

select solvent <real range>

After solvent selection, the solute atom selection is reset to being all selected.

The current selection can be saved to, or read from a file using the save and read keywords, respectively.

Some analysis are performed on groups of atoms. These groups of atoms are defined by

define <integer igroup> [<real rsel>] [solvent] { <string atomlist> }

The string atom in this definitions again takes the form

[{isgm [ - jsgm ] [,]} [:] [{aname[,]}]

where isgm and jsgm are the first and last residue numbers, and aname is an atom name. In the atomname a question mark may be used as a wildcard character.

Multiple define directive can be used to define a single set of atoms.

Coordinate analysis

To analyze the root mean square deviation from the specified reference coordinates:

where igroup specifies the group of atoms defined with a define directive. Keyword periodic can be used to specify the periodicity, ipbc=1 for periodicity in z, ipbc=2 for periodicity in x and y, and ipbc=3 for periodicity in x, y and z. Currently the only option is local which prints all selected solute atom with a distance between rsel and rval from the atoms defined in igroup. The actual analysis is done by the scan deirective. A formatted report is printed from group analyses using

where igroup and jgroup are groups of atoms defined with a define directive. Keyword periodic specifies that periodic boundary conditions need to be applied in ipbc dimensions. The type of analysis is define by function, value1 and value2. If filename is specified, the analysis is applied to the reference coordinates and written to the specified file. If no filename is given, the analysis is applied to the specified trajectory and performed as part of the scan directive. Implemented analyses defined by <string function> [<real value1> [<real value2>]] include

distance to calculate the distance between the centers of geometry of the two specified groups of atoms, and

distances to calculate all atomic distances between atoms in the specified groups that lie between value1 and value2.

where idef is the atom group definition number, length is the size of the histogram, zcoordinate is the currently only histogram option, and $filename$ is the filname to which the histogram is written.

Order parameters are evalated using

order <integer isel> <integer jsel> <string atomi> <string atomj>

This is an experimental feature.

To write the average coordinates of a trajectory

average [super] <string filename>

To perform the coordinate analysis:

scan [ super ] <string filename>

which will create, depending on the specified analysis options files filename.rms and filename.ana. After the scan directive previously defined coordinate analysis options are all reset. Optional keyword super specifies that frames read from the trajectory file(s) are superimposed to the reference structure before the analysis is performed.

Essential dynamics analysis

Essential dynamics analysis is performed by

essential

This can be followed by one or more

project <integer vector> <string filename>

to project the trajectory onto the specified vector. This will create files filename with extensions frm or trj, val, vec, _min.pdb and _max.pdb, with the projected trajectory, the projection value, the eigenvector, and the minimum and maximum projection structure.

For example, an essential dynamics analysis with projection onto the first vector generating files firstvec.{trj, val, vec, _min.pdb, _max.pdb} is generated by

essential
project 1 firstvec

Trajectory format conversion

To write a single frame in PDB or XYZ format, use

write [<integer number default 1>] [super] [solute] <string filename>

To copy the selected frames from the specified trejctory file(s), onto a new file, use

copy [solute] [rotate <real tangle>] <string filename>

To superimpose the selected atoms for each specified frame to the reference coordinates before copying onto a new file, use

super [solute] [rotate <real tangle>] <string filename>

The rotate directive specifies that the structure will make a full ratation every tangle ps. This directive only has effect when writing povray files.

The format of the new file is determined from the extension, which can be one of

The input coordinates are taken from the xyzq file that can be generated from a rst by the prepare module. Parameter spacing specifies the number of gridpoints per nm, rcut specifies extent of the charge grid beyond the molecule. Periodic boundaries will be used if periodic is specified. If iper is set to 2, periodic boundary conditions are applied in x and y dimensions only. If periodic is specified, a negative value of rcut will extend the grid in the periodic dimensions by abs(rcut), otherwise this value will be ignored in the periodic dimensions. The resulting plt formatted file pltfile can be viewed with the gOpenMol program. The resulting electrostatic potential grid is in units of kJmol − 1e − 1. If no files are specified, only the parameters are set. This analysis applies to solute(s) only.