17 June 2013

They work! I've also added a number of converted basis sets to the sourceforge repo under 'examples'. You'll also find example ecp and ECPOrbital files.

Phew...

Here's the README:

The programmes are not 'intelligent' -- they won't check that you are doing something reasonable. Bad input = bad output.
__Installation__:
Download eccepag and nwbas2ecce
They are both python (2.7) programmes, so you will need to install python to run them.
On linux, this is normally very easy. E.g. on debian, run
'sudo apt-get install python2.7'
and you are done.
If you want, you can put the files in /usr/local/bin and do
'sudo chmod +x /usr/local/bin/eccepage'
'sudo chmod +x /usr/local/bin/nwbas2ecce'
and you will be able to call the scripts from any directory.
__Usage__
nwbas2ecce can turn a full basis set, or a, ECP basis set, into an ECCE compatible set of basis set files.
Typically, an nwchem basis set consists of a single file, e.g. 3-21g. It can also be divided into several files, e.g. def2-svp and def-ecp, where the effective core potentials (ecps) are in def2-ecp. Other basis set files, like lanl2dz_ecp, contains both the orbital and the contraction parts.
Typically, a ECCE basis set suite consists of:
basis.BAS
basis.BAS.meta
basis.POT (for ECP)
basis.POT.meta (for ECP)
Sometimes polarization and diffuse functions are separated from the main .BAS file. E.g. 3-21++G* consists of
3-21G.BAS
3-21GS.BAS
POPLDIFF.BAS
, in addition to the meta files. The meta files are just markup-language type files with e.g. references. Note that you don't HAVE to break up the basis set components like this.
Since the basis set data can be broken up into smaller files, the overall basis set is defined as an entry in a category file.
For example, 3-21G is defined in the category file 'pople', and points to 3-21G.BAS. 3-21G* is also defined in pople, but point to both 3-21G.BAS and 3-21GS.BAS.
ECP works in a similar way, by combining a .BAS and a .POT file. Note that the .POT files look different from the .BAS files.
nwbas2ecce generates .BAS and .POT files based on whether there are basis/end or ecp/end sections in the nwchem basis set file. If there are both, both POT and BAS files are generated.
All these files are contained in server/data/Ecce/system/GaussianBasisSetLibrary
Finally, you need to generate .pag and .dir files that go into the server/data/Ecce/system/GaussianBasisSetLibrary/.DAV directory. The .dir file is always empty, while the .pag file is unfortunately a binary file. eccepag can, however, generate it with the right input.
See e.g. http://verahill.blogspot.com.au/2013/06/455-adding-nwchem-basis-sets-to-ecce.html for more detailed information
__Example__
We'll use def2-svp as an example. The nwchem basis set file def2-svp contains the basis set, while def2-ecp contains the core potentials. Use def2-svp to generate DEF2_SVP.BAS, DEF2_SVP.BAS.meta. Use def2-ecp to generate DEF2_ECP.POT, DEF2_ECP.POT.meta.
As part of the generation, .descriptor files are also generated. These contain information that should go into the category file(s).
Then generate the .pag files for both the POT and the BAS files, and touch the .dir files into existence.
Do like this:
nwbas2ecce -i def2-svp -o DEF2_SVP.BAS -n 'def2-svp'
nwbas2ecce -i def2-ecp -p DEF2_ECP.POT -n 'def2-ecp'
eccepag -n def2-svp -t ECPOrbital -c ORBITAL -y Segmented -s Y -o DEF2_SVP.BAS.pag
eccepag -n def2-ecp -t ecp -c AUXILIARY -o DEF2_ECP.POT.pag
NOTE: I don't actually know if def2-svp is segmented, and spherical. I don't think it matters for the .pag file generation. Also note that most inputs are case sensitive. Look at a similar .pag file for hints.
You now have the following files:
DEF2_ECP.POT
DEF2_ECP.POT.descriptor
DEF2_ECP.POT.meta
DEF2_ECP.POT.pag
DEF2_SVP.BAS
DEF2_SVP.BAS.descriptor
DEF2_SVP.BAS.meta
DEF2_SVP.BAS.pag
Copy the files. Note that you need to select the correct target directory, and that will vary with where you installed ECCE. I'll assume it's in /opt/ecce
cp DEF2* /opt/ecce/server/data/Ecce/system/GaussianBasisSetLibrary
cd /opt/ecce/server/data/Ecce/system/GaussianBasisSetLibrary
mv *.pag .DAV/
touch .DAV/DEF2_SVP.BAS.dir .DAV/DEF2_ECP.POT.dir
cat DEF2_SVP.BAS.descriptor >> ECPOrbital
cat DEF2_ECP.POT.descriptor >> ECPOrbital
cat DEF2_ECP.POT.descriptor >> ecp
Edit ECPOrbital so that it reads:
name= def2-svp
files= DEF2_SVP.BAS DEF2_ECP.POT
atoms= H He Li Be B C N O F Ne Na Mg Al Si P S Cl Ar K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br Kr Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn
atoms= Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I Xe Cs Ba La Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At Rn

19 June 2012

It's no secret that I'm a computational 'noob'. As such as I'm learning both by reading and by doing.

The doing part consists of checking 1) what the time penalty for different methods is and 2) what the accuracy/differences between different methods are.

Again, these are short calculations for simple molecules. Longer calculations with more exciting features (unpaired electrons, closely spaced MOs, highly negative charges etc.) may well behave completely different.

*** using freq calc of neutral species with cosmo, vs freq calc of cation in gas phase and energy w/ cosmo4. Spectra
We'll use octave for this. First, using cat and gawk, I put the x/y coordinates in a file.

5. Conclusions
It may seem weird that as a test case I picked a species I don't have any reference potential for. However, the goal here was to understand how the basis set affects the results, without being distracted by such things as Real Life.

The observed spectra can be divided into two group: 3-21G/6-31G vs 6-31++G**/cc-pVDZ/aug-cc-pVDZ. Polarization (and diffuse functions) seem to play a large role.

In terms of thermochemistry, not surprisingly aug-cc-pVDZ and 6-31++G** give very similar results since they both implement pol/diff functions. The computational cost is, however, significantly higher for aug-cc-pVDZ than 6-31++G**, at least in nwchem.

There is also little difference between doing freq calculations in gas phase vs using cosmo when it comes to the calculated redox potential for the more extensive basis sets.

3-21G gives very varying results, with it giving the highest potential in the gas phase but the second lowest potential with cosmo. cc-pVDZ consistently gives the lowest potential.

UHF/ROHF/HF are fast, but wildly inaccurate. LANL2DZ/6-31+G* looks ok, results-wise, but the thermodynamic corrections are actually much smaller in conjunction with COSMO than the other methods, which is suspicious.

If given the time I may post a more detailed analysis of polarisation vs diffuse functions later.