KS equations

We can transform the generalised eigenvalue problem given by the KS equations above, to get rid of the nasty sum on the right hand side.
In this case $\epsilon_{ij}$ is diagonal, and we get an eigenvalue problem:

Where $E[\{\psi_i\};\{\mathbf{R}_I\}]$ indicates that the energy depends on the (occupied) KS orbitals (which then give a density) and parametrically on all the $I$ nuclear coordinates, $\mathbf{R}_I$.

Guassian basis sets (many matrix elements can be done analytically)

we go a bit further than implied above - to be more accurate, we *contract* several Gaussians to form approximate atomic orbitals
$$
\phi_{\alpha} (\mathbf{r}) = \sum_m d_{m\alpha}g_m(\mathbf{r})
$$
where a primitive cartesian Gaussian centred at the origin is given by
$$
g_m(\mathbf{r}) = x^{m_x}y^{m_y}z^{m_z} e^{-\alpha_m r^2}
$$
and $m_x + m_y + m_z = l$, the angular momentum quantum number of the functions.

operator matrices are sparse. By neglecting overlap elements smaller than some number indirectly specified by `EPS_DEFAULT`, the Gaussian functions are localized with finite support. This means that $\mathbf{S, H}$ and $\mathbf{P}$ are sparse.

Basis set libraries

There are two main types of basis sets supplied with CP2K

GTH_BASIS_SETS: atomically optimized sets. These were the first shipped with the code and vary systematically in quality from DZ to QZ for lighter elements. Can be very good for molecular systems, but can be very bad for condensed matter systems.

BASIS_MOLOPT: molcularly optimized basis sets. These cover most elements of the periodic table, but only with fairly good quality DZVP-MOLOPT-SR-GTH. Should be a good starting point for most condensed matter calculations.

The basis set files provide the contraction coefficients ($d_{m \alpha}$) and exponents ($\alpha_m$) of the Gaussian functions.

Generally the first column is the exponents ($\alpha_m$ above) and the later columns give the $d_{m \alpha}$, each column being a set $\alpha$. Details can be found in the header of the BASIS_MOLOPT file.

Here there are Gaussians with five different exponents ( in Bohr$^{-2}$). From these 5 sets of functions are built, two $s$ functions (2nd and 3rd columns), 2 sets of $p$ functions (4th and fifth columns) and one set of $d$ functions (last column).

Note that the contraction coefficients are not varied during calculation. For the nitrogen basis above we have $2 + 2 \times 3 + 1 \times 5 = 13$ variables to optimize for each nitrogen atom in the system.

(CP2K tends to use general contractions for efficiency, allowing maximum use of recursion relations to generate matrix elements for high angular momentum functions).

GTH pseudopotentials

Accurate and transferable with few parameters.
Include scalar relativistic effects.
Available for all atoms $\in Z_{ion} < 86$

where $n_{tot}$ includes the nuclear charge as well as the electronic.
(The nuclear charge density is (of course) represented as a Gaussian distribution with parameter $R_I^c$ chosen to cancel a similar term from the local part of the pseudopotential)

If you have PRINT_LEVEL MEDIUM you can see at each SCF cycle some details of this process - particularly how many electrons and the core charge mapped onto the grids.
For instance, for CH2O and a CUTOFF of 400 Ry with GTH pseudos Trace(PS): 12.0000000000

Energy ripples

GTH pseudos have small density at the core - graph of density and $v_{XC}$ through a water molecule. These spikes can cause ripples in the energy as atoms move relative to the grid. These can be very problematic when trying to calculate vibrational frequencies.

There are smoothing routines `&XC_GRID / XC_DERIV`, see the exercise converging_cutoff.

Avoid ripples with higher a cutoff, or GAPW methodology.

Whatever you do don't change settings between simulations you want to compare.

Multigrids

When we want to put (collocate) a Gaussian type function onto the realspace grid, we can gain efficiency by using multiple grids with differing cutoff / spacing.