The Benchmark Energy & Geometry Database (BEGDB) collects results of
highly accurate QM calculations of molecular structures, energies and
properties. These data can serve as benchmarks for testing and
parameterization of other computational methods. More information on the
features of the database can be found here.

S66 data set update | 2014-02-17

The MP2C interaction energies were corrected, for details see the recently published errata to the S66 paper.

A24 data set added | 2013-07-02

New data set of 24 small complexes for which very accurate calculations are possible, intended as a benchmark for benchmark-quality methods.

X40 data set added | 2012-10-23

Benchmark CCSD(T)/CBS interaction energies and results of the tested methods have been added to the database.

Featured Datasets

A set of 24 small complexes for which we report accurate extrapolation to the complete basis set limit at CCSD(T) level and further corrections for core correlation, relativistic effects and quadruple excitations at the CCSDT(Q) level.

A set of 24 small noncovalent complexes featuring a variety of interaction motifs. The geometries had been optimized at the CCSD(T)/CBS level. CCSD(T) interaction energies had been extrapolated from a series of aug-cc-pV(T,Q,5) basis sets. The contributions of core correlation and scalar relativistic effects (4th order DKH amiltonian) had been determined at CCSD(T) level. Finally, an estimate of higher-order contributions to correlation energy had been calculated using CCSDT(Q) in 6-31G**(0.25,0.15) basis set.

These are seven large complexes (number of atoms: 48-112) stabilized mostly via dispersion interaction. This dataset was created in order to have new group of molecules which can be used for evaluation of accuracy of different quantum chemical methods in

Data set of large noncovalent complexes featuring mainly dispersion interactions (both pi-pi stacking and interaction of aliphatic hydrocarbons). The reference binding energies were determined by extrapolating the MP2 binding energies to the complete basis set limit and adding a ╬öQCISD(T)/6-31G*(0.25) correction term in the case of CBH, C3A, C3GC, GGG and PHE complexes; whereas in the case of C2C2PD and GCGC complexes the ╬ö correction term was determined at the QCISD(T)/ÔÇŁaugÔÇŁ-cc-pVDZ (cf. Mol.Phys 2010, 108, 249-257) and CCSD(T)/6-31G**(0.25,0.15) level of theory, respectively.

38 water clusters containing 2 to 10 waters. Data set contributed by the group of G. C. Shields. (The data set was updated on Oct 14, 2014, fixing mismatch between energies and geometries in the original entry)

These are a set of global and local minima of water clusters containing 2-10 waters. The geometry of each isomer was optimized using RI-MP2/aug-cc-pVDZ. The RI-MP2/CBS binding energy was calculated by extrapolating the RI-MP2/aug-cc-pVDZ//aug-cc-pVDZ, RI-MP2/aug-cc-pVTZ//aug-cc-pVDZ, and RI-MP2/aug-cc-pVQZ//aug-cc-pVDZ energies to their complete basis set limit using a 4-5 inverse 4-5 polynomial scheme that has been used extensively for water clusters. ╬öCCSD(T) correction using aug-cc-pVDZ basis set was added to estimate the CCSD(T)/CBS binding energy. Zero-point vibrational energy and finite temperature corrections within the ideal-gas-rigid-rotor-harmonic-oscillator (IGRRHO) model using scaled and unscaled harmonic vibrational frequencies can be found in the published article.

A set of 40 noncovalent complexes of organic halides, halohydrides and halogen molecules where the halogens participate in a variety of interaction types. The set, named X40, covers electrostatic interactions, London dispersion, hydrogen bonds, halogen bonding, halogen-¤Ç interactions and stacking of halogenated aromatic molecules. Interaction energies at equilibrium geometries were calculated using a composite CCSD(T)/CBS scheme where the CCSD(T) contribution is calculated using triple-zeta basis sets with diffuse functions on all atoms but hydrogen.

Geometries were constructed by scaling the closest intermolecular distance in the complexes by a factor of 0.8, 0.85, 0.9, 0.95, 1.0, 1.05, 1.1, 1.25, 1.5 and 2.0, starting from MP2/cc-pVTZ geometry. The benchmark CCSD(T)/CBS interaction energies are based on MP2/CBS calculations in aug-cc-pVTZ and aug-cc-pVQZ basis sets and CCSD(T) correction calculated in aug-cc-pVDZ basis set.

"The set contains 23 hydrogen bonds featuring all possible combionations of the most common donor and acceptor groups; 23 dispersion-dominated complexes covering pi-pi, aliphatic-aliphatic and pi-aliphatic interactions; and 20 complexes with mixed electrostatic/dispersion interaction. The benchmark CCSD(T)/CBS interaction energies are based on MP2/CBS calculations in aug-cc-pVTZ and aug-cc-pVQZ basis sets and CCSD(T) correction calculated in aug-cc-pVDZ basis set.
Full text of the paper is available: S66.pdfMore data for downloadUpdate: More accurate CCSD(T)/CBS calculations from JCTC 7, 3466 (2011) available - CCSD(T) contribution in heavy-aug-ccPVTZ and extrapolated from heavy-aug-ccPV(D,T)Z. To distinguish the different CCSD(T)/CBS setups in the table, the basis set is listed in parenthesis."

Geometries were constructed by scaling the closest intermolecular distance in the complexes by a factor of 0.9, 0.95, 1.0, 1.05, 1.1, 1.25, 1.5 and 2.0, starting from MP2/cc-pVTZ geometry. The benchmark CCSD(T)/CBS interaction energies are based on MP2/CBS calculations in aug-cc-pVTZ and aug-cc-pVQZ basis sets and CCSD(T) correction calculated in aug-cc-pVDZ basis set. Full text of the paper is available: S66.pdfMore data for download

Hydrogen bonds featuring ionic groups common in biomolecules (carboxylate, ammonium, guanidinium and imidazolium) interacting with neutral donor/acceptors. The set is constructed analogously to the S66x8 data set and calculated at the same level.

The set was developed for parameterization of hydrogen bonding correction for semiempirical QM methods. Geometries were constructed by scaling the closest intermolecular distance in the complexes by a factor of 0.9, 0.95, 1.0, 1.05, 1.1, 1.25, 1.5 and 2.0, starting from MP2/cc-pVTZ geometry. The benchmark CCSD(T)/CBS interaction energies are based on MP2/CBS calculations in aug-cc-pVTZ and aug-cc-pVQZ basis sets and CCSD(T) correction calculated in aug-cc-pVDZ basis set.

Geometries of noncovalent complexes from the S22 dataset were displaced along intermolecular axis, forming one shortened and three elongated (0.9, 1.2, 1.5 and 2.0 times the original intermolecular distance) structures. The dataset also includes the original geometry (labeled 1.0). CCSD(T)/CBS interaction energies consistent with the original S22 work have been calculated.

S22 set consists of small to relatively large (30 atoms) complexes of common molecules containing only C, N, O and H, and single, double and triple bonds. Most typical noncovalent interactions, such as hydrogen bonds (XHY), dispersion interactions (stacked parallel, T-shaped), and mixed electrostatic-dispersion interactions are represented. A total of 22 complexes are divided into three subgroups: (i) hydrogen bonded complexes; (ii) complexes with predominant dispersion stabilization; (iii) mixed complexes in which electrostatic and dispersion contributions are similar in magnitude. Cunterpoise-corrected gradient optimization was used to obtain the geometries. The smallest complexes were optimized by the CCSD(T) method (numerical gradients) using cc-pVTZ and cc-pVQZ basis sets without counterpoise correction. We believe that our S22 set will manage to represent non-covalent interactions in biological molecules in a balanced way and that it will help to design and test fast computational tools for biologically oriented applications.

A detailed quantum chemical study on five peptides (WG, WGG, FGG, GGF and GFA) containing the residues phenylalanyl (F), glycyl (G), tryptophyl (W) and alanyl (A)ÔÇöwhere F and W are of aromatic characterÔÇöis presented. When investigating isolated small peptides, the dispersion interaction is the dominant attractive force in the peptide backboneÔÇôaromatic side chain intramolecular interaction. Consequently, an accurate theoretical study of these systems requires the use of a methodology covering properly the London dispersion forces. For this reason we have assessed the performance of the MP2, SCS-MP2, MP3, TPSS-D, PBE-D, M06-2X, BH&H, TPSS, B3LYP, tight-binding DFT-D methods and ff99 empirical force field compared to CCSD(T)/complete basis set (CBS) limit benchmark data. All the DFT techniques with a ÔÇś-DÔÇÖ symbol have been augmented by empirical dispersion energy while the M06-2X functional was parameterized to cover the London dispersion energy. For the systems here studied we have concluded that the use of the ff99 force field is not recommended mainly due to problems concerning the assignment of reliable atomic charges. Tight-binding DFT-D is efficient as a screening tool providing reliable geometries. Among the DFT functionals, the M06-2X and TPSS-D show the best performance what is explained by the fact that both procedures cover the dispersion energy. The B3LYP and TPSS functionalsÔÇönot covering this energyÔÇöfail systematically. Both, electronic energies and geometries obtained by means of the wave-function theory methods compare satisfactorily with the CCSD(T)/CBS benchmark data.Conformer energies are set relative to average for given peptide in each method.