This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Preprints) and either DOI or URL of the article must be cited.

Abstract

The PM6 semiempirical method and the dispersion and hydrogen bond-corrected PM6-D3H+ method are used together with the SMD and COSMO continuum solvation models to predict pKa values of pyridines, alcohols, phenols, benzoic acids, carboxylic acids, and phenols using isodesmic reactions and compared to published ab initio results. The pKa values of pyridines, alcohols, phenols, and benzoic acids considered in this study can generally be predicted with PM6 and ab initio methods to within the same overall accuracy, with average mean absolute differences of 0.6 - 0.7 pH units. For carboxylic acids the accuracy (0.7 - 1.0 pH units) is also comparable to ab initio results if a single outlier is removed. For primary, secondary, and tertiary amines the accuracy is, respectively, similar (0.5 - 0.6), slightly worse (0.5 - 1.0), and worse (1.0 - 2.5), provided that di- and triethylamine are used as reference molecules for secondary and tertiary amines. When applied to a drug like molecule where an empirical pKa predictor exhibits a large (4.9 pH unit) error, we find that the errors for PM6-based predictions are roughly the same in magnitude but opposite in sign. As a result most of the PM6-based methods predict the correct protonation state at physiological pH, while the empirical predictor does not. The computational cost is around 2-5 minutes per conformer per core processor, making PM6-based pKa prediction computationally efficient enough to be used for high-throughput screening using on the order of 100 core processors.

Author Comment

This is a preprint submission to PeerJ Preprints.

Additional Information

Competing Interests

The authors declare that they have no competing interests.

Author Contributions

Jimmy C. Kromann conceived and designed the experiments, performed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Frej Larsen performed the experiments, analyzed the data, reviewed drafts of the paper.

Jan H. Jensen conceived and designed the experiments, analyzed the data, wrote the paper, prepared figures and/or tables, reviewed drafts of the paper.

Data Deposition

The following information was supplied regarding data availability:

The following are made available at https://dx.doi.org/10.6084/m9.figshare.c.3259513.v1: a list of pKa values used for Table \ref{tab:sastre}, all input and output files, a config file for the Balloon program used in the conformational search, various submit and analysis scripts.

Funding

JCK acknowledges support from the University of Copenhagen The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

0

Add your feedback

Before adding feedback, consider if it can be asked as a question instead, and if so then use the Question tab. Pointing out typos is fine, but authors are encouraged to accept only substantially helpful feedback.

Follow this preprint for updates

"Following" is like subscribing to any updates related to a preprint.
These updates will appear in your home dashboard each time you visit PeerJ.

You can also choose to receive updates via daily or weekly email digests.
If you are following multiple preprints then we will send you
no more than one email per day or week based on your preferences.

Note: You are now also subscribed to the subject areas of this preprint
and will receive updates in the daily or weekly email digests if turned on.
You can add specific subject areas through your profile settings.