Poster presentation at 5th German Conference on Cheminformatics: 23. CIC-Workshop Goslar, Germany. 8-10 November 2009 We demonstrate the theoretical and practical application of modern kernel-based machine learning methods to ligand-based virtual screening by successful prospective screening for novel agonists of the peroxisome proliferator-activated receptor gamma (PPARgamma) [1]. PPARgamma is a nuclear receptor involved in lipid and glucose metabolism, and related to type-2 diabetes and dyslipidemia. Applied methods included a graph kernel designed for molecular similarity analysis [2], kernel principle component analysis [3], multiple kernel learning [4], and, Gaussian process regression [5]. In the machine learning approach to ligand-based virtual screening, one uses the similarity principle [6] to identify potentially active compounds based on their similarity to known reference ligands. Kernel-based machine learning [7] uses the "kernel trick", a systematic approach to the derivation of non-linear versions of linear algorithms like separating hyperplanes and regression. Prerequisites for kernel learning are similarity measures with the mathematical property of positive semidefiniteness (kernels). The iterative similarity optimal assignment graph kernel (ISOAK) [2] is defined directly on the annotated structure graph, and was designed specifically for the comparison of small molecules. In our virtual screening study, its use improved results, e.g., in principle component analysis-based visualization and Gaussian process regression. Following a thorough retrospective validation using a data set of 176 published PPARgamma agonists [8], we screened a vendor library for novel agonists. Subsequent testing of 15 compounds in a cell-based transactivation assay [9] yielded four active compounds. The most interesting hit, a natural product derivative with cyclobutane scaffold, is a full selective PPARgamma agonist (EC50 = 10 ± 0.2 microM, inactive on PPARalpha and PPARbeta/delta at 10 microM). We demonstrate how the interplay of several modern kernel-based machine learning approaches can successfully improve ligand-based virtual screening results.

Shape complementarity is a compulsory condition for molecular recognition [1]. In our 3D ligand-based virtual screening approach called SQUIRREL, we combine shape-based rigid body alignment [2] with fuzzy pharmacophore scoring [3]. Retrospective validation studies demonstrate the superiority of methods which combine both shape and pharmacophore information on the family of peroxisome proliferator-activated receptors (PPARs). We demonstrate the real-life applicability of SQUIRREL by a prospective virtual screening study, where a potent PPARalpha agonist with an EC50 of 44 nM and 100-fold selectivity against PPARgamma has been identified. SQUIRREL molecular superposition is based on a graph-matching routine [4] and allows partial matching. We used this advantage for searching for bioisosteric replacement suggestions in a database of molecular fragments derived from a collection of drug-like compounds [5]. The bioisosteric groups suggested by our tool SQURRELnovo, can be used for ligand-based de novo design by a human expert. Using the fibrate derivative GW590735 [6] as query, we designed a novel lead structure by substitution of the acidic head group and hydrophobic tail. The synthesis and following testing in a cell-based reporter gene assay [7,8] revealed that the designed structure activates PPARalpha with an EC50 of 510 nM.

YS-121 [2-(4-chloro-6-(2,3-dimethylphenylamino)pyrimidin-2-ylthio)octanoic acid] is the result of target-oriented structural derivatization of pirinixic acid. It is a potent dual PPARα/γ-agonist, as well as a potent dual 5-LO/mPGES-1-inhibitor. Additionally, recent studies showed an anti-inflammatory efficacy in vivo. Because of its interference with many targets, YS-121 is a promising drug candidate for the treatment of inflammatory diseases. Ongoing preclinical studies will thus necessitate huge amounts of YS-121. To cope with those requirements, we have optimized the synthesis of YS-121. Surprisingly, we isolated and characterized byproducts during the resulting from nucleophilic aromatic substitution reactions by different tertiary alkylamines at a heteroaromatic halide. These amines should actually serve as assisting bases, because of their low nucleophilicity. This astonishing fact was not described in former publications concerning that type of reaction and, therefore, might be useful for further reaction improvement in general. Furthermore, we could develop a proposal for the mechanism of that byproduct formation.