Lazy structure-activity relationships (lazar) for the prediction of rodent carcinogenicity and Salmonella mutagenicity

Abstract

lazar is a new tool for the prediction of toxic properties of chemical structures. It derives predictions for query structures from a database with experimentally determined toxicity data. lazar generates predictions by searching the database for compounds that are similar with respect to a given toxic activity and calculating the prediction from their activities. Apart form the prediction, lazar provides the rationales (structural features and similar compounds) for the prediction and a reliable condence index that indicates, if a query structure falls within the applicability domain of the training database.

Leave-one-out (LOO) crossvalidation experiments were carried out for 10 carcinogenicity endpoints ({female|male} {hamster|mouse|rat} carcinogenicity and aggregate endpoints {hamster|mouse|rat} carcinogenicity and rodent carcinogenicity) and Salmonella mutagenicity from the Carcinogenic Potency Database (CPDB). An external validation of Salmonella mutagenicity predictions was performed with a dataset of 3895 structures. Leave-one-out and external validation experiments indicate that Salmonella mutagenicity can be predicted with 85% accuracy for compounds within the applicability domain of the CPDB. The LOO accuracy of lazar predictions of rodent carcinogenicity is 86%, the accuracies for other carcinogenicity endpoints vary between 78 and 95% for structures within the applicability domain.