Fig. 3

Overview of Pareto optimal solutions for regression and classification models. a Performance of continuous LD50 models is described as root-mean squared error (RMSE) versus percentage of compounds in the AD (%AD). b Performance of classification (nT, vT, EPA, GHS) models are described as balanced accuracy (BA) versus percentage of compounds in the AD (%AD). All the parameters refer to the ES. Models in the bottom-left part of the plots are characterized by the best compromise in terms of performance and coverage, with dotted lines representing the Pareto front for a given endpoint. White indicators are single models (R = rRF; B = BRF; H = HPT-RF; K = istkNN; S = SARpy; Q = aiQSAR), while black indicators are integrated models, flagged with the corresponding PF (for regression) or CS (for classification) threshold