Fig. 9
From: SIMPD: an algorithm for generating simulated time splits for validating machine learning approaches

Summary of the 99 data sets extracted from ChEMBL32 (Top): Number of compounds in each data set. (Bottom): Distribution of median AUC values for random forest models built using MFP2 and random splitting for these data sets