Fig. 8

Cumulative positive sample acquisition in ClinTox dataset using BERT and ECFP features with UCB and Greedy acquisition functions. Mean and standard error computed across 10 random seeds. Starting from a balanced initial set (50 positive, 50 negative), BERT-based models extracted 70% of positive samples in \(\sim 100\) iterations, compared to \(\sim 180\) iterations for ECFP-based models—a nearly twofold difference in sample acquisition efficiency