Skip to main content
Fig. 8 | Journal of Cheminformatics

Fig. 8

From: Molecular property prediction using pretrained-BERT and Bayesian active learning: a data-efficient approach to drug design

Fig. 8

Cumulative positive sample acquisition in ClinTox dataset using BERT and ECFP features with UCB and Greedy acquisition functions. Mean and standard error computed across 10 random seeds. Starting from a balanced initial set (50 positive, 50 negative), BERT-based models extracted 70% of positive samples in \(\sim 100\) iterations, compared to \(\sim 180\) iterations for ECFP-based models—a nearly twofold difference in sample acquisition efficiency

Back to article page