Skip to main content

Table 10 Averaged performance of fine-tuned BERT using K-fold cross-validation on newly proposed datasets

From: Positional embeddings and zero-shot learning using BERT for molecular-property prediction

PE

Data

Sequence

Aver. loss

Aver. accuracy

Aver. precision

Aver. recall

Aver. F1-score

Absolute

Malaria

SMILES

0.5364 ± 0.01

0.7332 ± 0.01

0.7227 ± 0.04

0.6203 ± 0.01

0.6672 ± 0.02

DeepSMILES

0.5662 ± 0.01

0.7130 ± 0.02

0.7189 ± 0.02

0.5510 ± 0.02

0.6234 ± 0.02

COVID

SMILES

0.5640 ± 0.02

0.7973 ± 0.04

0.8472 ± 0.06

0.7530 ± 0.06

0.7959 ± 0.05

DeepSMILES

0.5316 ± 0.01

0.7716 ± 0.04

0.8010 ± 0.04

0.7560 ± 0.05

0.7770 ± 0.04

Cocrystals

SMILES

0.5795 ± 0.01

0.6953 ± 0.02

0.6888 ± 0.03

0.6286v0.04

0.6560 ± 0.02

DeepSMILES

0.5723 ± 0.02

0.6807 ± 0.03

0.6692 ± 0.04

0.6188 ± 0.05

0.6418 ± 0.03

RKQ

Malaria

SMILES

0.5025 ± 0.01

0.7616 ± 0.01

0.7542 ± 0.02

0.6647 ± 0.02

0.7062 ± 0.01

DeepSMILES

0.5546 ± 0.01

0.7263 ± 0.01

0.7669 ± 0.04

0.5288 ± 0.04

0.6244 ± 0.02

COVID

SMILES

0.4777 ± 0.02

0.8432 ± 0.02

0.9079 ± 0.03

0.7818 ± 0.03

0.8398 ± 0.03

DeepSMILES

0.4408 ± 0.04

0.8068 ± 0.03

0.8752 ± 0.01

0.7385 ± 0.06

0.8002 ± 0.04

Cocrystals

SMILES

0.5247 ± 0.03

0.7197 ± 0.02

0.7127 ± 0.02

0.6633 ± 0.03

0.6867 ± 0.02

DeepSMILES

0.5154 ± 0.03

0.7416 ± 0.02

0.7323 ± 0.03

0.7041 ± 0.07

0.7151 ± 0.03

  1. Bold values denote the best-achieved performance for clarity and emphasis
  2. Aver. Averaged, RKQ Relative_key_query, DeepSMILES zero-shot learning analysis of BERT