Positional embeddings and zero-shot learning using BERT for molecular-property prediction

Table 3 Comparison results of different PEs and position encoding methods during pretraining of the BERT model on SMILES data

PEs and position encoding	Training time (h)	Optimal learning rate	Accuracy
Absolute	167	8.14e−5	0.9568
Relative_key	180	8.68e−6	0.9746
Relative_key_query	120	4.59e−6	0.9763
Sinusoidal [52]	105	1.67e−7	0.9755

ISSN: 1758-2946