Skip to main content

Table 2 Optimal parameters for pretaining and fine-tuning of the BERT model on SMILES and DeepSMILES data

From: Positional embeddings and zero-shot learning using BERT for molecular-property prediction

Parameters

Pretraining

Fine-tuning

Position encoding/PEs

Learning rate

1e−4

5e−6

 

Batch size

16

16

 

Warm-up ratio

0.016

0.1

 

Weight decay

0.01

0.01

 

Number of epochs

5

10

 

Optimizer

AdamW

AdamW

 

Warm up schedular

Linear

Linear

 

Number of parameters

85,054,464

86,496,002

Absolute

 

85,840,128

87,281,666

Relative_key

 

85,840,128

87,281,666

Relative_key_query

 

85,054,464

86,496,002

Sinusoidal [52]