Skip to main content

Table 5 Performance of our GNN and other prediction tools on the two different test subsets (original data and new data, which were added)

From: Prediction of the water solubility by a graph convolutional-based neural network on a highly curated dataset

Test set n = 980

Consensus GNN

EPI suite

OCHEM

ACD GALAS

Subset original dataset n = 756

 Predictions possible for

756

712

756

756

 r2

0.911

0.757

0.915

0.868

 q2

0.908

0.620

0.913

0.863

 rmse

0.630

1.303

0.612

0.770

 bias

0.102

0.199

0.023

− 0.092

 mne

− 2.58

− 7.15

− 3.25

− 5.74

 mpe

3.23

5.80

3.31

4.53

 95% neg

− 1.19

− 2.65

− 1.38

− 1.69

 95% pos

1.45

3.14

1.31

1.50

Subset new dataset n = 224

 Predictions possible for

224

222

224

224

 r2

0.862

0.589

0.768

0.701

 q2

0.845

0.254

0.749

0.685

 rmse

0.744

1.627

0.947

1.060

 bias

0.191

0.456

0.040

0.037

 mne

− 3.32

− 5.12

− 4.99

− 5.58

 mpe

2.18

5.38

3.74

3.54

 95% neg

− 1.37

− 3.65

− 2.37

− 2.01

 95% pos

1.48

3.69

1.87

2.19

  1. The performance of the GNN compared to three other available prediction tools (EPI Suite, ACD GALAS and OCHEM) for the test set of 980 chemicals. The statistics for the test subsets of original data and novel data are given