Skip to main content

Table 2 Variation of the FPs composition in zero values with respect to the vector size (in bits) and the subsequent performances of RF in molecule classification based on the different FPs versus the molecular descriptors

From: cidalsDB: an AI-empowered platform for anti-pathogen therapeutics research

Fingerprints

Vector size

% of Zeros

Metrics

Precision

Recall

F1-score

RDK

256

5

0.565

0.121

0.200

 

512

18

0.622

0.205

0.309

 

1024

39

0.650

0.238

0.349

 

2048

61

0.643

0.280

0.390

Atom Pair

256

43

0.705

0.0586

0.101

 

512

63

0.700

0.080

0.143

 

1024

79

0.810

0.114

0.200

 

2048

89

0.770

0.128

0.210

Topological Torsion

256

84

0.692

0.128

0.216

 

512

92

0.696

0.154

0.252

 

1024

96

0.723

0.158

0.259

 

2048

98

0.698

0.168

0.271

Morgan

256

72

0.811

0.048

0.091

 

512

85

0.807

0.083

0.152

 

1024

92

0.758

0.109

0.191

 

2048

96

0.766

0.147

0.257

Molecular descriptors

208

–

0.609

0.170

0.274