Fig. 2

Distribution of the Tanimoto similarity of every molecule in the test set to its closest neighbour in the training set, either in the DOCKSTRING split (dtrain and dtest) or in an example random split. Tanimoto similarity values were computed on Morgan fingerprints of length 1024 and radius 3, calculated with RDKit