Comparative evaluation of methods for the prediction of protein–ligand binding sites

Utgés, Javier S.; Barton, Geoffrey J.

doi:10.1186/s13321-024-00923-z

Journal of Cheminformatics

Table 1 Summary of ligand binding site prediction methods analysed in this study

From: Comparative evaluation of methods for the prediction of protein–ligand binding sites

Method	Approach	Features	# Features	P centroid	P residues	P score	P ranking	R score	R threshold	Cluster	Algorithm	Threshold (Å)
VN-EGNN	EGNN + VN	ESM-2 embeddings	1280	✓	✕	✓	✓	✕	–	–	–	–
IF-SitePred	LightGBM	ESM-IF1 embeddings	512	✓	✕	✓	✓	✕	0.5 (ALL 40)	Cloud points	DBSCAN	1.7
GrASP	GAT-GNN	Atom, residue, bond…	17	✓	✕	✓	✓	✓	0.3	Atoms	Average	15
PUResNet	DRN + 3D-CNN	Atom + one-hot encoding	18	✕	✓	✕	✕	✕	0.34	Atoms	DBSCAN	5.5
DeepPocket	fpocket + 3D-CNN	Atom	14	✓	✓	✓	✓	✕	–	–	–	–
P2Rank_CONS	Random Forest	Atom and residue	36	✓	✓	✓	✓	✓	0.35	SAS points	Single	3
P2Rank	Random Forest	Atom and residue	35	✓	✓	✓	✓	✓	0.35	SAS points	Single	3
fpocket_PRANK	fpocket + Random Forest	Atom and residue	34	✓	✓	✓	✓	✓	–	–	–	–
fpocket	α-spheres	–	–	✕	✓	✓	✓	✕	–	α-spheres	Multiple	1.7
PocketFinder⁺	LJ potential	–	–	✕	✕	✕	✕	✓	–	–	–	–
Ligsite⁺	Cubic grid	–	–	✕	✕	✕	✕	✓	–	–	–	–
Surfnet⁺	Gap regions	–	–	✕	✕	✕	✕	✓	–	–	–	–

All these methods were used with their default settings. Check marks (✓) indicate that a method provides a given output and crosses (✕) the contrary. Dashes (–) indicate a field is not applicable for a given method, e.g., features for non-machine learning-based methods. Approach: the techniques applied by the method; Features/#Features: the features and their number if the method is machine learning-based; P centroid/P residues/P score/P ranking/R score: whether the method reports the pocket centroid, pocket residues, pocket score, pocket ranking and residue ligandability score. Information about their clustering strategies is also relevant: whether the method uses a residue ligandability threshold (R threshold), the instances they cluster (Cluster) to define the distinct pockets, the clustering algorithm used (Algorithm) and threshold employed (Threshold). For example, P2Rank uses a random forest classifier on SAS points represented by 35 atom and residue features. Points with a score > 0.35 are later clustered into binding sites using single linkage and a threshold of 3 Å. DeepPocket and fpocket_PRANK use fpocket predictions as a starting point and later employ different technologies to re-score or re-define pockets. EGNN + VN: equivariant graph neural network + virtual nodes; LightGBM: light gradient boosting machine; GAT: graph attention network; GNN: graph neural network; DRN: deep residual network; 3D-CNN: three-dimensional convolutional neural network; LJ potential: Lennard–Jones potential

Back to article page

ISSN: 1758-2946

Contact us

Submission enquiries: journalsubmissions@springernature.com