Skip to main content

Table 1 Summary of ligand binding site prediction methods analysed in this study

From: Comparative evaluation of methods for the prediction of protein–ligand binding sites

Method

Approach

Features

# Features

P centroid

P residues

P score

P ranking

R score

R threshold

Cluster

Algorithm

Threshold (Å)

VN-EGNN

EGNN + VN

ESM-2 embeddings

1280

IF-SitePred

LightGBM

ESM-IF1 embeddings

512

0.5 (ALL 40)

Cloud points

DBSCAN

1.7

GrASP

GAT-GNN

Atom, residue, bond…

17

0.3

Atoms

Average

15

PUResNet

DRN + 3D-CNN

Atom + one-hot encoding

18

0.34

Atoms

DBSCAN

5.5

DeepPocket

fpocket + 3D-CNN

Atom

14

P2RankCONS

Random Forest

Atom and residue

36

0.35

SAS points

Single

3

P2Rank

Random Forest

Atom and residue

35

0.35

SAS points

Single

3

fpocketPRANK

fpocket + Random Forest

Atom and residue

34

fpocket

α-spheres

α-spheres

Multiple

1.7

PocketFinder+

LJ potential

Ligsite+

Cubic grid

Surfnet+

Gap regions

  1. All these methods were used with their default settings. Check marks () indicate that a method provides a given output and crosses () the contrary. Dashes (–) indicate a field is not applicable for a given method, e.g., features for non-machine learning-based methods. Approach: the techniques applied by the method; Features/#Features: the features and their number if the method is machine learning-based; P centroid/P residues/P score/P ranking/R score: whether the method reports the pocket centroid, pocket residues, pocket score, pocket ranking and residue ligandability score. Information about their clustering strategies is also relevant: whether the method uses a residue ligandability threshold (R threshold), the instances they cluster (Cluster) to define the distinct pockets, the clustering algorithm used (Algorithm) and threshold employed (Threshold). For example, P2Rank uses a random forest classifier on SAS points represented by 35 atom and residue features. Points with a score > 0.35 are later clustered into binding sites using single linkage and a threshold of 3 Å. DeepPocket and fpocketPRANK use fpocket predictions as a starting point and later employ different technologies to re-score or re-define pockets. EGNN + VN: equivariant graph neural network + virtual nodes; LightGBM: light gradient boosting machine; GAT: graph attention network; GNN: graph neural network; DRN: deep residual network; 3D-CNN: three-dimensional convolutional neural network; LJ potential: Lennard–Jones potential