Skip to main content
Fig. 1 | Journal of Cheminformatics

Fig. 1

From: Protein-small molecule binding site prediction based on a pre-trained protein language model with contrastive learning

Fig. 1

Structure of the CLAPE-SMB model, comprising three primary components: a sequence embedding module based on the large, pre-trained protein language model, ESM-2, with its weights kept fixed during training; a backbone neural network employing either MLP or 1DCNN, both integrated with a Softmax function serving as the classification head; and a loss function module employing a combination of triplet center loss and class-balanced focal loss. The output is the probability of small molecule binding at each residue of in input protein sequence, with > 50% probability classified as a likely binding site

Back to article page