Skip to main content
Fig. 2 | Journal of Cheminformatics

Fig. 2

From: APBIO: bioactive profiling of air pollutants through inferred bioactivity signatures and prediction of novel target interactions

Fig. 2

A diagnostic plot to evaluate the quality of signature 4 of the APDB F1 space and its comparison with the CC datasets. The a plot represents the distribution of feature values from minimum (blue) to maximum (red) per key (100 compounds are randomly selected by default). Similarly, the b and c plots depict the distribution of values present in the dataset per each feature and each key, respectively. The d figure is the two-dimensional t-distributed stochastic neighbor embedding (2D t-SNE) projection of the molecule signatures. The e and f density plots represent the pairwise Euclidean and Cosine distance distribution, respectively. The g and h are the receiver operating characteristic curve (ROC) and the corresponding area under the curve (AUC) value reflecting whether neighboring molecules for that signature tend to have similar mechanisms of action (MoA) or therapeutic code (ATC) (signature 0 as reference). The i plot shows the log10 value of the number of keys and features of signature 1 of each CC space and the signature 4 of the created space (white dot). The j and k lollipop plots illustrate the proportion of signature 4 keys covered by signature 1 keys for each CC dataset and, conversely, the proportion of signature 1 keys for each CC dataset covered by signature 4 keys. The l figure displays the area under the receiver operating characteristic curve (ROC-AUC) values indicating whether neighboring molecules for that signature share similar characteristics in each CC space (signature 0 as reference)

Back to article page