Fig. 1
From: MolPROP: Molecular Property prediction with multimodal language and graph fusion

Graphic of the MolPROP architecture. This includes an example of the molecule Molnupiravir. The molecule (top left) is represented as a heavy atom graph (e.g., C, N, O) with nodes defined as circles and edges as lines connecting the circles. The molecule is also represented as a SMILES string (bottom). The ChemBERTa-2 tokenized language representation is shown above the SMILES string where each token is defined by a color change (e.g., [C@@H] is one token). The attention mask is displayed above the token representation which assigns (1) or does not assign (0) attention to the token within the ChemBERTa-2 transformer during fine-tuning of the MolPROP models. The color scheme is defined as carbon=black, nitrogen=blue, oxygen=red, and gray=tokens not assigned attention (0) during fine-tuning and graph fusion. The small black arrows and boxes depict the token representations being concatenated onto their respective graph node features during language and graph fusion