Skip to main content
Fig. 1 | Journal of Cheminformatics

Fig. 1

From: kMoL: an open-source machine and federated learning library for drug discovery

Fig. 1

The data pre-processing workflow prepares raw data for analysis and execution. Users can mix and match five different components. Streamers manage how other lower-level components are interconnected. Loaders provide various ways to load data into memory from different file formats. Once in memory, featurizers and transformers process input and output features, respectively. Multiple featurizers and transformers can be chained together, and the final samples are cached to enhance performance across multiple epochs and experiments. Finally, splitters divide the processed samples into distinct groups

Back to article page