Understanding Chemical Trends in Generated Databases of Functional Molecules
Title
Understanding Chemical Trends in Generated Databases of Functional Molecules
            Subject
Chemistry
            Creator
Francesco Bartucca
            Contributor
Reinhard Maurer, Zsuzsanna Koczor-Benda
            Abstract
This project aims to investigate the output of G-SchNet, a generative machine learning model used by many in the field of computational chemistry. Discrepancies between the distributions of generated and training data are highlighted, in particular a tendency to generate molecules which contain more heavy atoms compared to the training datasets. Databases used for training include a thiols set composed of commercially-available molecules, and the more widely used OE62 and QM9 sets.
            Files
Collection
Citation
Francesco Bartucca, “Understanding Chemical Trends in Generated Databases of Functional Molecules,” URSS SHOWCASE, accessed November 4, 2025, https://urss.warwick.ac.uk/items/show/574.