690072_Mikulskis,P_2019.pdf (3.87 MB)
Towards efficient and interpretable machine learning models for materials discovery
journal contribution
posted on 2021-07-19, 00:32 authored by Paulius Mikulskis, Morgan Alexander, David WinklerDavid Winkler, La Trobe University LibraryLa Trobe University LibraryMachine learning (ML) and artificial intelligence (AI) methods for
modeling useful materials properties are now important technologies for
rational design and optimization of bespoke functional materials.
Although these methods make good predictions of the properties of new
materials, current modeling methods use efficient but rather arcane
(difficult-to-interpret) mathematical features (descriptors) to
characterize materials. Data-driven ML models are considerably more
useful if more chemically interpretable descriptors are used to train
them, as long as these models also accurately recapitulate the
properties of materials in training and test sets used to generate and
validate the models. Herein, how a particular type of molecular fragment
descriptor, the signature descriptor, achieves these joint aims of
accuracy and interpretability is described. Seven different types of
materials properties are modeled, and the performance of models
generated from signature descriptors is compared with those generated by
widely used Dragon descriptors. The key descriptors in the model
represent functionalities that make chemical sense. Mapping these
fragments back on to exemplar materials provides a useful guide to
chemists wishing to modify promising lead materials to improve their
properties. This is one of the first applications of signature
descriptors to the modeling of complex materials properties.