ePrints@IIScePrints@IISc Home | About | Browse | Latest Additions | Advanced Search | Contact | Help

Machine Learning and Statistical Analysis for Materials Science: Stability and Transferability of Fingerprint Descriptors and Chemical Insights

Pankajakshan, Praveen and Sanyal, Suchismita and de Noord, Onno E and Bhattacharya, Indranil and Bhattacharyya, Arnab and Waghmare, Umesh (2017) Machine Learning and Statistical Analysis for Materials Science: Stability and Transferability of Fingerprint Descriptors and Chemical Insights. In: Chemistry of Materials, 29 (10). pp. 4190-4201. ISSN 0897-4756

[img] PDF
che_mat_29-10_4190-4201_2017.pdf - Published Version
Restricted to Registered users only

Download (2MB) | Request a copy
Supplementary_che_mat_29-10_4190-4201_2017.pdf - Published Supplemental Material

Download (821kB) | Preview
Official URL: https://doi.org/10.1021/acs.chemmater.6b04229


In the paradigm of virtual high-throughput screening for materials, we have developed a semiautomated workflow or "recipe" that can help a material scientist to start from a raw data set of materials with their properties and descriptors, build predictive models, and draw insights into the governing mechanism. We demonstrate our recipe, which employs machine learning tools and statistical analysis, through application to a case study leading to identification of descriptors relevant to catalysts for CO2 electroreduction, starting from a published database of 298 catalyst alloys. At the heart of our methodology lies the Bootstrapped Projected Gradient Descent (BoPGD) algorithm, which has significant advantages over commonly used machine learning (ML) and statistical analysis (SA) tools such as the regression coefficient shrinkage-based method (LASSO) or artificial neural networks: (a) it selects descriptors with greater stability and transferability, with a goal to understand the chemical mechanism rather than fitting data, and (b) while being effective for smaller data sets such as in the test case, it employs clustering of descriptors to scale far more efficiently to large size of descriptor sets in terms of computational speed. In addition to identifying the descriptors that parametrize the d-band model of catalysts for CO2 reduction, we predict work function to be an essential and relevant descriptor. Based on this result, we propose a modification of the d-band model that includes the chemical effect of work function, and show that the resulting predictive model gives the binding energy of CO to catalyst fairly accurately. Since our scheme is general and particularly efficient in reducing a set of large number of descriptors to a minimal one, we expect it to be a versatile tool in obtaining chemical insights into complex phenomena and development of predictive models for design of materials.

Item Type: Journal Article
Publication: Chemistry of Materials
Publisher: American Chemical Society
Additional Information: The Copyright of the article belongs to the American Chemical Society.
Keywords: Artificial heart; Artificial intelligence; Binding energy; Carbon dioxide; Catalysts; Chemical analysis; Chemical modification; Chemical stability; Computational efficiency; Electrolytic reduction; Learning systems; Neural networks; Regression analysis; Work function; Chemical mechanism; Computational speed; High throughput screening; Material scientists; Predictive modeling; Predictive models; Projected gradient; Regression coefficient; Statistical methods
Department/Centre: Division of Electrical Sciences > Computer Science & Automation
Date Deposited: 06 Jun 2022 05:01
Last Modified: 06 Jun 2022 05:01
URI: https://eprints.iisc.ac.in/id/eprint/72927

Actions (login required)

View Item View Item