application of machine learning to materials discovery and … · 2018. 6. 11. · • ruoqian liu,...

26
Application of Machine Learning to Materials Discovery and Development Ankit Agrawal and Alok Choudhary Department of Electrical Engineering and Computer Science Northwestern University {ankitag,choudhar}@eecs.northwestern.edu Contributors: Surya Kalidindi (GaTech), Basavarsu (TRDDC), Chris Wolverton (NU), Ahmet Cecen (GaTech), Parijat Deshpande (TRDDC), Bryce Meredig (NU) MURI 3-Year Review June 22-23, 2015

Upload: others

Post on 23-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Application of Machine

Learning to Materials Discovery

and Development

Ankit Agrawal and Alok Choudhary Department of Electrical Engineering and Computer Science

Northwestern University {ankitag,choudhar}@eecs.northwestern.edu

Contributors: Surya Kalidindi (GaTech), Basavarsu (TRDDC), Chris Wolverton (NU),

Ahmet Cecen (GaTech), Parijat Deshpande (TRDDC), Bryce Meredig (NU)

MURI 3-Year Review

June 22-23, 2015

Page 2: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Integrated Computational Materials

Engineering (ICME)

1

© Olson, G. B. (1997). Computational design of hierarchically structured materials. Science, 277(5330), 1237-1242.

Processing

Structure

Properties

Performance

Goal/means

Cause and effect

Page 3: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Project Collaboration

2

© Olson, G. B. (1997). Computational design of hierarchically structured materials. Science, 277(5330), 1237-1242.

Processing

Structure

Properties

Performance

Goal/means

Cause and effect

Project I. Multi-objective Structure-Property Optimization

Page 4: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Project Collaboration

3

© Olson, G. B. (1997). Computational design of hierarchically structured materials. Science, 277(5330), 1237-1242.

Processing

Structure

Properties

Performance

Goal/means

Cause and effect

Project I. Multi-objective Structure-Property Optimization

Project II. Multiscale Prediction of Localization Relationships

Page 5: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Project Collaboration

4

© Olson, G. B. (1997). Computational design of hierarchically structured materials. Science, 277(5330), 1237-1242.

Processing

Structure

Properties

Performance

Goal/means

Cause and effect

Project I. Multi-objective Structure-Property Optimization

Project II. Multiscale Prediction of Localization Relationships

Project III. Exploring Composition-Processing-Property Relationships

Project IV. Composition-based Discovery of Stable Compounds

Page 6: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Predicting fatigue strength of steel from

composition and processing parameters

• CORRELATES TO COMPOSITION

• CORRELATES TO MANUFACTURING

PROCESSES

PROPERTIES

(FATIGUE STRENGTH)

Objective: Employ data-driven approaches to the NIMS public domain materials database for exploring composition-processing-property relationships and constructing predictive models for fatigue strength of steels.

Collaborative project between Agrawal (NU), Choudhary (NU), Kalidindi (GaTech), Basavarsu (TRDDC)

Page 7: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

NIMS Database Attributes

6 Reference : http://mits.nims.go.jp/index_en.html

Fatigue Data Sheet Information:

Chemical composition - %C, %Si, %Mn, %P, %S, %Ni, %Cr, Cu %, Mo% (all in wt. %)

Upstream processing details - Ingot size, Reduction ratio, Non-metallic inclusions

Heat treatment conditions – Temperature, Time and other process conditions for Normalizing, Carburizing-Quenching and Tempering processes

Mechanical properties - YS, UTS, %EL (Elongation), %RA (Reduction in Area), Vickers Hardness, Charpy impact value (J/cm2), Rotating bending fatigue strength @ 107 cycles

Total - 437 data records Carbon and low alloy steels - 371 observations, Carburizing steels - 48 observations and Spring steels -18 observations

Page 8: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Steel Fatigue Strength Prediction Framework

7

Page 9: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

8

Data Mining Modeling

• Classification/Regression • Learning a predictive model based on

supervised (labeled) training data, which can then be used to classify unseen data

• E.g. Decision trees, Neural Networks, Support Vector Machines, etc.

• Model evaluation • Test-train split

• Split the labeled data into training and testing sets

• Cross-validation • Test every instance in the dataset

using a model that has not seen that instance

• Types • k-fold cross validation • Leave-one-out cross-validation

(LOOCV) with k=n

Training

split

Testing

split

Page 10: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Cluster Visualization

9

Page 11: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Information Gain Based Feature Ranking

10

Page 12: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Compare vectors of actual and predicted values Coefficient of correlation (R)

Coefficient of determination (R2)

Mean Absolute Error (MAE)

Root Mean Squared Error (RMSE)

Standard Deviation of Error (SDE)

Mean Absolute Error Fraction (MAE)

Root Mean Squared Error Fraction (RMSE)

Standard Deviation of Error Fraction (SDE)

Evaluation Metrics

Page 13: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Results Comparison

12

Page 14: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

13

Page 15: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

14

Results Comparison

A. Agrawal, P. D. Deshpande, A. Cecen, G. P. Basavarsu, A. N. Choudhary, and S. R. Kalidindi, “Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters,” Integrating Materials and Manufacturing Innovation, 3 (8): 1–19, 2014.

Page 16: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Discovery of stable compounds

Collaborative project between Agrawal (NU), Choudhary (NU), Wolverton (NU)

Page 17: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Database Construction

• Thousands of DFT formation energies

• Empirical elemental data

Predictive Modeling

• Model 1: established heuristic

• Model 2: data mining

Model Evaluation

• Test models on unseen formation energies

Prediction

• Run combinatorial list of compositions through models

Ranking

• Combine heuristic and data mining predictions

Validation

• Experiments

• Crystal structure prediction

Millions of

candidate

ternary

compositions

Formation

energy

predictions

Models Compound

discovery

(a)

(b)

Ranked

high-

potential

candidates

Discovery Framework

Page 18: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Model Validation: Numerical

-5.0

-4.0

-3.0

-2.0

-1.0

0.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0

Mo

del

form

ati

on

en

erg

y (

eV

/ato

m)!

DFT formation energy (eV/atom)!

DM: binaries!R2 = 0.87!

MAE = 0.27 eV/at!

-5.0

-4.0

-3.0

-2.0

-1.0

0.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0

Mo

del

form

ati

on

en

erg

y (

eV

/ato

m)!

DFT formation energy (eV/atom)!

DM: binaries!R2 = 0.87!

MAE = 0.27 eV/at!

DM: bin. + tern.!R2 = 0.93!

MAE = 0.16 eV/at! -5.0

-4.0

-3.0

-2.0

-1.0

0.0 -5.0 -4.0 -3.0 -2.0 -1.0 0.0

Mo

del

form

ati

on

en

erg

y (

eV

/ato

m)!

DFT formation energy (eV/atom)!

DM: binaries!R2 = 0.87!

MAE = 0.27 eV/at!

DM: bin. + tern.!R2 = 0.93!

MAE = 0.16 eV/at!

Heuristic!R2 = 0.95!

MAE = 0.12 eV/at!

Page 19: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Model Validation: Ranking

Combined model

outperforms

either alone in

regime of interest Classifyallunstable

0!

0.2!

0.4!

0.6!

0.8!

1!

0! 0.2! 0.4! 0.6! 0.8! 1!

Tru

e p

osi

tive

rat

e (s

ensi

tivi

ty)!

False positive rate (1 - specificity)!

random

guessing!

heuristic!combined!

DM: bin.

+4k tern.!

perfect classifier classify all stable

classify all unstable

Classifier becomes:

more conservative

less conservative

Page 20: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

What happens when we rank “all

possible ternaries” by their

likelihood of stability?

Page 21: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Predictions for Discovery

Average of all A-B-X ternaries

Fingerprint of entire

unexplored ternary

composition space!

Interesting insights:

Highest ranked ternary:

SiYb3F5

Si acts as an anion

Validated with structure

and DFT calculations

pnictides, chalcogenides,

halides

Pt-X-Y

Pm12S19Se – a missing

binary Pm2S3?

Page 22: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Example of discovered stable ternary compositions whose stability was explicitly confirmed with crystal structure prediction. Our method is successful at identifying new stable

compounds across a wide variety of chemistries. 21

Validation

B. Meredig*, A. Agrawal*, S. Kirklin, J. E. Saal, J. W. Doak, A. Thompson, K. Zhang, A. Choudhary, and C. Wolverton, “Combinatorial screening for new materials in unconstrained composition space with machine learning”, Phys. Rev. B, 89, 094104, March 2014.

Page 23: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Summary

Steel Fatigue Strength Prediction o NIMS database consisting of composition and processing

parameters linked with performance (fatigue strength).

o Neural networks, decision trees, multivariate polynomial regression able to achieve high R2 values of >0.98.

Stable Compound Discovery o A database of DFT calculations used to learn composition-

property relationships, thus mimicking DFT for estimating stability.

o The resulting predictive models used to scan the entire ternary composition space to discover likely stable compositions.

o Many predictions explicitly confirmed with crystal structure prediction and DFT.

22

Page 24: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Future Outlook

23

Processing

Structure

Properties

Performance

Goal/means

Cause and effect

Project I. Multi-objective Structure-Property Optimization

Project II. Multiscale Prediction of Localization Relationships

Project III. Exploring Composition-Processing-Property Relationships

Project IV. Composition-based Discovery of Stable Compounds

Page 25: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Publications

• A. Agrawal, P. D. Deshpande, A. Cecen, G. P. Basavarsu, A. N. Choudhary, and S. R. Kalidindi, “Exploration of data science techniques to predict fatigue strength of steel from composition and processing parameters,” Integrating Materials and Manufacturing Innovation, vol. 3, no. 8, pp. 1–19, 2014.

• B. Meredig, A. Agrawal, S. Kirklin, J. E. Saal, J. W. Doak, A. Thompson, K. Zhang, A. Choudhary, and C. Wolverton, “Combinatorial screening for new materials in unconstrained composition space with machine learning,” Physical Review B, vol. 89, no. 094104, pp. 1–7, 2014. BM and AA are co-first authors.

• Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary, “A predictive machine learning approach for microstructure optimization and materials design,” Scientific Reports, Nature Publishing Group, 2015, in press.

• R. Liu, Z. Chen, T. Fast, S. Kalidindi, A. Agrawal, and A. Choudhary, “Predictive Modeling in Characterizing Localization Relationships.” 2014. 2014 TMS Annual Meeting & Exhibition, Symposium of Data Analytics for Materials Science and Manufacturing, Feb. 16-20, San Diego, CA.

• R. Liu, A. Kumar, Z. Chen, A. Agrawal, V. Sundararaghavan, and A. Choudhary, “A Data Mining Approach in Structure-Property Optimization.” 2014. 2014 TMS Annual Meeting & Exhibition, Symposium of Data Analytics for Materials Science and Manufacturing, Feb. 16-20, San Diego, CA.

• P. D. Deshpande, B. P. Gautham, A. Cecen, S. Kalidindi, A. Agrawal, and A. Choudhary, “Application of Statistical and Machine Learning Techniques for Correlating Properties to Composition and Manufacturing Processes of Steels,” in 2nd World Congress on Integrated Computational Materials Engineering, July 7-11, 2013, Salt Lake City, Utah, 2013, pp. 155–160.

• R. Liu, Y. Yabansu, S. Kalidindi, A. Agrawal, and A. Choudhary, “Predictive Modeling in Characterizing Localization Relationships.” 2015, in preparation. 24

Page 26: Application of Machine Learning to Materials Discovery and … · 2018. 6. 11. · • Ruoqian Liu, Abhishek Kumar, Zhengzhang Chen, Ankit Agrawal, Veera Sundararaghavan, Alok Choudhary,

Thank You !

25

Ankit Agrawal Research Associate Professor

Dept. of Electrical Engineering and Computer Science

Northwestern University [email protected]

www.eecs.northwestern.edu/~ankitag/