(S1) (S2) = =8.16

Similar documents
MultiscaleMaterialsDesignUsingInformatics. S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA

Application of Machine Learning to Materials Discovery and Development

CS 229 Final Report: Data-Driven Prediction of Band Gap of Materials

Metals and Nonmetals

Metals and Nonmetals. Metals and Nonmetals. The Periodic Table and Atomic Properties

Worksheet III. E ion = -Z eff 2 /n 2 (13.6 ev)

4/4/2013. Covalent Bonds a bond that results in the sharing of electron pairs between two atoms.

ATOMIC STRUCTURE, ELECTRONS, AND PERIODICITY

CHAPTER 2: BONDING AND PROPERTIES

Name: Class: Date: SHORT ANSWER Answer the following questions in the space provided.

ATOMIC STRUCTURE, ELECTRONS, AND PERIODICITY

Objective #1 (80 topics, due on 09/05 (11:59PM))

Machine Learning for the Materials Scientist

Chapter 7 Ionic And Metallic Bonding Answer Key Pearson Education

Chapter 6. Table of Contents. Section 1 Covalent Bonds. Section 2 Drawing and Naming Molecules. Section 3 Molecular Shapes. Covalent Compounds

Review Package #3 Atomic Models and Subatomic Particles The Periodic Table Chemical Bonding

need another quantum number, m s it is found that atoms, substances with unpaired electrons have a magnetic moment, 2 s(s+1) where {s = m s }

Lecture 2: Bonding in solids

An Empirical Study of Building Compact Ensembles

Chapter 9 Bonding - 1. Dr. Sapna Gupta

Three (3) (Qatar only) The expected learning outcome is that the student will be able to:

Quantum Classification of Malware. John Seymour

Chemistry Review Unit 4 Chemical Bonding

Introduction period group

1. Following Dalton s Atomic Theory, 2. In 1869 Russian chemist published a method. of organizing the elements. Mendeleev showed that

Periodic Trends in Properties of Homonuclear

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition

Gradient Boosting, Continued

1. Kernel ridge regression In contrast to ordinary least squares which has a cost function. m (θ T x (i) y (i) ) 2, J(θ) = 1 2.

Chapter 8. Bonding: General Concepts

Chapter 11 Chemical Bonds: The Formation of Compounds from Atoms Advanced Chemistry Periodic Trends in Atomic Properties Learning Objective

Biology 559R: Introduction to Phylogenetic Comparative Methods Topics for this week:

Classifying Galaxy Morphology using Machine Learning

Elementary Materials Science Concepts - Interatomic Bonding. Interatomic Bonding

New opportunities for materials informatics: Resources and data mining techniques for uncovering hidden relationships

Chemical Space: Modeling Exploration & Understanding

Do Now. What are valence electrons?

Its Bonding Time. Chemical Bonds CH 12

PART CHAPTER2. Atomic Bonding

Periodicity & Many-Electron Atoms

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

DATA ANALYTICS IN NANOMATERIALS DISCOVERY

CHAPTER 3. Crystallography

Chapter 8. Bonding: General Concepts

Electronegativity is a very useful concept for the explanation or understanding of chemical reactivity throughout the periodic table.

Crystalline Solids have atoms arranged in an orderly repeating pattern. Amorphous Solids lack the order found in crystalline solids

Winter Break Packet Absence makes the mind go blank. You will thank me for this later.

Unit 3 - Chemical Bonding and Molecular Structure

Made the FIRST periodic table

Introductory College Chemistry

CHEMISTRY 110 EXAM 1 SEPTEMBER 20, 2010 FORM A

Bonding in Chemistry. Chemical Bonds All chemical reactions involve breaking of some bonds and formation of new ones where new products are formed.

BU Practice Quiz Which type of bond is present in copper wire? 1. An ionic bond can be formed when one or more electrons are

Unit Two Test Review. Click to get a new slide. Choose your answer, then click to see if you were correct.

UNIT 7 DAY 1. Ionic Bonding Basics; Dot diagrams

Recognizing a Pattern

Atomic structure & interatomic bonding. Chapter two

Introduction to Hartree-Fock calculations in Spartan

Big Idea: Ionic Bonds: Ionic Bonds: Metals: Nonmetals: Covalent Bonds: Ionic Solids: What ions do atoms form? Electron Electron

Ceramic Bonding. CaF 2 : large SiC: small

Lesson 1: Stability and Energy in Bonding Introduction

Chapter 8: Periodic Properties of the Elements

Click Prediction and Preference Ranking of RSS Feeds

CHAPTER 2. Atomic Structure And Bonding 2-1

Chapter 10. Valence Electrons. Lewis dot symbols. Chemical Bonding

Hierarchical models for the rainfall forecast DATA MINING APPROACH

Earth Materials I Crystal Structures

Why all the repeating Why all the repeating Why all the repeating Why all the repeating

Chemistry 1 1. Review Package #3. Atomic Models and Subatomic Particles The Periodic Table Chemical Bonding

Atomic Structure & Interatomic Bonding

Supporting Information for. Predicting the Stability of Fullerene Allotropes Throughout the Periodic Table MA 02139

Chemistry Mid-Term Exam Review Spring 2017

CHEM1902/ N-2 November 2014

Instructor Supplemental Solutions to Problems. Organic Chemistry 5th Edition

Chemistry Curriculum Guide Scranton School District Scranton, PA

Chemoinformatics and information management. Peter Willett, University of Sheffield, UK

EVALUATING RISK FACTORS OF BEING OBESE, BY USING ID3 ALGORITHM IN WEKA SOFTWARE

Properties of substances are largely dependent on the bonds holding the material together.

Periodic Classification and Properties Page of 6

Unit 1 Chemistry Basics

Homework 2: Structure and Defects in Crystalline and Amorphous Materials

Chapter #2 The Periodic Table

ATOMIC STRUCTURE AND BONDING. IE-114 Materials Science and General Chemistry Lecture-2

EPSC 233. Compositional variation in minerals. Recommended reading: PERKINS, p. 286, 41 (Box 2-4).

Week 1, video 3: Classifiers, Part 1

Test Review # 4. Chemistry: Form TR4-9A

Chapter 2: Atomic structure and interatomic bonding. Chapter 2: Atomic structure and interatomic bonding

Elements and Electron Configuration

Patterns, Correlations, and Causality in Big Data of Materials: Analytics for Novel Materials Discovery

EasySDM: A Spatial Data Mining Platform

Chapter 13: Phenomena

Chapter 8. Periodic Properties of the Element

Molecular Structure and Bonding. Assis.Prof.Dr.Mohammed Hassan Lecture 2

Name Date Period. Can the properties of an element be predicted using a periodic table?

Chemistry Vocabulary. These vocabulary words appear on the Chemistry CBA in addition to being tested on the Chemistry Vocabulary Test.

Doping-induced valence change in Yb 5 Ge 4 x (Sb, Ga) x : (x 1)

Using Web-Based Computations in Organic Chemistry

Chemical Bonding I: Basic Concepts

LAB 11 Molecular Geometry Objectives

Class: XI Chapter 3: Classification of Elements Chapter Notes. Top Concepts

Transcription:

Formulae for Attributes As described in the manuscript, the first step of our method to create a predictive model of material properties is to compute attributes based on the composition of materials. These attributes are designed to enable a machine learning algorithm to construct general rules that can possibly learn chemistry and reflect some kind of chemical intuition. These attributes fall into four categories: Stoichiometric Attributes (6 in total) These attributes capture the fraction of the elements present and are not affected by what those elements are. All are based on L p norms (i.e. = ( ) ) of a vector representing the atomic fraction of the material corresponding to each element. In this work, we use the p=0 norm (which is equivalent to the number of components) and the p=2, 3, 5, 7, and 10 norms. Such a broad range was selected to create attributes that respond to changes in fractions with varied strengths. As an example, the p=7 norm for Fe 2 O 3 is: = 2 5 + 3 5 0.605 Elemental-Property-Based Attributes (115 in total) Most of the attributes created using our method are based on statistics of the elemental properties listed in Table S1. For each property, the minimum, maximum, and range of the values of the properties of each element present in the material is computed along with the fraction-weighted mean, average deviation, and mode (i.e. the property of the most prevalent element). The mean and average deviation are calculated using the following formulae: = = (S1) (S2) where is the property of element i, is the atomic fraction, is the mean, and is the average deviation. As an example, the mean of and average deviation in the atomic number of Fe 2 O 3 are: = (26) + (8) =15.2 = 26 15.2 + 8 15.2 =8.16 Valance Orbital Occupation Attributes (4 in total) These attributes are the fraction-weighted average of the number of valance electrons in each orbital divided by the fraction-weighted average of the total number of valance electrons. This attribute is exactly equivalent to the one employed by Meredig, Agrawal et al. 1 As an example, the fraction of p electrons for Fe 2 O 3 is computed by = 2 5 (0) + 3 5 (4) 6 2 5 (8) + 3 = 5 (6) 17 0.352

Ionic Compound Attributes (3 in total) These attributes are designed to determine whether a material is ionically bonded. The first measure is a Boolean denoting whether it possible to form a neutral, ionic compound assuming each element takes exactly one of its common charge states. The other two are based on the ionic character of a binary compound, which is computed from the electronegativity difference between its two constituent elements using the relation (, )=1 exp( 0.25( ) ) (S3) where is the fraction of ionic character, is the electronegativity of element A, and is the electronegativity of element B. 2 The first attribute we used is the maximum ionic character between any two elements in the material. The second is the mean ionic character, which is computed using = (, ) (S4) Comparison of Attributes Sets To assess the benefit of our expanded attribute set, we compared the predictive accuracy of models created using only the fractions of each element as attributes, using the attribute set proposed by Meredig, Agrawal et al., 1 and using the attributes set proposed in this work. In order to only test the effect of changing the attribute set, we employed the same machine learning algorithm in each case: an ensemble of reduced-error-pruning decision trees created using the rotation forest technique. 3 We used 10-fold cross-validation to estimate the predictive mean absolute error for a model trained on the DFTcomputed formation energy, band gap energy, and volume of 228676 compounds taken from the OQMD. 4 As shown in Figure S1, models created using our attributes are, on average, 81% more accurate than models created using only the atomic fractions of elements and 25% better than those using the attribute set of Ref. 1, which clearly shows the benefit of expanding the attribute set. Software Used to Perform Machine Learning All machine learning models described in the associated manuscript, with the exception of those produced using symbolic regression, were created, trained, and evaluated using the Materials Agnostic Platform for Informatics and Exploration (Magpie). This software library package serves to integrate all aspects of constructing predictive models for material properties into one program and enables users to perform all of the requisite tasks through a simple, interactive text interface. As long as users can install the Java Runtime Environment (which is available for many systems), they will be to run this code and, given the provided datasets and scripts, replicate the results of our study with ease. In the ZIP archive provided with this document, we have provided the Java archive for Magpie (Magpie.jar) and all of its required software libraries, which includes the Weka machine learning library. 5 Documentation for both the Java and text interfaces of this program is available online. 6 The source code for this program is freely available under a permissive license, and technical details of this code will also be described in a future publication. Hierarchical Model used to Predict Band Gap Energies A representation of the hierarchical model used to predict band gap energies in the Accurate Models for Arbitrary Properties section of the manuscript is shown in Figure S1. Each submodel employed by

this composite model was an ensemble of Reduced-Error Pruning Trees created using the random subspace approach. This composite model was chosen because it was found to most accurately locate compounds with a band gap between 0.9 and 1.7 ev out of around 50 other models. The input file and code used to create and test this model are available in the ZIP archive provided with this document.

References 1. Meredig, B., Agrawal, A., et al. Combinatorial screening for new materials in unconstrained composition space with machine learning. Phys. Rev. B 89, 094104 (2014). 2. Callister, W. D. Materials Science and Engineering: An Introduction. (Wiley, 2007). 3. Rodríguez, J. J., Kuncheva, L. I. & Alonso, C. J. Rotation forest: A new classifier ensemble method. IEEE Trans. Pattern Anal. Mach. Intell. 28, 1619 30 (2006). 4. Saal, J. E., Kirklin, S., Aykol, M., Meredig, B. & Wolverton, C. Materials Design and Discovery with High-Throughput Density Functional Theory: The Open Quantum Materials Database (OQMD). Jom 65, 1501 1509 (2013). 5. Hall, M. et al. The WEKA data mining software. ACM SIGKDD Explor. Newsl. 11, 10 (2009). 6. http://bitbucket.org/wolverton/magpie 7. Wierzbicki, A. P. A mathematical basis for satisficing decision making. Math. Model. 3, 391 405 (1982). 8. http://reference.wolfram.com/language/note/elementdatasourceinformation.html 9. Villars, P., Cenzual, K., Daams, J., Chen, Y. & Iwata, S. Data-driven atomic environment prediction for binaries using the Mendeleev number. J. Alloys Compd. 367, 167 175 (2004).

Figures Figure S1. Relative magnitude of mean absolute errors in cross-validation of machine learning models that used three different attribute sets: only the composition (i.e., atomic fractions of each element), the attribute set of Meredig, Agrawal et al., 1 and the set proposed in this work. Each model used ensembles of decision trees and were trained on the DFT formation energy, volume, and band gap energy 228676 compounds.

Figure S2. Hierarchical model used to predict band gap energies of crystalline compounds. Each rectangle with rounded corners represents a machine learning model. The model on the far left (Model #1) is trained to predict the range on which the band gap was most likely to lie. The models on the right are trained to predict actual value of the band gap energy. Depending on results of Model #1 and the composition of an entry, a different machine model would be used. For example, Model #3 will be used for all halogen-containing compounds predicted to have a band gap energy between 0 and 1.5 ev by Model #1.

Tables Table S1. Elemental properties used to compute elemental-property-based attributes. Elemental property is taken from that dataset available with the Wolfram programming language, 8 unless otherwise specified Atomic Weight Melting Atomic Number Mendeleev Number 9 Temperature Row Covalent Radius Electronegativity* # s Valence # d Valence # f Valence Total # Valance # Unfilled s # Unfilled d # Unfilled f Total # Unfilled Specific Volume of 0 K Ground State Magnetic Moment (per atom) Space Group Number of 0 K of 0 K ground state Ground State Column # p Valence # Unfilled p Band Gap Energy of 0 K Ground State *Electronegativities for Eu, Yb, Tb, Pm taken to be the average of that of the element with one greater and one less atomic number (e.g. the average of Sm and Gd is used for Eu) Computed as the number of electrons in a par ally-occupied orbital subtracted from the total number of electrons allowed in that orbital. Unoccupied orbitals always count as 0. Example: an element with a electronic configuration of [Ar]3d 3 4s 2 has 0 unfilled s orbitals, 7 filled d orbitals, and 0 unfilled p and f orbitals by the measure defined here. Data taken from OQMD.org