DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA
|
|
- Henry Green
- 5 years ago
- Views:
Transcription
1 DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA 315 C. K. Lowe-Ma, A. E. Chen, D. Scholl Physical & Environmental Sciences, Research and Advanced Engineering Ford Motor Company, Dearborn, Michigan, USA C. J. Gilmore, R. J. Thatcher Chemistry Department, University of Glasgow, Glasgow, Scotland, UK W. Sverdlik Department of Computer Science, Eastern Michigan University, Ypsilanti, Michigan, USA Abstract High-Throughput Materials Discovery uses automation and parallelism to synthesize and evaluate large numbers of specimens while reducing time and costs associated with finding and optimizing novel materials. As optimal performance may not be uniformly distributed throughout parameter space, efficient tools for analyzing data and evaluating large areas of compositional or parameter space are needed. Data mining tools enable moving from the statistics of limited experimental designs to more descriptive and predictive relationships. Clustering a set of 47 samples for which both X-ray powder diffraction data and X-ray fluorescence-based elemental composition data were available showed that elemental composition correlated strongly with phase composition in this particular set of samples. Also, the clustering of the X-ray data was found to be exactly coincident with a different sample characteristic "type". Decision tree classification of a larger data set of 86 samples showed that "type" could be defined with very few errors from relatively few splits of the XRF-based compositions. Although composition exhibited strong clustering, measures of performance in these same samples exhibited only very weak clustering. However, performance of the materials could be predicted from linear regression using different slices of the data. Neural nets were attempted for improved predictability of performance beyond linear regression. As expected from the liner regression results, single output linear-based multi-layer perceptrons yielded acceptable predictive capability, but were found to yield notably degraded predictive results if "type" was excluded from the models. The strong dependence of performance on "type" for these samples was an unexpected outcome of the data analysis. Introduction High-Throughput Materials Discovery makes use of automated instrumentation and parallelism to synthesize and test large numbers of specimens (Figure 1). The foundation of this approach is that more can be learned from experiments on a widely diverse set of specimens than from complex, detailed measurements on simple systems or on measurements of a limited number of samples. Automated instrumentation and large numbers of specimens implies that large amounts of data will be generated, implying a strong need for efficient methods in evaluating data from large areas of compositional or parameter space. Although standard experimental design (DOE) statistical tools can provide a basis for selecting parameters and interpreting results, DOE tools are inherently constrained to the parameter space examined. We would like to take knowledge gleaned from a wide diversity of specimens,
2 This document was presented at the Denver X-ray Conference (DXC) on Applications of X-ray Analysis. Sponsored by the International Centre for Diffraction Data (ICDD). This document is provided by ICDD in cooperation with the authors and presenters of the DXC for the express purpose of educating the scientific community. All copyrights for the document are retained by ICDD. Usage is restricted for the purposes of education and scientific research. DXC Website ICDD Website -
3 describe our knowledge about these specimens, and develop predictions about regions in parameter space where further studies would be warranted. Describing data and developing predictions falls in the realm of data mining. Instead of the inward deductive data focus of DOE and statistical analysis tools, data mining emphasizes learning from examples and extrapolating to more general descriptive or predictive models through the use of a variety of artificial intelligence, pattern recognition, and machine learning algorithms. Effective data mining is all about how to formulate questions that are meaningful or sensible and how to prepare data to correctly answer those questions. Unfortunately, no general recipes exist for designing good questions nor for preparing data, especially scientific data, although some useful general references are available. 1,2 Types of standard data mining algorithms that might be used to answer questions are listed in Table 1. In this paper, clustering, regression, decision classification trees, and neural nets were used to examine relationships in a dataset that contained both quantitative X-ray fluorescence compositions and X-ray powder diffraction data. 316 Design Experiment (DoE Tools) Data Reduction and Data Mining Database Robotic Synthesis Parallel Screening Figure 1. Ford Motor Company implementation of High-Throughput Materials Discovery Results and Discussion As previously mentioned, one of the biggest challenges in data mining is data preparation. Although many vendors offer very capable software for handling X-ray powder diffraction data, we developed a fully automated empirical algorithm for background subtraction using Python. The algorithm (Equation 1) uses a 6-parameter fit with complex non-linear weighting but requires only a single input parameter from the user specifying an estimate of where background is relative to the last few points at the high-angle end of the scan. The algorithm fits both the low-angle scatter arising from powder surface roughness and the flat background expected at higher angles from off-axis-cut zero-background quartz substrates (Figure 2). Minimization is achieved using a Nelder-Mead simplex. 2 x a6 2 + a3 + a 2 4 a 3 5e y = a1 + a + (Eqn. 1) x x x The advantage of using this algorithm for background subtraction is that all diffraction scans are treated the same and a very large number of data files can be handled very efficiently by listing the filenames in a batch run file. Following background subtraction, the X-ray powder
4 diffraction data can then be further processed. For the analyses described below, the X-ray powder diffraction data were subsequently processed using PolySNAP. 3,4 317 Table 1. Types of Data Mining Algorithms Regression Descriptive Data Classification Models (numerical data) Visualization Other Linear and multiple Statistical exploratory Market basket analyses, a Version space hypotheses regression data analysis priori algorithms Regression and model trees Decision trees and lists Hierarchical clustering Textual analyses Adaptive neural nets, Image analysis and Instance-based classifiers K-means clustering multilayer nets segmentation Genetic algorithms Perceptron neural nets Expectation Maximization clustering Genetic algorithms Bayesian inference Figure 2. Two X-ray powder diffraction scans showing the effectiveness of the new algorithm in fitting a background. The red line is the fitted background, y in Equation 1. The X-ray powder diffraction data were obtained with either a PAD-V or an X2 Scintag powder diffractometer equipped with a copper-target X-ray tube. Data were collected with continuous scans and electronic integration over θ. The X-ray fluorescence data were obtained with a Philips PW2400 with a chromium tube using UniQuant5 and sensitivities optimized using additional in-house calibration standards and with background channels customized to better handle the chemistries of these samples. The resulting output of oxide weight percentages was converted to moles of each element. The data were prepared such that relationships between phase composition, elemental composition, and performance could be examined. Merging data from different characterization techniques yielded two data sets: (1) a set of 47 samples with X-ray powder diffraction (XRD) data, elemental compositions from X-ray fluorescence (XRF), surface area, and four measures of performance; (2) a related data set containing 86 samples with XRF data, surface area, a parameter for history (sample aging), and four measures of performance but without XRD data.
5 Data sets (1) and (2) were initially examined for natural groupings in the data with clustering. STATISTICA 5 was used for hierarchical clustering of the XRF, surface area, and performance data. Similarity clustering of the XRF, surface area, and performance data in various combinations with the XRD data was accomplished with the three-way multidimensional scaling of PolySNAP. 6,7 For more predictive models, regression and decision tree classification were accomplished with the open-source software WEKA 6. Neural nets were developed using STATISTICA Neural Nets. 318 (a) (b) Figure 3. (a) The clustering of the XRPD data in data set (1) by multi-dimensional scaling in PolySNAP. (b) The clustering of the XRF data in data set (1) also by multi-dimensional scaling. Although difficult to see in these images, the cluster membership is exactly the same for both types of data. (a) Figure 4. From PolySNAP using data set (1), similarity clustering of a subset of the elemental data from XRF (a) without surface area and (b) with surface area included. (b)
6 As illustrated in Figure 3, cluster membership is found to be the same for both types of X-ray data, XRD and XRF. Therefore, the phase composition has a strong relationship to elemental composition. Different variations in specimen composition are related to the presence of different phases. Examination of the cluster membership shows that the members accurately reflect a descriptor sample type that was derived from other information unrelated to any chemical or characterization measurements, e.g., sample type reflects the source from which the chemicals originated. 319 Figure 5. Similarity clustering from PolySNAP that results from adding XRD data to surface area and a subset of XRF elemental data (data set 1). Our knowledge of the samples tells us that not all of the elemental composition should be related to sample type. Manually selecting a subset of the XRF data enables probing relationships beyond the influence of sample type. However, the subset of XRF data exhibits relatively weak clustering (Figure 4a). Including surface area with the XRF data changes the clustering membership (Figure 4b) but does not strengthen the relationships. Hierarchical clustering using complete Euclidean linkage distances for the same subset of XRF data but from the larger data set (2) of 86 samples still yields poor clustering with very small linkage distances. However, inclusion of surface area in the hierarchical clustering of the larger data set does yield more numerically significant linkage distances and more distinct clusters. Not surprisingly, because the XRD data contain information so strongly related to sample type (Figure 3a), the addition of XRD clustering to the XRF subset-surface area clustering imposes a more definite structure in the overall clustering (Figure 5). Nevertheless, surface area and the XRF subset of data influences the cluster membership compared with Figure 3a. Examination of the clustering relationships for the four measures of performance indicates that the performance data alone show no strong tendency to cluster. The larger data set (2) of 86 samples but without XRD data was used to test for efficacy in predicting performance. For the prediction model building, selecting amongst the possible
7 twenty-one primary variables, thirteen derived variables, and four response variables was accomplished by either the independent feature selection heuristic of STATISTICA Data Miner or by using in each technique the embedded algorithms that selectively add or subtract parameters. Rather surprisingly, linear regression models for all four measures of performance could be found with correlation coefficients ranging from 0.84 to Different combinations of XRF elements, surface area, and history parameter yield statistically comparable models although all models included sample type. Decision tree classification shows that sample type can be defined with very few errors from relatively few splits of the XRF-based compositions, which is consistent with the clustering observed using PolySNAP (Figure 3b). To examine the influence of sample type on the regression models, sample type and elements defining sample type were excluded, but the history parameter and various combinations of surface area with remaining XRF elements were included. Nevertheless, the correlation coefficients for the linear regression models dropped significantly to 0.77 to This leads us to conclude that the measures of performance that were tested do depend to some extent on aging history of the specimens, surface area, and other aspects of composition besides sample type, but that for these particular materials, sample type is a significant factor related to the performance of the materials. Predictive models developed using neural nets show the same trend; predictions are notably degraded without the inclusion of sample type. The predictive capabilities of neural net models are further degraded if multiple predictions are attempted. This may suggest that the parameters remaining after removing sample type may be only weakly related to performance and may be insufficiently independent to successfully predict material performance. 320 Conclusions XRD phase composition and XRF elemental composition were found to yield the same clustering and, hence, both types of X-ray data have a strong relationship to each other in the specimens examined. Cluster membership of the X-ray data was found to be indicative of an unrelated descriptor sample type. Models developed for these data sets needed the inclusion of sample type to be effective in predicting performance. Although the dependence on sample type is, perhaps, not surprising in retrospect, models independent of sample type would be more useful. Hence, the next step for extending our data mining is to find other descriptors that improve prediction of performance without requiring the inclusion of sample type in the model. Dimensionality reduction of spectral-type X-ray data may yield other descriptors useful for modeling performance. Improved predictive models would guide us to other regions in parameter space in which to search for new or optimized materials.
8 321 References 1 Data Mining, Ian Witten and Eibe Frank (2000); Machine Learning, Tom Mitchell (1997); Data Mining: Concepts and Techniques, J. Han and M. Kamber (2001); Data Mining: Concepts, Models, Methods, and Algorithms, Mehmed Kantardzic (2003). 2 Data Preparation for Data Mining, Dorian Pyle (1999). 3 PolySNAP, Brucker AXS; also G. Barr, W. Dong, C.J. Gilmore (2004). PolySNAP: a computer program for analysing high-throughput powder diffraction data, J. Appl. Cryst. 37, C.J. Gilmore, G. Barr, J. Paisley (2004). High-throughput powder diffraction. I. A new approach to qualitative and quantitative powder diffraction pattern analysis using full pattern profiles, J. Appl. Cryst. 37, 231; G. Barr, W. Dong, D.J. Gilmore (2004). High-throughput powder diffraction. II. Applications of clustering methods and multivariate data analysis, J. Appl. Cryst. 37, StatSoft, Inc. (2005). STATISTICA 7.1 or STATISTICA Data Miner, version Ian H. Witten and Eibe Frank (2005). "Data Mining: Practical machine learning tools and techniques", 2nd Edition, Morgan Kaufmann, San Francisco; see also Weka 3: Data Mining Software in Java, 7 Using WEKA s Greedy algorithm for linear regression models with the outlier (sample P31) removed.
FINDING DESCRIPTORS USEFUL FOR DATA MINING IN THE CHARACTERIZATION DATA OF CATALYSTS
Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 338 FINDING DESCRIPTORS USEFUL FOR DATA MINING IN THE CHARACTERIZATION DATA OF CATALYSTS C. K. Lowe-Ma,
More informationPeter L Warren, Pamela Y Shadforth ICI Technology, Wilton, Middlesbrough, U.K.
783 SCOPE AND LIMITATIONS XRF ANALYSIS FOR SEMI-QUANTITATIVE Introduction Peter L Warren, Pamela Y Shadforth ICI Technology, Wilton, Middlesbrough, U.K. Historically x-ray fluorescence spectrometry has
More informationANALYSIS OF LOW MASS ABSORPTION MATERIALS USING GLANCING INCIDENCE X-RAY DIFFRACTION
173 ANALYSIS OF LOW MASS ABSORPTION MATERIALS USING GLANCING INCIDENCE X-RAY DIFFRACTION N. A. Raftery, L. K. Bekessy, and J. Bowpitt Faculty of Science, Queensland University of Technology, GPO Box 2434,
More informationData Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction
Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction
More informationA. Pelliccioni (*), R. Cotroneo (*), F. Pungì (*) (*)ISPESL-DIPIA, Via Fontana Candida 1, 00040, Monteporzio Catone (RM), Italy.
Application of Neural Net Models to classify and to forecast the observed precipitation type at the ground using the Artificial Intelligence Competition data set. A. Pelliccioni (*), R. Cotroneo (*), F.
More informationThe Changing Landscape of Land Administration
The Changing Landscape of Land Administration B r e n t J o n e s P E, PLS E s r i World s Largest Media Company No Journalists No Content Producers No Photographers World s Largest Hospitality Company
More informationCHECKING AND ESTIMATING RIR VALUES
Copyright(C)JCPDS-International Centre for Diffraction Data 2000, Advances in X-ray Analysis, Vol.42 287 Copyright(C)JCPDS-International Centre for Diffraction Data 2000, Advances in X-ray Analysis, Vol.42
More informationBackground literature. Data Mining. Data mining: what is it?
Background literature Data Mining Lecturer: Peter Lucas Assessment: Written exam at the end of part II Practical assessment Compulsory study material: Transparencies Handouts (mostly on the Web) Course
More informationTHE IMPORTANCE OF THE SPECIMEN DISPLACEMENT CORRECTION IN RIETVELD PATTERN FITTING WITH SYMMETRIC REFLECTION-OPTICS DIFFRACTION DATA
Copyright(c)JCPDS-International Centre for Diffraction Data 2001,Advances in X-ray Analysis,Vol.44 96 THE IMPORTANCE OF THE SPECIMEN DISPLACEMENT CORRECTION IN RIETVELD PATTERN FITTING WITH SYMMETRIC REFLECTION-OPTICS
More informationIn Situ High-Temperature Study Of Silver Behenate Reduction To Silver Metal Using Synchrotron Radiation
Copyright (c)jcpds-international Centre for Diffraction Data 2002, Advances in X-ray Analysis, Volume 45. 371 In Situ High-Temperature Study Of Silver Behenate Reduction To Silver Metal Using Synchrotron
More informationMEASUREMENT CAPABILITIES OF X-RAY FLUORESCENCE FOR BPSG FILMS
, MEASUREMENT CAPABILITIES OF X-RAY FLUORESCENCE FOR BPSG FILMS K.O. Goyal, J.W. Westphal Semiconductor Equipment Group Watkins-Johnson Company Scotts Valley, California 95066 Abstract Deposition of borophosphosilicate
More informationresearch papers High-throughput powder diffraction. IV. Cluster validation using silhouettes and fuzzy clustering
Journal of Applied Crystallography ISSN 0021-8898 Received 25 June 2004 Accepted 24 August 2004 High-throughput powder diffraction. IV. validation using silhouettes and fuzzy clustering Gordon Barr, Wei
More informationP leiades: Subspace Clustering and Evaluation
P leiades: Subspace Clustering and Evaluation Ira Assent, Emmanuel Müller, Ralph Krieger, Timm Jansen, and Thomas Seidl Data management and exploration group, RWTH Aachen University, Germany {assent,mueller,krieger,jansen,seidl}@cs.rwth-aachen.de
More informationIMPROVING THE ACCURACY OF RIETVELD-DERIVED LATTICE PARAMETERS BY AN ORDER OF MAGNITUDE
Copyright (c)jcpds-international Centre for Diffraction Data 2002, Advances in X-ray Analysis, Volume 45. 158 IMPROVING THE ACCURACY OF RIETVELD-DERIVED LATTICE PARAMETERS BY AN ORDER OF MAGNITUDE B. H.
More informationGLANCING INCIDENCE XRF FOR THE ANALYSIS OF EARLY CHINESE BRONZE MIRRORS
176 177 GLANCING INCIDENCE XRF FOR THE ANALYSIS OF EARLY CHINESE BRONZE MIRRORS Robert W. Zuneska, Y. Rong, Isaac Vander, and F. J. Cadieu* Physics Dept., Queens College of CUNY, Flushing, NY 11367. ABSTRACT
More informationDevelopment of a Data Mining Methodology using Robust Design
Development of a Data Mining Methodology using Robust Design Sangmun Shin, Myeonggil Choi, Youngsun Choi, Guo Yi Department of System Management Engineering, Inje University Gimhae, Kyung-Nam 61-749 South
More informationAn Empirical Study of Building Compact Ensembles
An Empirical Study of Building Compact Ensembles Huan Liu, Amit Mandvikar, and Jigar Mody Computer Science & Engineering Arizona State University Tempe, AZ 85281 {huan.liu,amitm,jigar.mody}@asu.edu Abstract.
More informationEFFECT OF CALIBRATION SPECIMEN PREPARATION TECHNIQUES ON NARROW RANGE X-RAY FLUORESCENCE CALIBRATION ACCURACY
Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 424 EFFECT OF CALIBRATION SPECIMEN PREPARATION TECHNIQUES ON NARROW RANGE X-RAY FLUORESCENCE CALIBRATION
More informationRIETVELD REFINEMENT WITH XRD AND ND: ANALYSIS OF METASTABLE QANDILITE-LIKE STRUCTURES
Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 261 RIETVELD REFINEMENT WITH XRD AND ND: ANALYSIS OF METASTABLE QANDILITE-LIKE STRUCTURES G. Kimmel
More informationCALCULATION METHODS OF X-RAY SPECTRA: A COMPARATIVE STUDY
Copyright -International Centre for Diffraction Data 2010 ISSN 1097-0002 CALCULATION METHODS OF X-RAY SPECTRA: A COMPARATIVE STUDY B. Chyba, M. Mantler, H. Ebel, R. Svagera Technische Universit Vienna,
More informationINFLUENCE OF GROWTH INTERRUPTION ON THE FORMATION OF SOLID-STATE INTERFACES
122 INFLUENCE OF GROWTH INTERRUPTION ON THE FORMATION OF SOLID-STATE INTERFACES I. Busch 1, M. Krumrey 2 and J. Stümpel 1 1 Physikalisch-Technische Bundesanstalt, Bundesallee 100, 38116 Braunschweig, Germany
More informationAEROSOL FILTER ANALYSIS USING POLARIZED OPTICS EDXRF WITH THIN FILM FP METHOD
Copyright JCPDS-International Centre for Diffraction Data 2014 ISSN 1097-0002 219 AEROSOL FILTER ANALYSIS USING POLARIZED OPTICS EDXRF WITH THIN FILM FP METHOD Takao Moriyama 1), Atsushi Morikawa 1), Makoto
More informationDEVELOPMENT OF XRD IN EL SALVADOR
PACIFIC OCEAN Copyright JCPDS - International Centre for Diffraction Data 2005, Advances in X-ray Analysis, Volume 48. 150 ABSTRACT DEVELOPMENT OF XRD IN EL SALVADOR Elizabeth de Henríquez LaGeo S.A. de
More informationACCURATE QUANTIFICATION OF RADIOACTIVE MATERIALS BY X-RAY FLUORESCENCE: GALLIUM IN PLUTONIUM METAL
Copyright JCPDS - International Centre for Diffraction Data 2003, Advances in X-ray Analysis, Volume 46. 369 ACCURATE QUANTIFICATION OF RADIOACTIVE MATERIALS BY X-RAY FLUORESCENCE: GALLIUM IN PLUTONIUM
More informationApplying Data Mining Techniques on Soil Fertility Prediction
Applying Data Mining Techniques on Soil Fertility Prediction S.S.Baskar L.Arockiam S.Charles Abstract: The techniques of data mining are very popular in the area of agriculture. The advancement in Agricultural
More informationA COMPACT X-RAY SPECTROMETER WITH MULTI-CAPILLARY X-RAY LENS AND FLAT CRYSTALS
Copyright(c)JCPDS-International Centre for Diffraction Data 2001,Advances in X-ray Analysis,Vol.44 320 A COMPACT X-RAY SPECTROMETER WITH MULTI-CAPILLARY X-RAY LENS AND FLAT CRYSTALS Hiroyoshi SOEJIMA and
More informationFrom statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu
From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom
More informationInducing Polynomial Equations for Regression
Inducing Polynomial Equations for Regression Ljupčo Todorovski, Peter Ljubič, and Sašo Džeroski Department of Knowledge Technologies, Jožef Stefan Institute Jamova 39, SI-1000 Ljubljana, Slovenia Ljupco.Todorovski@ijs.si
More informationData Mining Part 4. Prediction
Data Mining Part 4. Prediction 4.3. Fall 2009 Instructor: Dr. Masoud Yaghini Outline Introduction Bayes Theorem Naïve References Introduction Bayesian classifiers A statistical classifiers Introduction
More informationTime-Resolved μ-xrf and Elemental Mapping of Biological Materials
296 Time-Resolved μ-xrf and Elemental Mapping of Biological Materials K. Tsuji 1,2), K. Tsutsumimoto 1), K. Nakano 1,2), K. Tanaka 1), A. Okhrimovskyy 1), Y. Konishi 1), and X. Ding 3) 1) Department of
More informationMATERIALS CHARACTERIZATION USING A NOVEL SIMULTANEOUS NEAR-INFRARED/X-RAY DIFFRACTION INSTRUMENT
Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 249 MATERIALS CHARACTERIZATION USING A NOVEL SIMULTANEOUS NEAR-INFRARED/X-RAY DIFFRACTION INSTRUMENT
More informationCLASSIFICATION METHODS FOR MAGIC TELESCOPE IMAGES ON A PIXEL-BY-PIXEL BASE. 1 Introduction
CLASSIFICATION METHODS FOR MAGIC TELESCOPE IMAGES ON A PIXEL-BY-PIXEL BASE CONSTANTINO MALAGÓN Departamento de Ingeniería Informática, Univ. Antonio de Nebrija, 28040 Madrid, Spain. JUAN ABEL BARRIO 2,
More informationNEW CORRECTION PROCEDURE FOR X-RAY SPECTROSCOPIC FLUORESCENCE DATA: SIMULATIONS AND EXPERIMENT
Copyright JCPDS - International Centre for Diffraction Data 2005, Advances in X-ray Analysis, Volume 48. 266 NEW CORRECTION PROCEDURE FOR X-RAY SPECTROSCOPIC FLUORESCENCE DATA: SIMULATIONS AND EXPERIMENT
More informationFUNDAMENTAL PARAMETER METHOD USING SCATTERING X-RAYS IN X-RAY FLUORESCENCE ANALYSIS
FUNDAMENTAL PARAMETER METHOD USING SCATTERING X-RAYS IN X-RAY FLUORESCENCE ANALYSIS 255 Yoshiyuki Kataoka 1, Naoki Kawahara 1, Shinya Hara 1, Yasujiro Yamada 1, Takashi Matsuo 1, Michael Mantler 2 1 Rigaku
More informationFUNDAMENTAL PARAMETERS ANALYSIS OF ROHS ELEMENTS IN PLASTICS
45 ABSTRACT FUNDAMENTAL PARAMETERS ANALYSIS OF ROHS ELEMENTS IN PLASTICS W. T. Elam, Robert B. Shen, Bruce Scruggs, and Joseph A. Nicolosi EDAX, Inc. Mahwah, NJ 70430 European Community Directive 2002/95/EC
More informationData Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td
Data Mining Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 Preamble: Control Application Goal: Maintain T ~Td Tel: 319-335 5934 Fax: 319-335 5669 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak
More informationNeural Networks and Ensemble Methods for Classification
Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated
More informationRADIOACTIVE SAMPLE EFFECTS ON EDXRF SPECTRA
90 RADIOACTIVE SAMPLE EFFECTS ON EDXRF SPECTRA Christopher G. Worley Los Alamos National Laboratory, MS G740, Los Alamos, NM 87545 ABSTRACT Energy dispersive X-ray fluorescence (EDXRF) is a rapid, straightforward
More informationADVANTAGES AND DISADVANTAGES OF BAYESIAN METHODS FOR OBTAINING XRF NET INTENSITIES
187 188 ADVANTAGES AND DISADVANTAGES OF BAYESIAN METHODS FOR OBTAINING XRF NET INTENSITIES ABSTRACT W. T. Elam, B. Scruggs, F. Eggert, and J. A. Nicolosi EDAX, a unit of Ametek Inc., 91 McKee Drive, Mahwah,
More informationPREDICTION OF THE CRYSTAL STRUCTURE OF BYNARY AND TERNARY INORGANIC COMPOUNDS USING SYMMETRY RESTRICTIONS AND POWDER DIFFRACTION DATA
Copyright(c)JCPDS-International Centre for Diffraction Data 2001,Advances in X-ray Analysis,Vol.44 116 PREDICTION OF THE CRYSTAL STRUCTURE OF BYNARY AND TERNARY INORGANIC COMPOUNDS USING SYMMETRY RESTRICTIONS
More informationCHARACTERIZING PROCESS SEMICONDUCTOR THIN FILMS WITH A CONFOCAL MICRO X-RAY FLUORESCENCE MICROSCOPE
CHARACTERIZING PROCESS SEMICONDUCTOR THIN FILMS WITH A CONFOCAL MICRO X-RAY FLUORESCENCE MICROSCOPE 218 Chris M. Sparks 1, Elizabeth P. Hastings 2, George J. Havrilla 2, and Michael Beckstead 2 1. ATDF,
More informationInstitute for Functional Imaging of Materials (IFIM)
Institute for Functional Imaging of Materials (IFIM) Sergei V. Kalinin Guiding the design of materials tailored for functionality Dynamic matter: information dimension Static matter Functional matter Imaging
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,
More informationApplied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition
Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world
More informationText Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University
Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data
More informationFACTORS AFFECTING IN-LINE PHASE CONTRAST IMAGING WITH A LABORATORY MICROFOCUS X-RAY SOURCE
Copyright JCPDS-International Centre for Diffraction Data 26 ISSN 197-2 FACTORS AFFECTING IN-LINE PHASE CONTRAST IMAGING WITH A LABORATORY MICROFOCUS X-RAY SOURCE 31 K. L. Kelly and B. K. Tanner Department
More informationUSABILITY OF PORTABLE X-RAY SPECTROMETER FOR DISCRIMINATION OF VALENCE STATES
Copyright (c)jcpds-international Centre for Diffraction Data 00, Advances in X-ray Analysis, Volume 45. 409 ISSN 1097-000 USABIITY OF POTABE X-AY SPECTOMETE FO DISCIMINATION OF VAENCE STATES I.A.Brytov,.I.Plotnikov,B.D.Kalinin,
More informationDEVELOPMENT OF A NEW POSITRON LIFETIME SPECTROSCOPY TECHNIQUE FOR DEFECT CHARACTERIZATION IN THICK MATERIALS
Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 59 DEVELOPMENT OF A NEW POSITRON LIFETIME SPECTROSCOPY TECHNIQUE FOR DEFECT CHARACTERIZATION IN
More informationCopyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol ISSN
Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 129 MATHEMATICAL OF DIFFRACTION PROPERTIES POLE FIGURES ABSTRACT Helmut Schaeben Mathematics and Computer
More informationClassification Using Decision Trees
Classification Using Decision Trees 1. Introduction Data mining term is mainly used for the specific set of six activities namely Classification, Estimation, Prediction, Affinity grouping or Association
More informationInduction of Decision Trees
Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.
More informationContents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics
Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics... 1 1.1 Chemoinformatics... 2 1.1.1 Open-Source Tools... 2 1.1.2 Introduction to Programming Languages... 3 1.2 Chemical Structure
More informationReading, UK 1 2 Abstract
, pp.45-54 http://dx.doi.org/10.14257/ijseia.2013.7.5.05 A Case Study on the Application of Computational Intelligence to Identifying Relationships between Land use Characteristics and Damages caused by
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationML techniques. symbolic techniques different types of representation value attribute representation representation of the first order
MACHINE LEARNING Definition 1: Learning is constructing or modifying representations of what is being experienced [Michalski 1986], p. 10 Definition 2: Learning denotes changes in the system That are adaptive
More informationInducing Polynomial Equations for Regression
Inducing Polynomial Equations for Regression Ljupčo Todorovski, Peter Ljubič, and Sašo Džeroski Department of Knowledge Technologies, Jožef Stefan Institute Jamova 39, SI-1000 Ljubljana, Slovenia Ljupco.Todorovski@ijs.si
More informationAPPLICATION OF MICRO X-RAY FLUORESCENCE SPECTROMETRY FOR LOCALIZED AREA ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL MATERIALS
Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 540 APPLICATION OF MICRO X-RAY FLUORESCENCE SPECTROMETRY FOR LOCALIZED AREA ANALYSIS OF BIOLOGICAL AND
More informationCondensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule
Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule Frank Hoonakker 1,3, Nicolas Lachiche 2, Alexandre Varnek 3, and Alain Wagner 3,4 1 Chemoinformatics laboratory,
More informationHorst Ebel, Robert Svagera, Christian Hager, Maria F.Ebel, Christian Eisenmenger-Sittner, Johann Wernisch, and Michael Mantler
DETECTION OF SUBMONOLAYERS BY MEASUREMENT OF THE TOTAL ELECTRON YIELD (TEY) OF X-RAY EXCITED ELECTRON EMISSION Horst Ebel, Robert Svagera, Christian Hager, Maria F.Ebel, Christian Eisenmenger-Sittner,
More informationLinear Models for Classification
Linear Models for Classification Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Linear
More informationION-EXCHANGE FILMS FOR ELEMENT CONCENTRATION IN X-RAY FLUORESCENCE ANALYSIS WITH TOTAL REFLECTION OF THE PRIMARY BEAM.
822 ION-EXCHANGE FILMS FOR ELEMENT CONCENTRATION IN X-RAY FLUORESCENCE ANALYSIS WITH TOTAL REFLECTION OF THE PRIMARY BEAM. Abstract A.P.Morovov, L.D.Danilin, V.V.Zhmailo, Yu.V.Ignatiev, A.E.Lakhtikov,
More informationThree-Dimensional Electron Microscopy of Macromolecular Assemblies
Three-Dimensional Electron Microscopy of Macromolecular Assemblies Joachim Frank Wadsworth Center for Laboratories and Research State of New York Department of Health The Governor Nelson A. Rockefeller
More informationEFFECT OF THE HOLE-BOTTOM FILLET RADIUS ON THE RESIDUAL STRESS ANALYSIS BY THE HOLE DRILLING METHOD
63 EFFECT OF THE HOLE-BOTTOM FILLET RADIUS ON THE RESIDUAL STRESS ANALYSIS BY THE HOLE DRILLING METHOD M. Scafidi a, E. Valentini b, B. Zuccarello a scafidi@dima.unipa.it, emilio.valentini@sintechnology.com,
More informationMaking Our Cities Safer: A Study In Neighbhorhood Crime Patterns
Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence
More informationArtificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence
Artificial Intelligence (AI) Artificial Intelligence AI is an attempt to reproduce intelligent reasoning using machines * * H. M. Cartwright, Applications of Artificial Intelligence in Chemistry, 1993,
More informationELECTRIC FIELD INFLUENCE ON EMISSION OF CHARACTERISTIC X-RAY FROM Al 2 O 3 TARGETS BOMBARDED BY SLOW Xe + IONS
390 ELECTRIC FIELD INFLUENCE ON EMISSION OF CHARACTERISTIC X-RAY FROM Al 2 O 3 TARGETS BOMBARDED BY SLOW Xe + IONS J. C. Rao 1, 2 *, M. Song 2, K. Mitsuishi 2, M. Takeguchi 2, K. Furuya 2 1 Department
More informationMore on Unsupervised Learning
More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data
More informationfor XPS surface analysis
Thermo Scientific Avantage XPS Software Powerful instrument operation and data processing for XPS surface analysis Avantage Software Atomic Concentration (%) 100 The premier software for surface analysis
More informationMultiscaleMaterialsDesignUsingInformatics. S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA
MultiscaleMaterialsDesignUsingInformatics S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA9550-12-1-0458 1 Hierarchical Material Structure Kalidindi and DeGraef, ARMS, 2015 Main Challenges
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationOnline Estimation of Discrete Densities using Classifier Chains
Online Estimation of Discrete Densities using Classifier Chains Michael Geilke 1 and Eibe Frank 2 and Stefan Kramer 1 1 Johannes Gutenberg-Universtität Mainz, Germany {geilke,kramer}@informatik.uni-mainz.de
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.
More informationREALIZATION OF AN ASYMMETRIC MULTILAYER X-RAY MIRROR
Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 218 REALIZATION OF AN ASYMMETRIC MULTILAYER X-RAY MIRROR S. M. Owens Laboratory for High Energy Astrophysics,
More informationLASER-COMPTON SCATTERING AS A POTENTIAL BRIGHT X-RAY SOURCE
Copyright(C)JCPDS-International Centre for Diffraction Data 2003, Advances in X-ray Analysis, Vol.46 74 ISSN 1097-0002 LASER-COMPTON SCATTERING AS A POTENTIAL BRIGHT X-RAY SOURCE K. Chouffani 1, D. Wells
More informationThe application of neural networks to the paper-making industry
The application of neural networks to the paper-making industry P. J. Edwards y, A.F. Murray y, G. Papadopoulos y, A.R. Wallace y and J. Barnard x y Dept. of Electronics and Electrical Eng., Edinburgh
More informationGene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm
Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm Zhenqiu Liu, Dechang Chen 2 Department of Computer Science Wayne State University, Market Street, Frederick, MD 273,
More informationCHAPTER 2: DATA MINING - A MODERN TOOL FOR ANALYSIS. Due to elements of uncertainty many problems in this world appear to be
11 CHAPTER 2: DATA MINING - A MODERN TOOL FOR ANALYSIS Due to elements of uncertainty many problems in this world appear to be complex. The uncertainty may be either in parameters defining the problem
More informationDiscriminant analysis and supervised classification
Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical
More informationCombinatorial Heterogeneous Catalysis
Combinatorial Heterogeneous Catalysis 650 μm by 650 μm, spaced 100 μm apart Identification of a new blue photoluminescent (PL) composite material, Gd 3 Ga 5 O 12 /SiO 2 Science 13 March 1998: Vol. 279
More informationSwitch Mechanism Diagnosis using a Pattern Recognition Approach
The 4th IET International Conference on Railway Condition Monitoring RCM 2008 Switch Mechanism Diagnosis using a Pattern Recognition Approach F. Chamroukhi, A. Samé, P. Aknin The French National Institute
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)
More informationX-RAY MICRODIFFRACTION STUDY OF THE HALF-V SHAPED SWITCHING LIQUID CRYSTAL
Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 321 X-RAY MICRODIFFRACTION STUDY OF THE HALF-V SHAPED SWITCHING LIQUID CRYSTAL Kazuhiro Takada 1,
More informationApplied Statistics. Multivariate Analysis - part II. Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1
Applied Statistics Multivariate Analysis - part II Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1 Fisher Discriminant You want to separate two types/classes (A and B) of
More informationAn Alternate Measure for Comparing Time Series Subsequence Clusters
An Alternate Measure for Comparing Time Series Subsequence Clusters Ricardo Mardales mardales@ engr.uconn.edu Dina Goldin dqg@engr.uconn.edu BECAT/CSE Technical Report University of Connecticut 1. Introduction
More informationDecision T ree Tree Algorithm Week 4 1
Decision Tree Algorithm Week 4 1 Team Homework Assignment #5 Read pp. 105 117 of the text book. Do Examples 3.1, 3.2, 3.3 and Exercise 3.4 (a). Prepare for the results of the homework assignment. Due date
More informationDANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence
DANIEL WILSON AND BEN CONKLIN Integrating AI with Foundation Intelligence for Actionable Intelligence INTEGRATING AI WITH FOUNDATION INTELLIGENCE FOR ACTIONABLE INTELLIGENCE in an arms race for artificial
More informationEasySDM: A Spatial Data Mining Platform
EasySDM: A Spatial Data Mining Platform (User Manual) Authors: Amine Abdaoui and Mohamed Ala Al Chikha, Students at the National Computing Engineering School. Algiers. June 2013. 1. Overview EasySDM is
More informationPrediction of Citations for Academic Papers
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationFeature Selection with Fuzzy Decision Reducts
Feature Selection with Fuzzy Decision Reducts Chris Cornelis 1, Germán Hurtado Martín 1,2, Richard Jensen 3, and Dominik Ślȩzak4 1 Dept. of Mathematics and Computer Science, Ghent University, Gent, Belgium
More informationEfficiently merging symbolic rules into integrated rules
Efficiently merging symbolic rules into integrated rules Jim Prentzas a, Ioannis Hatzilygeroudis b a Democritus University of Thrace, School of Education Sciences Department of Education Sciences in Pre-School
More informationIn Knowledge Discovery and Data Mining: Challenges and Realities. Edited by X. Zhu and I. Davidson. pp
if a>10 and b
More informationhsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference
CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann
ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output
More informationCOMP9444: Neural Networks. Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization
: Neural Networks Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization 11s2 VC-dimension and PAC-learning 1 How good a classifier does a learner produce? Training error is the precentage
More informationBayesian Classification. Bayesian Classification: Why?
Bayesian Classification http://css.engineering.uiowa.edu/~comp/ Bayesian Classification: Why? Probabilistic learning: Computation of explicit probabilities for hypothesis, among the most practical approaches
More informationA comparison of three class separability measures
A comparison of three class separability measures L.S Mthembu & J.Greene Department of Electrical Engineering, University of Cape Town Rondebosch, 7001, South Africa. Abstract lsmthlin007@mail.uct.ac.za
More information