DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA

Size: px
Start display at page:

Download "DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA"

Transcription

1 DATA MINING WITH DIFFERENT TYPES OF X-RAY DATA 315 C. K. Lowe-Ma, A. E. Chen, D. Scholl Physical & Environmental Sciences, Research and Advanced Engineering Ford Motor Company, Dearborn, Michigan, USA C. J. Gilmore, R. J. Thatcher Chemistry Department, University of Glasgow, Glasgow, Scotland, UK W. Sverdlik Department of Computer Science, Eastern Michigan University, Ypsilanti, Michigan, USA Abstract High-Throughput Materials Discovery uses automation and parallelism to synthesize and evaluate large numbers of specimens while reducing time and costs associated with finding and optimizing novel materials. As optimal performance may not be uniformly distributed throughout parameter space, efficient tools for analyzing data and evaluating large areas of compositional or parameter space are needed. Data mining tools enable moving from the statistics of limited experimental designs to more descriptive and predictive relationships. Clustering a set of 47 samples for which both X-ray powder diffraction data and X-ray fluorescence-based elemental composition data were available showed that elemental composition correlated strongly with phase composition in this particular set of samples. Also, the clustering of the X-ray data was found to be exactly coincident with a different sample characteristic "type". Decision tree classification of a larger data set of 86 samples showed that "type" could be defined with very few errors from relatively few splits of the XRF-based compositions. Although composition exhibited strong clustering, measures of performance in these same samples exhibited only very weak clustering. However, performance of the materials could be predicted from linear regression using different slices of the data. Neural nets were attempted for improved predictability of performance beyond linear regression. As expected from the liner regression results, single output linear-based multi-layer perceptrons yielded acceptable predictive capability, but were found to yield notably degraded predictive results if "type" was excluded from the models. The strong dependence of performance on "type" for these samples was an unexpected outcome of the data analysis. Introduction High-Throughput Materials Discovery makes use of automated instrumentation and parallelism to synthesize and test large numbers of specimens (Figure 1). The foundation of this approach is that more can be learned from experiments on a widely diverse set of specimens than from complex, detailed measurements on simple systems or on measurements of a limited number of samples. Automated instrumentation and large numbers of specimens implies that large amounts of data will be generated, implying a strong need for efficient methods in evaluating data from large areas of compositional or parameter space. Although standard experimental design (DOE) statistical tools can provide a basis for selecting parameters and interpreting results, DOE tools are inherently constrained to the parameter space examined. We would like to take knowledge gleaned from a wide diversity of specimens,

2 This document was presented at the Denver X-ray Conference (DXC) on Applications of X-ray Analysis. Sponsored by the International Centre for Diffraction Data (ICDD). This document is provided by ICDD in cooperation with the authors and presenters of the DXC for the express purpose of educating the scientific community. All copyrights for the document are retained by ICDD. Usage is restricted for the purposes of education and scientific research. DXC Website ICDD Website -

3 describe our knowledge about these specimens, and develop predictions about regions in parameter space where further studies would be warranted. Describing data and developing predictions falls in the realm of data mining. Instead of the inward deductive data focus of DOE and statistical analysis tools, data mining emphasizes learning from examples and extrapolating to more general descriptive or predictive models through the use of a variety of artificial intelligence, pattern recognition, and machine learning algorithms. Effective data mining is all about how to formulate questions that are meaningful or sensible and how to prepare data to correctly answer those questions. Unfortunately, no general recipes exist for designing good questions nor for preparing data, especially scientific data, although some useful general references are available. 1,2 Types of standard data mining algorithms that might be used to answer questions are listed in Table 1. In this paper, clustering, regression, decision classification trees, and neural nets were used to examine relationships in a dataset that contained both quantitative X-ray fluorescence compositions and X-ray powder diffraction data. 316 Design Experiment (DoE Tools) Data Reduction and Data Mining Database Robotic Synthesis Parallel Screening Figure 1. Ford Motor Company implementation of High-Throughput Materials Discovery Results and Discussion As previously mentioned, one of the biggest challenges in data mining is data preparation. Although many vendors offer very capable software for handling X-ray powder diffraction data, we developed a fully automated empirical algorithm for background subtraction using Python. The algorithm (Equation 1) uses a 6-parameter fit with complex non-linear weighting but requires only a single input parameter from the user specifying an estimate of where background is relative to the last few points at the high-angle end of the scan. The algorithm fits both the low-angle scatter arising from powder surface roughness and the flat background expected at higher angles from off-axis-cut zero-background quartz substrates (Figure 2). Minimization is achieved using a Nelder-Mead simplex. 2 x a6 2 + a3 + a 2 4 a 3 5e y = a1 + a + (Eqn. 1) x x x The advantage of using this algorithm for background subtraction is that all diffraction scans are treated the same and a very large number of data files can be handled very efficiently by listing the filenames in a batch run file. Following background subtraction, the X-ray powder

4 diffraction data can then be further processed. For the analyses described below, the X-ray powder diffraction data were subsequently processed using PolySNAP. 3,4 317 Table 1. Types of Data Mining Algorithms Regression Descriptive Data Classification Models (numerical data) Visualization Other Linear and multiple Statistical exploratory Market basket analyses, a Version space hypotheses regression data analysis priori algorithms Regression and model trees Decision trees and lists Hierarchical clustering Textual analyses Adaptive neural nets, Image analysis and Instance-based classifiers K-means clustering multilayer nets segmentation Genetic algorithms Perceptron neural nets Expectation Maximization clustering Genetic algorithms Bayesian inference Figure 2. Two X-ray powder diffraction scans showing the effectiveness of the new algorithm in fitting a background. The red line is the fitted background, y in Equation 1. The X-ray powder diffraction data were obtained with either a PAD-V or an X2 Scintag powder diffractometer equipped with a copper-target X-ray tube. Data were collected with continuous scans and electronic integration over θ. The X-ray fluorescence data were obtained with a Philips PW2400 with a chromium tube using UniQuant5 and sensitivities optimized using additional in-house calibration standards and with background channels customized to better handle the chemistries of these samples. The resulting output of oxide weight percentages was converted to moles of each element. The data were prepared such that relationships between phase composition, elemental composition, and performance could be examined. Merging data from different characterization techniques yielded two data sets: (1) a set of 47 samples with X-ray powder diffraction (XRD) data, elemental compositions from X-ray fluorescence (XRF), surface area, and four measures of performance; (2) a related data set containing 86 samples with XRF data, surface area, a parameter for history (sample aging), and four measures of performance but without XRD data.

5 Data sets (1) and (2) were initially examined for natural groupings in the data with clustering. STATISTICA 5 was used for hierarchical clustering of the XRF, surface area, and performance data. Similarity clustering of the XRF, surface area, and performance data in various combinations with the XRD data was accomplished with the three-way multidimensional scaling of PolySNAP. 6,7 For more predictive models, regression and decision tree classification were accomplished with the open-source software WEKA 6. Neural nets were developed using STATISTICA Neural Nets. 318 (a) (b) Figure 3. (a) The clustering of the XRPD data in data set (1) by multi-dimensional scaling in PolySNAP. (b) The clustering of the XRF data in data set (1) also by multi-dimensional scaling. Although difficult to see in these images, the cluster membership is exactly the same for both types of data. (a) Figure 4. From PolySNAP using data set (1), similarity clustering of a subset of the elemental data from XRF (a) without surface area and (b) with surface area included. (b)

6 As illustrated in Figure 3, cluster membership is found to be the same for both types of X-ray data, XRD and XRF. Therefore, the phase composition has a strong relationship to elemental composition. Different variations in specimen composition are related to the presence of different phases. Examination of the cluster membership shows that the members accurately reflect a descriptor sample type that was derived from other information unrelated to any chemical or characterization measurements, e.g., sample type reflects the source from which the chemicals originated. 319 Figure 5. Similarity clustering from PolySNAP that results from adding XRD data to surface area and a subset of XRF elemental data (data set 1). Our knowledge of the samples tells us that not all of the elemental composition should be related to sample type. Manually selecting a subset of the XRF data enables probing relationships beyond the influence of sample type. However, the subset of XRF data exhibits relatively weak clustering (Figure 4a). Including surface area with the XRF data changes the clustering membership (Figure 4b) but does not strengthen the relationships. Hierarchical clustering using complete Euclidean linkage distances for the same subset of XRF data but from the larger data set (2) of 86 samples still yields poor clustering with very small linkage distances. However, inclusion of surface area in the hierarchical clustering of the larger data set does yield more numerically significant linkage distances and more distinct clusters. Not surprisingly, because the XRD data contain information so strongly related to sample type (Figure 3a), the addition of XRD clustering to the XRF subset-surface area clustering imposes a more definite structure in the overall clustering (Figure 5). Nevertheless, surface area and the XRF subset of data influences the cluster membership compared with Figure 3a. Examination of the clustering relationships for the four measures of performance indicates that the performance data alone show no strong tendency to cluster. The larger data set (2) of 86 samples but without XRD data was used to test for efficacy in predicting performance. For the prediction model building, selecting amongst the possible

7 twenty-one primary variables, thirteen derived variables, and four response variables was accomplished by either the independent feature selection heuristic of STATISTICA Data Miner or by using in each technique the embedded algorithms that selectively add or subtract parameters. Rather surprisingly, linear regression models for all four measures of performance could be found with correlation coefficients ranging from 0.84 to Different combinations of XRF elements, surface area, and history parameter yield statistically comparable models although all models included sample type. Decision tree classification shows that sample type can be defined with very few errors from relatively few splits of the XRF-based compositions, which is consistent with the clustering observed using PolySNAP (Figure 3b). To examine the influence of sample type on the regression models, sample type and elements defining sample type were excluded, but the history parameter and various combinations of surface area with remaining XRF elements were included. Nevertheless, the correlation coefficients for the linear regression models dropped significantly to 0.77 to This leads us to conclude that the measures of performance that were tested do depend to some extent on aging history of the specimens, surface area, and other aspects of composition besides sample type, but that for these particular materials, sample type is a significant factor related to the performance of the materials. Predictive models developed using neural nets show the same trend; predictions are notably degraded without the inclusion of sample type. The predictive capabilities of neural net models are further degraded if multiple predictions are attempted. This may suggest that the parameters remaining after removing sample type may be only weakly related to performance and may be insufficiently independent to successfully predict material performance. 320 Conclusions XRD phase composition and XRF elemental composition were found to yield the same clustering and, hence, both types of X-ray data have a strong relationship to each other in the specimens examined. Cluster membership of the X-ray data was found to be indicative of an unrelated descriptor sample type. Models developed for these data sets needed the inclusion of sample type to be effective in predicting performance. Although the dependence on sample type is, perhaps, not surprising in retrospect, models independent of sample type would be more useful. Hence, the next step for extending our data mining is to find other descriptors that improve prediction of performance without requiring the inclusion of sample type in the model. Dimensionality reduction of spectral-type X-ray data may yield other descriptors useful for modeling performance. Improved predictive models would guide us to other regions in parameter space in which to search for new or optimized materials.

8 321 References 1 Data Mining, Ian Witten and Eibe Frank (2000); Machine Learning, Tom Mitchell (1997); Data Mining: Concepts and Techniques, J. Han and M. Kamber (2001); Data Mining: Concepts, Models, Methods, and Algorithms, Mehmed Kantardzic (2003). 2 Data Preparation for Data Mining, Dorian Pyle (1999). 3 PolySNAP, Brucker AXS; also G. Barr, W. Dong, C.J. Gilmore (2004). PolySNAP: a computer program for analysing high-throughput powder diffraction data, J. Appl. Cryst. 37, C.J. Gilmore, G. Barr, J. Paisley (2004). High-throughput powder diffraction. I. A new approach to qualitative and quantitative powder diffraction pattern analysis using full pattern profiles, J. Appl. Cryst. 37, 231; G. Barr, W. Dong, D.J. Gilmore (2004). High-throughput powder diffraction. II. Applications of clustering methods and multivariate data analysis, J. Appl. Cryst. 37, StatSoft, Inc. (2005). STATISTICA 7.1 or STATISTICA Data Miner, version Ian H. Witten and Eibe Frank (2005). "Data Mining: Practical machine learning tools and techniques", 2nd Edition, Morgan Kaufmann, San Francisco; see also Weka 3: Data Mining Software in Java, 7 Using WEKA s Greedy algorithm for linear regression models with the outlier (sample P31) removed.

FINDING DESCRIPTORS USEFUL FOR DATA MINING IN THE CHARACTERIZATION DATA OF CATALYSTS

FINDING DESCRIPTORS USEFUL FOR DATA MINING IN THE CHARACTERIZATION DATA OF CATALYSTS Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 338 FINDING DESCRIPTORS USEFUL FOR DATA MINING IN THE CHARACTERIZATION DATA OF CATALYSTS C. K. Lowe-Ma,

More information

Peter L Warren, Pamela Y Shadforth ICI Technology, Wilton, Middlesbrough, U.K.

Peter L Warren, Pamela Y Shadforth ICI Technology, Wilton, Middlesbrough, U.K. 783 SCOPE AND LIMITATIONS XRF ANALYSIS FOR SEMI-QUANTITATIVE Introduction Peter L Warren, Pamela Y Shadforth ICI Technology, Wilton, Middlesbrough, U.K. Historically x-ray fluorescence spectrometry has

More information

ANALYSIS OF LOW MASS ABSORPTION MATERIALS USING GLANCING INCIDENCE X-RAY DIFFRACTION

ANALYSIS OF LOW MASS ABSORPTION MATERIALS USING GLANCING INCIDENCE X-RAY DIFFRACTION 173 ANALYSIS OF LOW MASS ABSORPTION MATERIALS USING GLANCING INCIDENCE X-RAY DIFFRACTION N. A. Raftery, L. K. Bekessy, and J. Bowpitt Faculty of Science, Queensland University of Technology, GPO Box 2434,

More information

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References Introduction

More information

A. Pelliccioni (*), R. Cotroneo (*), F. Pungì (*) (*)ISPESL-DIPIA, Via Fontana Candida 1, 00040, Monteporzio Catone (RM), Italy.

A. Pelliccioni (*), R. Cotroneo (*), F. Pungì (*) (*)ISPESL-DIPIA, Via Fontana Candida 1, 00040, Monteporzio Catone (RM), Italy. Application of Neural Net Models to classify and to forecast the observed precipitation type at the ground using the Artificial Intelligence Competition data set. A. Pelliccioni (*), R. Cotroneo (*), F.

More information

The Changing Landscape of Land Administration

The Changing Landscape of Land Administration The Changing Landscape of Land Administration B r e n t J o n e s P E, PLS E s r i World s Largest Media Company No Journalists No Content Producers No Photographers World s Largest Hospitality Company

More information

CHECKING AND ESTIMATING RIR VALUES

CHECKING AND ESTIMATING RIR VALUES Copyright(C)JCPDS-International Centre for Diffraction Data 2000, Advances in X-ray Analysis, Vol.42 287 Copyright(C)JCPDS-International Centre for Diffraction Data 2000, Advances in X-ray Analysis, Vol.42

More information

Background literature. Data Mining. Data mining: what is it?

Background literature. Data Mining. Data mining: what is it? Background literature Data Mining Lecturer: Peter Lucas Assessment: Written exam at the end of part II Practical assessment Compulsory study material: Transparencies Handouts (mostly on the Web) Course

More information

THE IMPORTANCE OF THE SPECIMEN DISPLACEMENT CORRECTION IN RIETVELD PATTERN FITTING WITH SYMMETRIC REFLECTION-OPTICS DIFFRACTION DATA

THE IMPORTANCE OF THE SPECIMEN DISPLACEMENT CORRECTION IN RIETVELD PATTERN FITTING WITH SYMMETRIC REFLECTION-OPTICS DIFFRACTION DATA Copyright(c)JCPDS-International Centre for Diffraction Data 2001,Advances in X-ray Analysis,Vol.44 96 THE IMPORTANCE OF THE SPECIMEN DISPLACEMENT CORRECTION IN RIETVELD PATTERN FITTING WITH SYMMETRIC REFLECTION-OPTICS

More information

In Situ High-Temperature Study Of Silver Behenate Reduction To Silver Metal Using Synchrotron Radiation

In Situ High-Temperature Study Of Silver Behenate Reduction To Silver Metal Using Synchrotron Radiation Copyright (c)jcpds-international Centre for Diffraction Data 2002, Advances in X-ray Analysis, Volume 45. 371 In Situ High-Temperature Study Of Silver Behenate Reduction To Silver Metal Using Synchrotron

More information

MEASUREMENT CAPABILITIES OF X-RAY FLUORESCENCE FOR BPSG FILMS

MEASUREMENT CAPABILITIES OF X-RAY FLUORESCENCE FOR BPSG FILMS , MEASUREMENT CAPABILITIES OF X-RAY FLUORESCENCE FOR BPSG FILMS K.O. Goyal, J.W. Westphal Semiconductor Equipment Group Watkins-Johnson Company Scotts Valley, California 95066 Abstract Deposition of borophosphosilicate

More information

research papers High-throughput powder diffraction. IV. Cluster validation using silhouettes and fuzzy clustering

research papers High-throughput powder diffraction. IV. Cluster validation using silhouettes and fuzzy clustering Journal of Applied Crystallography ISSN 0021-8898 Received 25 June 2004 Accepted 24 August 2004 High-throughput powder diffraction. IV. validation using silhouettes and fuzzy clustering Gordon Barr, Wei

More information

P leiades: Subspace Clustering and Evaluation

P leiades: Subspace Clustering and Evaluation P leiades: Subspace Clustering and Evaluation Ira Assent, Emmanuel Müller, Ralph Krieger, Timm Jansen, and Thomas Seidl Data management and exploration group, RWTH Aachen University, Germany {assent,mueller,krieger,jansen,seidl}@cs.rwth-aachen.de

More information

IMPROVING THE ACCURACY OF RIETVELD-DERIVED LATTICE PARAMETERS BY AN ORDER OF MAGNITUDE

IMPROVING THE ACCURACY OF RIETVELD-DERIVED LATTICE PARAMETERS BY AN ORDER OF MAGNITUDE Copyright (c)jcpds-international Centre for Diffraction Data 2002, Advances in X-ray Analysis, Volume 45. 158 IMPROVING THE ACCURACY OF RIETVELD-DERIVED LATTICE PARAMETERS BY AN ORDER OF MAGNITUDE B. H.

More information

GLANCING INCIDENCE XRF FOR THE ANALYSIS OF EARLY CHINESE BRONZE MIRRORS

GLANCING INCIDENCE XRF FOR THE ANALYSIS OF EARLY CHINESE BRONZE MIRRORS 176 177 GLANCING INCIDENCE XRF FOR THE ANALYSIS OF EARLY CHINESE BRONZE MIRRORS Robert W. Zuneska, Y. Rong, Isaac Vander, and F. J. Cadieu* Physics Dept., Queens College of CUNY, Flushing, NY 11367. ABSTRACT

More information

Development of a Data Mining Methodology using Robust Design

Development of a Data Mining Methodology using Robust Design Development of a Data Mining Methodology using Robust Design Sangmun Shin, Myeonggil Choi, Youngsun Choi, Guo Yi Department of System Management Engineering, Inje University Gimhae, Kyung-Nam 61-749 South

More information

An Empirical Study of Building Compact Ensembles

An Empirical Study of Building Compact Ensembles An Empirical Study of Building Compact Ensembles Huan Liu, Amit Mandvikar, and Jigar Mody Computer Science & Engineering Arizona State University Tempe, AZ 85281 {huan.liu,amitm,jigar.mody}@asu.edu Abstract.

More information

EFFECT OF CALIBRATION SPECIMEN PREPARATION TECHNIQUES ON NARROW RANGE X-RAY FLUORESCENCE CALIBRATION ACCURACY

EFFECT OF CALIBRATION SPECIMEN PREPARATION TECHNIQUES ON NARROW RANGE X-RAY FLUORESCENCE CALIBRATION ACCURACY Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 424 EFFECT OF CALIBRATION SPECIMEN PREPARATION TECHNIQUES ON NARROW RANGE X-RAY FLUORESCENCE CALIBRATION

More information

RIETVELD REFINEMENT WITH XRD AND ND: ANALYSIS OF METASTABLE QANDILITE-LIKE STRUCTURES

RIETVELD REFINEMENT WITH XRD AND ND: ANALYSIS OF METASTABLE QANDILITE-LIKE STRUCTURES Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 261 RIETVELD REFINEMENT WITH XRD AND ND: ANALYSIS OF METASTABLE QANDILITE-LIKE STRUCTURES G. Kimmel

More information

CALCULATION METHODS OF X-RAY SPECTRA: A COMPARATIVE STUDY

CALCULATION METHODS OF X-RAY SPECTRA: A COMPARATIVE STUDY Copyright -International Centre for Diffraction Data 2010 ISSN 1097-0002 CALCULATION METHODS OF X-RAY SPECTRA: A COMPARATIVE STUDY B. Chyba, M. Mantler, H. Ebel, R. Svagera Technische Universit Vienna,

More information

INFLUENCE OF GROWTH INTERRUPTION ON THE FORMATION OF SOLID-STATE INTERFACES

INFLUENCE OF GROWTH INTERRUPTION ON THE FORMATION OF SOLID-STATE INTERFACES 122 INFLUENCE OF GROWTH INTERRUPTION ON THE FORMATION OF SOLID-STATE INTERFACES I. Busch 1, M. Krumrey 2 and J. Stümpel 1 1 Physikalisch-Technische Bundesanstalt, Bundesallee 100, 38116 Braunschweig, Germany

More information

AEROSOL FILTER ANALYSIS USING POLARIZED OPTICS EDXRF WITH THIN FILM FP METHOD

AEROSOL FILTER ANALYSIS USING POLARIZED OPTICS EDXRF WITH THIN FILM FP METHOD Copyright JCPDS-International Centre for Diffraction Data 2014 ISSN 1097-0002 219 AEROSOL FILTER ANALYSIS USING POLARIZED OPTICS EDXRF WITH THIN FILM FP METHOD Takao Moriyama 1), Atsushi Morikawa 1), Makoto

More information

DEVELOPMENT OF XRD IN EL SALVADOR

DEVELOPMENT OF XRD IN EL SALVADOR PACIFIC OCEAN Copyright JCPDS - International Centre for Diffraction Data 2005, Advances in X-ray Analysis, Volume 48. 150 ABSTRACT DEVELOPMENT OF XRD IN EL SALVADOR Elizabeth de Henríquez LaGeo S.A. de

More information

ACCURATE QUANTIFICATION OF RADIOACTIVE MATERIALS BY X-RAY FLUORESCENCE: GALLIUM IN PLUTONIUM METAL

ACCURATE QUANTIFICATION OF RADIOACTIVE MATERIALS BY X-RAY FLUORESCENCE: GALLIUM IN PLUTONIUM METAL Copyright JCPDS - International Centre for Diffraction Data 2003, Advances in X-ray Analysis, Volume 46. 369 ACCURATE QUANTIFICATION OF RADIOACTIVE MATERIALS BY X-RAY FLUORESCENCE: GALLIUM IN PLUTONIUM

More information

Applying Data Mining Techniques on Soil Fertility Prediction

Applying Data Mining Techniques on Soil Fertility Prediction Applying Data Mining Techniques on Soil Fertility Prediction S.S.Baskar L.Arockiam S.Charles Abstract: The techniques of data mining are very popular in the area of agriculture. The advancement in Agricultural

More information

A COMPACT X-RAY SPECTROMETER WITH MULTI-CAPILLARY X-RAY LENS AND FLAT CRYSTALS

A COMPACT X-RAY SPECTROMETER WITH MULTI-CAPILLARY X-RAY LENS AND FLAT CRYSTALS Copyright(c)JCPDS-International Centre for Diffraction Data 2001,Advances in X-ray Analysis,Vol.44 320 A COMPACT X-RAY SPECTROMETER WITH MULTI-CAPILLARY X-RAY LENS AND FLAT CRYSTALS Hiroyoshi SOEJIMA and

More information

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom

More information

Inducing Polynomial Equations for Regression

Inducing Polynomial Equations for Regression Inducing Polynomial Equations for Regression Ljupčo Todorovski, Peter Ljubič, and Sašo Džeroski Department of Knowledge Technologies, Jožef Stefan Institute Jamova 39, SI-1000 Ljubljana, Slovenia Ljupco.Todorovski@ijs.si

More information

Data Mining Part 4. Prediction

Data Mining Part 4. Prediction Data Mining Part 4. Prediction 4.3. Fall 2009 Instructor: Dr. Masoud Yaghini Outline Introduction Bayes Theorem Naïve References Introduction Bayesian classifiers A statistical classifiers Introduction

More information

Time-Resolved μ-xrf and Elemental Mapping of Biological Materials

Time-Resolved μ-xrf and Elemental Mapping of Biological Materials 296 Time-Resolved μ-xrf and Elemental Mapping of Biological Materials K. Tsuji 1,2), K. Tsutsumimoto 1), K. Nakano 1,2), K. Tanaka 1), A. Okhrimovskyy 1), Y. Konishi 1), and X. Ding 3) 1) Department of

More information

MATERIALS CHARACTERIZATION USING A NOVEL SIMULTANEOUS NEAR-INFRARED/X-RAY DIFFRACTION INSTRUMENT

MATERIALS CHARACTERIZATION USING A NOVEL SIMULTANEOUS NEAR-INFRARED/X-RAY DIFFRACTION INSTRUMENT Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 249 MATERIALS CHARACTERIZATION USING A NOVEL SIMULTANEOUS NEAR-INFRARED/X-RAY DIFFRACTION INSTRUMENT

More information

CLASSIFICATION METHODS FOR MAGIC TELESCOPE IMAGES ON A PIXEL-BY-PIXEL BASE. 1 Introduction

CLASSIFICATION METHODS FOR MAGIC TELESCOPE IMAGES ON A PIXEL-BY-PIXEL BASE. 1 Introduction CLASSIFICATION METHODS FOR MAGIC TELESCOPE IMAGES ON A PIXEL-BY-PIXEL BASE CONSTANTINO MALAGÓN Departamento de Ingeniería Informática, Univ. Antonio de Nebrija, 28040 Madrid, Spain. JUAN ABEL BARRIO 2,

More information

NEW CORRECTION PROCEDURE FOR X-RAY SPECTROSCOPIC FLUORESCENCE DATA: SIMULATIONS AND EXPERIMENT

NEW CORRECTION PROCEDURE FOR X-RAY SPECTROSCOPIC FLUORESCENCE DATA: SIMULATIONS AND EXPERIMENT Copyright JCPDS - International Centre for Diffraction Data 2005, Advances in X-ray Analysis, Volume 48. 266 NEW CORRECTION PROCEDURE FOR X-RAY SPECTROSCOPIC FLUORESCENCE DATA: SIMULATIONS AND EXPERIMENT

More information

FUNDAMENTAL PARAMETER METHOD USING SCATTERING X-RAYS IN X-RAY FLUORESCENCE ANALYSIS

FUNDAMENTAL PARAMETER METHOD USING SCATTERING X-RAYS IN X-RAY FLUORESCENCE ANALYSIS FUNDAMENTAL PARAMETER METHOD USING SCATTERING X-RAYS IN X-RAY FLUORESCENCE ANALYSIS 255 Yoshiyuki Kataoka 1, Naoki Kawahara 1, Shinya Hara 1, Yasujiro Yamada 1, Takashi Matsuo 1, Michael Mantler 2 1 Rigaku

More information

FUNDAMENTAL PARAMETERS ANALYSIS OF ROHS ELEMENTS IN PLASTICS

FUNDAMENTAL PARAMETERS ANALYSIS OF ROHS ELEMENTS IN PLASTICS 45 ABSTRACT FUNDAMENTAL PARAMETERS ANALYSIS OF ROHS ELEMENTS IN PLASTICS W. T. Elam, Robert B. Shen, Bruce Scruggs, and Joseph A. Nicolosi EDAX, Inc. Mahwah, NJ 70430 European Community Directive 2002/95/EC

More information

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td Data Mining Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 Preamble: Control Application Goal: Maintain T ~Td Tel: 319-335 5934 Fax: 319-335 5669 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak

More information

Neural Networks and Ensemble Methods for Classification

Neural Networks and Ensemble Methods for Classification Neural Networks and Ensemble Methods for Classification NEURAL NETWORKS 2 Neural Networks A neural network is a set of connected input/output units (neurons) where each connection has a weight associated

More information

RADIOACTIVE SAMPLE EFFECTS ON EDXRF SPECTRA

RADIOACTIVE SAMPLE EFFECTS ON EDXRF SPECTRA 90 RADIOACTIVE SAMPLE EFFECTS ON EDXRF SPECTRA Christopher G. Worley Los Alamos National Laboratory, MS G740, Los Alamos, NM 87545 ABSTRACT Energy dispersive X-ray fluorescence (EDXRF) is a rapid, straightforward

More information

ADVANTAGES AND DISADVANTAGES OF BAYESIAN METHODS FOR OBTAINING XRF NET INTENSITIES

ADVANTAGES AND DISADVANTAGES OF BAYESIAN METHODS FOR OBTAINING XRF NET INTENSITIES 187 188 ADVANTAGES AND DISADVANTAGES OF BAYESIAN METHODS FOR OBTAINING XRF NET INTENSITIES ABSTRACT W. T. Elam, B. Scruggs, F. Eggert, and J. A. Nicolosi EDAX, a unit of Ametek Inc., 91 McKee Drive, Mahwah,

More information

PREDICTION OF THE CRYSTAL STRUCTURE OF BYNARY AND TERNARY INORGANIC COMPOUNDS USING SYMMETRY RESTRICTIONS AND POWDER DIFFRACTION DATA

PREDICTION OF THE CRYSTAL STRUCTURE OF BYNARY AND TERNARY INORGANIC COMPOUNDS USING SYMMETRY RESTRICTIONS AND POWDER DIFFRACTION DATA Copyright(c)JCPDS-International Centre for Diffraction Data 2001,Advances in X-ray Analysis,Vol.44 116 PREDICTION OF THE CRYSTAL STRUCTURE OF BYNARY AND TERNARY INORGANIC COMPOUNDS USING SYMMETRY RESTRICTIONS

More information

CHARACTERIZING PROCESS SEMICONDUCTOR THIN FILMS WITH A CONFOCAL MICRO X-RAY FLUORESCENCE MICROSCOPE

CHARACTERIZING PROCESS SEMICONDUCTOR THIN FILMS WITH A CONFOCAL MICRO X-RAY FLUORESCENCE MICROSCOPE CHARACTERIZING PROCESS SEMICONDUCTOR THIN FILMS WITH A CONFOCAL MICRO X-RAY FLUORESCENCE MICROSCOPE 218 Chris M. Sparks 1, Elizabeth P. Hastings 2, George J. Havrilla 2, and Michael Beckstead 2 1. ATDF,

More information

Institute for Functional Imaging of Materials (IFIM)

Institute for Functional Imaging of Materials (IFIM) Institute for Functional Imaging of Materials (IFIM) Sergei V. Kalinin Guiding the design of materials tailored for functionality Dynamic matter: information dimension Static matter Functional matter Imaging

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

FACTORS AFFECTING IN-LINE PHASE CONTRAST IMAGING WITH A LABORATORY MICROFOCUS X-RAY SOURCE

FACTORS AFFECTING IN-LINE PHASE CONTRAST IMAGING WITH A LABORATORY MICROFOCUS X-RAY SOURCE Copyright JCPDS-International Centre for Diffraction Data 26 ISSN 197-2 FACTORS AFFECTING IN-LINE PHASE CONTRAST IMAGING WITH A LABORATORY MICROFOCUS X-RAY SOURCE 31 K. L. Kelly and B. K. Tanner Department

More information

USABILITY OF PORTABLE X-RAY SPECTROMETER FOR DISCRIMINATION OF VALENCE STATES

USABILITY OF PORTABLE X-RAY SPECTROMETER FOR DISCRIMINATION OF VALENCE STATES Copyright (c)jcpds-international Centre for Diffraction Data 00, Advances in X-ray Analysis, Volume 45. 409 ISSN 1097-000 USABIITY OF POTABE X-AY SPECTOMETE FO DISCIMINATION OF VAENCE STATES I.A.Brytov,.I.Plotnikov,B.D.Kalinin,

More information

DEVELOPMENT OF A NEW POSITRON LIFETIME SPECTROSCOPY TECHNIQUE FOR DEFECT CHARACTERIZATION IN THICK MATERIALS

DEVELOPMENT OF A NEW POSITRON LIFETIME SPECTROSCOPY TECHNIQUE FOR DEFECT CHARACTERIZATION IN THICK MATERIALS Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 59 DEVELOPMENT OF A NEW POSITRON LIFETIME SPECTROSCOPY TECHNIQUE FOR DEFECT CHARACTERIZATION IN

More information

Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol ISSN

Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol ISSN Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 129 MATHEMATICAL OF DIFFRACTION PROPERTIES POLE FIGURES ABSTRACT Helmut Schaeben Mathematics and Computer

More information

Classification Using Decision Trees

Classification Using Decision Trees Classification Using Decision Trees 1. Introduction Data mining term is mainly used for the specific set of six activities namely Classification, Estimation, Prediction, Affinity grouping or Association

More information

Induction of Decision Trees

Induction of Decision Trees Induction of Decision Trees Peter Waiganjo Wagacha This notes are for ICS320 Foundations of Learning and Adaptive Systems Institute of Computer Science University of Nairobi PO Box 30197, 00200 Nairobi.

More information

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics

Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics Contents 1 Open-Source Tools, Techniques, and Data in Chemoinformatics... 1 1.1 Chemoinformatics... 2 1.1.1 Open-Source Tools... 2 1.1.2 Introduction to Programming Languages... 3 1.2 Chemical Structure

More information

Reading, UK 1 2 Abstract

Reading, UK 1 2 Abstract , pp.45-54 http://dx.doi.org/10.14257/ijseia.2013.7.5.05 A Case Study on the Application of Computational Intelligence to Identifying Relationships between Land use Characteristics and Damages caused by

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

ML techniques. symbolic techniques different types of representation value attribute representation representation of the first order

ML techniques. symbolic techniques different types of representation value attribute representation representation of the first order MACHINE LEARNING Definition 1: Learning is constructing or modifying representations of what is being experienced [Michalski 1986], p. 10 Definition 2: Learning denotes changes in the system That are adaptive

More information

Inducing Polynomial Equations for Regression

Inducing Polynomial Equations for Regression Inducing Polynomial Equations for Regression Ljupčo Todorovski, Peter Ljubič, and Sašo Džeroski Department of Knowledge Technologies, Jožef Stefan Institute Jamova 39, SI-1000 Ljubljana, Slovenia Ljupco.Todorovski@ijs.si

More information

APPLICATION OF MICRO X-RAY FLUORESCENCE SPECTROMETRY FOR LOCALIZED AREA ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL MATERIALS

APPLICATION OF MICRO X-RAY FLUORESCENCE SPECTROMETRY FOR LOCALIZED AREA ANALYSIS OF BIOLOGICAL AND ENVIRONMENTAL MATERIALS Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 540 APPLICATION OF MICRO X-RAY FLUORESCENCE SPECTROMETRY FOR LOCALIZED AREA ANALYSIS OF BIOLOGICAL AND

More information

Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule

Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule Condensed Graph of Reaction: considering a chemical reaction as one single pseudo molecule Frank Hoonakker 1,3, Nicolas Lachiche 2, Alexandre Varnek 3, and Alain Wagner 3,4 1 Chemoinformatics laboratory,

More information

Horst Ebel, Robert Svagera, Christian Hager, Maria F.Ebel, Christian Eisenmenger-Sittner, Johann Wernisch, and Michael Mantler

Horst Ebel, Robert Svagera, Christian Hager, Maria F.Ebel, Christian Eisenmenger-Sittner, Johann Wernisch, and Michael Mantler DETECTION OF SUBMONOLAYERS BY MEASUREMENT OF THE TOTAL ELECTRON YIELD (TEY) OF X-RAY EXCITED ELECTRON EMISSION Horst Ebel, Robert Svagera, Christian Hager, Maria F.Ebel, Christian Eisenmenger-Sittner,

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 hic@cc.gatech.edu Henrik I Christensen (RIM@GT) Linear

More information

ION-EXCHANGE FILMS FOR ELEMENT CONCENTRATION IN X-RAY FLUORESCENCE ANALYSIS WITH TOTAL REFLECTION OF THE PRIMARY BEAM.

ION-EXCHANGE FILMS FOR ELEMENT CONCENTRATION IN X-RAY FLUORESCENCE ANALYSIS WITH TOTAL REFLECTION OF THE PRIMARY BEAM. 822 ION-EXCHANGE FILMS FOR ELEMENT CONCENTRATION IN X-RAY FLUORESCENCE ANALYSIS WITH TOTAL REFLECTION OF THE PRIMARY BEAM. Abstract A.P.Morovov, L.D.Danilin, V.V.Zhmailo, Yu.V.Ignatiev, A.E.Lakhtikov,

More information

Three-Dimensional Electron Microscopy of Macromolecular Assemblies

Three-Dimensional Electron Microscopy of Macromolecular Assemblies Three-Dimensional Electron Microscopy of Macromolecular Assemblies Joachim Frank Wadsworth Center for Laboratories and Research State of New York Department of Health The Governor Nelson A. Rockefeller

More information

EFFECT OF THE HOLE-BOTTOM FILLET RADIUS ON THE RESIDUAL STRESS ANALYSIS BY THE HOLE DRILLING METHOD

EFFECT OF THE HOLE-BOTTOM FILLET RADIUS ON THE RESIDUAL STRESS ANALYSIS BY THE HOLE DRILLING METHOD 63 EFFECT OF THE HOLE-BOTTOM FILLET RADIUS ON THE RESIDUAL STRESS ANALYSIS BY THE HOLE DRILLING METHOD M. Scafidi a, E. Valentini b, B. Zuccarello a scafidi@dima.unipa.it, emilio.valentini@sintechnology.com,

More information

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns

Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Making Our Cities Safer: A Study In Neighbhorhood Crime Patterns Aly Kane alykane@stanford.edu Ariel Sagalovsky asagalov@stanford.edu Abstract Equipped with an understanding of the factors that influence

More information

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence Artificial Intelligence (AI) Artificial Intelligence AI is an attempt to reproduce intelligent reasoning using machines * * H. M. Cartwright, Applications of Artificial Intelligence in Chemistry, 1993,

More information

ELECTRIC FIELD INFLUENCE ON EMISSION OF CHARACTERISTIC X-RAY FROM Al 2 O 3 TARGETS BOMBARDED BY SLOW Xe + IONS

ELECTRIC FIELD INFLUENCE ON EMISSION OF CHARACTERISTIC X-RAY FROM Al 2 O 3 TARGETS BOMBARDED BY SLOW Xe + IONS 390 ELECTRIC FIELD INFLUENCE ON EMISSION OF CHARACTERISTIC X-RAY FROM Al 2 O 3 TARGETS BOMBARDED BY SLOW Xe + IONS J. C. Rao 1, 2 *, M. Song 2, K. Mitsuishi 2, M. Takeguchi 2, K. Furuya 2 1 Department

More information

More on Unsupervised Learning

More on Unsupervised Learning More on Unsupervised Learning Two types of problems are to find association rules for occurrences in common in observations (market basket analysis), and finding the groups of values of observational data

More information

for XPS surface analysis

for XPS surface analysis Thermo Scientific Avantage XPS Software Powerful instrument operation and data processing for XPS surface analysis Avantage Software Atomic Concentration (%) 100 The premier software for surface analysis

More information

MultiscaleMaterialsDesignUsingInformatics. S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA

MultiscaleMaterialsDesignUsingInformatics. S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA MultiscaleMaterialsDesignUsingInformatics S. R. Kalidindi, A. Agrawal, A. Choudhary, V. Sundararaghavan AFOSR-FA9550-12-1-0458 1 Hierarchical Material Structure Kalidindi and DeGraef, ARMS, 2015 Main Challenges

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,

More information

Online Estimation of Discrete Densities using Classifier Chains

Online Estimation of Discrete Densities using Classifier Chains Online Estimation of Discrete Densities using Classifier Chains Michael Geilke 1 and Eibe Frank 2 and Stefan Kramer 1 1 Johannes Gutenberg-Universtität Mainz, Germany {geilke,kramer}@informatik.uni-mainz.de

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

REALIZATION OF AN ASYMMETRIC MULTILAYER X-RAY MIRROR

REALIZATION OF AN ASYMMETRIC MULTILAYER X-RAY MIRROR Copyright(c)JCPDS-International Centre for Diffraction Data 2000,Advances in X-ray Analysis,Vol.43 218 REALIZATION OF AN ASYMMETRIC MULTILAYER X-RAY MIRROR S. M. Owens Laboratory for High Energy Astrophysics,

More information

LASER-COMPTON SCATTERING AS A POTENTIAL BRIGHT X-RAY SOURCE

LASER-COMPTON SCATTERING AS A POTENTIAL BRIGHT X-RAY SOURCE Copyright(C)JCPDS-International Centre for Diffraction Data 2003, Advances in X-ray Analysis, Vol.46 74 ISSN 1097-0002 LASER-COMPTON SCATTERING AS A POTENTIAL BRIGHT X-RAY SOURCE K. Chouffani 1, D. Wells

More information

The application of neural networks to the paper-making industry

The application of neural networks to the paper-making industry The application of neural networks to the paper-making industry P. J. Edwards y, A.F. Murray y, G. Papadopoulos y, A.R. Wallace y and J. Barnard x y Dept. of Electronics and Electrical Eng., Edinburgh

More information

Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm

Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm Gene Expression Data Classification with Revised Kernel Partial Least Squares Algorithm Zhenqiu Liu, Dechang Chen 2 Department of Computer Science Wayne State University, Market Street, Frederick, MD 273,

More information

CHAPTER 2: DATA MINING - A MODERN TOOL FOR ANALYSIS. Due to elements of uncertainty many problems in this world appear to be

CHAPTER 2: DATA MINING - A MODERN TOOL FOR ANALYSIS. Due to elements of uncertainty many problems in this world appear to be 11 CHAPTER 2: DATA MINING - A MODERN TOOL FOR ANALYSIS Due to elements of uncertainty many problems in this world appear to be complex. The uncertainty may be either in parameters defining the problem

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Combinatorial Heterogeneous Catalysis

Combinatorial Heterogeneous Catalysis Combinatorial Heterogeneous Catalysis 650 μm by 650 μm, spaced 100 μm apart Identification of a new blue photoluminescent (PL) composite material, Gd 3 Ga 5 O 12 /SiO 2 Science 13 March 1998: Vol. 279

More information

Switch Mechanism Diagnosis using a Pattern Recognition Approach

Switch Mechanism Diagnosis using a Pattern Recognition Approach The 4th IET International Conference on Railway Condition Monitoring RCM 2008 Switch Mechanism Diagnosis using a Pattern Recognition Approach F. Chamroukhi, A. Samé, P. Aknin The French National Institute

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)

More information

X-RAY MICRODIFFRACTION STUDY OF THE HALF-V SHAPED SWITCHING LIQUID CRYSTAL

X-RAY MICRODIFFRACTION STUDY OF THE HALF-V SHAPED SWITCHING LIQUID CRYSTAL Copyright JCPDS - International Centre for Diffraction Data 2004, Advances in X-ray Analysis, Volume 47. 321 X-RAY MICRODIFFRACTION STUDY OF THE HALF-V SHAPED SWITCHING LIQUID CRYSTAL Kazuhiro Takada 1,

More information

Applied Statistics. Multivariate Analysis - part II. Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1

Applied Statistics. Multivariate Analysis - part II. Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1 Applied Statistics Multivariate Analysis - part II Troels C. Petersen (NBI) Statistics is merely a quantization of common sense 1 Fisher Discriminant You want to separate two types/classes (A and B) of

More information

An Alternate Measure for Comparing Time Series Subsequence Clusters

An Alternate Measure for Comparing Time Series Subsequence Clusters An Alternate Measure for Comparing Time Series Subsequence Clusters Ricardo Mardales mardales@ engr.uconn.edu Dina Goldin dqg@engr.uconn.edu BECAT/CSE Technical Report University of Connecticut 1. Introduction

More information

Decision T ree Tree Algorithm Week 4 1

Decision T ree Tree Algorithm Week 4 1 Decision Tree Algorithm Week 4 1 Team Homework Assignment #5 Read pp. 105 117 of the text book. Do Examples 3.1, 3.2, 3.3 and Exercise 3.4 (a). Prepare for the results of the homework assignment. Due date

More information

DANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence

DANIEL WILSON AND BEN CONKLIN. Integrating AI with Foundation Intelligence for Actionable Intelligence DANIEL WILSON AND BEN CONKLIN Integrating AI with Foundation Intelligence for Actionable Intelligence INTEGRATING AI WITH FOUNDATION INTELLIGENCE FOR ACTIONABLE INTELLIGENCE in an arms race for artificial

More information

EasySDM: A Spatial Data Mining Platform

EasySDM: A Spatial Data Mining Platform EasySDM: A Spatial Data Mining Platform (User Manual) Authors: Amine Abdaoui and Mohamed Ala Al Chikha, Students at the National Computing Engineering School. Algiers. June 2013. 1. Overview EasySDM is

More information

Prediction of Citations for Academic Papers

Prediction of Citations for Academic Papers 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Feature Selection with Fuzzy Decision Reducts

Feature Selection with Fuzzy Decision Reducts Feature Selection with Fuzzy Decision Reducts Chris Cornelis 1, Germán Hurtado Martín 1,2, Richard Jensen 3, and Dominik Ślȩzak4 1 Dept. of Mathematics and Computer Science, Ghent University, Gent, Belgium

More information

Efficiently merging symbolic rules into integrated rules

Efficiently merging symbolic rules into integrated rules Efficiently merging symbolic rules into integrated rules Jim Prentzas a, Ioannis Hatzilygeroudis b a Democritus University of Thrace, School of Education Sciences Department of Education Sciences in Pre-School

More information

hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference

hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference CS 229 Project Report (TR# MSB2010) Submitted 12/10/2010 hsnim: Hyper Scalable Network Inference Machine for Scale-Free Protein-Protein Interaction Networks Inference Muhammad Shoaib Sehgal Computer Science

More information

Linear Discrimination Functions

Linear Discrimination Functions Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann ECLT 5810 Classification Neural Networks Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann Neural Networks A neural network is a set of connected input/output

More information

COMP9444: Neural Networks. Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization

COMP9444: Neural Networks. Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization : Neural Networks Vapnik Chervonenkis Dimension, PAC Learning and Structural Risk Minimization 11s2 VC-dimension and PAC-learning 1 How good a classifier does a learner produce? Training error is the precentage

More information

Bayesian Classification. Bayesian Classification: Why?

Bayesian Classification. Bayesian Classification: Why? Bayesian Classification http://css.engineering.uiowa.edu/~comp/ Bayesian Classification: Why? Probabilistic learning: Computation of explicit probabilities for hypothesis, among the most practical approaches

More information

A comparison of three class separability measures

A comparison of three class separability measures A comparison of three class separability measures L.S Mthembu & J.Greene Department of Electrical Engineering, University of Cape Town Rondebosch, 7001, South Africa. Abstract lsmthlin007@mail.uct.ac.za

More information