Classification methods
|
|
- Noreen Strickland
- 6 years ago
- Views:
Transcription
1 Multivariate analysis (II) Cluster analysis and Cronbach s alpha Classification methods 12 th JRC Annual Training on Composite Indicators & Multicriteria Decision Analysis (COIN 2014) dorota.bialowolska@jrc.ec.europa.eu European Commission Joint Research Centre Econometrics and Applied Statistics Unit Composite Indicators Research Group (JRC-COIN) Multivariate analysis (II) 1 Multivariate analysis (II) 2 Cluster analysis: setting short- and long-term targets Several Canadian regions may have similar CLI scores but very different patterns across the seventeen indicators or pillars of learning To help local authorities identify peer regions that are similarly situated with respect to the individual indicators, we applied cluster analysis. Clustering using Ward s method and then using k-mean clusters Multivariate analysis (II) 3 Multivariate analysis (II) 4
2 Cluster analysis: solution when aggregation cannot be performed Indicators of objective health: (1) life expectancy at birth (LE), (2) infant mortality rate (IM), (3) potential years of life lost before age 70 (PYLL70), (4) probability of not reaching age 65 (P65) Indicators of subjective health: Proportions of people (1) declaring to have good general health (GH), (2) reporting no long-standing illnesses (LSI), (3) reporting no limitations in activities because of health issues (LA) To depict health conditions in the EU regions (1) hierarchical clustering with Ward s method and squared Euclidean distance and (2) k-mean clustering Objective and subjective health measures do not always coincide The EU is clearly split into the EU-15 and Central and Eastern European countries with health conditions considerably better in the western regions of the EU This division is observed with respect to objective health conditions only. Inclusion of self-perceived health status in the analysis measures considerably changes this picture Multivariate analysis (II) 5 Multivariate analysis (II) 6 Classification methods in the beginning of the process, each element is in a cluster of its own. The clusters are then sequentially combined into larger clusters, until all elements end up being in the same cluster. K-mean cluster aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean Multivariate analysis (II) 7 Multivariate analysis (II) 8
3 1. Calculate proximity matrix (choice of measure of proximity) 1. Calculate proximity matrix (choice of measure of proximity/distance) 2. Find the smallest value in the proximity matrix (apart from values from diagonal) and join countries associated to this values (form a cluster) 3. Calculate proximity matrix for reduced set of countries. For not combined countries the values are the same as in the proximity matrix from 1). But what to do with a cluster? 4. Choose a cluster method Euclidean Squared Euclidean Minkowski Manhatan Mahalanobis Chebyshew The choice of proximity measure may influence the clustering results Steps 2 and 3 are repeated till forming one cluster with all countries included Multivariate analysis (II) 9 Multivariate analysis (II) Calculate proximity matrix (choice of measure of proximity) 2. Find the smallest value in the proximity matrix (apart from values from diagonal) and join countries associated to this values (form a cluster) Clustering methods 1. Nearest neighbor = single linkage 2. Furthest neighbor = complete linkage 3. Calculate proximity matrix for reduced set of countries. For not combined countries the values are the same as in the proximity matrix from 1). But how to establish the distance between the cluster and other countries? 4. Choose a clustering method Steps 2 and 3 are repeated till forming one cluster with all countries included 3. Average linkage 4. Median clustering 5. Ward's minimum variance method 6. etc. Multivariate analysis (II) 11 Multivariate analysis (II) 12
4 Nearest neighbor = single linkage Distance between new cluster and a country outside it is defined as the smallest distance out of distances between countries in cluster and a country outside the cluster Distance between two clusters is defined as a distance out of distances between countries (one in each cluster) that are closest to each other Furthest neighbor = complete linkage Distance between new cluster and a country outside it is defined as the furthest distance out of distances between countries in cluster and a country outside the cluster Distance between two clusters is defined as a distance out of distances between countries (one in each cluster) that are furthest to each other Multivariate analysis (II) 13 Multivariate analysis (II) 14 Average linkage Distance between new cluster and a country outside it is defined as a mean distance of distances between countries in cluster and a country outside the cluster Proximity measure: squared Euclidean distance Cluster method: furthest neighbor Distance between two clusters is defined as a mean distance of all distances between countries (one in each cluster) Multivariate analysis (II) 15 Multivariate analysis (II) 16
5 The less clustering result depends on the cluster method chosen, the better quality of the final solution It is good to normalize indicators before clustering Multivariate analysis (II) 17 Multivariate analysis (II) 18 Cronbach s alpha is regarded as a measure of both internal consistency and reliability - it is a lower bound to population reliability Cronbach s Alpha X - indicator k number of indicators Cronbach s alpha might be applied to confirm or reject uni-dimensionality Indicators are supposed to have the same orientation with regard to a composite Cronbach s alpha increases when the number of indicators increases Multivariate analysis (II) 19 Multivariate analysis (II) 20
6 Example Example Multivariate analysis (II) 21 Multivariate analysis (II) 22 Example Despite being the most frequently reported statistic supporting the quality of the test scores/composites, Cronbach s alpha is criticized Multivariate analysis (II) 23 Multivariate analysis (II) 24
7 References: Bentler, P. M. (2009). Alpha, dimension-free, and model-based internal consistency reliability. Psychometrika, 74(1), Cortina, J. (1993). What is coefficient Alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: an alternative to coefficient alpha. Psychometrika, 74(1), Saisana, M. (2008) Composite Learning Index: Robustness Issues and Critical Assessment. JRC Scientific and Technical Reports, EUR doi: /7087 Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of cronbach s alpha. Psychometrika, 74(1), Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach s alpha. International Journal of Medical Education, 2, doi: /ijme.4dfb.8dfd Weziak-Bialowolska, D. (2014). Health conditions in regions of Eastern and Western Europe. International Journal of Public Health, 59(3), doi: /s Multivariate analysis (II) 25
Multivariate Analysis Cluster Analysis
Multivariate Analysis Cluster Analysis Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Cluster Analysis System Samples Measurements Similarities Distances Clusters
More informationApplying cluster analysis to 2011 Census local authority data
Applying cluster analysis to 2011 Census local authority data Kitty.Lymperopoulou@manchester.ac.uk SPSS User Group Conference November, 10 2017 Outline Basic ideas of cluster analysis How to choose variables
More information2. Sample representativeness. That means some type of probability/random sampling.
1 Neuendorf Cluster Analysis Model: X1 X2 X3 X4 X5 Clusters (Nominal variable) Y1 Y2 Y3 Clustering/Internal Variables External Variables Assumes: 1. Actually, any level of measurement (nominal, ordinal,
More informationAn Alternative Algorithm for Classification Based on Robust Mahalanobis Distance
Dhaka Univ. J. Sci. 61(1): 81-85, 2013 (January) An Alternative Algorithm for Classification Based on Robust Mahalanobis Distance A. H. Sajib, A. Z. M. Shafiullah 1 and A. H. Sumon Department of Statistics,
More informationData Preprocessing. Cluster Similarity
1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationOptimal Blocking by Minimizing the Maximum Within-Block Distance
Optimal Blocking by Minimizing the Maximum Within-Block Distance Michael J. Higgins Jasjeet Sekhon Princeton University University of California at Berkeley November 14, 2013 For the Kansas State University
More informationMultivariate Analysis
Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.
More informationCluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li
77 Cluster Analysis (Sect. 9.6/Chap. 14 of Wilks) Notes by Hong Li 1) Introduction Cluster analysis deals with separating data into groups whose identities are not known in advance. In general, even the
More informationMultivariate Statistics
Multivariate Statistics Chapter 6: Cluster Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering
More informationPattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher
Pattern recognition "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher The more relevant patterns at your disposal, the better your decisions will be. This is hopeful news to
More information2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting
More informationOutline. Reliability. Reliability. PSY Oswald
PSY 395 - Oswald Outline Concept of What are Constructs? Construct Contamination and Construct Deficiency in General Classical Test Theory Concept of Cars (engines, brakes!) Friends (on time, secrets)
More informationHow rural the EU RDP is? An analysis through spatial funds allocation
How rural the EU RDP is? An analysis through spatial funds allocation Beatrice Camaioni, Roberto Esposti, Antonello Lobianco, Francesco Pagliacci, Franco Sotte Department of Economics and Social Sciences
More informationCluster Analysis CHAPTER PREVIEW KEY TERMS
LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: Define cluster analysis, its roles, and its limitations. Identify the types of research questions addressed by
More informationCADERNOS DO IME Série Estatística
CADERNOS DO IME Série Estatística Universidade do Estado do Rio de Janeiro - UERJ Rio de Janeiro RJ - Brasil ISSN impresso 1413-9022 / ISSN on-line 2317-4535 - v.36, p.29-37, 2014 AN IF-ITEM-DELETED SENSITIVE
More informationFeature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size
Feature selection and classifier performance in computer-aided diagnosis: The effect of finite sample size Berkman Sahiner, a) Heang-Ping Chan, Nicholas Petrick, Robert F. Wagner, b) and Lubomir Hadjiiski
More informationData Exploration and Unsupervised Learning with Clustering
Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a
More information2. Sample representativeness. That means some type of probability/random sampling.
1 Neuendorf Cluster Analysis Assumes: 1. Actually, any level of measurement (nominal, ordinal, interval/ratio) is accetable for certain tyes of clustering. The tyical methods, though, require metric (I/R)
More informationMultivariate Statistics: Hierarchical and k-means cluster analysis
Multivariate Statistics: Hierarchical and k-means cluster analysis Steffen Unkel Department of Medical Statistics University Medical Center Goettingen, Germany Summer term 217 1/43 What is a cluster? Proximity
More informationData preprocessing. DataBase and Data Mining Group 1. Data set types. Tabular Data. Document Data. Transaction Data. Ordered Data
Elena Baralis and Tania Cerquitelli Politecnico di Torino Data set types Record Tables Document Data Transaction Data Graph World Wide Web Molecular Structures Ordered Spatial Data Temporal Data Sequential
More informationMetric-based classifiers. Nuno Vasconcelos UCSD
Metric-based classifiers Nuno Vasconcelos UCSD Statistical learning goal: given a function f. y f and a collection of eample data-points, learn what the function f. is. this is called training. two major
More informationTime Series Classification
Distance Measures Classifiers DTW vs. ED Further Work Questions August 31, 2017 Distance Measures Classifiers DTW vs. ED Further Work Questions Outline 1 2 Distance Measures 3 Classifiers 4 DTW vs. ED
More informationUniversity of Florida CISE department Gator Engineering. Clustering Part 1
Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects
More informationGIS CONFERENCE MAKING PLACE MATTER Decoding Health Data with Spatial Statistics
esri HEALTH AND HUMAN SERVICES GIS CONFERENCE MAKING PLACE MATTER Decoding Health Data with Spatial Statistics Flora Vale Jenora D Acosta Wait a minute Wait a minute Where is Lauren?? Wait a minute Where
More informationClusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved
Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse
More informationMSP Research Note. RDQ Reliability, Validity and Norms
MSP Research Note RDQ Reliability, Validity and Norms Introduction This research note describes the technical properties of the RDQ. Evidence for the reliability and validity of the RDQ is presented against
More informationCOSINE SIMILARITY APPROACHES TO RELIABILITY OF LIKERT SCALE AND ITEMS
ROMANIAN JOURNAL OF PSYCHOLOGICAL STUDIES HYPERION UNIVERSITY www.hyperion.ro COSINE SIMILARITY APPROACHES TO RELIABILITY OF LIKERT SCALE AND ITEMS SATYENDRA NATH CHAKRABARTTY Indian Ports Association,
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationMethodological issues in the development of accessibility measures to services: challenges and possible solutions in the Canadian context
Methodological issues in the development of accessibility measures to services: challenges and possible solutions in the Canadian context Alessandro Alasia 1, Frédéric Bédard 2, and Julie Bélanger 1 (1)
More informationItem Reliability Analysis
Item Reliability Analysis Revised: 10/11/2017 Summary... 1 Data Input... 4 Analysis Options... 5 Tables and Graphs... 5 Analysis Summary... 6 Matrix Plot... 8 Alpha Plot... 10 Correlation Matrix... 11
More informationClustering. Stephen Scott. CSCE 478/878 Lecture 8: Clustering. Stephen Scott. Introduction. Outline. Clustering.
1 / 19 sscott@cse.unl.edu x1 If no label information is available, can still perform unsupervised learning Looking for structural information about instance space instead of label prediction function Approaches:
More informationLecture 8: Classification
1/26 Lecture 8: Classification Måns Eriksson Department of Mathematics, Uppsala University eriksson@math.uu.se Multivariate Methods 19/5 2010 Classification: introductory examples Goal: Classify an observation
More informationClarifying the concepts of reliability, validity and generalizability
Clarifying the concepts of reliability, validity and generalizability Maria Valaste 1 and Lauri Tarkkonen 2 1 University of Helsinki, Finland e-mail: maria.valaste@helsinki.fi 2 University of Helsinki,
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationMotivating the Covariance Matrix
Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role
More informationLinks between socio-economic and ethnic segregation at different spatial scales: a comparison between The Netherlands and Belgium
Links between socio-economic and ethnic segregation at different spatial scales: a comparison between The Netherlands and Belgium Bart Sleutjes₁ & Rafael Costa₂ ₁ Netherlands Interdisciplinary Demographic
More informationMarielle Caccam Jewel Refran
Marielle Caccam Jewel Refran Cluster analysis is a group of multivariate techniques whose primary purpose is to group objects (e.g., respondents, products, or other entities) based on the characteristics
More informationAUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-12 NATIONAL SECURITY COMPLEX
AUTOMATED TEMPLATE MATCHING METHOD FOR NMIS AT THE Y-1 NATIONAL SECURITY COMPLEX J. A. Mullens, J. K. Mattingly, L. G. Chiang, R. B. Oberer, J. T. Mihalczo ABSTRACT This paper describes a template matching
More informationData Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur
Data Mining Prof. Pabitra Mitra Department of Computer Science & Engineering Indian Institute of Technology, Kharagpur Lecture 21 K - Nearest Neighbor V In this lecture we discuss; how do we evaluate the
More informationAuthor : Dr. Pushpinder Kaur. Educational Statistics: Mean Median and Mode
B.ED. PART- II ACADEMIC SESSION : 2017-2018 PAPER XVIII Assessment for Learning Lesson No. 8 Author : Dr. Pushpinder Kaur Educational Statistics: Mean Median and Mode MEAN : The mean is the average value
More informationLecture 12 : Graph Laplacians and Cheeger s Inequality
CPS290: Algorithmic Foundations of Data Science March 7, 2017 Lecture 12 : Graph Laplacians and Cheeger s Inequality Lecturer: Kamesh Munagala Scribe: Kamesh Munagala Graph Laplacian Maybe the most beautiful
More informationAreal data. Infant mortality, Auckland NZ districts. Number of plant species in 20cm x 20 cm patches of alpine tundra. Wheat yield
Areal data Reminder about types of data Geostatistical data: Z(s) exists everyhere, varies continuously Can accommodate sudden changes by a model for the mean E.g., soil ph, two soil types with different
More informationSolving Non-uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms
Solving Non-uniqueness in Agglomerative Hierarchical Clustering Using Multidendrograms Alberto Fernández and Sergio Gómez arxiv:cs/0608049v2 [cs.ir] 0 Jun 2009 Departament d Enginyeria Informàtica i Matemàtiques,
More informationTrends in Human Development Index of European Union
Trends in Human Development Index of European Union Department of Statistics, Hacettepe University, Beytepe, Ankara, Turkey spxl@hacettepe.edu.tr, deryacal@hacettepe.edu.tr Abstract: The Human Development
More informationFACULTY OF APPLIED ECONOMICS
FACULTY OF APPLIED ECONOMICS DEPARTMENT OF MATHEMATICS, STATISTICS AND ACTUARIAL SCIENCES French Regional Wheat Prices : 1756-1870 An Illustrated Statistical Analysis E. Borghers RESEARCH PAPER 06-003
More informationDeveloping a global, people-based definition of cities and settlements
Developing a global, people-based definition of cities and settlements Cooperation between: Directorate General for Regional and, Joint Research Centre, EUROSTAT (European Commission, European Union) OECD,
More informationAnomaly (outlier) detection. Huiping Cao, Anomaly 1
Anomaly (outlier) detection Huiping Cao, Anomaly 1 Outline General concepts What are outliers Types of outliers Causes of anomalies Challenges of outlier detection Outlier detection approaches Huiping
More informationEXPERT AGGREGATION WITH DEPENDENCE
EXPERT AGGREGATION WITH DEPENDENCE M. J. Kallen, R.M. Cooke 2 Department of Mathematics, Delft University of Technology, Delft, The Netherlands 2 Department of Mathematics, Delft University of Technology,
More informationESTIMATION OF TREATMENT EFFECTS VIA MATCHING
ESTIMATION OF TREATMENT EFFECTS VIA MATCHING AAEC 56 INSTRUCTOR: KLAUS MOELTNER Textbooks: R scripts: Wooldridge (00), Ch.; Greene (0), Ch.9; Angrist and Pischke (00), Ch. 3 mod5s3 General Approach The
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationChap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University
Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics
More informationTable of Contents. Multivariate methods. Introduction II. Introduction I
Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation
More informationsphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19
additive tree structure, 10-28 ADDTREE, 10-51, 10-53 EXTREE, 10-31 four point condition, 10-29 ADDTREE, 10-28, 10-51, 10-53 adjusted R 2, 8-7 ALSCAL, 10-49 ANCOVA, 9-1 assumptions, 9-5 example, 9-7 MANOVA
More informationMultivariate Analysis
Multivariate Analysis Chapter 5: Cluster analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2015/2016 Master in Business Administration and
More informationStrategic Regional Planning and Regional Employment in Greece: A Clustering Analysis Approach
Strategic Regional Planning and Regional Employment in Greece: A Clustering Analysis Approach PANAYOTIS MICHAELIDES School of Applied Mathematics and Physics National Technical University of Athens 9 Heroon
More informationDiscriminant analysis and supervised classification
Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical
More informationSTATISTICA MULTIVARIATA 2
1 / 73 STATISTICA MULTIVARIATA 2 Fabio Rapallo Dipartimento di Scienze e Innovazione Tecnologica Università del Piemonte Orientale, Alessandria (Italy) fabio.rapallo@uniupo.it Alessandria, May 2016 2 /
More informationProbabilistic Methods in Bioinformatics. Pabitra Mitra
Probabilistic Methods in Bioinformatics Pabitra Mitra pabitra@cse.iitkgp.ernet.in Probability in Bioinformatics Classification Categorize a new object into a known class Supervised learning/predictive
More informationDecision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag
Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:
More informationLANDSCAPE PATTERN AND PER-PIXEL CLASSIFICATION PROBABILITIES. Scott W. Mitchell,
LANDSCAPE PATTERN AND PER-PIXEL CLASSIFICATION PROBABILITIES Scott W. Mitchell, Department of Geography and Environmental Studies, Carleton University, Loeb Building B349, 1125 Colonel By Drive, Ottawa,
More informationMeasurement Theory. Reliability. Error Sources. = XY r XX. r XY. r YY
Y -3 - -1 0 1 3 X Y -10-5 0 5 10 X Measurement Theory t & X 1 X X 3 X k Reliability e 1 e e 3 e k 1 The Big Picture Measurement error makes it difficult to identify the true patterns of relationships between
More information2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.
Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity. A global picture of the protein universe will help us to understand
More informationANALYSIS OF LARGE SCALE SOIL SPECTRAL LIBRARIES
Antoine Stevens (1), Marco Nocita (1,2), & Bas van Wesemael (1) ANALYSIS OF LARGE SCALE SOIL SPECTRAL LIBRARIES 1 Georges Lemaître Centre for Earth and Climate Research, Earth and Life Institute, UCLouvain,
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationConjoint use of variables clustering and PLS structural equations modelling
Conjoint use of variables clustering and PLS structural equations modelling Valentina Stan 1 and Gilbert Saporta 1 1 Conservatoire National des Arts et Métiers, 9 Rue Saint Martin, F 75141 Paris Cedex
More informationTests for Two Coefficient Alphas
Chapter 80 Tests for Two Coefficient Alphas Introduction Coefficient alpha, or Cronbach s alpha, is a popular measure of the reliability of a scale consisting of k parts. The k parts often represent k
More informationClustering. CSL465/603 - Fall 2016 Narayanan C Krishnan
Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification
More informationAdvanced Statistical Methods: Beyond Linear Regression
Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi
More informationSection 4. Test-Level Analyses
Section 4. Test-Level Analyses Test-level analyses include demographic distributions, reliability analyses, summary statistics, and decision consistency and accuracy. Demographic Distributions All eligible
More informationFACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING
FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT
More informationName: Per: Due: (A) Wed. October 11 Chapter 1, Key Issue 4: Why Are Some Human Actions Not Sustainable? Pages 30-37
Name: Per: Due: (A) Wed. October 11 Chapter 1, Key Issue 4: Why Are Some Human Actions Not Sustainable? Pages 30-37 1. Explain the difference between renewable resources and nonrenewable resources. What
More information5. Discriminant analysis
5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density
More informationECE 592 Topics in Data Science
ECE 592 Topics in Data Science Final Fall 2017 December 11, 2017 Please remember to justify your answers carefully, and to staple your test sheet and answers together before submitting. Name: Student ID:
More informationMULTIVARIATE ANALYSIS OF BORE HOLE DISCONTINUITY DATA
Maerz,. H., and Zhou, W., 999. Multivariate analysis of bore hole discontinuity data. Rock Mechanics for Industry, Proceedings of the 37th US Rock Mechanics Symposium, Vail Colorado, June 6-9, 999, v.,
More informationMachine Learning for Signal Processing Bayes Classification
Machine Learning for Signal Processing Bayes Classification Class 16. 24 Oct 2017 Instructor: Bhiksha Raj - Abelino Jimenez 11755/18797 1 Recap: KNN A very effective and simple way of performing classification
More informationMapping of Science. Bart Thijs ECOOM, K.U.Leuven, Belgium
Mapping of Science Bart Thijs ECOOM, K.U.Leuven, Belgium Introduction Definition: Mapping of Science is the application of powerful statistical tools and analytical techniques to uncover the structure
More informationAnomaly Detection. Jing Gao. SUNY Buffalo
Anomaly Detection Jing Gao SUNY Buffalo 1 Anomaly Detection Anomalies the set of objects are considerably dissimilar from the remainder of the data occur relatively infrequently when they do occur, their
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationPart I. Linear regression & LASSO. Linear Regression. Linear Regression. Week 10 Based in part on slides from textbook, slides of Susan Holmes
Week 10 Based in part on slides from textbook, slides of Susan Holmes Part I Linear regression & December 5, 2012 1 / 1 2 / 1 We ve talked mostly about classification, where the outcome categorical. If
More informationLECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS
LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal
More informationCHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the
CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful
More informationDrought News August 2014
European Drought Observatory (EDO) Drought News August 2014 (Based on data until the end of July) http://edo.jrc.ec.europa.eu August 2014 EDO (http://edo.jrc.ec.europa.eu) Page 2 of 8 EDO Drought News
More informationSupport Vector Machine. Industrial AI Lab.
Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different
More informationComputational Genomics
Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event
More informationLECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity
LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists
More informationA multivariate multilevel model for the analysis of TIMMS & PIRLS data
A multivariate multilevel model for the analysis of TIMMS & PIRLS data European Congress of Methodology July 23-25, 2014 - Utrecht Leonardo Grilli 1, Fulvia Pennoni 2, Carla Rampichini 1, Isabella Romeo
More informationClustering VS Classification
MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans:
More informationJoint Research Centre
Joint Research Centre the European Commission's in-house science service Serving society Stimulating innovation Supporting legislation The EU Commission's definition of nanomaterial: implementation and
More informationExam #1 March 9, 2016
Mathematics 1372/D552, Spring 2016 Instructor: Suman Ganguli Exam #1 March 9, 2016 Name: 1. (20 points) The following data set lists the high temperatures (in degrees Fahrenheit) observed in Central Park
More informationWolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 14 Indexes for Multimedia Data 14 Indexes for Multimedia
More informationUse of auxiliary information in the sampling strategy of a European area frame agro-environmental survey
Use of auxiliary information in the sampling strategy of a European area frame agro-environmental survey Laura Martino 1, Alessandra Palmieri 1 & Javier Gallego 2 (1) European Commission: DG-ESTAT (2)
More informationA first model of learning
A first model of learning Let s restrict our attention to binary classification our labels belong to (or ) We observe the data where each Suppose we are given an ensemble of possible hypotheses / classifiers
More informationLogic and machine learning review. CS 540 Yingyu Liang
Logic and machine learning review CS 540 Yingyu Liang Propositional logic Logic If the rules of the world are presented formally, then a decision maker can use logical reasoning to make rational decisions.
More informationAn Application of Discriminant Analysis On University Matriculation Examination Scores For Candidates Admitted Into Anamabra State University
An Application of Discriminant Analysis On University Matriculation Examination Scores For Candidates Admitted Into Anamabra State University Okoli C.N Department of Statistics Anambra State University,
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationUniversity of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout 2:. The Multivariate Gaussian & Decision Boundaries
University of Cambridge Engineering Part IIB Module 3F3: Signal and Pattern Processing Handout :. The Multivariate Gaussian & Decision Boundaries..15.1.5 1 8 6 6 8 1 Mark Gales mjfg@eng.cam.ac.uk Lent
More informationBayes Decision Theory - I
Bayes Decision Theory - I Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statistical Learning from Data Goal: Given a relationship between a feature vector and a vector y, and iid data samples ( i,y i ), find
More informationApplied Hierarchical Cluster Analysis with Average Linkage Algoritm
CAUCHY Jurnal Matematika Murni dan Aplikasi Volume 5(1)(2017), Pages 1-7 p-issn: 2086-0382; e-issn: 2477-3344 Applied Hierarchical Cluster Analysis with Average Linkage Algoritm Cindy Cahyaning Astuti
More information