Multivariate Analysis
|
|
- Sarah Stone
- 6 years ago
- Views:
Transcription
1 Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X X 1p 2. X 21. X X 2p. N X N1 X N2... X Np X T j = (X j1,..., X jp ) = j th row of X. Many statistical problems and procedures (estimators, tests,...) similar to dimension 1, but some specific for multivariate data, e.g. dimension reduction, finding few relevant features,...
2 Prof. Dr. J. Franke All of Statistics 3.2 Ideally: X 1,..., X N i.i.d. multivariate normal N p (µ, Σ) with mean vector µ and covariance matrix Σ µ k = EX jk, Σ kl = cov(x jk, X jl ), k, l = 1,..., p. Parameter estimates ˆµ = X N = 1 N Nj=1 X j ˆΣ = S = 1 N N j=1 (X j X N )(X j X N ) T, i.e. S kl = 1 N N j=1 (X jk ˆµ k )(X jl ˆµ l ) S is symmetric, all eigenvalues 0. Let d 1 d 2... d p 0 be the eigenvalues of S S = ODO T, D = diag(d 1,..., d p )
3 Prof. Dr. J. Franke All of Statistics 3.3 S = ODO T, D = diag(d 1,..., d p ) O orthogonal matrix (basis transformation), columns of O = orthogonal basis of eigenvectors of S. Analogously: δ 1 δ 2... δ p 0 eigenvalues of Σ = diag(δ 1,..., δ p ) = Ω T ΣΩ Σ = Ω Ω T Principal Component Analysis (PCA) X 0 representative of X 1,..., X N principal component transformation: X 0 W = Ω T (X 0 µ) k th principal component of X 0 : W k = X 0 µ, e k where e 1,..., e p = normed eigenvectors of Σ = columns of Ω
4 Prof. Dr. J. Franke All of Statistics 3.4 W = p k=1 W ke k representation of X 0 µ in eigenbasis of Σ. Properties: EW k = 0, var W k = δ k, cov(w k, W l ) = 0, k l var W 1 var W 2... var W p If X 0 N p (µ, Σ)-distributed, then W N p (0, )-distributed, and, hence, W 1,..., W p independent! var W 1 = max{var U; U = p k=1 α k X 0k, α 1,..., α p R, p k=1 α 2 k = 1} Idea of principal component analysis (dimension reduction): Find q p linear combinations of X 01,..., X 0p which explain a large percentage of the variability in the features.
5 Prof. Dr. J. Franke All of Statistics 3.5 Solution: W 1,..., W q Open questions: µ, Σ =?, q =? Empirical principal components Sample X 1,..., X N i.i.d., sample covariance matrix S = ODO T, sample mean ˆµ = X N principal component transformation: X j V j = O T (X j X N ) Use features V j1,..., V jq instead of X j1,..., X jp, j = 1,..., N. Selection of q : d 1 d 2... d p eigenvalues of S. The proportion of variability of X 0 explained by W 1,..., W q is δ δ q, estimated by d d q.
6 Prof. Dr. J. Franke All of Statistics 3.6 scree graph: plot d k or d k p l=1 d l against k choose that q where the graph becomes flat. Rules of thumb: a) Choose q s.th. 90% of total variability explained: q = min{r p; d d r 0.9 p k=1 d k = 0.9 tr S} b) (Kaiser) Consider only principal components with above average variance: q = max{r p; d r 1 p p k=1 d k }
7 Prof. Dr. J. Franke All of Statistics 3.7 N = 180 pit props cut from Corsican pine (Jeffreys, 1967) Goal (regression): Y j = maximum compressive strength as a function of p = 13 predictor variables: 1: top diameter 2: length 3: moisture content (% of dry weight) 4: specific gravity at test 5: oven-dry specific gravity of timber 6: no. of annual rings: at top 7:...: at base 8: maximum bow 9: distance top to point of maximum bow 10: no. of knot whorls 11: length of clear prop from top 12: average no. of knots per whorl 13: average diameter of knots Principal component analysis for X j1,..., X jp, j = 1,..., N
8 Prof. Dr. J. Franke All of Statistics 3.8 scree graph (k, d k ) for Corsican pitprop data
9 Prof. Dr. J. Franke All of Statistics 3.9 Eigenvector w.r.t. d 1 : e 1 = ( 0.40, 0.41, 0.12, 0.17, 0.06, 0.28, 0.40, 0.29, 0.36, 0.38, 0.01, 0.12, 0.11) T V j1 average of X jk, k = 1, 2, 6-10 total size of pit prop V j2 average of X jk, V j3 average of X jk, k = 3, 4 degree of seasoning k = 4-7 speed of growth V j4 X j11 length of clear prop from top V j5 X j12 average no. of knots per whorl V j6 average of X jk, k = 5, 13 Rules of thumb: a) q = 6 b) q = 4 Scree graph q = 3 or q = 6
10 Prof. Dr. J. Franke All of Statistics 3.10 Discriminant Analysis Classification problem: object from class C 1,..., C m Observed: feature vector X 0 Ass.: X 0 has density f k (x) if object from C k Bayes classifier: decide for class C k if X 0 {x; f k (x) = max f i(x)} (1) i=1,...,m Gaussian case: if object from C k, X 0 is N p (µ k, Σ)-distributed. Then, (1) is equivalent to X 0 µ k 2 Σ = (X 0 µ k ) T Σ 1 (X 0 µ k ) =. Σ = Mahalanobis distance w.r.t. Σ min i=1,...,m X 0 µ i 2 Σ
11 Prof. Dr. J. Franke All of Statistics 3.11 Special case m = 2 : decide for C 1 if α T (X 0 µ) > 0 with µ = 1 2 (µ 1 + µ 2 ), α = Σ 1 (µ 1 µ 2 ). In practice, µ 1,..., µ m, Σ unknown. Training set X (k) j, j = 1,..., n k, i.i.d. N p (µ k, Σ), k = 1,..., m, with known classification sample means and sample covariance matrices for each subsample ˆµ k = 1 n k n k j=1 X (k) j, S (k), k = 1,..., m Combine S (1),..., S (m) to estimate of Σ S = 1 N m k=1 n k S (k), N = n n m
12 Prof. Dr. J. Franke All of Statistics 3.12 empirical classification rule: decide for C k if X 0 ˆµ k 2 S = min i=1,...,m X 0 ˆµ i 2 S Warning: If µ 1 =... = µ m, classification meaningless, but ˆµ 1,..., ˆµ m not equal. Safeguard: Test H 0 : µ 1 =... = µ m (multivariate ANOVA) Fisher s discriminant rule No Gaussian assumption consider only linear discriminant functions, i.e. decide for C k if a T (X 0 ˆµ k ) < a T (X 0 ˆµ i ) for all i k.
13 Prof. Dr. J. Franke All of Statistics 3.13 a T (X 0 ˆµ k ) < a T (X 0 ˆµ i ) for all i k. Choose a s.th. the discriminant function shows maximal differences between the groups (of a training set) a is an eigenvector w.r.t. largest eigenvalue of S 1 B where B = X N = 1 N m k=1 m n k (ˆµ k X N )(X N ˆµ k ) T, k=1 n kˆµ k = 1 N m n k k=1 j=1 X (k) j. For m = 2, Fisher s rule coincides with Bayes classification for Gaussian data. For m > 2, usually not.
14 Prof. Dr. J. Franke All of Statistics 3.18 Cluster analysis Discriminant analysis: Given classes C 1,..., C m. Find classification rule. Estimate it based on training set (supervised learning). Cluster analysis: Find appropriate classes! No information about which object belongs to which group (unsupervised learning). Observed feature vectors X 1,..., X N R p, independent. Assume, e.g., that all X j are N p (µ k, Σ k )-distributed if X j belongs to class C k for some unknown µ k, Σ k, k = 1,..., m, and unknown m, C 1,..., C m.
15 Prof. Dr. J. Franke All of Statistics 3.19 Maximum likelihood over µ k, Σ k, C 1,..., C m and m in principle possible with penalty for large m to avoid overfitting computationally not feasible usually. Hierarchical clustering algorithms Needed: distance between objects i, j common choice: Pearson distance of feature vectors d 2 ij = p k=1 s 2 k = 1 N 1 (X ik X jk ) 2, N s 2 k j=1(x jk ˆµ k ) 2, ˆµ k = 1 N N X jk j=1 Agglomerative: Start with maximal number of clusters, i.e. each object is an own cluster: C (0) j = {j}, j = 1,..., N.
16 Prof. Dr. J. Franke All of Statistics 3.20 Nearest neighbour single linkage (nnsl) sort d ij, i < j d r1 s 1 d r2 s ) N 1 clusters C (0) r 1 + C (0) s 1, C (0) j, j r 1, s 1 2) If r 1, s 1 r 2, s 2 N 2 clusters {r 1, s 1 }, {r 2, s 2 }, {j} j r 1, s 1, r 2, s 2. If r 1 = r 2, s 1 s 2 N 2 clusters {r 1, s 1, s 2 }, {j}, j r 1, s 1, s 2. If r 1 r 2, s 1 = s 2... l) join cluster containing r l with cluster containing s l Stop if d rl,s l > threshold d 0.
17 Prof. Dr. J. Franke All of Statistics 3.21 Average linkage D rs = distance between cluster r and cluster s 0) D (0) ij = d ij for C (0) j = {j} 1) As in nnsl, distance of new cluster C r (0) 1 + C s (0) 1 C (0) j, j r 1, s 1 : to cluster D (1) 1j = 1 2 (d r 1 j + d s1 j) 2) Join the two clusters with minimal distance D r (i 1) s. Define distance of new cluster to the other clusters by D (i) = 1 2 (D(i 1) r k + D (i 1) s k ), k r, s 3) Stop if all cluster distances > d 0.
18 Prof. Dr. J. Franke All of Statistics 3.22 Data: Protein consumption in N = 25 European countries for p = 9 food groups: 1. Red meat 2. White meat 3. Eggs 4. Milk 5. Fish 6. Cereals 7. Starchy foods 8. Pulses, nuts, and oil-seeds 9. Fruits and vegetables
19 Prof. Dr. J. Franke All of Statistics 3.23 Complete linkage cluster analysis: Eastern Europe (blue): East Germany, Czechoslovakia, Poland, USSR, Hungary; Scandinavia (green): Sweden, Denmark, Norway, Finland; Western Europe (red): UK, France, West Germany, Belgium, Ireland, Netherlands, Austria, Switzerland; Iberian (purple) Spain, Portugal; Mediterranean (orange) Italy, Greece; the Balkans (yellow) Yugoslavia, Romania, Bulgaria, Albania. PCA: dimension reduction to q = 4 principal components: 1: total meat consumptiom 2-4: Consumption of red meat, white meat resp. fish Weber, A. (1973) Agrarpolitik im Spannungsfeld der internationalen Ernährungspolitik, Institut für Agrarpolitik und Marktlehre, Kiel. Data for download:
20 Prof. Dr. J. Franke All of Statistics 3.24
STATISTICA MULTIVARIATA 2
1 / 73 STATISTICA MULTIVARIATA 2 Fabio Rapallo Dipartimento di Scienze e Innovazione Tecnologica Università del Piemonte Orientale, Alessandria (Italy) fabio.rapallo@uniupo.it Alessandria, May 2016 2 /
More informationTechniques and Applications of Multivariate Analysis
Techniques and Applications of Multivariate Analysis Department of Statistics Professor Yong-Seok Choi E-mail: yschoi@pusan.ac.kr Home : yschoi.pusan.ac.kr Contents Multivariate Statistics (I) in Spring
More informationCropCast Europe Weekly Report
CropCast Europe Weekly Report Kenny Miller Monday, June 05, 2017 Europe Hot Spots Abundant showers should ease dryness across northern and central UK as well as across western Norway and Sweden. Improvements
More informationClassification 2: Linear discriminant analysis (continued); logistic regression
Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:
More informationSCHOOL OF MATHEMATICS AND STATISTICS
Data provided: Graph Paper MAS6011 SCHOOL OF MATHEMATICS AND STATISTICS Dependent Data Spring Semester 2016 2017 3 hours Marks will be awarded for your best five answers. RESTRICTED OPEN BOOK EXAMINATION
More informationComputer Vision Group Prof. Daniel Cremers. 3. Regression
Prof. Daniel Cremers 3. Regression Categories of Learning (Rep.) Learnin g Unsupervise d Learning Clustering, density estimation Supervised Learning learning from a training data set, inference on the
More informationTrends in Human Development Index of European Union
Trends in Human Development Index of European Union Department of Statistics, Hacettepe University, Beytepe, Ankara, Turkey spxl@hacettepe.edu.tr, deryacal@hacettepe.edu.tr Abstract: The Human Development
More informationLinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis
More informationTMA4267 Linear Statistical Models V2017 [L3]
TMA4267 Linear Statistical Models V2017 [L3] Part 1: Multivariate RVs and normal distribution (L3) Covariance and positive definiteness [H:2.2,2.3,3.3], Principal components [H11.1-11.3] Quiz with Kahoot!
More informationLinear & Non-Linear Discriminant Analysis! Hugh R. Wilson
Linear & Non-Linear Discriminant Analysis! Hugh R. Wilson PCA Review! Supervised learning! Fisher linear discriminant analysis! Nonlinear discriminant analysis! Research example! Multiple Classes! Unsupervised
More informationUnsupervised Learning: Dimensionality Reduction
Unsupervised Learning: Dimensionality Reduction CMPSCI 689 Fall 2015 Sridhar Mahadevan Lecture 3 Outline In this lecture, we set about to solve the problem posed in the previous lecture Given a dataset,
More informationHigh Dimensional Discriminant Analysis
High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid High Dimensional Discriminant Analysis - Lear seminar p.1/43 Introduction High
More informationHigh Dimensional Discriminant Analysis
High Dimensional Discriminant Analysis Charles Bouveyron 1,2, Stéphane Girard 1, and Cordelia Schmid 2 1 LMC IMAG, BP 53, Université Grenoble 1, 38041 Grenoble cedex 9 France (e-mail: charles.bouveyron@imag.fr,
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More informationStatistics for Applications. Chapter 9: Principal Component Analysis (PCA) 1/16
Statistics for Applications Chapter 9: Principal Component Analysis (PCA) 1/16 Multivariate statistics and review of linear algebra (1) Let X be a d-dimensional random vector and X 1,..., X n be n independent
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationThe EuCheMS Division Chemistry and the Environment EuCheMS/DCE
The EuCheMS Division Chemistry and the Environment EuCheMS/DCE EuCheMS Division on Chemistry and the Environment was formed as a FECS Working Party in 1977. Membership: 37 members from 34 countries. Countries
More informationA Markov system analysis application on labour market dynamics: The case of Greece
+ A Markov system analysis application on labour market dynamics: The case of Greece Maria Symeonaki Glykeria Stamatopoulou This project has received funding from the European Union s Horizon 2020 research
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationClassification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees
Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Rafdord M. Neal and Jianguo Zhang Presented by Jiwen Li Feb 2, 2006 Outline Bayesian view of feature
More information[7] Big Data: Clustering
[7] Big Data: Clustering Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/teaching Clustering via Mixture Models You ve seen lots of models for [y x] (and
More information5. Discriminant analysis
5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region Table : Reported measles cases for the period January December 207 (data as of 02 February
More informationAD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES
Strasbourg, 29 May 2015 PC-GR-COT (2013) 2 EN_Rev AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES TO THE UNITED NATIONS CONVENTION
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region Table : Reported cases for the period November 207 October 208 (data as of 30 November
More informationRefinement of the OECD regional typology: Economic Performance of Remote Rural Regions
[Preliminary draft April 2010] Refinement of the OECD regional typology: Economic Performance of Remote Rural Regions by Lewis Dijkstra* and Vicente Ruiz** Abstract To account for differences among rural
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region Table : Reported cases for the period June 207 May 208 (data as of 0 July 208) Population in
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Feature Extraction Hamid R. Rabiee Jafar Muhammadi, Alireza Ghasemi, Payam Siyari Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Dimensionality Reduction
More informationLECTURE NOTE #10 PROF. ALAN YUILLE
LECTURE NOTE #10 PROF. ALAN YUILLE 1. Principle Component Analysis (PCA) One way to deal with the curse of dimensionality is to project data down onto a space of low dimensions, see figure (1). Figure
More informationModelling structural change using broken sticks
Modelling structural change using broken sticks Paul White, Don J. Webber and Angela Helvin Department of Mathematics and Statistics, University of the West of England, Bristol, UK Department of Economics,
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationDirectorate C: National Accounts, Prices and Key Indicators Unit C.3: Statistics for administrative purposes
EUROPEAN COMMISSION EUROSTAT Directorate C: National Accounts, Prices and Key Indicators Unit C.3: Statistics for administrative purposes Luxembourg, 17 th November 2017 Doc. A6465/18/04 version 1.2 Meeting
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More informationClustering using Unsupervised Binary Trees: CUBT
Clustering using Unsupervised Binary Trees: CUBT arxiv:1011.2624v1 [stat.me] 11 Nov 2010 Ricardo Fraiman Universidad de San Andrés Badih Ghattas Université de la Méditerrannée Marcela Svarc Universidad
More informationFINM 33180/STAT32940: MULTIVARIATE DATA ANALYSIS VIA MATRIX DECOMPOSITIONS
FINM 33180/STAT32940: MULTIVARIATE DATA ANALYSIS VIA MATRIX DECOMPOSITIONS Overview, Motivation, Background Thanks: Rasmus Bro (Copenhagen), Mei Wang (Chicago) Administrative Matters Course webpage: http://wwwstatuchicagoedu/~lekheng/courses/331/
More informationKey Findings and Policy Briefs No. 2 SPATIAL ANALYSIS OF RURAL DEVELOPMENT MEASURES
SPATIAL ANALYSIS OF RURAL DEVELOPMENT MEASURES Key Findings and Policy Briefs No. 2 RDP expenditures and their objectives from a spatial perspective: Does funding correspond to needs? RDP expenditures
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationBayes Decision Theory - I
Bayes Decision Theory - I Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statistical Learning from Data Goal: Given a relationship between a feature vector and a vector y, and iid data samples ( i,y i ), find
More information20 Unsupervised Learning and Principal Components Analysis (PCA)
116 Jonathan Richard Shewchuk 20 Unsupervised Learning and Principal Components Analysis (PCA) UNSUPERVISED LEARNING We have sample points, but no labels! No classes, no y-values, nothing to predict. Goal:
More informationSTATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS
STATISTICS 407 METHODS OF MULTIVARIATE ANALYSIS TOPICS Principal Component Analysis (PCA): Reduce the, summarize the sources of variation in the data, transform the data into a new data set where the variables
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region Table : Reported measles cases for the period July 207 June 208 (data as of August 208) Population
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationMotivating the Covariance Matrix
Motivating the Covariance Matrix Raúl Rojas Computer Science Department Freie Universität Berlin January 2009 Abstract This note reviews some interesting properties of the covariance matrix and its role
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationIEEE..- AD-A INDICATORS OF COMPARATIVE EAST-EST ECONOMIC STRENGTH 171
AD-A145 450 INDICATORS OF COMPARATIVE EAST-EST ECONOMIC STRENGTH 171 IEEE..- 198f(U) BUREAU OF INTELLIGENCE AND RESEARCH (STATE) WASHINGTON DC ASSESSMENTS AND RESEARCH L KORNEI UNCLASSIFIED 07 DEC 82 512-AR
More informationPATTERN CLASSIFICATION
PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS
More informationWHO EpiData. A monthly summary of the epidemiological data on selected vaccine preventable diseases in the European Region
A monthly summary of the epidemiological data on selected vaccine preventable diseases in the European Region Table 1: Reported measles cases for the 12-month period February 2016 January 2017 (data as
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region Table : Reported cases for the period September 207 August 208 (data as of 0 October 208) Population
More informationF M U Total. Total registrants at 31/12/2014. Profession AS 2, ,574 BS 15,044 7, ,498 CH 9,471 3, ,932
Profession AS 2,949 578 47 3,574 BS 15,044 7,437 17 22,498 CH 9,471 3,445 16 12,932 Total registrants at 31/12/2014 CS 2,944 2,290 0 5,234 DT 8,048 413 15 8,476 HAD 881 1,226 0 2,107 ODP 4,219 1,921 5,958
More informationPattern Recognition. Parameter Estimation of Probability Density Functions
Pattern Recognition Parameter Estimation of Probability Density Functions Classification Problem (Review) The classification problem is to assign an arbitrary feature vector x F to one of c classes. The
More informationWHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region
A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region Table 1: Reported cases for the period January December 2018 (data as of 01 February 2019)
More informationBayesian Decision Theory
Bayesian Decision Theory Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 46 Bayesian
More informationHigh Dimensional Discriminant Analysis
High Dimensional Discriminant Analysis Charles Bouveyron LMC-IMAG & INRIA Rhône-Alpes Joint work with S. Girard and C. Schmid ASMDA Brest May 2005 Introduction Modern data are high dimensional: Imagery:
More informationThe Information Content of Capacity Utilisation Rates for Output Gap Estimates
The Information Content of Capacity Utilisation Rates for Output Gap Estimates Michael Graff and Jan-Egbert Sturm 15 November 2010 Overview Introduction and motivation Data Output gap data: OECD Economic
More informationMachine Learning 11. week
Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately
More informationGravity Analysis of Regional Economic Interdependence: In case of Japan
Prepared for the 21 st INFORUM World Conference 26-31 August 2013, Listvyanka, Russia Gravity Analysis of Regional Economic Interdependence: In case of Japan Toshiaki Hasegawa Chuo University Tokyo, JAPAN
More information10/27/2015. Content. Well-homogenized national datasets. Difference (national global) BEST (1800) Difference BEST (1911) Difference GHCN & GISS (1911)
Content Is the global mean temperature trend too low? Victor Venema, Phil Jones, Ralf Lindau, Tim Osborn and numerous collaborators @VariabilityBlog variable-variability.blogspot.com 1. Comparison trend
More informationHypothesis testing:power, test statistic CMS:
Hypothesis testing:power, test statistic The more sensitive the test, the better it can discriminate between the null and the alternative hypothesis, quantitatively, maximal power In order to achieve this
More informationISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification
ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationPrinciple Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA
Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In
More informationCalories, Obesity and Health in OECD Countries
Presented at: The Agricultural Economics Society's 81st Annual Conference, University of Reading, UK 2nd to 4th April 200 Calories, Obesity and Health in OECD Countries Mario Mazzocchi and W Bruce Traill
More informationMeasuring Instruments Directive (MID) MID/EN14154 Short Overview
Measuring Instruments Directive (MID) MID/EN14154 Short Overview STARTING POSITION Approval vs. Type examination In the past, country specific approvals were needed to sell measuring instruments in EU
More informationPRINCIPAL COMPONENTS ANALYSIS
121 CHAPTER 11 PRINCIPAL COMPONENTS ANALYSIS We now have the tools necessary to discuss one of the most important concepts in mathematical statistics: Principal Components Analysis (PCA). PCA involves
More informationSupervised Learning. Regression Example: Boston Housing. Regression Example: Boston Housing
Supervised Learning Unsupervised learning: To extract structure and postulate hypotheses about data generating process from observations x 1,...,x n. Visualize, summarize and compress data. We have seen
More informationVariance estimation on SILC based indicators
Variance estimation on SILC based indicators Emilio Di Meglio Eurostat emilio.di-meglio@ec.europa.eu Guillaume Osier STATEC guillaume.osier@statec.etat.lu 3rd EU-LFS/EU-SILC European User Conference 1
More informationSupervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012
Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:
More informationModelling and projecting the postponement of childbearing in low-fertility countries
of childbearing in low-fertility countries Nan Li and Patrick Gerland, United Nations * Abstract In most developed countries, total fertility reached below-replacement level and stopped changing notably.
More informationPHYSICAL FEATURES OF EUROPE. Europe Unit
PHYSICAL FEATURES OF EUROPE Europe Unit PENINSULA OF PENINSULAS Europe is a large peninsula that consists of many smaller peninsulas Most places in Europe are no more than 300 miles from an ocean or sea
More informationECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 K-means clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview K-means clustering
More informationTable of Contents. Multivariate methods. Introduction II. Introduction I
Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation
More informationDecember 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis
.. December 20, 2013 Todays lecture. (PCA) (PLS-R) (LDA) . (PCA) is a method often used to reduce the dimension of a large dataset to one of a more manageble size. The new dataset can then be used to make
More informationUnsupervised Learning. k-means Algorithm
Unsupervised Learning Supervised Learning: Learn to predict y from x from examples of (x, y). Performance is measured by error rate. Unsupervised Learning: Learn a representation from exs. of x. Learn
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationFeature selection and extraction Spectral domain quality estimation Alternatives
Feature selection and extraction Error estimation Maa-57.3210 Data Classification and Modelling in Remote Sensing Markus Törmä markus.torma@tkk.fi Measurements Preprocessing: Remove random and systematic
More informationPCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani
PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given
More informationThe School Geography Curriculum in European Geography Education. Similarities and differences in the United Europe.
The School Geography Curriculum in European Geography Education. Similarities and differences in the United Europe. Maria Rellou and Nikos Lambrinos Aristotle University of Thessaloniki, Dept.of Primary
More information[8] Big Data: Factor Models
[8] Big Data: Factor Models Matt Taddy, University of Chicago Booth School of Business faculty.chicagobooth.edu/matt.taddy/teaching Today is all about Dimension Reduction (more than normal) The setting:
More informationData Preprocessing. Cluster Similarity
1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M
More informationWeighted Voting Games
Weighted Voting Games Gregor Schwarz Computational Social Choice Seminar WS 2015/2016 Technische Universität München 01.12.2015 Agenda 1 Motivation 2 Basic Definitions 3 Solution Concepts Core Shapley
More informationRegularized Discriminant Analysis and Reduced-Rank LDA
Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and
More informationClassification of high dimensional data: High Dimensional Discriminant Analysis
Classification of high dimensional data: High Dimensional Discriminant Analysis Charles Bouveyron, Stephane Girard, Cordelia Schmid To cite this version: Charles Bouveyron, Stephane Girard, Cordelia Schmid.
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationSC4/SM4 Data Mining and Machine Learning Clustering
SC4/SM4 Data Mining and Machine Learning Clustering Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/dmml Department of Statistics,
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationExport Destinations and Input Prices. Appendix A
Export Destinations and Input Prices Paulo Bastos Joana Silva Eric Verhoogen Jan. 2016 Appendix A For Online Publication Figure A1. Real Exchange Rate, Selected Richer Export Destinations UK USA Sweden
More informationFeature Engineering, Model Evaluations
Feature Engineering, Model Evaluations Giri Iyengar Cornell University gi43@cornell.edu Feb 5, 2018 Giri Iyengar (Cornell Tech) Feature Engineering Feb 5, 2018 1 / 35 Overview 1 ETL 2 Feature Engineering
More informationBathing water results 2011 Slovakia
Bathing water results Slovakia 1. Reporting and assessment This report gives a general overview of water in Slovakia for the season. Slovakia has reported under the Directive 2006/7/EC since 2008. When
More informationBayesian Decision and Bayesian Learning
Bayesian Decision and Bayesian Learning Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208 http://www.eecs.northwestern.edu/~yingwu 1 / 30 Bayes Rule p(x ω i
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 11-12: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 11-12: Unsupervised Learning 1 (Clustering: k-means, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machine-learning
More informationWeekly price report on Pig carcass (Class S, E and R) and Piglet prices in the EU. Carcass Class S % + 0.3% % 98.
Weekly price report on Pig carcass (Class S, E and R) and Piglet prices in the EU Disclaimer Please note that EU prices for pig meat, are averages of the national prices communicated by Member States weighted
More informationIntroduction to Graphical Models
Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) Yung-Kyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic
More informationPrincipal Component Analysis
I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation
More informationDimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Dimension reduction, PCA & eigenanalysis Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Combinations of features Given a data matrix X n p with p fairly large, it can
More information