(b) The sample canonical correlation matrix is reported below:
|
|
- Elwin Bruce
- 6 years ago
- Views:
Transcription
1 Multivariate data analysis HW#4 1. Solution: (a) The estimated coefficients for X are: So j! = ( , , ). The estimated coefficients for Y are: So h! = ( , , ). (b) The sample canonical correlation matrix is reported below: We can see that the correlation coefficients between ξ!, ξ!, ω!, ω!, (ξ!, ω! ) for i j are all approximately zero.
2 2. Solution: (a) Visualize the first observation by plotting a against y as follows: We can see that this gives us a digit 3. (b) Scree plot and Cumulative scree plot are:
3 And scatterplot matrix of principal components is: (c) The data looks as if they are from a MVN distribution, as after examining the pdf of each of the principal components, we can see that the shape of all the principal components are very close to normal. As they are linear combinations of the original data, and if all linear combinations of the vectors are normally distributed, then the vector will have jointly MVN, we can get a sense that the data looks as if they are from MVN. (d) In order to see how many principal components we should keep, I draw the cumulative scree plot and a horizontal line at 90% below:
4 After examining the cumulative scatter plot above and by the 90% cutoff criteria, I will keep the first six or seven principal components, as six components almost achieve 0.90, but a little bit smaller than (e) Now I walk alone each of the first four principal components, and for each principal component, I give scatterplot for mean, ±2σ five points, all the twenty plots are copied below, with each row represents the five plots for each of the four principal components:
5 (f) After combining the two data sets together, I do a PCA for the new data set and copy the scatterplot matrix of principal components below.
6 The scatterplot matrix above is too small, so in order to see if we can see a nice separation of observations corresponding to the digit 3 and digit 8, I draw another scatterplot matrix which contains the first several principal component. Now we need to check how many principal components to keep, so I draw the following scree plot and cumulative scree plot: By the 90% cutoff criteria, I will keep the first five principal components and draw the scatterplot matrix of the first five principal components as follows:
7 After examining the graphs above, we can clearly see that there is a relatively nice separation for the digit 3 which are blue dots and the digit 8 which are red crossings in the scatterplots for principal component 1 with principal component 2, principal component 1 with principal component 3, principal component 1 with principal component 4 and principal component 1 with principal component 5. So we can separate digit 3 and digit 8. R code: #part a# x<- c(pendigit3[1,1],pendigit3[1,3],pendigit3[1,5],pendigit3[1,7],pendigit3[1,9],pendigit 3[1,11],pendigit3[1,13],pendigit3[1,15]) y<- c(pendigit3[1,2],pendigit3[1,4],pendigit3[1,6],pendigit3[1,8],pendigit3[1,10],pendig it3[1,12],pendigit3[1,14],pendigit3[1,16]) plot(x,y) lines(x,y) #part b and d# x<- pendigit3[,c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16)] spr <- princomp(x) U<- spr$loadings L<- (spr$sdev)^2 Z <- spr$scores pairs(z,pch=c(rep(1,1055)),col=c(rep("blue",1055))) par(mfrow=c(1,2)) plot(l,type="b",xlab="component",ylab="lambda",main="scree plot") plot(cumsum(l)/sum(l)*100,ylim=c(0,100),type="b",xlab="component",ylab="cum ulative propotion (%)",main="cum. Scree plot") plot(cumsum(l)/sum(l)*100,ylim=c(0,100),type="b",xlab="component",ylab="cum ulative propotion (%)",main="cum. Scree plot") abline(h=90) #part e# x<- pendigit3[,c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16)] spr <- princomp(x) U<- spr$loadings L<- (spr$sdev)^2 Z <- spr$scores par(mfrow=c(1,5))
8 v1<- c(mean(x[,1]),mean(x[,2]),mean(x[,3]),mean(x[,4]),mean(x[,5]),mean(x[,6]),mean(x [,7]),mean(x[,8]),mean(x[,9]),mean(x[,10]),mean(x[,11]),mean(x[,12]),mean(x[,13]), mean(x[,14]),mean(x[,15]),mean(x[,16])) a1<- v1+(- 1)*sqrt(L[1])*U[,1] a2<- v1+(- 2)*sqrt(L[1])*U[,1] a3<- v1 a4<- v1+sqrt(l[1])*u[,1] a5<- v1+2*sqrt(l[1])*u[,1] x1<- c(a1[1],a1[3],a1[5],a1[7],a1[9],a1[11],a1[13],a1[15]) y1<- c(a1[2],a1[4],a1[6],a1[8],a1[10],a1[12],a1[14],a1[16]) plot(x1,y1) lines(x1,y1) x2<- c(a2[1],a2[3],a2[5],a2[7],a2[9],a2[11],a2[13],a2[15]) y2<- c(a2[2],a2[4],a2[6],a2[8],a2[10],a2[12],a2[14],a2[16]) plot(x2,y2) lines(x2,y2) x3<- c(a3[1],a3[3],a3[5],a3[7],a3[9],a3[11],a3[13],a3[15]) y3<- c(a3[2],a3[4],a3[6],a3[8],a3[10],a3[12],a3[14],a3[16]) plot(x3,y3) lines(x3,y3) x4<- c(a4[1],a4[3],a4[5],a4[7],a4[9],a4[11],a4[13],a4[15]) y4<- c(a4[2],a4[4],a4[6],a4[8],a4[10],a4[12],a4[14],a4[16]) plot(x4,y4) lines(x4,y4) x5<- c(a5[1],a5[3],a5[5],a5[7],a5[9],a5[11],a5[13],a5[15]) y5<- c(a5[2],a5[4],a5[6],a5[8],a5[10],a5[12],a5[14],a5[16]) plot(x5,y5) lines(x5,y5) #part f# total <- rbind(pendigit3, pendigit8) Y<- total[,c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16)] spr2<- princomp(y) U2<- spr2$loadings L2<- (spr2$sdev)^2 Z2 <- spr2$scores pairs(z2,pch=c(rep(1,1055),rep(3,1055)),col=c(rep("blue",1055),rep("red",1055))) par(mfrow=c(1,2)) plot(l2,type="b",xlab="component",ylab="lambda",main="scree plot")
9 plot(cumsum(l2)/sum(l2)*100,ylim=c(0,100),type="b",xlab="component",ylab="c umulative propotion (%)",main="cum. Scree plot") pairs(z2[,c(1,2,3,4,5)],pch=c(rep(1,1055),rep(3,1055)),col=c(rep("blue",1055),rep("red",105 5))) pairs(z2[,c(1,2,3,4,5)],pch=c(rep(1,1055),rep(3,1055)),col=c(rep("blue",1055),rep("red",105 5))) abline(h=90) 3. Solution: Use 10- fold cross validation to report misclassification rates of the four classifiers as follows: (1) LDA The 10- fold cross validation misclassification rate is (2) Nearest Centroid rule The 10- fold cross validation misclassification rate is (3) QDA The 10- fold cross validation misclassification rate is (4) 5- Nearest Neighbors The 10- fold cross validation misclassification rate is 0. We can use the following bar plot to report the results above:
10 4. Solution:
11 5. Solution: The code below is a pseudo- code in R for 10- fold cross validation calculation in the two questions above. cross_validation <- function(x,y,v=10,classify_fun,predict_fun){ n = length(y) fold_size = ceiling(n/v) index = sample(n,n) CVerror = 0 for (v in 1:V){ temp = ((v-1)*fold_size):(min(v*fold_size,n)) test_index = index[temp] train_index = index[-temp] traindata = X[train_index,] trainy = y[train_index] testdata = X[test_index,] testy = y[test_index] model = classify_fun(traindata,trainy) prediction = predict_fun(model,testdata) CVerror = CVerror + sum(prediction!=testy) } CVerror = CVerror/n return(cverror) }
Lecture 4: Principal Component Analysis and Linear Dimension Reduction
Lecture 4: Principal Component Analysis and Linear Dimension Reduction Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics University of Pittsburgh E-mail:
More informationAlgebra of Principal Component Analysis
Algebra of Principal Component Analysis 3 Data: Y = 5 Centre each column on its mean: Y c = 7 6 9 y y = 3..6....6.8 3. 3.8.6 Covariance matrix ( variables): S = -----------Y n c ' Y 8..6 c =.6 5.8 Equation
More informationChapter 9 Ingredients of Multivariable Change: Models, Graphs, Rates
Chapter 9 Ingredients of Multivariable Change: Models, Graphs, Rates 9.1 Multivariable Functions and Contour Graphs Although Excel can easily draw 3-dimensional surfaces, they are often difficult to mathematically
More informationPrelab: Complete the prelab section BEFORE class Purpose:
Lab: Projectile Motion Prelab: Complete the prelab section BEFORE class Purpose: What is the Relationship between and for the situation of a cannon ball shot off a with an angle of from the horizontal.
More informationFrequency and Histograms
Warm Up Lesson Presentation Lesson Quiz Algebra 1 Create stem-and-leaf plots. Objectives Create frequency tables and histograms. Vocabulary stem-and-leaf plot frequency frequency table histogram cumulative
More informationSTA 414/2104, Spring 2014, Practice Problem Set #1
STA 44/4, Spring 4, Practice Problem Set # Note: these problems are not for credit, and not to be handed in Question : Consider a classification problem in which there are two real-valued inputs, and,
More information6.867 Machine Learning
6.867 Machine Learning Problem Set 2 Due date: Wednesday October 6 Please address all questions and comments about this problem set to 6867-staff@csail.mit.edu. You will need to use MATLAB for some of
More informationUnsupervised clustering of COMBO-17 galaxy photometry
STScI Astrostatistics R tutorials Eric Feigelson (Penn State) November 2011 SESSION 2 Multivariate clustering and classification ***************** ***************** Unsupervised clustering of COMBO-17
More informationClassification 2: Linear discriminant analysis (continued); logistic regression
Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:
More informationSupervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012
Supervised Learning: Linear Methods (1/2) Applied Multivariate Statistics Spring 2012 Overview Review: Conditional Probability LDA / QDA: Theory Fisher s Discriminant Analysis LDA: Example Quality control:
More informationShort Answer Questions: Answer on your separate blank paper. Points are given in parentheses.
ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on
More informationMachine Learning, Fall 2009: Midterm
10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all
More informationCS 361: Probability & Statistics
January 24, 2018 CS 361: Probability & Statistics Relationships in data Standard coordinates If we have two quantities of interest in a dataset, we might like to plot their histograms and compare the two
More informationChemometrics: Classification of spectra
Chemometrics: Classification of spectra Vladimir Bochko Jarmo Alander University of Vaasa November 1, 2010 Vladimir Bochko Chemometrics: Classification 1/36 Contents Terminology Introduction Big picture
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationTable of Contents. Multivariate methods. Introduction II. Introduction I
Table of Contents Introduction Antti Penttilä Department of Physics University of Helsinki Exactum summer school, 04 Construction of multinormal distribution Test of multinormality with 3 Interpretation
More informationGraphing Data. Example:
Graphing Data Bar graphs and line graphs are great for looking at data over time intervals, or showing the rise and fall of a quantity over the passage of time. Example: Auto Sales by Year Year Number
More informationIntroduction GeoXp : an R package for interactive exploratory spatial data analysis. Illustration with a data set of schools in Midi-Pyrénées.
Presentation of Presentation of Use of Introduction : an R package for interactive exploratory spatial data analysis. Illustration with a data set of schools in Midi-Pyrénées. Authors of : Christine Thomas-Agnan,
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationProperties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area
Properties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area under the curve. The curve is called the probability
More informationGaussian Models
Gaussian Models ddebarr@uw.edu 2016-04-28 Agenda Introduction Gaussian Discriminant Analysis Inference Linear Gaussian Systems The Wishart Distribution Inferring Parameters Introduction Gaussian Density
More informationRegularized Discriminant Analysis and Reduced-Rank LDA
Regularized Discriminant Analysis and Reduced-Rank LDA Department of Statistics The Pennsylvania State University Email: jiali@stat.psu.edu Regularized Discriminant Analysis A compromise between LDA and
More informationVisualizing Tests for Equality of Covariance Matrices Supplemental Appendix
Visualizing Tests for Equality of Covariance Matrices Supplemental Appendix Michael Friendly and Matthew Sigal September 18, 2017 Contents Introduction 1 1 Visualizing mean differences: The HE plot framework
More informationLinear Dimensionality Reduction
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Principal Component Analysis 3 Factor Analysis
More informationIntroduction to Spatial Analysis. Spatial Analysis. Session organization. Learning objectives. Module organization. GIS and spatial analysis
Introduction to Spatial Analysis I. Conceptualizing space Session organization Module : Conceptualizing space Module : Spatial analysis of lattice data Module : Spatial analysis of point patterns Module
More informationContents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)
Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture
More informationLecture 6: Methods for high-dimensional problems
Lecture 6: Methods for high-dimensional problems Hector Corrada Bravo and Rafael A. Irizarry March, 2010 In this Section we will discuss methods where data lies on high-dimensional spaces. In particular,
More informationECE 592 Topics in Data Science
ECE 592 Topics in Data Science Final Fall 2017 December 11, 2017 Please remember to justify your answers carefully, and to staple your test sheet and answers together before submitting. Name: Student ID:
More informationIn-class determine between which cube root. Assignments two consecutive whole
Unit 1: Expressions Part A: Powers and Roots (2-3 Weeks) Essential Questions How can algebraic expressions be used to model, analyze, and solve mathematical situations? I CAN Statements Vocabulary Standards
More informationIntroduction to Signal Detection and Classification. Phani Chavali
Introduction to Signal Detection and Classification Phani Chavali Outline Detection Problem Performance Measures Receiver Operating Characteristics (ROC) F-Test - Test Linear Discriminant Analysis (LDA)
More informationCLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS
CLASSICAL NORMAL-BASED DISCRIMINANT ANALYSIS EECS 833, March 006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@gs.u.edu 864-093 Overheads and resources available at http://people.u.edu/~gbohling/eecs833
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationEDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS
EDAMI DIMENSION REDUCTION BY PRINCIPAL COMPONENT ANALYSIS Mario Romanazzi October 29, 2017 1 Introduction An important task in multidimensional data analysis is reduction in complexity. Recalling that
More informationISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification
ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationMultivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis
Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis
More informationEffective Linear Discriminant Analysis for High Dimensional, Low Sample Size Data
Effective Linear Discriant Analysis for High Dimensional, Low Sample Size Data Zhihua Qiao, Lan Zhou and Jianhua Z. Huang Abstract In the so-called high dimensional, low sample size (HDLSS) settings, LDA
More information(a) Find the value of x. (4) Write down the standard deviation. (2) (Total 6 marks)
1. The following frequency distribution of marks has mean 4.5. Mark 1 2 3 4 5 6 7 Frequency 2 4 6 9 x 9 4 Find the value of x. (4) Write down the standard deviation. (Total 6 marks) 2. The following table
More informationDidacticiel - Études de cas
1 Topic New features for PCA (Principal Component Analysis) in Tanagra 1.4.45 and later: tools for the determination of the number of factors. Principal Component Analysis (PCA) 1 is a very popular dimension
More informationPrincipal Component Analysis (PCA) Theory, Practice, and Examples
Principal Component Analysis (PCA) Theory, Practice, and Examples Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite) variables. p k n A
More informationContents. 13. Graphs of Trigonometric Functions 2 Example Example
Contents 13. Graphs of Trigonometric Functions 2 Example 13.19............................... 2 Example 13.22............................... 5 1 Peterson, Technical Mathematics, 3rd edition 2 Example 13.19
More informationSTATS306B STATS306B. Discriminant Analysis. Jonathan Taylor Department of Statistics Stanford University. June 3, 2010
STATS306B Discriminant Analysis Jonathan Taylor Department of Statistics Stanford University June 3, 2010 Spring 2010 Classification Given K classes in R p, represented as densities f i (x), 1 i K classify
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationPrincipal Components Analysis. Sargur Srihari University at Buffalo
Principal Components Analysis Sargur Srihari University at Buffalo 1 Topics Projection Pursuit Methods Principal Components Examples of using PCA Graphical use of PCA Multidimensional Scaling Srihari 2
More informationMATH 1150 Chapter 2 Notation and Terminology
MATH 1150 Chapter 2 Notation and Terminology Categorical Data The following is a dataset for 30 randomly selected adults in the U.S., showing the values of two categorical variables: whether or not the
More information1 of 7 8/11/2014 10:27 AM Units: Teacher: PacedAlgebraPartA, CORE Course: PacedAlgebraPartA Year: 2012-13 The Language of Algebra What is the difference between an algebraic expression and an algebraic
More informationPattern Recognition 2
Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error
More informationA Scientific Model for Free Fall.
A Scientific Model for Free Fall. I. Overview. This lab explores the framework of the scientific method. The phenomenon studied is the free fall of an object released from rest at a height H from the ground.
More informationComputational Genomics
Computational Genomics http://www.cs.cmu.edu/~02710 Introduction to probability, statistics and algorithms (brief) intro to probability Basic notations Random variable - referring to an element / event
More informationGrade 3. Grade 3 K 8 Standards 23
Grade 3 In grade 3, instructional time should focus on four critical areas: (1) developing understanding of multiplication and division and strategies for multiplication and division within 100; (2) developing
More informationComputer exercise 3: PCA, CCA and factors. Principal component analysis. Eigenvalues and eigenvectors
UPPSALA UNIVERSITY Department of Mathematics Måns Thulin Multivariate Methods Spring 2011 thulin@math.uu.se Computer exercise 3: PCA, CCA and factors In this computer exercise the following topics are
More informationLecture 1: Descriptive Statistics
Lecture 1: Descriptive Statistics MSU-STT-351-Sum 15 (P. Vellaisamy: MSU-STT-351-Sum 15) Probability & Statistics for Engineers 1 / 56 Contents 1 Introduction 2 Branches of Statistics Descriptive Statistics
More informationKernel Sliced Inverse Regression With Applications to Classification
May 21-24, 2008 in Durham, NC Kernel Sliced Inverse Regression With Applications to Classification Han-Ming Wu (Hank) Department of Mathematics, Tamkang University Taipei, Taiwan 2008/05/22 http://www.hmwu.idv.tw
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1
Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of
More informationPart I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes
Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with
More informationCS 5630/6630 Scientific Visualization. Elementary Plotting Techniques II
CS 5630/6630 Scientific Visualization Elementary Plotting Techniques II Motivation Given a certain type of data, what plotting technique should I use? What plotting techniques should be avoided? How do
More informationAre You Ready? Multiply and Divide Fractions
SKILL Multiply and Divide Fractions eaching Skill Objective Multiply and divide fractions. Review with students the steps for multiplying fractions. Point out that it is a good idea to write fraction multiplication
More informationUNIVERSITY OF NORTH CAROLINA CHARLOTTE 1995 HIGH SCHOOL MATHEMATICS CONTEST March 13, 1995 (C) 10 3 (D) = 1011 (10 1) 9
UNIVERSITY OF NORTH CAROLINA CHARLOTTE 5 HIGH SCHOOL MATHEMATICS CONTEST March, 5. 0 2 0 = (A) (B) 0 (C) 0 (D) 0 (E) 0 (E) 0 2 0 = 0 (0 ) = 0 2. If z = x, what are all the values of y for which (x + y)
More informationSymmetry Transforms 1
Symmetry Transforms 1 Motivation Symmetry is everywhere 2 Motivation Symmetry is everywhere Perfect Symmetry [Blum 64, 67] [Wolter 85] [Minovic 97] [Martinet 05] 3 Motivation Symmetry is everywhere Local
More informationAn Introduction to Machine Learning
An Introduction to Machine Learning L6: Structured Estimation Alexander J. Smola Statistical Machine Learning Program Canberra, ACT 0200 Australia Alex.Smola@nicta.com.au Tata Institute, Pune, January
More informationMSA220 Statistical Learning for Big Data
MSA220 Statistical Learning for Big Data Lecture 4 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology More on Discriminant analysis More on Discriminant
More informationMathematics Grade 3. grade 3 21
Mathematics Grade 3 In Grade 3, instructional time should focus on four critical areas: (1) developing understanding of multiplication and division and strategies for multiplication and division within
More informationLecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad
Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad Key message Spatial dependence First Law of Geography (Waldo Tobler): Everything is related to everything else, but near things
More informationLecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides
Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides Intelligent Data Analysis and Probabilistic Inference Lecture
More informationA Bias Correction for the Minimum Error Rate in Cross-validation
A Bias Correction for the Minimum Error Rate in Cross-validation Ryan J. Tibshirani Robert Tibshirani Abstract Tuning parameters in supervised learning problems are often estimated by cross-validation.
More informationStatistical Concepts. Constructing a Trend Plot
Module 1: Review of Basic Statistical Concepts 1.2 Plotting Data, Measures of Central Tendency and Dispersion, and Correlation Constructing a Trend Plot A trend plot graphs the data against a variable
More informationThe Principal Component Analysis
The Principal Component Analysis Philippe B. Laval KSU Fall 2017 Philippe B. Laval (KSU) PCA Fall 2017 1 / 27 Introduction Every 80 minutes, the two Landsat satellites go around the world, recording images
More informationThe role of dimensionality reduction in classification
The role of dimensionality reduction in classification Weiran Wang and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu
More informationCalculating Weight Using Volume
Another Way to Get the Answer Advanced Calculate the weight of a steel reinforcing bar (long cylinder) if length is 50 feet, the cross-sectional area is 0.785 in 2 (Bar diameter is 1 inch.) The density
More informationLecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4
Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Data sources and preparation Project 1 out 4 Data reduction, similarity & distance, data augmentation
More informationPackage rrr. R topics documented: December 9, Title Reduced-Rank Regression Version URL
Title Reduced-Ran Regression Version 1.0.0 URL http://github.com/chrisadd/rrr Pacage rrr December 9, 2016 Reduced-ran regression, diagnostics and graphics. Depends R (>= 3.2.0) Imports Rcpp, MASS, magrittr,
More informationPrincipal Component Analysis
I.T. Jolliffe Principal Component Analysis Second Edition With 28 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition Acknowledgments List of Figures List of Tables
More informationJointly Distributed Variables
Jointly Distributed Variables Sec 2.6, 9.1 & 9.2 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 7-3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationPackage SPreFuGED. July 29, 2016
Type Package Package SPreFuGED July 29, 2016 Title Selecting a Predictive Function for a Given Gene Expression Data Version 1.0 Date 2016-07-27 Author Victor L Jong, Kit CB Roes & Marinus JC Eijkemans
More informationGlobal Clinical Data Classification: A Discriminate Analysis
Global Clinical Data Classification: A Discriminate Analysis Amurthur Ramamurthy, Gordon Kapke and Jodi Yoder Covance Central Laboratories, Indianapolis, IN How different is Clinical Laboratory data sets
More informationLEC 4: Discriminant Analysis for Classification
LEC 4: Discriminant Analysis for Classification Dr. Guangliang Chen February 25, 2016 Outline Last time: FDA (dimensionality reduction) Today: QDA/LDA (classification) Naive Bayes classifiers Matlab/Python
More informationTables Table A Table B Table C Table D Table E 675
BMTables.indd Page 675 11/15/11 4:25:16 PM user-s163 Tables Table A Standard Normal Probabilities Table B Random Digits Table C t Distribution Critical Values Table D Chi-square Distribution Critical Values
More informationA Triangular Array of the Counts of Natural Numbers with the Same Number of Prime Factors (Dimensions) Within 2 n Space
A Triangular Array of the Counts of Natural Numbers with the Same Number of Prime Factors (Dimensions) Within 2 n Space Abstract By defining the dimension of natural numbers as the number of prime factors,
More informationSome hints for the Radioactive Decay lab
Some hints for the Radioactive Decay lab Edward Stokan, March 7, 2011 Plotting a histogram using Microsoft Excel The way I make histograms in Excel is to put the bounds of the bin on the top row beside
More information7 Gaussian Discriminant Analysis (including QDA and LDA)
36 Jonathan Richard Shewchuk 7 Gaussian Discriminant Analysis (including QDA and LDA) GAUSSIAN DISCRIMINANT ANALYSIS Fundamental assumption: each class comes from normal distribution (Gaussian). X N(µ,
More informationVectorization. Yu Wu, Ishan Patil. October 13, 2017
Vectorization Yu Wu, Ishan Patil October 13, 2017 Exercises to be covered We will implement some examples of image classification algorithms using a subset of the MNIST dataset logistic regression for
More informationOverview. Overview. Overview. Specific Examples. General Examples. Bivariate Regression & Correlation
Bivariate Regression & Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate Linear Regression Line SPSS Output Interpretation Covariance ou already
More informationThe Bayes classifier
The Bayes classifier Consider where is a random vector in is a random variable (depending on ) Let be a classifier with probability of error/risk given by The Bayes classifier (denoted ) is the optimal
More informationTransform Coding. Transform Coding Principle
Transform Coding Principle of block-wise transform coding Properties of orthonormal transforms Discrete cosine transform (DCT) Bit allocation for transform coefficients Entropy coding of transform coefficients
More informationPart I. Linear Discriminant Analysis. Discriminant analysis. Discriminant analysis
Week 5 Based in part on slides from textbook, slides of Susan Holmes Part I Linear Discriminant Analysis October 29, 2012 1 / 1 2 / 1 Nearest centroid rule Suppose we break down our data matrix as by the
More informationLecture 5. Gaussian Models - Part 1. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. November 29, 2016
Lecture 5 Gaussian Models - Part 1 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza November 29, 2016 Luigi Freda ( La Sapienza University) Lecture 5 November 29, 2016 1 / 42 Outline 1 Basics
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationPrincipal Component Analysis, A Powerful Scoring Technique
Principal Component Analysis, A Powerful Scoring Technique George C. J. Fernandez, University of Nevada - Reno, Reno NV 89557 ABSTRACT Data mining is a collection of analytical techniques to uncover new
More informationLecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad
Lecture 3: Exploratory Spatial Data Analysis (ESDA) Prof. Eduardo A. Haddad Key message Spatial dependence First Law of Geography (Waldo Tobler): Everything is related to everything else, but near things
More informationSF2935: MODERN METHODS OF STATISTICAL LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS LEARNING. Tatjana Pavlenko.
SF2935: MODERN METHODS OF STATISTICAL LEARNING LECTURE 3 SUPERVISED CLASSIFICATION, LINEAR DISCRIMINANT ANALYSIS Tatjana Pavlenko 5 November 2015 SUPERVISED LEARNING (REP.) Starting point: we have an outcome
More informationGrade 3 Unit Standards ASSESSMENT #1
ASSESSMENT #1 3.NBT.1 Use place value understanding to round whole numbers to the nearest 10 or 100 Fluently add and subtract within 1000 using strategies and algorithms based on place value, properties
More informationDiscriminant analysis and supervised classification
Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical
More informationFall 07 ISQS 6348 Midterm Solutions
Fall 07 ISQS 648 Midterm Solutions Instructions: Open notes, no books. Points out of 00 in parentheses. 1. A random vector X = 4 X 1 X X has the following mean vector and covariance matrix: E(X) = 4 1
More informationA Correlation of. Student Activity Book. to the Common Core State Standards for Mathematics. Grade 2
A Correlation of Student Activity Book to the Common Core State Standards for Mathematics Grade 2 Copyright 2016 Pearson Education, Inc. or its affiliate(s). All rights reserved Grade 2 Units Unit 1 -
More informationCalifornia CCSS Mathematics Grades 1-3
Operations and Algebraic Thinking Represent and solve problems involving addition and subtraction. 1.OA.1. Use addition and subtraction within 20 to solve word problems involving situations of adding to,
More informationPre-Junior Certificate Examination, Mathematics. Paper 1 Ordinary Level Time: 2 hours. 300 marks. For examiner Question Mark Question Mark
J.17 NAME SCHOOL TEACHER Pre-Junior Certificate Examination, 016 Name/vers Printed: Checked: To: Updated: Name/vers Complete ( Paper 1 Ordinary Level Time: hours 300 marks For examiner Question Mark Question
More informationExample: Face Detection
Announcements HW1 returned New attendance policy Face Recognition: Dimensionality Reduction On time: 1 point Five minutes or more late: 0.5 points Absent: 0 points Biometrics CSE 190 Lecture 14 CSE190,
More informationLevel 3 Calculus, 2008
90635 3 906350 For Supervisor s Level 3 Calculus, 008 90635 Differentiate functions and use derivatives to solve problems Credits: Six 9.30 am Tuesday 18 November 008 Check that the National Student Number
More informationRobot Image Credit: Viktoriya Sukhanova 123RF.com. Dimensionality Reduction
Robot Image Credit: Viktoriya Sukhanova 13RF.com Dimensionality Reduction Feature Selection vs. Dimensionality Reduction Feature Selection (last time) Select a subset of features. When classifying novel
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More information