Last time: PCA. Statistical Data Mining and Machine Learning Hilary Term Singular Value Decomposition (SVD) Eigendecomposition and PCA

Size: px
Start display at page:

Download "Last time: PCA. Statistical Data Mining and Machine Learning Hilary Term Singular Value Decomposition (SVD) Eigendecomposition and PCA"

Transcription

1 Last time: PCA Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: PCA Find an orthogonal basis {v 1, v 2,..., v p } for the data space such that: The first principal component (PC) v 1 is the direction of greatest variance of data. The j-th PC v j is the direction orthogonal to v 1, v 2,..., v j 1 of greatest variance, for j = 2,..., p. Eigendecomposition of the sample covariance matrix S = 1 n 1 n i=1 x ix i. S = VΛV. Λ is a diagonal matrix with eigenvalues (variances along each principal component) λ 1 λ 2 λ p 0 V is a p p orthogonal matrix whose columns are the p eigenvectors of S, i.e. the principal components v 1,..., v p Dimensionality reduction by projecting x i R p onto first k principal components: z i = [ v 1 x i,..., v k x i ] R k. Eigendecomposition and PCA Singular Value Decomposition (SVD) S = 1 n 1 n i=1 x i x i = 1 n 1 X X. S is a real and symmetric matrix, so there exist p eigenvectors v 1,..., v p that are pairwise orthogonal and p associated eigenvalues λ 1,..., λ p which satisfy the eigenvalue equation Sv i = λ i v i. In particular, V is an orthogonal matrix: VV = V V = I p. S is a positive-semidefinite matrix, so the eigenvalues are non-negative: λ i 0, i. Why is S symmetric? Why is S positive-semidefinite? Reminder: A symmetric p p matrix R is said to be positive-semidefinite if a R p, a Ra 0. SVD Any real-valued n p matrix X can be written as X = UDV where U is an n n orthogonal matrix: UU = U U = I n D is a n p matrix with decreasing non-negative elements on the diagonal (the singular values) and zero off-diagonal elements. V is a p p orthogonal matrix: VV = V V = I p SVD always exists, even for non-square matrices. Fast and numerically stable algorithms for SVD are available in most packages.the relevant R command is svd.

2 SVD and PCA Let X = UDV be the SVD of the n p data matrix X. Note that (n 1)S = X X = (UDV ) (UDV ) = VD U UDV = VD DV, using orthogonality (U U = I n ) of U. The eigenvalues of S are thus the diagonal entries of Λ = 1 n 1 D D. We also have XX = (UDV )(UDV ) = UDV VD U = UDD U, using orthogonality (V V = I p ) of V. Gram matrix B = XX, B ij = x i x j is called the Gram matrix of dataset X. B and (n 1)S = X X have the same nonzero eigenvalues, equal to the non-zero squared singular values of X. > biplot(crabs.pca,scale=1) CWCL BD FL RW PCA plots show the data items (rows of X) in the space spanned by PCs. allow us to visualize the original variables X (1),..., X (p) (corresponding to columns of X) in the same plot. Iris Data Recall that X = [X (1),..., X (p) ] and X = UDV is the SVD of the data matrix. The full PC projection of x i is the i-th row of UD: z i = V x i = D U i, equivalently: XV = UD. The j-th unit vector e j R p points in the direction of the original variable X (j). Its PC projection η j is: η j = V e j = V j (the j-th row of V) The projection of e j indicates the weighting each PC gives to the original variable X (j). Dot products between these projections give entries of the data matrix: 50 samples from each of the 3 species of iris: setosa, versicolor, and virginica Each measuring the length and widths of both sepal and petals Collected by E. Anderson (135) and analysed by R.A. Fisher (136) x ij = min{n,p} k=1 U ik D kk V jk = D U i, V j = z i, η j. focus on the first two PCs and the quality depends on the proportion of variance explained by the first two PCs.

3 Iris Data Iris data biplot > data(iris) > iris[sample(150,20),] Sepal.Length Sepal.Width Petal.Length Petal.Width Species versicolor setosa setosa versicolor virginica setosa versicolor versicolor setosa versicolor virginica versicolor setosa setosa versicolor versicolor versicolor virginica setosa setosa > iris.pca<-princomp(iris[,-5],cor=true) > loadings(iris.pca) Comp.3 Comp.4 Sepal.Length Sepal.Width Petal.Length Petal.Width > biplot(iris.pca,scale=0) Sepal.Width Petal.Length Petal.Width Sepal.Length Iris Data biplot - scaled There are other projections we can consider for biplots (assuming p < n to simplify notation): p x ij = U ik D kk V jk = D 1:p,1:pUi,1:p, Vj = D 1 α 1:p,1:p U i,1:p, D α 1:p,1:pVj. k=1 where 0 α 1, i.e., we change representation to z i = D 1 α 1:p,1:p U i,1:p, η j = D α 1:p,1:pV j case α = 1: Sample covariance of the projected points is: Ĉov ( Z ) = 1 n 1 U 1:n,1:pU 1:n,1:p = 1 n 1 I p. Projected points are uncorrelated and dimensions are equi-variance. Sample covariance between X (i) and X (j) is: Ê(X (i) X (j) ) = 1 ( VD DV ) = 1 n 1 i,j n 1 D 1:p,1:pVi, D 1:p,1:p Vj The angle between the projected variables corresponds to their correlation. >?biplot... scale: The variables are scaled by lambda ^ scale and the observations are scaled by lambda ^ (1-scale) where lambda are the singular values as computed by princomp. (default=1)... > biplot(iris.pca,scale=1) Sepal.Width Petal.Len Petal.Wid Sepal.Length

4 Crabs Data biplots US Arrests Data > biplot(crabs.pca,scale=0) > biplot(crabs.pca,scale=1) CW CL BD FL RW CWCL BD FL RW This data set contains statistics, in arrests per 100,000 residents for assault, murder, and rape in each of the 50 US states in 173. Also given is the percent of the population living in urban areas. pairs(usarrests) usarrests.pca <- princomp(usarrests,cor=t) plot(usarrests.pca) pairs(predict(usarrests.pca)) biplot(usarrests.pca) US Arrests Data Pairs Plot US Arrests Data Biplot > pairs(usarrests) > biplot(usarrests.pca) Murder Assault UrbanPop Rape Mississippi North Carolina South Carolina West VirginiaVermont Georgia Alabama Alaska Arkansas Kentucky Murder Louisiana Tennessee South Dakota North Dakota Montana Maryland Assault Wyoming Maine Virginia Idaho New Mexico Florida New Hampshire Michigan Iowa Indiana Nebraska Missouri Kansas Rape Delaware Oklahoma Texas Oregon Pennsylvania Minnesota Wisconsin Illinois Nevada Arizona Ohio New York Colorado Washington Connecticut New Jersey Massachusetts Utah Rhode Island California Hawaii UrbanPop

5 Suppose there are n points X in R p, but we are only given the n n matrix D of inter-point distances. Can we reconstruct X? Rigid transformations (translations, rotations and reflections) do not change inter-point distances so cannot recover X exactly. However X can be recovered up to these transformations! Let d ij = x i x j 2 be the distance between points x i and x j. d 2 ij = x i x j 2 2 = (x i x j ) (x i x j ) = x i x i + x j x j 2x i x j Let B = XX be the n n matrix of dot-products, b ij = x i x j. The above shows that D can be computed from B. Some algebraic exercise shows that B can be recovered from D if we assume n i=1 x i = 0. US City Flight Distances If we knew X, then SVD gives X = UDV. As X has rank at most r = min(n, p), we have at most r non-zero singular values in D and we can assume U R n r, D R r r and V R r p. The eigendecomposition of B is then: B = XX = UD 2 U = UΛU. This eigendecomposition can be obtained from B without knowledge of X! Let x i = U i Λ 1 2 R r. If r < p, pad x i with 0s so that it has length p. Then, x i x j = U i ΛU j = b ij = x i x j and we have found a set of vectors with dot-products given by B, as desired. The vectors x i differs from x i only via the orthogonal matrix V (recall that xi = U i DV = x i V ) so are equivalent up to rotation and reflections. We present a table of flying mileages between 10 American cities, distances calculated from our 2-dimensional world. Using D as the starting point, metric MDS finds a configuration with the same distance matrix. ATLA CHIG DENV HOUS LA MIAM NY SF SEAT DC

6 US City Flight Distances US City Flight Distances library(mass) us <- read.csv(" ## use classical MDS to find lower dimensional views of the data ## recover X in 2 dimensions us.classical <- cmdscale(d=us,k=2) plot(us.classical) text(us.classical,labels=names(us)) MIAMI NY DC ATLANTA CHICAGO HOUSTON DENVER LA SF SEATTLE Lower-dimensional Reconstructions Crabs Data In classical MDS derivation, we used all eigenvalues in the eigendecomposition of B to reconstruct x i = U i Λ 1 2. library(mass) crabs$spsex=paste(crabs$sp,crabs$sex,sep="") varnames<-c( FL, RW, CL, CW, BD ) Crabs <- crabs[,varnames] Crabs.class <- factor(crabs$spsex) crabsmds <- cmdscale(d= dist(crabs),k=2) plot(crabsmds, pch=20, cex=2,col=unclass(crabs.class)) We can use only the largest k < min(n, p) eigenvalues and eigenvectors in the reconstruction, giving the best k-dimensional view of the data. This is analogous to PCA, where only the largest eigenvalues of X X are used, and the smallest ones effectively suppressed. Indeed, PCA and classical MDS are duals and yield effectively the same result. MDS MDS 1

7 Crabs Data Varieties of MDS Compare with previous PCA analysis. Classical MDS solution corresponds to the first 2 PCs Comp.3 Comp.4 Comp Generally, MDS is a class of dimensionality reduction techniques which represents data points x 1,..., x n R p in a lower-dimensional space z 1,..., z n R k which tries to preserve inter-point (dis)similarities. It requires only the matrix D of pairwise dissimilarities d ij = d(x i, x j ). For example, we can use Euclidean distance d ij = x i x j 2, but other dissimilarities are possible. MDS finds representations z 1,..., z n R k such that z i z j 2 d(x i, x j ) = d ij, and differences in dissimilarities are measured by the appropriate loss (d ij, z i z j 2 ). Goal: Find Z which minimizes the stress function S(Z) = i j (d ij, z i z j 2 ) Varieties of MDS Choices of (dis)similarities and (stress) functions lead to different algorithms. Classical/Torgerson: preserves inner products instead - strain function (cmdscale) S(Z) = (b ij z i z, z j z ) 2 i j Metric Shephard-Kruskal: preserves distances w.r.t. squared stress S(Z) = i j (d ij z i z j 2 ) 2 Sammon: preserves shorter distances more (sammon) S(Z) = i j (d ij z i z j 2 ) 2 Non-Metric Shephard-Kruskal: ignores actual distance values, only preserves ranks (isomds) i j S(Z) = min (g(d ij) z i z j 2 ) 2 g increasing i j z i z j 2 2 d ij

Outline. Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining

Outline. Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining Outline Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining Dimensionality Reduction Introduction Principal Components Analysis Singular Value Decomposition Multidimensional

More information

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially

More information

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis

Lecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 3: Principal Component Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 4: Factor analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering Pedro

More information

New Educators Campaign Weekly Report

New Educators Campaign Weekly Report Campaign Weekly Report Conversations and 9/24/2017 Leader Forms Emails Collected Text Opt-ins Digital Journey 14,661 5,289 4,458 7,124 317 13,699 1,871 2,124 Pro 13,924 5,175 4,345 6,726 294 13,086 1,767

More information

Multivariate Statistics

Multivariate Statistics Multivariate Statistics Chapter 6: Cluster Analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2017/2018 Master in Mathematical Engineering

More information

Jakarta International School 6 th Grade Formative Assessment Graphing and Statistics -Black

Jakarta International School 6 th Grade Formative Assessment Graphing and Statistics -Black Jakarta International School 6 th Grade Formative Assessment Graphing and Statistics -Black Name: Date: Score : 42 Data collection, presentation and application Frequency tables. (Answer question 1 on

More information

Standard Indicator That s the Latitude! Students will use latitude and longitude to locate places in Indiana and other parts of the world.

Standard Indicator That s the Latitude! Students will use latitude and longitude to locate places in Indiana and other parts of the world. Standard Indicator 4.3.1 That s the Latitude! Purpose Students will use latitude and longitude to locate places in Indiana and other parts of the world. Materials For the teacher: graph paper, globe showing

More information

Challenge 1: Learning About the Physical Geography of Canada and the United States

Challenge 1: Learning About the Physical Geography of Canada and the United States 60ºN S T U D E N T H A N D O U T Challenge 1: Learning About the Physical Geography of Canada and the United States 170ºE 10ºW 180º 20ºW 60ºN 30ºW 1 40ºW 160ºW 50ºW 150ºW 60ºW 140ºW N W S E 0 500 1,000

More information

, District of Columbia

, District of Columbia State Capitals These are the State Seals of each state. Fill in the blank with the name of each states capital city. (Hint: You may find it helpful to do the word search first to refresh your memory.),

More information

Outline. Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining

Outline. Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining Outline Administrivia and Introduction Course Structure Syllabus Introduction to Data Mining Dimensionality Reduction Introduction Principal Components Analysis Singular Value Decomposition Multidimensional

More information

Multivariate Analysis

Multivariate Analysis Multivariate Analysis Chapter 5: Cluster analysis Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2015/2016 Master in Business Administration and

More information

Correction to Spatial and temporal distributions of U.S. winds and wind power at 80 m derived from measurements

Correction to Spatial and temporal distributions of U.S. winds and wind power at 80 m derived from measurements JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 109,, doi:10.1029/2004jd005099, 2004 Correction to Spatial and temporal distributions of U.S. winds and wind power at 80 m derived from measurements Cristina L. Archer

More information

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee May 2018

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee May 2018 Cooperative Program Allocation Budget Receipts May 2018 Cooperative Program Allocation Budget Current Current $ Change % Change Month Month from from Contribution Sources 2017-2018 2016-2017 Prior Year

More information

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee October 2017

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee October 2017 Cooperative Program Allocation Budget Receipts October 2017 Cooperative Program Allocation Budget Current Current $ Change % Change Month Month from from Contribution Sources 2017-2018 2016-2017 Prior

More information

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee October 2018

Cooperative Program Allocation Budget Receipts Southern Baptist Convention Executive Committee October 2018 Cooperative Program Allocation Budget Receipts October 2018 Cooperative Program Allocation Budget Current Current $ Change % Change Month Month from from Contribution Sources 2018-2019 2017-2018 Prior

More information

Preview: Making a Mental Map of the Region

Preview: Making a Mental Map of the Region Preview: Making a Mental Map of the Region Draw an outline map of Canada and the United States on the next page or on a separate sheet of paper. Add a compass rose to your map, showing where north, south,

More information

Abortion Facilities Target College Students

Abortion Facilities Target College Students Target College Students By Kristan Hawkins Executive Director, Students for Life America Ashleigh Weaver Researcher Abstract In the Fall 2011, Life Dynamics released a study entitled, Racial Targeting

More information

Additional VEX Worlds 2019 Spot Allocations

Additional VEX Worlds 2019 Spot Allocations Overview VEX Worlds 2019 Spot s Qualifying spots for the VEX Robotics World Championship are calculated twice per year. On the following table, the number in the column is based on the number of teams

More information

QF (Build 1010) Widget Publishing, Inc Page: 1 Batch: 98 Test Mode VAC Publisher's Statement 03/15/16, 10:20:02 Circulation by Issue

QF (Build 1010) Widget Publishing, Inc Page: 1 Batch: 98 Test Mode VAC Publisher's Statement 03/15/16, 10:20:02 Circulation by Issue QF 1.100 (Build 1010) Widget Publishing, Inc Page: 1 Circulation by Issue Qualified Non-Paid Circulation Qualified Paid Circulation Individual Assoc. Total Assoc. Total Total Requester Group Qualified

More information

Online Appendix: Can Easing Concealed Carry Deter Crime?

Online Appendix: Can Easing Concealed Carry Deter Crime? Online Appendix: Can Easing Concealed Carry Deter Crime? David Fortunato University of California, Merced dfortunato@ucmerced.edu Regulations included in institutional context measure As noted in the main

More information

PRINCIPAL COMPONENTS ANALYSIS

PRINCIPAL COMPONENTS ANALYSIS PRINCIPAL COMPONENTS ANALYSIS Iris Data Let s find Principal Components using the iris dataset. This is a well known dataset, often used to demonstrate the effect of clustering algorithms. It contains

More information

A. Geography Students know the location of places, geographic features, and patterns of the environment.

A. Geography Students know the location of places, geographic features, and patterns of the environment. Learning Targets Elementary Social Studies Grade 5 2014-2015 A. Geography Students know the location of places, geographic features, and patterns of the environment. A.5.1. A.5.2. A.5.3. A.5.4. Label North

More information

Summary of Natural Hazard Statistics for 2008 in the United States

Summary of Natural Hazard Statistics for 2008 in the United States Summary of Natural Hazard Statistics for 2008 in the United States This National Weather Service (NWS) report summarizes fatalities, injuries and damages caused by severe weather in 2008. The NWS Office

More information

Printable Activity book

Printable Activity book Printable Activity book 16 Pages of Activities Printable Activity Book Print it Take it Keep them busy Print them out Laminate them or Put them in page protectors Put them in a binder Bring along a dry

More information

Intercity Bus Stop Analysis

Intercity Bus Stop Analysis by Karalyn Clouser, Research Associate and David Kack, Director of the Small Urban and Rural Livability Center Western Transportation Institute College of Engineering Montana State University Report prepared

More information

Hourly Precipitation Data Documentation (text and csv version) February 2016

Hourly Precipitation Data Documentation (text and csv version) February 2016 I. Description Hourly Precipitation Data Documentation (text and csv version) February 2016 Hourly Precipitation Data (labeled Precipitation Hourly in Climate Data Online system) is a database that gives

More information

Chapter. Organizing and Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc.

Chapter. Organizing and Summarizing Data. Copyright 2013, 2010 and 2007 Pearson Education, Inc. Chapter 2 Organizing and Summarizing Data Section 2.1 Organizing Qualitative Data Objectives 1. Organize Qualitative Data in Tables 2. Construct Bar Graphs 3. Construct Pie Charts When data is collected

More information

Meteorology 110. Lab 1. Geography and Map Skills

Meteorology 110. Lab 1. Geography and Map Skills Meteorology 110 Name Lab 1 Geography and Map Skills 1. Geography Weather involves maps. There s no getting around it. You must know where places are so when they are mentioned in the course it won t be

More information

Club Convergence and Clustering of U.S. State-Level CO 2 Emissions

Club Convergence and Clustering of U.S. State-Level CO 2 Emissions Methodological Club Convergence and Clustering of U.S. State-Level CO 2 Emissions J. Wesley Burnett Division of Resource Management West Virginia University Wednesday, August 31, 2013 Outline Motivation

More information

What Lies Beneath: A Sub- National Look at Okun s Law for the United States.

What Lies Beneath: A Sub- National Look at Okun s Law for the United States. What Lies Beneath: A Sub- National Look at Okun s Law for the United States. Nathalie Gonzalez Prieto International Monetary Fund Global Labor Markets Workshop Paris, September 1-2, 2016 What the paper

More information

Osteopathic Medical Colleges

Osteopathic Medical Colleges Osteopathic Medical Colleges Matriculants by U.S. States and Territories Entering Class 0 Prepared by the Research Department American Association of Colleges of Osteopathic Medicine Copyright 0, AAM All

More information

Office of Special Education Projects State Contacts List - Part B and Part C

Office of Special Education Projects State Contacts List - Part B and Part C Office of Special Education Projects State Contacts List - Part B and Part C Source: http://www.ed.gov/policy/speced/guid/idea/monitor/state-contactlist.html Alabama Customer Specialist: Jill Harris 202-245-7372

More information

SUPPLEMENTAL NUTRITION ASSISTANCE PROGRAM QUALITY CONTROL ANNUAL REPORT FISCAL YEAR 2008

SUPPLEMENTAL NUTRITION ASSISTANCE PROGRAM QUALITY CONTROL ANNUAL REPORT FISCAL YEAR 2008 SUPPLEMENTAL NUTRITION ASSISTANCE PROGRAM QUALITY CONTROL ANNUAL REPORT FISCAL YEAR 2008 U.S. DEPARTMENT OF AGRICULTURE FOOD AND NUTRITION SERVICE PROGRAM ACCOUNTABILITY AND ADMINISTRATION DIVISION QUALITY

More information

JAN/FEB MAR/APR MAY/JUN

JAN/FEB MAR/APR MAY/JUN QF 1.100 (Build 1010) Widget Publishing, Inc Page: 1 Circulation Breakdown by Issue Qualified Non-Paid Qualified Paid Previous This Previous This Total Total issue Removals Additions issue issue Removals

More information

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984.

Parametric Test. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 1984. Multiple Linear Regression Spatial Application I: State Homicide Rates Equations taken from Zar, 984. y ˆ = a + b x + b 2 x 2K + b n x n where n is the number of variables Example: In an earlier bivariate

More information

2005 Mortgage Broker Regulation Matrix

2005 Mortgage Broker Regulation Matrix 2005 Mortgage Broker Regulation Matrix Notes on individual states follow the table REG EXEMPTIONS LIC-EDU LIC-EXP LIC-EXAM LIC-CONT-EDU NET WORTH BOND MAN-LIC MAN-EDU MAN-EXP MAN-EXAM Alabama 1 0 2 0 0

More information

North American Geography. Lesson 2: My Country tis of Thee

North American Geography. Lesson 2: My Country tis of Thee North American Geography Lesson 2: My Country tis of Thee Unit Overview: As students work through the activities in this unit they will be introduced to the United States in general, different regions

More information

Multivariate Classification Methods: The Prevalence of Sexually Transmitted Diseases

Multivariate Classification Methods: The Prevalence of Sexually Transmitted Diseases Multivariate Classification Methods: The Prevalence of Sexually Transmitted Diseases Summer Undergraduate Mathematical Sciences Research Institute (SUMSRI) Lindsay Kellam, Queens College kellaml@queens.edu

More information

LABORATORY REPORT. If you have any questions concerning this report, please do not hesitate to call us at (800) or (574)

LABORATORY REPORT. If you have any questions concerning this report, please do not hesitate to call us at (800) or (574) LABORATORY REPORT If you have any questions concerning this report, please do not hesitate to call us at (800) 332-4345 or (574) 233-4777. This report may not be reproduced, except in full, without written

More information

Principal Component Analysis (PCA) Principal Component Analysis (PCA)

Principal Component Analysis (PCA) Principal Component Analysis (PCA) Recall: Eigenvectors of the Covariance Matrix Covariance matrices are symmetric. Eigenvectors are orthogonal Eigenvectors are ordered by the magnitude of eigenvalues: λ 1 λ 2 λ p {v 1, v 2,..., v n } Recall:

More information

LABORATORY REPORT. If you have any questions concerning this report, please do not hesitate to call us at (800) or (574)

LABORATORY REPORT. If you have any questions concerning this report, please do not hesitate to call us at (800) or (574) LABORATORY REPORT If you have any questions concerning this report, please do not hesitate to call us at (800) 332-4345 or (574) 233-4777. This report may not be reproduced, except in full, without written

More information

RELATIONSHIPS BETWEEN THE AMERICAN BROWN BEAR POPULATION AND THE BIGFOOT PHENOMENON

RELATIONSHIPS BETWEEN THE AMERICAN BROWN BEAR POPULATION AND THE BIGFOOT PHENOMENON RELATIONSHIPS BETWEEN THE AMERICAN BROWN BEAR POPULATION AND THE BIGFOOT PHENOMENON ETHAN A. BLIGHT Blight Investigations, Gainesville, FL ABSTRACT Misidentification of the American brown bear (Ursus arctos,

More information

Statistical Methods for Data Mining

Statistical Methods for Data Mining Statistical Methods for Data Mining Kuangnan Fang Xiamen University Email: xmufkn@xmu.edu.cn Linear Model Selection and Regularization Recall the linear model Y = 0 + 1 X 1 + + p X p +. In the lectures

More information

MINERALS THROUGH GEOGRAPHY

MINERALS THROUGH GEOGRAPHY MINERALS THROUGH GEOGRAPHY INTRODUCTION Minerals are related to rock type, not political definition of place. So, the minerals are to be found in a variety of locations that doesn t depend on population

More information

Alpine Funds 2016 Tax Guide

Alpine Funds 2016 Tax Guide Alpine s 2016 Guide Alpine Dynamic Dividend ADVDX 01/28/2016 01/29/2016 01/29/2016 0.020000000 0.017621842 0.000000000 0.00000000 0.017621842 0.013359130 0.000000000 0.000000000 0.002378158 0.000000000

More information

Crop Progress. Corn Mature Selected States [These 18 States planted 92% of the 2017 corn acreage]

Crop Progress. Corn Mature Selected States [These 18 States planted 92% of the 2017 corn acreage] Crop Progress ISSN: 00 Released October, 0, by the National Agricultural Statistics Service (NASS), Agricultural Statistics Board, United s Department of Agriculture (USDA). Corn Mature Selected s [These

More information

Alpine Funds 2017 Tax Guide

Alpine Funds 2017 Tax Guide Alpine s 2017 Guide Alpine Dynamic Dividend ADVDX 1/30/17 1/31/17 1/31/17 0.020000000 0.019248130 0.000000000 0.00000000 0.019248130 0.013842273 0.000000000 0.000000000 0.000751870 0.000000000 0.00 0.00

More information

Pima Community College Students who Enrolled at Top 200 Ranked Universities

Pima Community College Students who Enrolled at Top 200 Ranked Universities Pima Community College Students who Enrolled at Top 200 Ranked Universities Institutional Research, Planning and Effectiveness Project #20170814-MH-60-CIR August 2017 Students who Attended Pima Community

More information

OUT-OF-STATE 965 SUBTOTAL OUT-OF-STATE U.S. TERRITORIES FOREIGN COUNTRIES UNKNOWN GRAND TOTAL

OUT-OF-STATE 965 SUBTOTAL OUT-OF-STATE U.S. TERRITORIES FOREIGN COUNTRIES UNKNOWN GRAND TOTAL Report ID: USSR8072-V3 Page No. 1 Jurisdiction: ON-CAMPUS IL Southern Illinois University - Carb 1 0 0 0 Black Hawk College Quad-Cities 0 0 1 0 John A Logan College 1 0 0 0 Rend Lake College 1 0 0 0 Aurora

More information

Rank University AMJ AMR ASQ JAP OBHDP OS PPSYCH SMJ SUM 1 University of Pennsylvania (T) Michigan State University

Rank University AMJ AMR ASQ JAP OBHDP OS PPSYCH SMJ SUM 1 University of Pennsylvania (T) Michigan State University Rank University AMJ AMR ASQ JAP OBHDP OS PPSYCH SMJ SUM 1 University of Pennsylvania 4 1 2 0 2 4 0 9 22 2(T) Michigan State University 2 0 0 9 1 0 0 4 16 University of Michigan 3 0 2 5 2 0 0 4 16 4 Harvard

More information

High School World History Cycle 2 Week 2 Lifework

High School World History Cycle 2 Week 2 Lifework Name: Advisory: Period: High School World History Cycle 2 Week 2 Lifework This packet is due Monday, November 7 Complete and turn in on Friday for 10 points of EXTRA CREDIT! Lifework Assignment Complete

More information

extreme weather, climate & preparedness in the american mind

extreme weather, climate & preparedness in the american mind extreme weather, climate & preparedness in the american mind Extreme Weather, Climate & Preparedness In the American Mind Interview dates: March 12, 2012 March 30, 2012. Interviews: 1,008 Adults (18+)

More information

Grand Total Baccalaureate Post-Baccalaureate Masters Doctorate Professional Post-Professional

Grand Total Baccalaureate Post-Baccalaureate Masters Doctorate Professional Post-Professional s by Location of Permanent Home Address and Degree Level Louisiana Acadia 19 13 0 3 0 3 0 0 0 Allen 5 5 0 0 0 0 0 0 0 Ascension 307 269 2 28 1 6 0 1 0 Assumption 14 12 0 1 0 1 0 0 0 Avoyelles 6 4 0 1 0

More information

Grand Total Baccalaureate Post-Baccalaureate Masters Doctorate Professional Post-Professional

Grand Total Baccalaureate Post-Baccalaureate Masters Doctorate Professional Post-Professional s by Location of Permanent Home Address and Degree Level Louisiana Acadia 26 19 0 6 1 0 0 0 0 Allen 7 7 0 0 0 0 0 0 0 Ascension 275 241 3 23 1 6 0 1 0 Assumption 13 12 0 1 0 0 0 0 0 Avoyelles 15 11 0 3

More information

Unsupervised Learning Basics

Unsupervised Learning Basics SC4/SM8 Advanced Topics in Statistical Machine Learning Unsupervised Learning Basics Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/atsml/

More information

Insurance Department Resources Report Volume 1

Insurance Department Resources Report Volume 1 2014 Insurance Department Resources Report Volume 1 201 Insurance Department Resources Report Volume One 201 The NAIC is the authoritative source for insurance industry information. Our expert solutions

More information

MINERALS THROUGH GEOGRAPHY. General Standard. Grade level K , resources, and environmen t

MINERALS THROUGH GEOGRAPHY. General Standard. Grade level K , resources, and environmen t Minerals through Geography 1 STANDARDS MINERALS THROUGH GEOGRAPHY See summary of National Science Education s. Original: http://books.nap.edu/readingroom/books/nses/ Concept General Specific General Specific

More information

JAN/FEB MAR/APR MAY/JUN

JAN/FEB MAR/APR MAY/JUN QF 1.100 (Build 1010) Widget Publishing, Inc Page: 1 Circulation Breakdown by Issue Analyzed Nonpaid and Verified Paid Previous This Previous This Total Total issue Removals Additions issue issue Removals

More information

BlackRock Core Bond Trust (BHK) BlackRock Enhanced International Dividend Trust (BGY) 2 BlackRock Defined Opportunity Credit Trust (BHL) 3

BlackRock Core Bond Trust (BHK) BlackRock Enhanced International Dividend Trust (BGY) 2 BlackRock Defined Opportunity Credit Trust (BHL) 3 MUNICIPAL FUNDS Arizona (MZA) California Municipal Income Trust (BFZ) California Municipal 08 Term Trust (BJZ) California Quality (MCA) California Quality (MUC) California (MYC) Florida Municipal 00 Term

More information

October 2016 v1 12/10/2015 Page 1 of 10

October 2016 v1 12/10/2015 Page 1 of 10 State Section S s Effective October 1, 2016 Overview The tables list the Section S items that will be active on records with a target date on or after October 1, 2016. The active on each item subset code

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

Stem-and-Leaf Displays

Stem-and-Leaf Displays 3.2 Displaying Numerical Data: Stem-and-Leaf Displays 107 casts in your area? (San Luis Obispo Tribune, June 15, 2005). The responses are summarized in the table below. Extremely 4% Very 27% Somewhat 53%

More information

All-Time Conference Standings

All-Time Conference Standings All-Time Conference Standings Pac 12 Conference Conf. Matches Sets Overall Matches Team W L Pct W L Pct. Score Opp Last 10 Streak Home Away Neutral W L Pct. Arizona 6 5.545 22 19.537 886 889 6-4 W5 4-2

More information

FLOOD/FLASH FLOOD. Lightning. Tornado

FLOOD/FLASH FLOOD. Lightning. Tornado 2004 Annual Summaries National Oceanic and Atmospheric Administration National Environmental Satellite Data Information Service National Climatic Data Center FLOOD/FLASH FLOOD Lightning Tornado Hurricane

More information

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/ Louisiana Acadia 20 17 3 0 0 0 Allen 2 2 0 0 0 0 Ascension 226 185 37 2 1 1 Assumption 16 15 1 0 0 0 Avoyelles 20 19 1 0 0 0 Beauregard 16 11 4 0 0 1 Bienville 2 2 0 0 0 0 Bossier 22 18 4 0 0 0 Caddo 91

More information

United States Geography Unit 1

United States Geography Unit 1 United States Geography Unit 1 I WANT YOU TO STUDY YOUR GEORGAPHY Name: Period: Due Date: Geography Key Terms Absolute Location: Relative Location: Demographic Map: Population Density: Sun-Belt: Archipelago:

More information

An Analysis of Regional Income Variation in the United States:

An Analysis of Regional Income Variation in the United States: Modern Economy, 2017, 8, 232-248 http://www.scirp.org/journal/me ISSN Online: 2152-7261 ISSN Print: 2152-7245 An Analysis of Regional Income Variation in the United States: 1969-2013 Orley M. Amos Jr.

More information

Physical Features of Canada and the United States

Physical Features of Canada and the United States I VIUAL Physical Features of Canada and the United tates 170 ARCTIC OCA Aleutian s 1 Bering ea ALAKA Yukon R. Mt. McKinley (20,320 ft. 6,194 m) Gulf of Alaska BROOK RAG RAG Queen Charlotte s R Vancouver

More information

Physical Features of Canada and the United States

Physical Features of Canada and the United States Physical Features of Canada and the United tates 170 ARCTIC OCA Aleutian s 1 1 Bering ea ALAKA Yukon R. Mt. McKinley (20,320 ft. 6,194 m) Gulf of Alaska BROOK RAG RAG Queen Charlotte s R Vancouver O C

More information

GIS use in Public Health 1

GIS use in Public Health 1 Geographic Information Systems (GIS) use in Public Health Douglas Morales, MPH Epidemiologist/GIS Coordinator Office of Health Assessment and Epidemiology Epidemiology Unit Objectives Define GIS and justify

More information

MO PUBL 4YR 2090 Missouri State University SUBTOTAL-MO

MO PUBL 4YR 2090 Missouri State University SUBTOTAL-MO Report ID: USSR8072-V3 Page No. 1 Jurisdiction: ON-CAMPUS IL American Intercontinental Universit 0 0 1 0 Northern Illinois University 0 0 4 0 Southern Illinois Univ - Edwardsvil 2 0 2 0 Southern Illinois

More information

A Summary of State DOT GIS Activities

A Summary of State DOT GIS Activities A Summary of State DOT GIS Activities Prepared for the 2006 AASHTO GIS-T Symposium Columbus, OH Introduction This is the 11 th year that the GIS-T Symposium has conducted a survey of GIS activities at

More information

KS PUBL 4YR Kansas State University Pittsburg State University SUBTOTAL-KS

KS PUBL 4YR Kansas State University Pittsburg State University SUBTOTAL-KS Report ID: USSR8072-V3 Page No. 1 Jurisdiction: ON-CAMPUS IL PUBL TCH DeVry University Addison 1 0 0 0 Eastern Illinois University 1 0 0 0 Illinois State University 0 0 2 0 Northern Illinois University

More information

State Section S Items Effective October 1, 2017

State Section S Items Effective October 1, 2017 State Section S Items Effective October 1, 2017 Overview The tables list the Section S items that will be active on records with a target date on or after October 1, 2017. The active item on each item

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Singular Value Decomposition and Principal Component Analysis (PCA) I

Singular Value Decomposition and Principal Component Analysis (PCA) I Singular Value Decomposition and Principal Component Analysis (PCA) I Prof Ned Wingreen MOL 40/50 Microarray review Data per array: 0000 genes, I (green) i,i (red) i 000 000+ data points! The expression

More information

National Organization of Life and Health Insurance Guaranty Associations

National Organization of Life and Health Insurance Guaranty Associations National Organization of and Health Insurance Guaranty Associations November 21, 2005 Dear Chief Executive Officer: Consistent with prior years, NOLHGA is providing the enclosed data regarding insolvency

More information

Division I Sears Directors' Cup Final Standings As of 6/20/2001

Division I Sears Directors' Cup Final Standings As of 6/20/2001 1 Stanford (Cal.) 1359 2 90 9 75 20 62.5 0 0 0 0 0 0 3 75 1 100 5 60 8 76 4 80 0 0 2 70 2 California-Los Angeles 1138 0 0 5 78.5 17 67 0 0 0 0 0 0 2 90 9 50 5 60 2 0 33 48.5 2 70 1 100 3 Georgia 890.5

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Introducing North America

Introducing North America Introducing North America I. Quick Stats Includes U.S. & Canada U.S consists of 50 States Federal Government Democracy 4 th in world w/ land area 3 rd in population Economic leader of free world II. Major

More information

1. Introduction to Multivariate Analysis

1. Introduction to Multivariate Analysis 1. Introduction to Multivariate Analysis Isabel M. Rodrigues 1 / 44 1.1 Overview of multivariate methods and main objectives. WHY MULTIVARIATE ANALYSIS? Multivariate statistical analysis is concerned with

More information

Summary of Terminal Master s Degree Programs in Philosophy

Summary of Terminal Master s Degree Programs in Philosophy Summary of Terminal Master s Degree Programs in Philosophy Faculty and Student Demographics All data collected by the ican Philosophical Association. The data in this publication have been provided by

More information

Linear Algebra (Review) Volker Tresp 2018

Linear Algebra (Review) Volker Tresp 2018 Linear Algebra (Review) Volker Tresp 2018 1 Vectors k, M, N are scalars A one-dimensional array c is a column vector. Thus in two dimensions, ( ) c1 c = c 2 c i is the i-th component of c c T = (c 1, c

More information

112th U.S. Senate ACEC Scorecard

112th U.S. Senate ACEC Scorecard 112th U.S. Senate ACEC Scorecard HR 658 FAA Funding Alaska Alabama Arkansas Arizona California Colorado Connecticut S1 Lisa Murkowski R 100% Y Y Y Y Y Y Y Y Y S2 Mark Begich D 89% Y Y Y Y N Y Y Y Y S1

More information

Non-iterative, regression-based estimation of haplotype associations

Non-iterative, regression-based estimation of haplotype associations Non-iterative, regression-based estimation of haplotype associations Benjamin French, PhD Department of Biostatistics and Epidemiology University of Pennsylvania bcfrench@upenn.edu National Cancer Center

More information

Chapter 5 Linear least squares regression

Chapter 5 Linear least squares regression Chapter 5 Linear least squares regression Consider first simple linear regression. For now, the model of interest is only the model of the conditional expectations; the distribution of the residuals is

More information

Foundations of Computer Vision

Foundations of Computer Vision Foundations of Computer Vision Wesley. E. Snyder North Carolina State University Hairong Qi University of Tennessee, Knoxville Last Edited February 8, 2017 1 3.2. A BRIEF REVIEW OF LINEAR ALGEBRA Apply

More information

Evidence for increasingly extreme and variable drought conditions in the contiguous United States between 1895 and 2012

Evidence for increasingly extreme and variable drought conditions in the contiguous United States between 1895 and 2012 Evidence for increasingly extreme and variable drought conditions in the contiguous United States between 1895 and 2012 Sierra Rayne a,, Kaya Forest b a Chemologica Research, 318 Rose Street, PO Box 74,

More information

Linear Algebra (Review) Volker Tresp 2017

Linear Algebra (Review) Volker Tresp 2017 Linear Algebra (Review) Volker Tresp 2017 1 Vectors k is a scalar (a number) c is a column vector. Thus in two dimensions, c = ( c1 c 2 ) (Advanced: More precisely, a vector is defined in a vector space.

More information

7 Principal Component Analysis

7 Principal Component Analysis 7 Principal Component Analysis This topic will build a series of techniques to deal with high-dimensional data. Unlike regression problems, our goal is not to predict a value (the y-coordinate), it is

More information

Robust scale estimation with extensions

Robust scale estimation with extensions Robust scale estimation with extensions Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline The robust scale estimator P n Robust covariance

More information

Preprocessing & dimensionality reduction

Preprocessing & dimensionality reduction Introduction to Data Mining Preprocessing & dimensionality reduction CPSC/AMTH 445a/545a Guy Wolf guy.wolf@yale.edu Yale University Fall 2016 CPSC 445 (Guy Wolf) Dimensionality reduction Yale - Fall 2016

More information

A GUIDE TO THE CARTOGRAPHIC PRODUCTS OF

A GUIDE TO THE CARTOGRAPHIC PRODUCTS OF A GUIDE TO THE CARTOGRAPHIC PRODUCTS OF THE FEDERAL DEPOSITORY LIBRARY PROGRAM (FDLP) This guide was designed for use as a collection development tool by map selectors of depository libraries that participate

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/

Office of Budget & Planning 311 Thomas Boyd Hall Baton Rouge, LA Telephone 225/ Fax 225/ Louisiana Acadia 25 19 4 2 0 0 Allen 8 7 1 0 0 0 Ascension 173 143 26 1 0 3 Assumption 14 12 2 0 0 0 Avoyelles 51 41 9 0 0 1 Beauregard 18 14 3 0 0 1 Bienville 5 0 4 0 1 0 Bossier 28 27 0 1 0 0 Caddo 95

More information

If you have any questions concerning this report, please feel free to contact me. REPORT OF LABORATORY ANALYSIS

If you have any questions concerning this report, please feel free to contact me. REPORT OF LABORATORY ANALYSIS #=CL# LI USE: FR - JILL LAVOIE LI OBJECT ID: August 22, 2014 Jill Lavoie Merrimack Village District - UCMR3 2 Greens Pond Road Merrimack, NH 03054 RE: Dear Jill Lavoie: Enclosed are the analytical results

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information