Sensory Analysis, R tools to analyze Preference liking in function of descriptive measures.
|
|
- Osborn Wilkerson
- 5 years ago
- Views:
Transcription
1 Sensory Analysis, R tools to analyze Preference liking in function of descriptive measures. Dhafer Malouche essai.academia.edu/dhafermalouche Center of Political Studies, Institute of Social Research University of Michigan, Ecole Supérieure de la Statistique et de l Analyse de l Information, University of Carthage. 12/09/2016
2 2/50 What s Sensory Analysis Sensory analysis is a scientific discipline used to analyze reactions to stimuli perceived through the senses: sight, smell, touch, taste, and sound. It is considered one of the most important tools for the food industry, given that it can be applied in several ways.
3 3/50 Aim of Sensory Analysis Gives a comparison between several (food) products Explains preference consumer in terms of the characteristics of the (food) products. Provides the optimal (food) product: liked by a majority of consumers.
4 4/50 Data used for Sensory Analysis Analytic Data. X 1 Physico-chemical data: measure some physico-chemical parameters of the product of the Study. X 2 X Sensory data: rates of the products by a panel of experts according to a list of sensory attributes. Genetic data. Hedonic data: Y, rates provided by a sample of consumers. X 1, X 2,... are analytic data. Y is a perceptual data. Y f (X 1, X 2...)
5 5/50 Example: A study on the Tunisian Olive Oil. 78 million olive oil trees (more than 300 million in Spain and 750 million worldwide). 6,950 square miles (10% of the Tunisian Area) Oil mills 4th producer and 2nd exporter...
6 6/50
7 7/50 Analyzing Olive Oils Three data sets Sensory Data: Panel of Experts Physico-chemical Data: Several laboratories, Italy, Tunisia, Hedonic Data: A survey in a grocery shop with 252 consumers who accepted to taste the samples of OO.
8 8/50 R tools for Sensory Analysis FactoMineR, SensoMineR, Factoshiny Authors: Francois Husson, Julie Josse, Sebastien Le, Jeremy Mazet Description: performs PCA, MFA, Clustering and Panel performance, External and internal Mapping... Webpage: and sensr Authors: Per B Brockho, Rune Haubo Bojesen. Description: Statistical tests of sensory discrimation and similarity data, Power and sample size computations for discrimination and similarity. tests SensMap Authors: Ibtihel Rebhi, Dhafer Malouche. Description: Everything about sensory analysis, External and Internal mapping, Shiny UI.
9 Sensory data 9/50
10 Sensory data 10/50
11 11/50 Sensory data 9 experts rate 21 olive oil products: 7 cultivars: Coratina (Italy), Arbequina (Spain cultivar), Chemchali (South West), Chemlali (South East), Chetoui (North), Leguim (Center) Zalmati (South). 3 extraction systems: 3P, 2P, SP. 9 attributes: Fruity, Bitter, Pungent, Fusty, Musty, Winey, Muddy, Metallic, Rancid: rates from 0 to 10.
12 Problem: How can we measure the quality of the collected data? 12/50
13 13/50 Assessing performance of a Panel The statistical assessment of the performance based on three important qualities: discrimination: their ability to differentiate the products repeatable: consistently. agreement: consensually. To assess this performance we use an ANOVA: the dependents variables are the sensory attributes. The factors: products panelists sessions
14 14/50 Assessing performance of a Panel: ANOVA Model Let (Y iks ) be the rate of the sensory attribute ANOVA: of the i th product by the k th panelist at the s th session Y iks = µ + α i + β k + γ s + [αβ] ik + [αγ ] is + [βγ ] ks + ϵ iks where (i, k, s), ϵ iks N(0, σ 2 ) (i, k, s), (i, k, s ), cov(ϵ iks, ϵ i k s ) = 0
15 15/50 Assessing performance of a Panel: ANOVA Model α i is the Product effect: it measures the ability of discrimination (if significant). [αβ] ik is the interaction Product-Panelist. It indicates whether there s a consensus among the panelists: agreement. [αγ ] is is the interaction Product-Session. It indicates whether the product is perceived similarly from one session to another: repeatability.
16 In practice with R > m1<-aov(fruity~product+panelist+session+ product:panelist+product:session+ jury:session,data=panel_session_data) > summary(m1) Df Sum Sq Mean Sq F value Pr(>F) product < 2e-16 *** panelist ** session product:panelist e-08 *** product:session panelist:session Residuals Signif. codes: 0 *** ** 0.01 *
17 17/50 In practice with R With p-value < 2e 16: (Product effect is significant, discrimination performance. [expected] With p-value = 2.66e 08, Product-Panelist is significant, no consensus. [not expected] With p-value = 0.47, Product-Session is not significant, repeatability. [expected]
18 18/50 In practice with R, Interaction Product-Session 6 5 chemchali(3p) chemchali(2p) coratina(2p) coratina(3p) coratina(sp) chetoui(3p) chetoui(2p) Mean per session 4 3 chetoui(sp) leguim(3p) chemchali(sp) chem(2p) leguim(2p) arbequina(2p) arbequina(sp) chem(sp) chem(3p) zalmati(3p) zalmati(2p) arbequina(3p) Session zalmati(sp) leguim(sp) Mean on the whole session
19 19/50 In practice with R, package SensoMineR >formul<-"~product+jury+session+product:jury+ + product:session+jury:session" > res.panel<-panelperf(panel_session_data,firstvar = 4, + lastvar = 9, + formul = formul) > round(res.panel$p.value,5) prod jury session p:j p:s Fruity Bitter Pungent Fusty Moddy Rancid j:s Fruity Bitter Pungent Fusty Moddy Rancid
20 Physico-chemical data.
21 21/50 Physico-chemical data. 7 cultivars + 2 Extraction systems (sp and 3p). 3 repetitions for each measure 68 Physico-chemical parameters : Acidity, K232,...
22 22/50 Physico-chemical data, Assessing repeatability Let (Y ir ) be the measure of a given Physico-chemical parameter. ANOVA: where of the i th product by the r th repitation Y ir = µ + α i + β r + ϵ ir (i, r), ϵ ir N(0, σ 2 ) (i, r), (i, r ), cov(ϵ ir, ϵ i r ) = 0
23 23/50 In practice with R, > m1<-aov(acidity~product+repitation,data=phch) > summary(m1) Df Sum Sq Mean Sq F value Pr(>F) product e-14 *** repitation Residuals Signif. codes: 0 *** ** 0.01 * With p-value = 1.44e 14, Product has a significant effect, discrimination. With p-value = 0.605, Repetation has not a significant effect, repeatability.
24 24/50 Repeatability, Multivariate approach. Perform a Principal Component Analysis on the PHCH-data. Dimension reduction method. Summarize the most information contained in the data with a small number of artificial variables (principal components). Information = Total variance of the data.
25 25/50 Repeatability, Multivariate approach. 1 Let X be the physico-chemical data: P N numeric where P is the number of the products and N is the number of variables: X 1,..., X N the columns of X. 2 Scale and center X and consider the correlation matrix V = X X/P. 3 Search for F a P 1 matrix (or vector) such that F = N α j X j such that Var(F ) is maximal j=1 Solution: F = F 1 = Xu 1 where u 1 is the eigen vector of V associated to the greater eigen value λ 1 of V. C 1 is called the first principal component of X.
26 26/50 Repeatability, Multivariate approach. 4 We search also a second F such that F = N α j X j such that Var(F ) is maximal and Cor(F, F 1 ) = 0 j=1 Solution: F = F 2 = Xu 2 where u 2 is the eigen vector of V associated to the 2nd greater eigen value λ 2 of V. 5 And F 1, F 2,..., F k,.. 6 Draw a scatter-plot [F k, F k ] with repetition as a supplementary variable.
27 27/50 In practice with R, Scree-plot > library(factominer) > library(factoextra) > res.pca<-pca(x,scale.unit = T, + ncp = 5,quali.sup = 66:67,graph = F) > fviz_screeplot(res.pca, ncp = 10) + theme_classic()
28 28/50 In practice with R, Scree-plot Percentage of explained variances Scree plot Dimensions
29 29/50 In practice with R, Scree-plot Percentage of explained variances Scree plot Dimensions
30 In practice with R, Product-map > library(ggplot2) > library(scales) > library(grid) > library(plyr) > library(gridextra) > find_hull <- function(x) X[chull(X$PC1, X$PC2), ] >dt=cbind.data.frame(res.pca$ind$coord[,1:2], + X$product,X$repetition) > colnames(dt)=c("pc1","pc2","product","repetition") > dt2=cbind.data.frame(res.pca$quali.sup$coord[1:14,1:2], + rownames(res.pca$quali.sup$coord[1:14,])) > colnames(dt2)=c("pc1","pc2","product") > dt2$product=gsub("_","",dt2$product) > hulls <- ddply(dt, "Product", find_hull)
31 31/50 In practice with R, Product-map > p<-ggplot(data = dt,aes(x=pc1,y=pc2,col=product,fill=product))+ + geom_hline(yintercept = 0,alpha=.4)+ + geom_vline(xintercept = 0,alpha=.4)+ + geom_polygon(data=hulls, alpha=.2)+ + geom_point() + + geom_text(data=dt2,aes(x=pc1,y=pc2,label=product),col="black")+ + xlab(paste("axis 1 (",round(res.pca$eig[1,2],1),"%)",sep=""))+ + ylab(paste("axis 2 (",round(res.pca$eig[2,2],1),"%)",sep="")) > p<-p+theme_classic()+theme(legend.position = "none") > p
32 In practice with R, Product-map 7.5 chetou(sp) 5.0 chetou(3p) Axis 2 (12.5%) orati(3p) corati(sp) chemch(3p) leguim(3p) leguim(sp) arbeq(3p) chemch(sp) arbeq(sp) chemle(sp) chemle(3p) zalmat(3p) zalmat(sp) Axis 1 (28.1%)
33 33/50 In practice with R, V-test > res.pca$quali.sup$v.test[15:17,] Dim.1 Dim.2 Dim.3 Dim.4 Dim.5 repetition repetition repitation
34 In practice with R, Product-map Axis 2 (12.5%) Axis 1 (28.1%)
35 In practice with R, Product-map Circle of Correlations Axis 2 (12.5%) 1.0 H- Tyr DFLA 0.5 Ac-Pin LA Cis-3-hexenol 3-hexenyl acetate polyphenols b carotène 0.0 SOO C 18:1 OOOAOL LOO LLL LLO C 18:2 1-Pentanol LnLO LnLP LnOO LOP Trans-2-hexenol PLP Hexanal POO C 16:1 PLL POP Axis 1 (28.1%)
36 External Mapping
37 37/50 External Mapping Y f (X 1, X 2,...) It Aims to give an estimation of the probability of liking of a product. derive a multidimensional representation of products based on their sensory profile or a set of other external data such as instrumental measures of color, texture or flavor.
38 38/50 External Mapping, Danzart 1998, Combine all measures X 1, X 2,... into one analytic data matrix X with dimensions P N (products attributes). 2 Perform a dimension reduction method on X: for example a Principal Component Analysis and let F 1 and F 2 be the first two components. 3 Let Y be the hedonic matrix with dimensions P C (products consumers) and with columns Y 1,..., Y C. 4 For all c = 1,..., C, perform a regression Model Let f c be an estimation of f c. Y c = f c (F 1, F 2 ) + ϵ c
39 External Mapping, Danzart 1998, Consider a grid in the plan (C 1, C 2 ): { G = (f 1 l 1, f 2 l 2 ), (l 1, l 2 ) {1,..., L} 2} 6 For each c and (l 1, l 2 ), predict ŷ(c, l 1, l 2 ) = f c (f 1 l 1, f 2 l 2 ) 7 Computer the percentage of the consumers that have predictions higher than the average of their given scores. where ϕ(l 1, l 2 ) = 1 C C y(c) = 1 P 1 {ŷ(c,l1,l 2 ) y(c)} c=1 P y p (c) p=1 39/50
40 40/50 In practice with R 1 Prepare data > x.phch=aggregate(phch[,1:65], + by = list(phch$product),mean) > y.hedo=aggregate(hedonic[,1:255], + by=list(hedonic$product),mean) > z.panel=aggregate(panel[,3:8], + by=list(panel$products),mean) 2 Merge PHCH-data and Senso-data into one data matrix. > i=match(x.phch$group.1,z.panel$group.1) > X=cbind.data.frame(x.phch[,-1],z.panel[i,-1]) 3 Hedonic-matrix > Y=y.hedo[,-c(1:4)]
41 41/50 In practice with R (cont.) 4 Perform PCA on the Senso-data from SensMap package. > library(sensmap) > map=map.with.pca(x = X) 5 Extract the first two components of the PCA and create the grid. > maps=cbind.data.frame(map$f1,map$f2) > colnames(maps)=c("f1","f2") > nbpoints=100 > discretspace=discrete.function(nbpoints = nbpoints, + map = maps)
42 In practice with R (cont.) 6 Perform quadratic regression (default) and estimate the predictions among all the points in the grid and for each consumer Y c = a c + b 1 cf 1 + b 1 cf 2 + b 11 c (F 1 ) 2 + b 22 c (F 2 ) 2 + b 12 c F 1 F 1 + ϵ c > x.lm=predict.scores.lm(y = Y,discretspace = discretspace, + map = maps) 7 Compute percentage of Liking. > p.lm=100*rowmeans(x.lm$preference)
43 43/50 In practice with R (cont.) 8 Draw the External-map. > image.plot(graph.surfconso,col=terrain.colors(60)) > contour(x=graph.surfconso$x,y=graph.surfconso$y, + z=graph.surfconso$z,add=t, + levels=seq(from=0,to=100,by=5)) > text(x=maps$f1,y=maps$f2,labels=y.hedo$group.1,pos=3) > points(x=maps$f1,y=maps$f2,pch=20)
44 44/50 In practice with R (cont.) chemle(3p) chemle(sp) zalmat(3p) zalmat(sp) arbeq(3p) 40 arbeq(sp) chetou(sp) chemch3p leguim(sp) leguim(3p) chemch(sp) chetou(3p) corati(3p) corati(sp)
45 In practice with R (cont.) 45/50
46 Denoising the External map Perform now a Local Polynomial Regression Fitting of the percentages in terms of the (x, y) coordinates > dlow=cbind.data.frame(discretspace,p.lm) > colnames(dlow)=c("x","y","z") > m.loess<-loess(z~x+y,span=.5,data=dlow,degree=2) > p.loess=m.loess$fitted > graph.surfconso.loess=as.image(z=p.loess,x=discretspace, + nrow=nbpoints, + ncol=nbpoints)
47 47/50 Denoising the External map (cont.) chetoui(sp) chemle(3p) chemle(sp) zalmat(3p) zalmat(sp) arbeq(3p) arbeq(sp) 45 chemch(3p) leguim(sp) leguim(3p) chemch(sp) chetoui(3p) corati(3p) corati(sp)
48 Denoising the External map (cont.)
49 49/50 Some references Analysis Sensory Data with R, Sebastien Lê, Thiery Worsh, CRC Press (2014) McEwan, J. A. (1996). Preference Mapping for product optimization. In Multivariate analysis of data in sensory science (Naes, T., and Risvik, E. eds.). New York: Elsevier (pp ). Danzart, M. (1998). Quadratic model in preference mapping. 4th Sensometric meeting, Copenhagen, Denmark, August timesens.com by INRA Dijon, France (2016)
50 Thank you 50/50
Christine Borgen Linander 1,, Rune Haubo Bojesen Christensen 1, Rebecca Evans 2, Graham Cleaver 2, Per Bruun Brockhoff 1. University of Denmark
Individual differences in replicated multi-product 2-AFC data with and without supplementary difference scoring: Comparing Thurstonian mixed regression models for binary and ordinal data with linear mixed
More informationHandling missing values in Multiple Factor Analysis
Handling missing values in Multiple Factor Analysis François Husson, Julie Josse Applied mathematics department, Agrocampus Ouest, Rennes, France Rabat, 26 March 2014 1 / 10 Multi-blocks data set Groups
More informationMedOOmics Mediterranean Extra Virgin Olive Oil Omics: profiling and fingerprinting
RIMNet2 Mid-Term Project Evaluation Meeting 12 October 2017, Montpellier, France MedOOmics Mediterranean Extra Virgin Olive Oil Omics: profiling and fingerprinting Coordinator Maria João CBRIT Universidade
More informationThreshold models with fixed and random effects for ordered categorical data
Threshold models with fixed and random effects for ordered categorical data Hans-Peter Piepho Universität Hohenheim, Germany Edith Kalka Universität Kassel, Germany Contents 1. Introduction. Case studies
More informationCorrespondence Analysis
Correspondence Analysis Julie Josse, François Husson, Sébastien Lê Applied Mathematics Department, Agrocampus Ouest user-2008 Dortmund, August 11th 2008 1 / 22 History Theoretical principles: Fisher (1940)
More informationOutline. The ordinal package: Analyzing ordinal data. Ordinal scales are commonly used (attribute rating or liking) Ordinal data the wine data
Outline Outline The ordinal package: Analyzing ordinal data Per Bruun Brockhoff DTU Compute Section for Statistics Technical University of Denmark perbb@dtu.dk August 0 015 c Per Bruun Brockhoff (DTU)
More informationMultivariate Analysis
Prof. Dr. J. Franke All of Statistics 3.1 Multivariate Analysis High dimensional data X 1,..., X N, i.i.d. random vectors in R p. As a data matrix X: objects values of p features 1 X 11 X 12... X 1p 2.
More informationMultivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis
Multivariate Data Analysis a survey of data reduction and data association techniques: Principal Components Analysis For example Data reduction approaches Cluster analysis Principal components analysis
More informationMultilevel Component Analysis applied to the measurement of a complex product experience
Multilevel Component Analysis applied to the measurement of a complex product experience Boucon, C.A., Petit-Jublot, C.E.F., Groeneschild C., Dijksterhuis, G.B. Outline Background Introduction to Simultaneous
More informationmissmda: a package to handle missing values in Multivariate exploratory Data Analysis methods
missmda: a package to handle missing values in Multivariate exploratory Data Analysis methods Julie Josse & Francois Husson Applied Mathematics Department Agrocampus Rennes - France user! 2011, Warwick,
More informationModels for Replicated Discrimination Tests: A Synthesis of Latent Class Mixture Models and Generalized Linear Mixed Models
Models for Replicated Discrimination Tests: A Synthesis of Latent Class Mixture Models and Generalized Linear Mixed Models Rune Haubo Bojesen Christensen & Per Bruun Brockhoff DTU Informatics Section for
More informationMain purposes of sensr. The sensr package: Difference and similarity testing. Main functions in sensr
Main purposes of sensr : Difference and similarity testing Per B Brockhoff DTU Compute Section for Statistics Technical University of Denmark perbb@dtu.dk August 17 2015 Statistical tests of sensory discrimation
More informationChaper 5: Matrix Approach to Simple Linear Regression. Matrix: A m by n matrix B is a grid of numbers with m rows and n columns. B = b 11 b m1 ...
Chaper 5: Matrix Approach to Simple Linear Regression Matrix: A m by n matrix B is a grid of numbers with m rows and n columns B = b 11 b 1n b m1 b mn Element b ik is from the ith row and kth column A
More informationVisualizing Tests for Equality of Covariance Matrices Supplemental Appendix
Visualizing Tests for Equality of Covariance Matrices Supplemental Appendix Michael Friendly and Matthew Sigal September 18, 2017 Contents Introduction 1 1 Visualizing mean differences: The HE plot framework
More informationMultiple Regression: Example
Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c
More information1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables
1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables
More informationPrincipal component analysis
Principal component analysis Motivation i for PCA came from major-axis regression. Strong assumption: single homogeneous sample. Free of assumptions when used for exploration. Classical tests of significance
More informationExplaining Correlations by Plotting Orthogonal Contrasts
Explaining Correlations by Plotting Orthogonal Contrasts Øyvind Langsrud MATFORSK, Norwegian Food Research Institute. www.matforsk.no/ola/ To appear in The American Statistician www.amstat.org/publications/tas/
More informationINTRODUCCIÓ A L'ANÀLISI MULTIVARIANT. Estadística Biomèdica Avançada Ricardo Gonzalo Sanz 13/07/2015
INTRODUCCIÓ A L'ANÀLISI MULTIVARIANT Estadística Biomèdica Avançada Ricardo Gonzalo Sanz ricardo.gonzalo@vhir.org 13/07/2015 1. Introduction to Multivariate Analysis 2. Summary Statistics for Multivariate
More informationRecall that a measure of fit is the sum of squared residuals: where. The F-test statistic may be written as:
1 Joint hypotheses The null and alternative hypotheses can usually be interpreted as a restricted model ( ) and an model ( ). In our example: Note that if the model fits significantly better than the restricted
More information11.2 The Quadratic Formula
11.2 The Quadratic Formula Solving Quadratic Equations Using the Quadratic Formula. By solving the general quadratic equation ax 2 + bx + c = 0 using the method of completing the square, one can derive
More informationA User's Guide To Principal Components
A User's Guide To Principal Components J. EDWARD JACKSON A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Brisbane Toronto Singapore Contents Preface Introduction 1. Getting
More informationNesting and Mixed Effects: Part I. Lukas Meier, Seminar für Statistik
Nesting and Mixed Effects: Part I Lukas Meier, Seminar für Statistik Where do we stand? So far: Fixed effects Random effects Both in the factorial context Now: Nested factor structure Mixed models: a combination
More information610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison
610 - R1A "Make friends" with your data Psychology 610, University of Wisconsin-Madison Prof Colleen F. Moore Note: The metaphor of making friends with your data was used by Tukey in some of his writings.
More informationClassification 1: Linear regression of indicators, linear discriminant analysis
Classification 1: Linear regression of indicators, linear discriminant analysis Ryan Tibshirani Data Mining: 36-462/36-662 April 2 2013 Optional reading: ISL 4.1, 4.2, 4.4, ESL 4.1 4.3 1 Classification
More informationStructural Equation Modeling and Confirmatory Factor Analysis. Types of Variables
/4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris
More informationSMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning 1 Agenda 1. Comparison of Treatments (One Variable) Analysis of Variance
More informationFactor Analysis (1) Factor Analysis
Factor Analysis (1) Outlines: 1. Introduction of factor analysis 2. Principle component analysis 4. Factor rotation 5. Case Shan-Yu Chou 1 Factor Analysis Combines questions or variables to create new
More informationThe STATIS Method. 1 Overview. Hervé Abdi 1 & Dominique Valentin. 1.1 Origin and goal of the method
The Method Hervé Abdi 1 & Dominique Valentin 1 Overview 1.1 Origin and goal of the method is a generalization of principal component analysis (PCA) whose goal is to analyze several sets of variables collected
More informationRegularization approaches in sensory descriptive analysis Applications to Kokumi-product development
Regularization approaches in sensory descriptive analysis Applications to Kokumi-product development Eduard Derks, Kae Morita, Tetsuo Aishima Sensometrics Juli 12th, 2012 NPD-study Goal of project 1. Understand
More informationG E INTERACTION USING JMP: AN OVERVIEW
G E INTERACTION USING JMP: AN OVERVIEW Sukanta Dash I.A.S.R.I., Library Avenue, New Delhi-110012 sukanta@iasri.res.in 1. Introduction Genotype Environment interaction (G E) is a common phenomenon in agricultural
More informationStatistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).
Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results
More informationDimensionality Reduction Techniques (DRT)
Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,
More informationEXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY
EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY MODULE 4 : Linear models Time allowed: One and a half hours Candidates should answer THREE questions. Each question carries 20 marks. The number of marks
More informationStat 401B Final Exam Fall 2015
Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationClusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved
Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse
More informationMATH 115: Review for Chapter 5
MATH 5: Review for Chapter 5 Can you find the real zeros of a polynomial function and identify the behavior of the graph of the function at its zeros? For each polynomial function, identify the zeros of
More informationCorrespondence Analysis (CA)
Correspondence Analysis (CA) François Husson & Magalie Houée-Bigot Department o Applied Mathematics - Rennes Agrocampus husson@agrocampus-ouest.r / 43 Correspondence Analysis (CA) Data 2 Independence model
More informationMultivariate Analysis of Variance
Chapter 15 Multivariate Analysis of Variance Jolicouer and Mosimann studied the relationship between the size and shape of painted turtles. The table below gives the length, width, and height (all in mm)
More informationSTAT 571A Advanced Statistical Regression Analysis. Chapter 8 NOTES Quantitative and Qualitative Predictors for MLR
STAT 571A Advanced Statistical Regression Analysis Chapter 8 NOTES Quantitative and Qualitative Predictors for MLR 2015 University of Arizona Statistics GIDP. All rights reserved, except where previous
More informationWorkshop 7.4a: Single factor ANOVA
-1- Workshop 7.4a: Single factor ANOVA Murray Logan November 23, 2016 Table of contents 1 Revision 1 2 Anova Parameterization 2 3 Partitioning of variance (ANOVA) 10 4 Worked Examples 13 1. Revision 1.1.
More informationRegularized PCA to denoise and visualise data
Regularized PCA to denoise and visualise data Marie Verbanck Julie Josse François Husson Laboratoire de statistique, Agrocampus Ouest, Rennes, France CNAM, Paris, 16 janvier 2013 1 / 30 Outline 1 PCA 2
More informationStatistical Analysis. G562 Geometric Morphometrics PC 2 PC 2 PC 3 PC 2 PC 1. Department of Geological Sciences Indiana University
PC 2 PC 2 G562 Geometric Morphometrics Statistical Analysis PC 2 PC 1 PC 3 Basic components of GMM Procrustes Whenever shapes are analyzed together, they must be superimposed together This aligns shapes
More informationPrincipal Variance Components Analysis for Quantifying Variability in Genomics Data
Principal Variance Components Analysis for Quantifying Variability in Genomics Data Tzu-Ming Chu SAS Institute Inc. Outlines Motivation PVCA (PCA + VCA) Example Mouse Lung Tumorigenicity Data Grouped Batch
More informationFormula for the t-test
Formula for the t-test: How the t-test Relates to the Distribution of the Data for the Groups Formula for the t-test: Formula for the Standard Error of the Difference Between the Means Formula for the
More informationMachine Learning 11. week
Machine Learning 11. week Feature Extraction-Selection Dimension reduction PCA LDA 1 Feature Extraction Any problem can be solved by machine learning methods in case of that the system must be appropriately
More information2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2
PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting
More informationDesign & Analysis of Experiments 7E 2009 Montgomery
Chapter 5 1 Introduction to Factorial Design Study the effects of 2 or more factors All possible combinations of factor levels are investigated For example, if there are a levels of factor A and b levels
More information15.063: Communicating with Data
15.063: Communicating with Data Summer 2003 Recitation 6 Linear Regression Today s Content Linear Regression Multiple Regression Some Problems 15.063 - Summer '03 2 Linear Regression Why? What is it? Pros?
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationG562 Geometric Morphometrics. Statistical Tests. Department of Geological Sciences Indiana University. (c) 2012, P. David Polly
Statistical Tests Basic components of GMM Procrustes This aligns shapes and minimizes differences between them to ensure that only real shape differences are measured. PCA (primary use) This creates a
More informationClassification 2: Linear discriminant analysis (continued); logistic regression
Classification 2: Linear discriminant analysis (continued); logistic regression Ryan Tibshirani Data Mining: 36-462/36-662 April 4 2013 Optional reading: ISL 4.4, ESL 4.3; ISL 4.3, ESL 4.4 1 Reminder:
More informationAudio transcription of the Principal Component Analysis course
Audio transcription of the Principal Component Analysis course Part I. Data - Practicalities Slides 1 to 8 Pages 2 to 5 Part II. Studying individuals and variables Slides 9 to 26 Pages 6 to 13 Part III.
More informationThe Statistical Sleuth in R: Chapter 5
The Statistical Sleuth in R: Chapter 5 Linda Loi Kate Aloisio Ruobing Zhang Nicholas J. Horton January 21, 2013 Contents 1 Introduction 1 2 Diet and lifespan 2 2.1 Summary statistics and graphical display........................
More informationCanonical Correlation & Principle Components Analysis
Canonical Correlation & Principle Components Analysis Aaron French Canonical Correlation Canonical Correlation is used to analyze correlation between two sets of variables when there is one set of IVs
More informationOnline Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales
Online Appendix for Cultural Biases in Economic Exchange? Luigi Guiso Paola Sapienza Luigi Zingales 1 Table A.1 The Eurobarometer Surveys The Eurobarometer surveys are the products of a unique program
More informationStatistics Toolbox 6. Apply statistical algorithms and probability models
Statistics Toolbox 6 Apply statistical algorithms and probability models Statistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of
More informationMultivariate and Multivariable Regression. Stella Babalola Johns Hopkins University
Multivariate and Multivariable Regression Stella Babalola Johns Hopkins University Session Objectives At the end of the session, participants will be able to: Explain the difference between multivariable
More informationThurstonian scales obtained by transformation of beta distributions
Available online at www.sciencedirect.com Food Quality and Preference 19 (2008) 407 411 www.elsevier.com/locate/foodqual Thurstonian scales obtained by transformation of beta distributions L. Carbonell,
More informationSix Sigma Black Belt Study Guides
Six Sigma Black Belt Study Guides 1 www.pmtutor.org Powered by POeT Solvers Limited. Analyze Correlation and Regression Analysis 2 www.pmtutor.org Powered by POeT Solvers Limited. Variables and relationships
More informationChemometrics. Matti Hotokka Physical chemistry Åbo Akademi University
Chemometrics Matti Hotokka Physical chemistry Åbo Akademi University Linear regression Experiment Consider spectrophotometry as an example Beer-Lamberts law: A = cå Experiment Make three known references
More informationDealing with Heteroskedasticity
Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing
More informationInt Math 1 Statistic and Probability. Name:
Name: Int Math 1 1. Juan wants to rent a house. He gathers data on many similar houses. The distance from the center of the city, x, and the monthly rent for each house, y, are shown in the scatter plot.
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 2. Overview of multivariate techniques 2.1 Different approaches to multivariate data analysis 2.2 Classification of multivariate techniques
More informationANOVA (Analysis of Variance) output RLS 11/20/2016
ANOVA (Analysis of Variance) output RLS 11/20/2016 1. Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means.
More information2. Outliers and inference for regression
Unit6: Introductiontolinearregression 2. Outliers and inference for regression Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16
More informationContents. 9. Fractional and Quadratic Equations 2 Example Example Example
Contents 9. Fractional and Quadratic Equations 2 Example 9.52................................ 2 Example 9.54................................ 3 Example 9.55................................ 4 1 Peterson,
More informationcor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )
Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationData Visualization for Policy Advocacy
Data Visualization for Policy Advocacy David Epstein Research Associate Baltimore Neighborhood Indicators Alliance Jacob France Institute University of Baltimore Presentation for the National Low Income
More information7. The set of all points for which the x and y coordinates are negative is quadrant III.
SECTION - 67 CHAPTER Section -. To each point P in the plane there corresponds a single ordered pair of numbers (a, b) called the coordinates of the point. To each ordered pair of numbers (a, b) there
More informationSTAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 6:00 PM
STAT212_E3 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICS & STATISTICS Term 171 Page 1 of 9 STAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 2017 @ 6:00 PM Name: ID #:
More informationApplied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition
Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition Pearson Education Limited Edinburgh Gate Harlow Essex CM20 2JE England and Associated Companies throughout the world
More informationLecture 2: Diversity, Distances, adonis. Lecture 2: Diversity, Distances, adonis. Alpha- Diversity. Alpha diversity definition(s)
Lecture 2: Diversity, Distances, adonis Lecture 2: Diversity, Distances, adonis Diversity - alpha, beta (, gamma) Beta- Diversity in practice: Ecological Distances Unsupervised Learning: Clustering, etc
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationStat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb
Stat 42/52 TWO WAY ANOVA Feb 6 25 Charlotte Wickham stat52.cwick.co.nz Roadmap DONE: Understand what a multiple regression model is. Know how to do inference on single and multiple parameters. Some extra
More informationDiscriminant Analysis
Chapter 16 Discriminant Analysis A researcher collected data on two external features for two (known) sub-species of an insect. She can use discriminant analysis to find linear combinations of the features
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationReview. Midterm Exam. Midterm Review. May 6th, 2015 AMS-UCSC. Spring Session 1 (Midterm Review) AMS-5 May 6th, / 24
Midterm Exam Midterm Review AMS-UCSC May 6th, 2015 Spring 2015. Session 1 (Midterm Review) AMS-5 May 6th, 2015 1 / 24 Topics Topics We will talk about... 1 Review Spring 2015. Session 1 (Midterm Review)
More informationPumpkin Example: Flaws in Diagnostics: Correcting Models
Math 3080. Treibergs Pumpkin Example: Flaws in Diagnostics: Correcting Models Name: Example March, 204 From Levine Ramsey & Smidt, Applied Statistics for Engineers and Scientists, Prentice Hall, Upper
More informationArea1 Scaled Score (NAPLEX) .535 ** **.000 N. Sig. (2-tailed)
Institutional Assessment Report Texas Southern University College of Pharmacy and Health Sciences "An Analysis of 2013 NAPLEX, P4-Comp. Exams and P3 courses The following analysis illustrates relationships
More informationSTK4900/ Lecture 3. Program
STK4900/9900 - Lecture 3 Program 1. Multiple regression: Data structure and basic questions 2. The multiple linear regression model 3. Categorical predictors 4. Planned experiments and observational studies
More informationR Output for Linear Models using functions lm(), gls() & glm()
LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base
More informationANOVA, ANCOVA and MANOVA as sem
ANOVA, ANCOVA and MANOVA as sem Robin Beaumont 2017 Hoyle Chapter 24 Handbook of Structural Equation Modeling (2015 paperback), Examples converted to R and Onyx SEM diagrams. This workbook duplicates some
More informationMissing values imputation for mixed data based on principal component methods
Missing values imputation for mixed data based on principal component methods Vincent Audigier, François Husson & Julie Josse Agrocampus Rennes Compstat' 2012, Limassol (Cyprus), 28-08-2012 1 / 21 A real
More informationANOVA. Testing more than 2 conditions
ANOVA Testing more than 2 conditions ANOVA Today s goal: Teach you about ANOVA, the test used to measure the difference between more than two conditions Outline: - Why anova? - Contrasts and post-hoc tests
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationDesign of Engineering Experiments Chapter 5 Introduction to Factorials
Design of Engineering Experiments Chapter 5 Introduction to Factorials Text reference, Chapter 5 page 170 General principles of factorial experiments The two-factor factorial with fixed effects The ANOVA
More informationSecond Midterm Exam Name: Solutions March 19, 2014
Math 3080 1. Treibergs σιι Second Midterm Exam Name: Solutions March 19, 2014 (1. The article Withdrawl Strength of Threaded Nails, in Journal of Structural Engineering, 2001, describes an experiment to
More informationSTAT 572 Assignment 5 - Answers Due: March 2, 2007
1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.
More informationMultivariate Statistics Summary and Comparison of Techniques. Multivariate Techniques
Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems
More informationPsychology 405: Psychometric Theory
Psychology 405: Psychometric Theory Homework Problem Set #2 Department of Psychology Northwestern University Evanston, Illinois USA April, 2017 1 / 15 Outline The problem, part 1) The Problem, Part 2)
More informationCorrespondence analysis and related methods
Michael Greenacre Universitat Pompeu Fabra www.globalsong.net www.youtube.com/statisticalsongs../carmenetwork../arcticfrontiers Correspondence analysis and related methods Middle East Technical University
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationShort Answer Questions: Answer on your separate blank paper. Points are given in parentheses.
ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on
More informationConsensus NMF: A unified approach to two-sided testing of micro array data
Consensus NMF: A unified approach to two-sided testing of micro array data Paul Fogel, Consultant, Paris, France S. Stanley Young, National Institute of Statistical Sciences JMP Users Group October 11-14
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationBIOSTATS 640 Spring 2018 Unit 2. Regression and Correlation (Part 1 of 2) R Users
BIOSTATS 640 Spring 08 Unit. Regression and Correlation (Part of ) R Users Unit Regression and Correlation of - Practice Problems Solutions R Users. In this exercise, you will gain some practice doing
More informationApplication of the LLL Algorithm in Sphere Decoding
Application of the LLL Algorithm in Sphere Decoding Sanzheng Qiao Department of Computing and Software McMaster University August 20, 2008 Outline 1 Introduction Application Integer Least Squares 2 Sphere
More information