Sensory Analysis, R tools to analyze Preference liking in function of descriptive measures.

Size: px

Start display at page:

Download "Sensory Analysis, R tools to analyze Preference liking in function of descriptive measures."

Osborn Wilkerson
5 years ago
Views:

1 Sensory Analysis, R tools to analyze Preference liking in function of descriptive measures. Dhafer Malouche essai.academia.edu/dhafermalouche Center of Political Studies, Institute of Social Research University of Michigan, Ecole Supérieure de la Statistique et de l Analyse de l Information, University of Carthage. 12/09/2016

2 2/50 What s Sensory Analysis Sensory analysis is a scientific discipline used to analyze reactions to stimuli perceived through the senses: sight, smell, touch, taste, and sound. It is considered one of the most important tools for the food industry, given that it can be applied in several ways.

3 3/50 Aim of Sensory Analysis Gives a comparison between several (food) products Explains preference consumer in terms of the characteristics of the (food) products. Provides the optimal (food) product: liked by a majority of consumers.

4 4/50 Data used for Sensory Analysis Analytic Data. X 1 Physico-chemical data: measure some physico-chemical parameters of the product of the Study. X 2 X Sensory data: rates of the products by a panel of experts according to a list of sensory attributes. Genetic data. Hedonic data: Y, rates provided by a sample of consumers. X 1, X 2,... are analytic data. Y is a perceptual data. Y f (X 1, X 2...)

5 5/50 Example: A study on the Tunisian Olive Oil. 78 million olive oil trees (more than 300 million in Spain and 750 million worldwide). 6,950 square miles (10% of the Tunisian Area) Oil mills 4th producer and 2nd exporter...

6 6/50

7 7/50 Analyzing Olive Oils Three data sets Sensory Data: Panel of Experts Physico-chemical Data: Several laboratories, Italy, Tunisia, Hedonic Data: A survey in a grocery shop with 252 consumers who accepted to taste the samples of OO.

8 8/50 R tools for Sensory Analysis FactoMineR, SensoMineR, Factoshiny Authors: Francois Husson, Julie Josse, Sebastien Le, Jeremy Mazet Description: performs PCA, MFA, Clustering and Panel performance, External and internal Mapping... Webpage: and sensr Authors: Per B Brockho, Rune Haubo Bojesen. Description: Statistical tests of sensory discrimation and similarity data, Power and sample size computations for discrimination and similarity. tests SensMap Authors: Ibtihel Rebhi, Dhafer Malouche. Description: Everything about sensory analysis, External and Internal mapping, Shiny UI.

9 Sensory data 9/50

10 Sensory data 10/50

11 11/50 Sensory data 9 experts rate 21 olive oil products: 7 cultivars: Coratina (Italy), Arbequina (Spain cultivar), Chemchali (South West), Chemlali (South East), Chetoui (North), Leguim (Center) Zalmati (South). 3 extraction systems: 3P, 2P, SP. 9 attributes: Fruity, Bitter, Pungent, Fusty, Musty, Winey, Muddy, Metallic, Rancid: rates from 0 to 10.

12 Problem: How can we measure the quality of the collected data? 12/50

13 13/50 Assessing performance of a Panel The statistical assessment of the performance based on three important qualities: discrimination: their ability to differentiate the products repeatable: consistently. agreement: consensually. To assess this performance we use an ANOVA: the dependents variables are the sensory attributes. The factors: products panelists sessions

14 14/50 Assessing performance of a Panel: ANOVA Model Let (Y iks ) be the rate of the sensory attribute ANOVA: of the i th product by the k th panelist at the s th session Y iks = µ + α i + β k + γ s + [αβ] ik + [αγ ] is + [βγ ] ks + ϵ iks where (i, k, s), ϵ iks N(0, σ 2 ) (i, k, s), (i, k, s ), cov(ϵ iks, ϵ i k s ) = 0

15 15/50 Assessing performance of a Panel: ANOVA Model α i is the Product effect: it measures the ability of discrimination (if significant). [αβ] ik is the interaction Product-Panelist. It indicates whether there s a consensus among the panelists: agreement. [αγ ] is is the interaction Product-Session. It indicates whether the product is perceived similarly from one session to another: repeatability.

16 In practice with R > m1<-aov(fruity~product+panelist+session+ product:panelist+product:session+ jury:session,data=panel_session_data) > summary(m1) Df Sum Sq Mean Sq F value Pr(>F) product < 2e-16 *** panelist ** session product:panelist e-08 *** product:session panelist:session Residuals Signif. codes: 0 *** ** 0.01 *

17 17/50 In practice with R With p-value < 2e 16: (Product effect is significant, discrimination performance. [expected] With p-value = 2.66e 08, Product-Panelist is significant, no consensus. [not expected] With p-value = 0.47, Product-Session is not significant, repeatability. [expected]

18 18/50 In practice with R, Interaction Product-Session 6 5 chemchali(3p) chemchali(2p) coratina(2p) coratina(3p) coratina(sp) chetoui(3p) chetoui(2p) Mean per session 4 3 chetoui(sp) leguim(3p) chemchali(sp) chem(2p) leguim(2p) arbequina(2p) arbequina(sp) chem(sp) chem(3p) zalmati(3p) zalmati(2p) arbequina(3p) Session zalmati(sp) leguim(sp) Mean on the whole session

19 19/50 In practice with R, package SensoMineR >formul<-"~product+jury+session+product:jury+ + product:session+jury:session" > res.panel<-panelperf(panel_session_data,firstvar = 4, + lastvar = 9, + formul = formul) > round(res.panel$p.value,5) prod jury session p:j p:s Fruity Bitter Pungent Fusty Moddy Rancid j:s Fruity Bitter Pungent Fusty Moddy Rancid

20 Physico-chemical data.

21 21/50 Physico-chemical data. 7 cultivars + 2 Extraction systems (sp and 3p). 3 repetitions for each measure 68 Physico-chemical parameters : Acidity, K232,...

22 22/50 Physico-chemical data, Assessing repeatability Let (Y ir ) be the measure of a given Physico-chemical parameter. ANOVA: where of the i th product by the r th repitation Y ir = µ + α i + β r + ϵ ir (i, r), ϵ ir N(0, σ 2 ) (i, r), (i, r ), cov(ϵ ir, ϵ i r ) = 0

23 23/50 In practice with R, > m1<-aov(acidity~product+repitation,data=phch) > summary(m1) Df Sum Sq Mean Sq F value Pr(>F) product e-14 *** repitation Residuals Signif. codes: 0 *** ** 0.01 * With p-value = 1.44e 14, Product has a significant effect, discrimination. With p-value = 0.605, Repetation has not a significant effect, repeatability.

24 24/50 Repeatability, Multivariate approach. Perform a Principal Component Analysis on the PHCH-data. Dimension reduction method. Summarize the most information contained in the data with a small number of artificial variables (principal components). Information = Total variance of the data.

25 25/50 Repeatability, Multivariate approach. 1 Let X be the physico-chemical data: P N numeric where P is the number of the products and N is the number of variables: X 1,..., X N the columns of X. 2 Scale and center X and consider the correlation matrix V = X X/P. 3 Search for F a P 1 matrix (or vector) such that F = N α j X j such that Var(F ) is maximal j=1 Solution: F = F 1 = Xu 1 where u 1 is the eigen vector of V associated to the greater eigen value λ 1 of V. C 1 is called the first principal component of X.

26 26/50 Repeatability, Multivariate approach. 4 We search also a second F such that F = N α j X j such that Var(F ) is maximal and Cor(F, F 1 ) = 0 j=1 Solution: F = F 2 = Xu 2 where u 2 is the eigen vector of V associated to the 2nd greater eigen value λ 2 of V. 5 And F 1, F 2,..., F k,.. 6 Draw a scatter-plot [F k, F k ] with repetition as a supplementary variable.

27 27/50 In practice with R, Scree-plot > library(factominer) > library(factoextra) > res.pca<-pca(x,scale.unit = T, + ncp = 5,quali.sup = 66:67,graph = F) > fviz_screeplot(res.pca, ncp = 10) + theme_classic()

28 28/50 In practice with R, Scree-plot Percentage of explained variances Scree plot Dimensions

29 29/50 In practice with R, Scree-plot Percentage of explained variances Scree plot Dimensions

30 In practice with R, Product-map > library(ggplot2) > library(scales) > library(grid) > library(plyr) > library(gridextra) > find_hull <- function(x) X[chull(X$PC1, X$PC2), ] >dt=cbind.data.frame(res.pca$ind$coord[,1:2], + X$product,X$repetition) > colnames(dt)=c("pc1","pc2","product","repetition") > dt2=cbind.data.frame(res.pca$quali.sup$coord[1:14,1:2], + rownames(res.pca$quali.sup$coord[1:14,])) > colnames(dt2)=c("pc1","pc2","product") > dt2$product=gsub("_","",dt2$product) > hulls <- ddply(dt, "Product", find_hull)

31 31/50 In practice with R, Product-map > p<-ggplot(data = dt,aes(x=pc1,y=pc2,col=product,fill=product))+ + geom_hline(yintercept = 0,alpha=.4)+ + geom_vline(xintercept = 0,alpha=.4)+ + geom_polygon(data=hulls, alpha=.2)+ + geom_point() + + geom_text(data=dt2,aes(x=pc1,y=pc2,label=product),col="black")+ + xlab(paste("axis 1 (",round(res.pca$eig[1,2],1),"%)",sep=""))+ + ylab(paste("axis 2 (",round(res.pca$eig[2,2],1),"%)",sep="")) > p<-p+theme_classic()+theme(legend.position = "none") > p

32 In practice with R, Product-map 7.5 chetou(sp) 5.0 chetou(3p) Axis 2 (12.5%) orati(3p) corati(sp) chemch(3p) leguim(3p) leguim(sp) arbeq(3p) chemch(sp) arbeq(sp) chemle(sp) chemle(3p) zalmat(3p) zalmat(sp) Axis 1 (28.1%)

33 33/50 In practice with R, V-test > res.pca$quali.sup$v.test[15:17,] Dim.1 Dim.2 Dim.3 Dim.4 Dim.5 repetition repetition repitation

34 In practice with R, Product-map Axis 2 (12.5%) Axis 1 (28.1%)

35 In practice with R, Product-map Circle of Correlations Axis 2 (12.5%) 1.0 H- Tyr DFLA 0.5 Ac-Pin LA Cis-3-hexenol 3-hexenyl acetate polyphenols b carotène 0.0 SOO C 18:1 OOOAOL LOO LLL LLO C 18:2 1-Pentanol LnLO LnLP LnOO LOP Trans-2-hexenol PLP Hexanal POO C 16:1 PLL POP Axis 1 (28.1%)

36 External Mapping

37 37/50 External Mapping Y f (X 1, X 2,...) It Aims to give an estimation of the probability of liking of a product. derive a multidimensional representation of products based on their sensory profile or a set of other external data such as instrumental measures of color, texture or flavor.

38 38/50 External Mapping, Danzart 1998, Combine all measures X 1, X 2,... into one analytic data matrix X with dimensions P N (products attributes). 2 Perform a dimension reduction method on X: for example a Principal Component Analysis and let F 1 and F 2 be the first two components. 3 Let Y be the hedonic matrix with dimensions P C (products consumers) and with columns Y 1,..., Y C. 4 For all c = 1,..., C, perform a regression Model Let f c be an estimation of f c. Y c = f c (F 1, F 2 ) + ϵ c

39 External Mapping, Danzart 1998, Consider a grid in the plan (C 1, C 2 ): { G = (f 1 l 1, f 2 l 2 ), (l 1, l 2 ) {1,..., L} 2} 6 For each c and (l 1, l 2 ), predict ŷ(c, l 1, l 2 ) = f c (f 1 l 1, f 2 l 2 ) 7 Computer the percentage of the consumers that have predictions higher than the average of their given scores. where ϕ(l 1, l 2 ) = 1 C C y(c) = 1 P 1 {ŷ(c,l1,l 2 ) y(c)} c=1 P y p (c) p=1 39/50

40 40/50 In practice with R 1 Prepare data > x.phch=aggregate(phch[,1:65], + by = list(phch$product),mean) > y.hedo=aggregate(hedonic[,1:255], + by=list(hedonic$product),mean) > z.panel=aggregate(panel[,3:8], + by=list(panel$products),mean) 2 Merge PHCH-data and Senso-data into one data matrix. > i=match(x.phch$group.1,z.panel$group.1) > X=cbind.data.frame(x.phch[,-1],z.panel[i,-1]) 3 Hedonic-matrix > Y=y.hedo[,-c(1:4)]

41 41/50 In practice with R (cont.) 4 Perform PCA on the Senso-data from SensMap package. > library(sensmap) > map=map.with.pca(x = X) 5 Extract the first two components of the PCA and create the grid. > maps=cbind.data.frame(map$f1,map$f2) > colnames(maps)=c("f1","f2") > nbpoints=100 > discretspace=discrete.function(nbpoints = nbpoints, + map = maps)

42 In practice with R (cont.) 6 Perform quadratic regression (default) and estimate the predictions among all the points in the grid and for each consumer Y c = a c + b 1 cf 1 + b 1 cf 2 + b 11 c (F 1 ) 2 + b 22 c (F 2 ) 2 + b 12 c F 1 F 1 + ϵ c > x.lm=predict.scores.lm(y = Y,discretspace = discretspace, + map = maps) 7 Compute percentage of Liking. > p.lm=100*rowmeans(x.lm$preference)

43 43/50 In practice with R (cont.) 8 Draw the External-map. > image.plot(graph.surfconso,col=terrain.colors(60)) > contour(x=graph.surfconso$x,y=graph.surfconso$y, + z=graph.surfconso$z,add=t, + levels=seq(from=0,to=100,by=5)) > text(x=maps$f1,y=maps$f2,labels=y.hedo$group.1,pos=3) > points(x=maps$f1,y=maps$f2,pch=20)

44 44/50 In practice with R (cont.) chemle(3p) chemle(sp) zalmat(3p) zalmat(sp) arbeq(3p) 40 arbeq(sp) chetou(sp) chemch3p leguim(sp) leguim(3p) chemch(sp) chetou(3p) corati(3p) corati(sp)

45 In practice with R (cont.) 45/50

46 Denoising the External map Perform now a Local Polynomial Regression Fitting of the percentages in terms of the (x, y) coordinates > dlow=cbind.data.frame(discretspace,p.lm) > colnames(dlow)=c("x","y","z") > m.loess<-loess(z~x+y,span=.5,data=dlow,degree=2) > p.loess=m.loess$fitted > graph.surfconso.loess=as.image(z=p.loess,x=discretspace, + nrow=nbpoints, + ncol=nbpoints)

47 47/50 Denoising the External map (cont.) chetoui(sp) chemle(3p) chemle(sp) zalmat(3p) zalmat(sp) arbeq(3p) arbeq(sp) 45 chemch(3p) leguim(sp) leguim(3p) chemch(sp) chetoui(3p) corati(3p) corati(sp)

48 Denoising the External map (cont.)

49 49/50 Some references Analysis Sensory Data with R, Sebastien Lê, Thiery Worsh, CRC Press (2014) McEwan, J. A. (1996). Preference Mapping for product optimization. In Multivariate analysis of data in sensory science (Naes, T., and Risvik, E. eds.). New York: Elsevier (pp ). Danzart, M. (1998). Quadratic model in preference mapping. 4th Sensometric meeting, Copenhagen, Denmark, August timesens.com by INRA Dijon, France (2016)

50 Thank you 50/50

Christine Borgen Linander 1,, Rune Haubo Bojesen Christensen 1, Rebecca Evans 2, Graham Cleaver 2, Per Bruun Brockhoff 1. University of Denmark

Christine Borgen Linander 1,, Rune Haubo Bojesen Christensen 1, Rebecca Evans 2, Graham Cleaver 2, Per Bruun Brockhoff 1. University of Denmark Individual differences in replicated multi-product 2-AFC data with and without supplementary difference scoring: Comparing Thurstonian mixed regression models for binary and ordinal data with linear mixed