LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

Size: px
Start display at page:

Download "LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS"

Transcription

1 LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS NOTES FROM PRE- LECTURE RECORDING ON PCA PCA and EFA have similar goals. They are substantially different in important ways. The goal is to collapse 20 items into more meaningful variables to explain the data. Maybe adding up the variables is not the optimal way to combine them. Large component values for each item - the items are important in explaining the data. We can interpret the component and give it a name. Component 2 explains another dimension to the data. Straight line that explains most of the data when you rotate the graphical representation - this is the first principal component. We want to find the weights for the variables of the first principal component - this has maximum variance, and the variance of this is the first eigenvalue. The second principal component maximises the explanation of the remaining variance, and has zero correlation with the first component - independent explanation of the variance. PCA aims to weight the different variables to account for more variation in the data. The principal components that you extract involve an additive sum of the weights of individual variables in order to explain more variation and not need 20 distinct variables. Once you have extracted components, you can rotate again to simplify the structure and aid interpretation of the components. The rotation enables you to remove components with smaller loadings, so you can ignore these and aid interpretation. NOTES FROM PRE- LECTURE RECORDING ON EFA Goal is similar to PCA but it is different. Involves an assumption of an underlying factor that causes the correlations among the variables. PCA is a mathematical matrix technique, and EFA is a method that assumes some underlying explanatory model in the data, such as intelligence or self esteem. We are searching for this underlying factor, that explains lots of different observations of intelligence (lots of different variables). In EFA, the observed variables are derived from the common factors. Common factors are similar to principal components, but common factors explain the observed variables - as these come from the overarching factor. Diagram - you have four intelligence tests (X1 X2 X3 X4), that measure the overarching factor, of intelligence F. Diagram describes the common factor model. The overarching factor F explains the observed scores on each X test, to the extent of the lamba term for each X. You would expect the tests to be correlated, but they are not correlated because

2 they are all explained by the common factor. Once you take the common factor into account, there are no relations among the tests, but each test has its own explanatory factor. Underlying rationale - partial correlations among variables. Can analyse whether correlations between two items is fully explained by an overarching variable. The goal of factor analysis is to find this latent variable which will account for observed correlations and render the correlation between items, when taking into account the latent variable, zero. The latent factor accounts for the observe correlation. The aim is to find a latent or unobserved variable, which when correlated with our observed variables, leads to partial correlations between the observed variables that are as close to zero as we can get. The communality is the variance due to common factors - the variance among each variable that is due to the overarching common factor. Then what is left over from the communality is the residual variance that comes from the individual variable itself. How to find these latent factors? There are various types of exploratory factor analysis. Recommended - maximum likelihood and principal axis factoring. - Maximum likelihood is good as it has a test of fit. If significant, it means it does not fit the data. We do not want to reject the null hypothesis. You want p values above.05 for this. The problem with this test is it is very sensitive to sample size, so can pick up trivial mis- specifications with larger samples too easily. - Various ways to take this into account: RMSEA. If it is less than 0.05, it is a good fit. You would say, based on the calculation, that this model is a good fit, based on the RMSEA calculation. Then once we have identified the factors, we can estimate factor scores for each case. One option for estimating factor scores is to form sum of scores for items that load on a factor. This assumes equal weights for each item (tau- equivalent test). If this is not true, alpha a is a serious underestimate. Another option - assumption of varying factor loadings. These are known as congeneric tests. What was described for principal components analysis - finding the weights for each variable. With unequal loadings - this is regression method, bartlett method or anderson rubin. Here, you want to create new variables / factors based on your multiple scores. Save the variables and run the analysis on the factor, not the separate tests. This is because the new factor combines the weights of the individual tests based on an assumed underlying latent variable. NOTES FROM LECTURE ASSOCIATIONS AMONG VARIABLES Say you measure people on a 20 item scale this would give you 20 variables. However, as this is too many to use practically in research, we can use the

3 associations among the variables to condense these into a smaller number. There are various techniques for associating variables o Cluster analysis placed variables in groups organised hierarchically, where each variable is in one and only one cluster. o Multidimensional scaling shows relationships among variables by spatial configuration. o Principal component analysis combines variables into weighted sums (components) to simplify the analysis. o Factor analysis shows relationships among variables by relationships to hypothetical underlying factors or latent variables. Principal component and factor analysis both use correlations as their underlying association among variables, whereas cluster analysis and MDS use proximities. These are similar methods, but are distinct. Why bother? o If you have 30 variables, then it makes sense to simplify them by combining them into a small number of components or factors. o PCA does this by finding optimal linear transformations. This is a mathematical technique that uses observed data to create principal components. o EFA assumes that there are latent factors that are not directly observed an explain the associations among the variables. This aims to find latent, unobserved factors that account for variation in the variables. PRINCIPAL COMPONENTS ANALYSIS The basic idea of principal components is graphical. The goal is to find new axes to fit your data these will represent the new components. If you have n variables, you are working in n dimensional space. Perhaps there are new axes that will more simply model the data. Perhaps the data can be described just as well without using the full n components. To make these new axes useful, you need to choose new components one by one in order to explain the most variance. Example three variables graphed in three- dimensional space (see above). Notice that this appears to be one straight line that cuts through the data. If you rotate the axes, you can visualize this straight line that appears to explain the data very well. Red line this is the first principal component, as it accounts for the most variation

4 in the data. You can find this line, and then use this as one variable, to therefore replace three variables with one and still account for a large proportion of the variation in the data. There may be more than one line that explains the data (more than one principal component). But you hope to keep the principal components low, such that you have minimal variables to work with and still explain a large proportion of variation in the data. ALGEBRAIC INTERPRETATION The sum of the a coefficients is not important, it is just a technical constraint that enables you to find the weights for the variables that make up the principal components. It is important that the first principal component you extract has maximum variance explained this is the first eigenvalue. If necessary, you can extract a second principal component, which explains the maximum proportion of the remaining variance (2 nd eigenvalue), and has ZERO correlation with the first principal component (is at right angles to this line). You can continue to do this until you have p principal components, if you have p variables. This means you are simply rotating your axes in order to explain the variation in the data with new axes. But if the majority of the variance is explained with 2-3 principal components, you would stop after these. Why is this simpler? o When they are first extracted, the principal components are not correlated with each other, even if the original variables are (we need them to be). o Using this technique, we can also understand which component is the most important the first component explains the largest proportion of the variance, the second component explains the second largest, and so on. Eigenvalues are derived from the correlation matrix of variables an eigendecomposition of the matrix. The correlation matrix is used, rather than the covariance matrix, so the variables are standardised before analysis. This is a mathematical technique, not a statistical technique. As there are no error terms here, this is not statistics. It is standard manipulation of the correlation matrix of the variables, to derive principal components. Principal components analysis is especially useful when you have a large number of variables and wish to explain the same amount of variance in the observed data using fewer variables. EXAMPLE 20 item teamwork scale for a training group. Participants were asked these items on a likert scale.

5 On SPSS Analyze > Dimension Reduction > Factor. Enter the variables you wish to derive principal components from. Extraction the default method is principal components. If you run the analysis with default settings, you get the following output: This shows you the 20 components. Initial eigenvalues column shows you the variance explained by each component. The column on the right highlights the cumulative % of the variance explained by each of the components, if all 20 are extracted. Do not use the default option used by SPSS of selecting 4 components this is wrong. We want to know when to decide that we have captured adequate signal, and the rest is noise. HOW MANY COMPONENTS SHOULD YOU EXTRACT? 1. Kaiser s rule This is the default option used by SPSS but is usually wrong. Number of factors = number of eigenvalues > 1.0. Known as the Kaiser- Guttman rule. This rule intuitively means that any retained factor should account for at least as much variation as any of the original variables. Each observed variable contributes one unit of variance to the total variance. If the eigenvalue is greater than 1, then each principal component explains at least as much variance as 1 observed variable. It tends to choose one third of the variables as the number of components. 2. The Scree Plot Good for graphically conceptualizing the idea of signal and noise. Scree is the rubble at the bottom of the cliff. The Scree Plot graphs the number of components (x) against the eigenvalues (y). At some point the plot turns, and this is the point at which the eigenvalues for your components diminish, and at this point is where you should stop extracting components. Obtain the Scree Plot under the extraction option in SPSS. From these different tests, you will obtain a number of different criteria for selecting the different

6 components, and you need to make an assessment about how many to extract. 3. The Parallel Test This test enables you to move beyond a superficial examination of the Scree plot. This test generates random data of the same dimensions this is definitely noise. You can then look at the eigenvalues for your dataset compared to the random data, and if they are larger, then this represents signal in direct comparison to noise. Where the Scree Plot (red line) is above the 95 th percentile (blue line) you can be sure this is significant signal. This is beneficial as it removes the subjectivity of simply looking at the Scree plot, and gives you a specific decision point. It is not in SPSS, but you can use a syntax script to run the parallel test. Syntax for the parallel test ensure you have the correct variables entered. From this, you get the following output:

7 From this, you extract components where your data (raw data column) is greater than the 95 th percentile of the random data (Pctile column). 2 is the last component where the eigenvalue for our observed data is greater than the random noise, so you select 2 components. 4. The MAP Test Velicer (1976) devised a test based on partial correlations known as the Minimum Average Partial Correlation test. After each component is extracted, it and those extracted before it are partialled out of the correlation matrix of original variables, and the average of the resulting partial correlations calculated. As more components are partialled out, the resulting partial correlations would approach zero. But at some point components that reflected noise would be partialled out, and the average partial correlation would begin to rise. Based on this, you choose the number of components corresponding to the minimum average partial correlation. You use a syntax script for the MAP test as well. This produces the following output:

8 You can see from the middle column that the average partial correlation decreases, and then increases once the noise has been partialled out. Choose the number of components at the minimum point of this value. Here, the map test and the parallel test agree on 2 principal components, so you can have confidence that 2 is a good number. If there was a discrepancy make a decision. Here you can run both and see if you gain differing interpretations, or refer to Schmitt (2011) and say that the parallel test is more accurate and go with that, or see what is the most interpretable option etc. Syntax for Parallel and MAP tests is on LMS. COMPONENT / FACTOR LOADINGS Now that we know the number of factors to be extracted is 2, you can specify this under extraction on SPSS. o This gives you a component matrix, with various different loadings for the different variables within each component. o Can see the weights by component each item with a greater weight is more important than those with a lesser weight, in the given component. o Signs of the weights this is all standardised, so if you get a high score on a positive weighted item, then you get a higher score on component 1. If you get a higher score on a negatively weighted item, your score on component 1 decreases.

9 This is still fairly complicated with loadings for all 20 items on the individual components. You want to work out what component 1 actually means. We need another step to make this simpler rotate again. COMPONENT ROTATION: SIMPLIFYING INTERPRETATION Once you have extracted a number of components that is less than your total number of variables, you can rotate this again in the remaining dimensions and gain an even simpler interpretation. We rotate again so that the components are easier to interpret. Allowed with large loadings on some variables, and small loadings on others. On SPSS you can do this under the rotations option. Varimax is an orthogonal rotation method keeps the axes at right angles and does not allow correlation between the components. Some rotation methods allow correlation, others don t. This provides a rotated component matrix in the output increases the larger loadings, and decreases the smaller loadings to aid interpretation. To aid this further, you can ask SPSS to remove small coefficients. o These are still included in the analysis, they are just not reported in the output. Removal of coefficients < 0.3 is generally acceptable. o Now that you are at this point, you can interpret the component with reference to the original variables. COMPONENT INTERPRETATION You look at the loadings greater than 0.6, with reference to the original items in the question, and derive what the commonality among all the items might be which represents the component variable. This is a unique way of adding up the different variables within each component, with derived weights that aid interpretation of the data. TYPES OF ROTATION Orthogonal rotation components / factors DO NOT correlate. Components are at right angles to one another. With an orthogonal rotation, loadings are equivalent to correlations between observed variables and components.

10 Oblique rotation components / factors DO correlate. Components converge on one another in the graph. With oblique rotations you always get a correlation matrix in SPSS as well which tells you the extent to which your factors correlate and a pattern matrix and structure matrix. Focus on the PATTERN MATRIX. The structure matrix is a product of the pattern matrix and the factor correlation matrix. But you use the pattern matrix for interpretation. Schmitt (2011) strongly recommends oblique rotation methods as in practice components usually correlate. In practice try both rotation methods. If the correlation between the components is quite small, revert back to orthogonal rotations. EXPLORATORY FACTOR ANALYSIS This is much more important in psychological research than principal components analysis, which is a much older technique. This is because in psychological research, much of what we deal with is latent constructs that we assume explain our observations. o Eg intelligence. There are many different measures for intelligence, and we assume this latent variable explains observations on intelligence tests. Factor analysis is a technique to search for such underlying factors a method related to measurement. Factor analysis assumes these latent constructs exist you need a theoretical argument to justify what construct might exist. Particularly useful when you want to estimate underlying factors or constructs that cause the associations among your variables, which cannot be measured directly. THE COMMON FACTOR MODEL We have the common factor, which each individual measure relates to, and the specific factor for each measure, which explains variation from the common factor (the error term).

11 We have observed variables, and assume there are k common factors that explain the observations on these variables. Each variable represents the product of the common factor, the degree to which the variable relates to the common factor, with the error added in. Assumptions of this EXPLORATORY factor analysis (these differ for confirmatory factor analysis): o Common factors are standardised (variance = 1) o Common factors are uncorrelated o Specific factors are uncorrelated o Common factors are uncorrelated with specific factors. The common factor model uses structural equation modeling interpretations. We have an unseen common factor (eg neuroticism) at the top. The four Xs are the different observations or measures (eg different neuroticism tests, or different items on a neuroticism test). Each observation is related to the common factor to the value of each lambda coefficient. Different lambda coefficients indicate differing degrees of prediction of the common factor. u coefficients represent the variation in the observations that is not explained by the common factor (error terms). Greater error terms mean smaller lambda coefficients. UNDERLYING RATIONALE: PARTIAL CORRELATIONS This technique uses the correlations between items, given the influence of the explanatory factor. Suppose a correlation of between items on an extraversion scale: o 1 Don t mind being the centre of attention.

12 o 2 Feel comfortable around people. o Correlation of 0.82 between item 1 and extraversion. o Correlation of 0.75 between item 2 and extraversion. r 12.E is the partial correlation between items 1 and 2 controlling for Extraversion. r 12 is the correlation between items 1 and 2. r 1E is the correlation between item 1 and Extraversion. r 2E is the correlation between item 2 and Extraversion. The above calculation tells you that the correlation between items 1 and 2, given extraversion (controlling for this), is zero their correlation is FULLY EXPLAINED by extraversion. The goal of factor analysis therefore is to find a latent variable which will account for observed correlation find the point at which the partial correlations are zero. We will try and find a model (which PCA does not do, this is simple mathematics), which best captures the covariance matrix, and renders the partial correlations zero. The aim is to find a latent or unobserved variable, which when correlated with our observed variables, leads to partial correlations between the observed variables that are as close to zero as possible. o The computer algorithm estimates the lambda coefficients to the point at which partial correlations are near as possible to zero. If each variable (Xs in diagram) is to vary, it can only vary by the extent to which it relates to the latent factor (lambda), or the specific factor (u term). This is conceptualised below:

13 We want high communality this is the variation due to the common factor (sum of the squared lambda terms). FUNDAMENTAL EQUATION OF FACTOR ANALYSIS Left term is the covariance matrix of all observed variables. We are trying to find an optimal model to separate this into the factor loading variance (left term), and the specific factor variance (right term). We are trying to model the covariance matrix here. PERFORMING THE FACTOR ANALYSIS IN SPSS Instead of principal components analysis under extraction tab, choose any other option usually maximum likelihood, or general least squares exploratory factor analysis options. There are issues here with whether or not the covariance matrix is readily factorisable. PCA always works because it is simple mathematics. As EFA is a statistical model, the conditions of the data need to be right for it to function correctly. TECHNICAL ISSUES WITH RUNNING EFA Sample size research has attempted to provide rules of thumb for sample size to ensure beneficial results. o The results of a key paper (Guadagnoli and Velicer 1987) illustrated that absolute sample size and communalities (size of factor loadings) are the most important thing. o If there are conditions where the component loadings are high (above 0.60), and there were four or more variables per component, the sample size could be as low as 50. Generally though, you would want around 150 participants for this to work.

14 Costello & Osborne (2005) state the ideally EFA should have: o High communalities for each item (> 0.8 would be excellent, but is more common). If an item has communality lower than 0.4, it could be removed, as it doesn t fit with other items. But don t remove items too liberally in the assignment if they have low communalities. o Few cross loadings not many items that load more than around 0.32 on more than one factor. Eg Big 5 personality traits these do not correlate with one another. In reality, you will have items that cross load. You can tolerate some cross- loadings, but you wouldn t have them in an ideal dataset. o More than three strongly loading items per factor. If these conditions do not apply, a bigger sample size could help alleviate this. But in practice you cannot increase your sample size after you have collected the data. o Schmitt (2011) recommends the Muthen- Muthen simulation method to decide on power that you have in your sample size, but this is implemented in MPlus and not SPSS. COMMUNALITY Communalities are important, but these are only known after finding factor loadings. Thus, there are various different diagnostics you can use to whether your covariance matrix will show communality prior to finding factor loadings: o Low correlations lead to low factor loadings. o Bartlett s test of whether correlations = 0. Want to reject this. o Anti- image correlation matrix. o Kaiser s measure of sampling adequacy. You can run these pre- diagnostics in SPSS under the descriptives tab tick anti- image, and the KMSA and Bartlett s options. > GUTTMAN- KAISER IMAGE APPROACH Image analysis involves partitioning of the variance of an observed variable into common and unique parts: o Produce correlations due to common parts image correlations. o Produce correlations due to unique parts anti- image correlations. Want these to be NEAR- ZERO. o In the table of output in SPSS in the diagonal of anti- image correlations, you want big numbers. Diagonal is the MSA value for the variable (want close to 1), and the off- diagonal is the anti- image correlations (want close to zero). This tells you that the correlations / covariances in the data should be factorisable. o Good to report this: anti- image correlations were close to zero none were greater than the diagonal smallest was 0.8.

15 KAISER S MEASURE OF SAMPLING ADEQUACY Don t need to know the formula, the output will report a measure of your sampling adequacy. The higher the value, the better. The sampling adequacy statistic here is very high this is a good result. Bartlett s test you want this to be significant and it usually is. It is good to report this, but it is better to understand and report the KMO with reference to the anti- image correlation matrix. These diagnostic techniques give you a sense of whether you can factorise your variables. METHODS FOR FINDING FACTORS Schmitt (2011) recommends the use of two SPSS methods for EFA: o Maximum likelihood (ML), and o Principal axis factoring (PA). Try different methods if you obtain consistent results, this is good. The maximum likelihood method has a test of fit. Statistical interpretation of the fit statistics requires the data to be multivariate normal, but factor loadings can always be calculated whether the data is normal or not. o Chi- square values testing reconstructed correlations (from factors) against data correlations. o We want the data to be matched by the model so we DO NOT want the goodness of fit test to be significant. We WANT probabilities GREATER than our significance level. BUT the chi- squared value is very sensitive to sample size, so can pick up trivial mis- specifications with larger samples too easily. If it does not fit, you should look at the RMSEA:

16 SPSS will not produce the RMSEA. In this case, it came to you get all the values from the goodness of fit table except the sample size. o Sometimes the degrees of freedom can be greater than the chi square here RMSEA is treated as 0. o This is an overall error term low values are good. Heywood cases in factor loadings. This is a technical problem you have probably extracted too many factors. This means the computer has inserted a correction because the loading is there is no unique variance. SUGGESTED STEPS IN EFA Check your data by examining the MSA values. Find the most likely number of factors, using one of the better rules such as the parallel test. Where there is some doubt, find solutions for your best estimate and one more and one less numbers of factors. Use a common factor method when you want to interpret the factors. If you just want a weighted sum of variables use principal components. Try oblique rotations first. If there are no correlations between the factors consider using a simpler orthogonal rotation. Unique to EFA will the model work? This is the anti- image analyses. Does the model fit RMSEA. You can also save the factors as new variables, and then run further analyses using these eg do males and females differ on the new factor (whatever this is argued to represent) independent samples t- test. FACTOR SCORES Having identified factors, we can estimate factor scores for each case. o One option form sum of scores for items that load on the factor. This assumes equal weights for each item (tau- equivalent test). If this is not true, alpha is a serious underestimate. o Another option with the assumption of varying factor loadings known as congeneric tests. SPSS provides three ways of doing this regression is the default (this is in the save option). Also Barlett method (recommended). Anderson- Rubin is misleading for oblique solutions as this assumes uncorrelated scores.

17 In practice, the method you use may not matter much. COMPARING PCA AND EFA Principal components analysis: o You do not claim that there is some underlying construct that you are measuring. There are simply themes in the data that you extract. o Works with observed variables. o Components are weighted composites of the observed variables, so are also observed variables. o If a variable is added to or removed from the analysis, the component may change. o If another component is added or removed, the other component loadings do not change. Exploratory factor analysis: o You can claim that you are investigating an underlying factor that causes correlations among your variables. o Factors are latent variables superordinate to the observed variables: they cause the observed variables to correlate. o If an observed variable is added to or removed from the analysis, the others should not change. o If another factor is added or removed, then the factor loadings of the others will change. Why the difference? This is an issue of the diagonal elements of the correlation matrix. o In component analysis, the value of 1.0 is used, and the aim is to explain all the variance of the variable. o In factor analysis, the diagonal element is the communality, and the aim is to explain only the common variance of an element. Comparing EFA and PCA for the data in the example the interpretation is similar. Using PCA over EFA perhaps factor analysis will not work for technical reasons. But PCA will always work as you are not modeling simply collapsing observed variables. If you cannot get a factor analysis, report this, and say that you will run a PCA instead. If you are only looking for themes in the data, then PCA is acceptable. But where you measure correlations with the idea that these reflect some underlying or latent variable, you should run an EFA. INTERPRETING PCA COMPONENTS VS EFA FACTORS Yes, some people say that components should not be interpreted - but I don't see why not and I also question the point of extracting components and using them as variables if you are a priori deciding that the components do not carry any interpretation at all. The subtle difference is that EFA assumes that there is at least one latent factor, a latent construct which the observed variables "measure". PCA does not assume this, so that the interpretation of a component is something like (as I said in the lecture) "a theme in the data", there's no claim that there is a latent construct such as intelligence or self esteem. Consider the cigarette example with which I started the PCA lecture. There are three physical properties of cigarettes but they are correlated so strongly that the data

18 falls pretty much on a straight line. There is no reason why I cannot "interpret" the component (the straight line) as indicating a dimension on which the cigarettes have more or less of the phsyical properties in question. It would be silly not to because that's indeed what it is. But there is no latent construct here. Of course it may well be possible to relate the physical properties dimension to health outcomes for heavy smokers, so this is a sensible and useful piece of analysis.

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

Exploratory Factor Analysis and Principal Component Analysis

Exploratory Factor Analysis and Principal Component Analysis Exploratory Factor Analysis and Principal Component Analysis Today s Topics: What are EFA and PCA for? Planning a factor analytic study Analysis steps: Extraction methods How many factors Rotation and

More information

VAR2 VAR3 VAR4 VAR5. Or, in terms of basic measurement theory, we could model it as:

VAR2 VAR3 VAR4 VAR5. Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in the relationships among the variables) -Factors are linear constructions of the set of variables (see #8 under

More information

Or, in terms of basic measurement theory, we could model it as:

Or, in terms of basic measurement theory, we could model it as: 1 Neuendorf Factor Analysis Assumptions: 1. Metric (interval/ratio) data 2. Linearity (in relationships among the variables--factors are linear constructions of the set of variables; the critical source

More information

Dimensionality Reduction Techniques (DRT)

Dimensionality Reduction Techniques (DRT) Dimensionality Reduction Techniques (DRT) Introduction: Sometimes we have lot of variables in the data for analysis which create multidimensional matrix. To simplify calculation and to get appropriate,

More information

Applied Multivariate Analysis

Applied Multivariate Analysis Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2017 Dimension reduction Exploratory (EFA) Background While the motivation in PCA is to replace the original (correlated) variables

More information

Introduction to Confirmatory Factor Analysis

Introduction to Confirmatory Factor Analysis Introduction to Confirmatory Factor Analysis Multivariate Methods in Education ERSH 8350 Lecture #12 November 16, 2011 ERSH 8350: Lecture 12 Today s Class An Introduction to: Confirmatory Factor Analysis

More information

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2

2/26/2017. This is similar to canonical correlation in some ways. PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 PSY 512: Advanced Statistics for Psychological and Behavioral Research 2 What is factor analysis? What are factors? Representing factors Graphs and equations Extracting factors Methods and criteria Interpreting

More information

Factor analysis. George Balabanis

Factor analysis. George Balabanis Factor analysis George Balabanis Key Concepts and Terms Deviation. A deviation is a value minus its mean: x - mean x Variance is a measure of how spread out a distribution is. It is computed as the average

More information

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables

1 A factor can be considered to be an underlying latent variable: (a) on which people differ. (b) that is explained by unknown variables 1 A factor can be considered to be an underlying latent variable: (a) on which people differ (b) that is explained by unknown variables (c) that cannot be defined (d) that is influenced by observed variables

More information

Principal Components Analysis using R Francis Huang / November 2, 2016

Principal Components Analysis using R Francis Huang / November 2, 2016 Principal Components Analysis using R Francis Huang / huangf@missouri.edu November 2, 2016 Principal components analysis (PCA) is a convenient way to reduce high dimensional data into a smaller number

More information

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis

Multivariate Fundamentals: Rotation. Exploratory Factor Analysis Multivariate Fundamentals: Rotation Exploratory Factor Analysis PCA Analysis A Review Precipitation Temperature Ecosystems PCA Analysis with Spatial Data Proportion of variance explained Comp.1 + Comp.2

More information

Principal Component Analysis & Factor Analysis. Psych 818 DeShon

Principal Component Analysis & Factor Analysis. Psych 818 DeShon Principal Component Analysis & Factor Analysis Psych 818 DeShon Purpose Both are used to reduce the dimensionality of correlated measurements Can be used in a purely exploratory fashion to investigate

More information

Multivariate and Multivariable Regression. Stella Babalola Johns Hopkins University

Multivariate and Multivariable Regression. Stella Babalola Johns Hopkins University Multivariate and Multivariable Regression Stella Babalola Johns Hopkins University Session Objectives At the end of the session, participants will be able to: Explain the difference between multivariable

More information

Package paramap. R topics documented: September 20, 2017

Package paramap. R topics documented: September 20, 2017 Package paramap September 20, 2017 Type Package Title paramap Version 1.4 Date 2017-09-20 Author Brian P. O'Connor Maintainer Brian P. O'Connor Depends R(>= 1.9.0), psych, polycor

More information

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables

Structural Equation Modeling and Confirmatory Factor Analysis. Types of Variables /4/04 Structural Equation Modeling and Confirmatory Factor Analysis Advanced Statistics for Researchers Session 3 Dr. Chris Rakes Website: http://csrakes.yolasite.com Email: Rakes@umbc.edu Twitter: @RakesChris

More information

Exploratory Factor Analysis and Canonical Correlation

Exploratory Factor Analysis and Canonical Correlation Exploratory Factor Analysis and Canonical Correlation 3 Dec 2010 CPSY 501 Dr. Sean Ho Trinity Western University Please download: SAQ.sav Outline for today Factor analysis Latent variables Correlation

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 10 August 2, 2011 Advanced Multivariate Statistical Methods ICPSR Summer Session #2 Lecture #10-8/3/2011 Slide 1 of 55 Today s Lecture Factor Analysis Today s Lecture Exploratory

More information

Intermediate Social Statistics

Intermediate Social Statistics Intermediate Social Statistics Lecture 5. Factor Analysis Tom A.B. Snijders University of Oxford January, 2008 c Tom A.B. Snijders (University of Oxford) Intermediate Social Statistics January, 2008 1

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Chapter 4: Factor Analysis

Chapter 4: Factor Analysis Chapter 4: Factor Analysis In many studies, we may not be able to measure directly the variables of interest. We can merely collect data on other variables which may be related to the variables of interest.

More information

UCLA STAT 233 Statistical Methods in Biomedical Imaging

UCLA STAT 233 Statistical Methods in Biomedical Imaging UCLA STAT 233 Statistical Methods in Biomedical Imaging Instructor: Ivo Dinov, Asst. Prof. In Statistics and Neurology University of California, Los Angeles, Spring 2004 http://www.stat.ucla.edu/~dinov/

More information

STAT 730 Chapter 9: Factor analysis

STAT 730 Chapter 9: Factor analysis STAT 730 Chapter 9: Factor analysis Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Data Analysis 1 / 15 Basic idea Factor analysis attempts to explain the

More information

Didacticiel - Études de cas

Didacticiel - Études de cas 1 Topic New features for PCA (Principal Component Analysis) in Tanagra 1.4.45 and later: tools for the determination of the number of factors. Principal Component Analysis (PCA) 1 is a very popular dimension

More information

Unconstrained Ordination

Unconstrained Ordination Unconstrained Ordination Sites Species A Species B Species C Species D Species E 1 0 (1) 5 (1) 1 (1) 10 (4) 10 (4) 2 2 (3) 8 (3) 4 (3) 12 (6) 20 (6) 3 8 (6) 20 (6) 10 (6) 1 (2) 3 (2) 4 4 (5) 11 (5) 8 (5)

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses. ISQS 6348 Final exam solutions. Name: Open book and notes, but no electronic devices. Answer short answer questions on separate blank paper. Answer multiple choice on this exam sheet. Put your name on

More information

Introduction to Factor Analysis

Introduction to Factor Analysis to Factor Analysis Lecture 11 November 2, 2005 Multivariate Analysis Lecture #11-11/2/2005 Slide 1 of 58 Today s Lecture Factor Analysis. Today s Lecture Exploratory factor analysis (EFA). Confirmatory

More information

B. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis

B. Weaver (18-Oct-2001) Factor analysis Chapter 7: Factor Analysis B Weaver (18-Oct-2001) Factor analysis 1 Chapter 7: Factor Analysis 71 Introduction Factor analysis (FA) was developed by C Spearman It is a technique for examining the interrelationships in a set of variables

More information

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013

STRUCTURAL EQUATION MODELING. Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 STRUCTURAL EQUATION MODELING Khaled Bedair Statistics Department Virginia Tech LISA, Summer 2013 Introduction: Path analysis Path Analysis is used to estimate a system of equations in which all of the

More information

Quantitative Understanding in Biology Short Course Session 9 Principal Components Analysis

Quantitative Understanding in Biology Short Course Session 9 Principal Components Analysis Quantitative Understanding in Biology Short Course Session 9 Principal Components Analysis Jinhyun Ju Jason Banfelder Luce Skrabanek June 21st, 218 1 Preface For the last session in this course, we ll

More information

Principal Components Analysis and Exploratory Factor Analysis

Principal Components Analysis and Exploratory Factor Analysis Principal Components Analysis and Exploratory Factor Analysis PRE 905: Multivariate Analysis Lecture 12: May 6, 2014 PRE 905: PCA and EFA (with CFA) Today s Class Advanced matrix operations Principal Components

More information

Quantitative Understanding in Biology Principal Components Analysis

Quantitative Understanding in Biology Principal Components Analysis Quantitative Understanding in Biology Principal Components Analysis Introduction Throughout this course we have seen examples of complex mathematical phenomena being represented as linear combinations

More information

Machine Learning (Spring 2012) Principal Component Analysis

Machine Learning (Spring 2012) Principal Component Analysis 1-71 Machine Learning (Spring 1) Principal Component Analysis Yang Xu This note is partly based on Chapter 1.1 in Chris Bishop s book on PRML and the lecture slides on PCA written by Carlos Guestrin in

More information

Introduction to Structural Equation Modeling

Introduction to Structural Equation Modeling Introduction to Structural Equation Modeling Notes Prepared by: Lisa Lix, PhD Manitoba Centre for Health Policy Topics Section I: Introduction Section II: Review of Statistical Concepts and Regression

More information

Inter Item Correlation Matrix (R )

Inter Item Correlation Matrix (R ) 7 1. I have the ability to influence my child s well-being. 2. Whether my child avoids injury is just a matter of luck. 3. Luck plays a big part in determining how healthy my child is. 4. I can do a lot

More information

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts

A Study of Statistical Power and Type I Errors in Testing a Factor Analytic. Model for Group Differences in Regression Intercepts A Study of Statistical Power and Type I Errors in Testing a Factor Analytic Model for Group Differences in Regression Intercepts by Margarita Olivera Aguilar A Thesis Presented in Partial Fulfillment of

More information

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING

FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING FACTOR ANALYSIS AND MULTIDIMENSIONAL SCALING Vishwanath Mantha Department for Electrical and Computer Engineering Mississippi State University, Mississippi State, MS 39762 mantha@isip.msstate.edu ABSTRACT

More information

Factor Analysis. Summary. Sample StatFolio: factor analysis.sgp

Factor Analysis. Summary. Sample StatFolio: factor analysis.sgp Factor Analysis Summary... 1 Data Input... 3 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 9 Extraction Statistics... 10 Rotation Statistics... 11 D and 3D Scatterplots...

More information

Factor Analysis: An Introduction. What is Factor Analysis? 100+ years of Factor Analysis FACTOR ANALYSIS AN INTRODUCTION NILAM RAM

Factor Analysis: An Introduction. What is Factor Analysis? 100+ years of Factor Analysis FACTOR ANALYSIS AN INTRODUCTION NILAM RAM NILAM RAM 2018 PSYCHOLOGY R BOOTCAMP PENNSYLVANIA STATE UNIVERSITY AUGUST 16, 2018 FACTOR ANALYSIS https://psu-psychology.github.io/r-bootcamp-2018/index.html WITH ADDITIONAL MATERIALS AT https://quantdev.ssri.psu.edu/tutorials

More information

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models

Confirmatory Factor Analysis: Model comparison, respecification, and more. Psychology 588: Covariance structure and factor models Confirmatory Factor Analysis: Model comparison, respecification, and more Psychology 588: Covariance structure and factor models Model comparison 2 Essentially all goodness of fit indices are descriptive,

More information

Introduction to Machine Learning

Introduction to Machine Learning 10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what

More information

COMPARING SEVERAL MEANS: ANOVA

COMPARING SEVERAL MEANS: ANOVA LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons

More information

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur

Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Regression Analysis and Forecasting Prof. Shalabh Department of Mathematics and Statistics Indian Institute of Technology-Kanpur Lecture 10 Software Implementation in Simple Linear Regression Model using

More information

Principal Components. Summary. Sample StatFolio: pca.sgp

Principal Components. Summary. Sample StatFolio: pca.sgp Principal Components Summary... 1 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 8 Component Weights... 9 D and 3D Component Plots... 10 Data Table... 11 D and 3D Component

More information

Multiple Regression Theory 2006 Samuel L. Baker

Multiple Regression Theory 2006 Samuel L. Baker MULTIPLE REGRESSION THEORY 1 Multiple Regression Theory 2006 Samuel L. Baker Multiple regression is regression with two or more independent variables on the right-hand side of the equation. Use multiple

More information

appstats8.notebook October 11, 2016

appstats8.notebook October 11, 2016 Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus

More information

ECON 497: Lecture 4 Page 1 of 1

ECON 497: Lecture 4 Page 1 of 1 ECON 497: Lecture 4 Page 1 of 1 Metropolitan State University ECON 497: Research and Forecasting Lecture Notes 4 The Classical Model: Assumptions and Violations Studenmund Chapter 4 Ordinary least squares

More information

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9

The Common Factor Model. Measurement Methods Lecture 15 Chapter 9 The Common Factor Model Measurement Methods Lecture 15 Chapter 9 Today s Class Common Factor Model Multiple factors with a single test ML Estimation Methods New fit indices because of ML Estimation method

More information

Y (Nominal/Categorical) 1. Metric (interval/ratio) data for 2+ IVs, and categorical (nominal) data for a single DV

Y (Nominal/Categorical) 1. Metric (interval/ratio) data for 2+ IVs, and categorical (nominal) data for a single DV 1 Neuendorf Discriminant Analysis The Model X1 X2 X3 X4 DF2 DF3 DF1 Y (Nominal/Categorical) Assumptions: 1. Metric (interval/ratio) data for 2+ IVs, and categorical (nominal) data for a single DV 2. Linearity--in

More information

Confirmatory Factor Analysis

Confirmatory Factor Analysis Confirmatory Factor Analysis Latent Trait Measurement and Structural Equation Models Lecture #6 February 13, 2013 PSYC 948: Lecture #6 Today s Class An introduction to confirmatory factor analysis The

More information

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA

Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis (PCA) Relationship Between a Linear Combination of Variables and Axes Rotation for PCA Principle Components Analysis: Uses one group of variables (we will call this X) In

More information

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts

More information

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate

More information

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals

Chapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus

More information

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression

t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression t-test for b Copyright 2000 Tom Malloy. All rights reserved. Regression Recall, back some time ago, we used a descriptive statistic which allowed us to draw the best fit line through a scatter plot. We

More information

Robustness of factor analysis in analysis of data with discrete variables

Robustness of factor analysis in analysis of data with discrete variables Aalto University School of Science Degree programme in Engineering Physics and Mathematics Robustness of factor analysis in analysis of data with discrete variables Student Project 26.3.2012 Juha Törmänen

More information

TESTING FOR CO-INTEGRATION

TESTING FOR CO-INTEGRATION Bo Sjö 2010-12-05 TESTING FOR CO-INTEGRATION To be used in combination with Sjö (2008) Testing for Unit Roots and Cointegration A Guide. Instructions: Use the Johansen method to test for Purchasing Power

More information

Advanced Methods for Determining the Number of Factors

Advanced Methods for Determining the Number of Factors Advanced Methods for Determining the Number of Factors Horn s (1965) Parallel Analysis (PA) is an adaptation of the Kaiser criterion, which uses information from random samples. The rationale underlying

More information

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement

Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement Class Introduction and Overview; Review of ANOVA, Regression, and Psychological Measurement Introduction to Structural Equation Modeling Lecture #1 January 11, 2012 ERSH 8750: Lecture 1 Today s Class Introduction

More information

Dimensionality Assessment: Additional Methods

Dimensionality Assessment: Additional Methods Dimensionality Assessment: Additional Methods In Chapter 3 we use a nonlinear factor analytic model for assessing dimensionality. In this appendix two additional approaches are presented. The first strategy

More information

REVIEW 8/2/2017 陈芳华东师大英语系

REVIEW 8/2/2017 陈芳华东师大英语系 REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p

More information

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA

Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA Topics: Measurement Invariance (MI) in CFA and Differential Item Functioning (DIF) in IRT/IFA What are MI and DIF? Testing measurement invariance in CFA Testing differential item functioning in IRT/IFA

More information

Confirmatory Factor Models (CFA: Confirmatory Factor Analysis)

Confirmatory Factor Models (CFA: Confirmatory Factor Analysis) Confirmatory Factor Models (CFA: Confirmatory Factor Analysis) Today s topics: Comparison of EFA and CFA CFA model parameters and identification CFA model estimation CFA model fit evaluation CLP 948: Lecture

More information

Chapter 3: Testing alternative models of data

Chapter 3: Testing alternative models of data Chapter 3: Testing alternative models of data William Revelle Northwestern University Prepared as part of course on latent variable analysis (Psychology 454) and as a supplement to the Short Guide to R

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models

Chapter 8. Models with Structural and Measurement Components. Overview. Characteristics of SR models. Analysis of SR models. Estimation of SR models Chapter 8 Models with Structural and Measurement Components Good people are good because they've come to wisdom through failure. Overview William Saroyan Characteristics of SR models Estimation of SR models

More information

Principles of factor analysis. Roger Watson

Principles of factor analysis. Roger Watson Principles of factor analysis Roger Watson Factor analysis Factor analysis Factor analysis Factor analysis is a multivariate statistical method for reducing large numbers of variables to fewer underlying

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each

Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each Repeated-Measures ANOVA in SPSS Correct data formatting for a repeated-measures ANOVA in SPSS involves having a single line of data for each participant, with the repeated measures entered as separate

More information

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018

SRMR in Mplus. Tihomir Asparouhov and Bengt Muthén. May 2, 2018 SRMR in Mplus Tihomir Asparouhov and Bengt Muthén May 2, 2018 1 Introduction In this note we describe the Mplus implementation of the SRMR standardized root mean squared residual) fit index for the models

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Latent Trait Reliability

Latent Trait Reliability Latent Trait Reliability Lecture #7 ICPSR Item Response Theory Workshop Lecture #7: 1of 66 Lecture Overview Classical Notions of Reliability Reliability with IRT Item and Test Information Functions Concepts

More information

Power Analysis Introduction to Power Analysis with G*Power 3 Dale Berger 1401

Power Analysis Introduction to Power Analysis with G*Power 3 Dale Berger 1401 Power Analysis Introduction to Power Analysis with G*Power 3 Dale Berger 1401 G*Power 3 is a wonderful free resource for power analysis. This program provides power analyses for tests that use F, t, chi-square,

More information

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data.

Structure in Data. A major objective in data analysis is to identify interesting features or structure in the data. Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two

More information

DIMENSION REDUCTION AND CLUSTER ANALYSIS

DIMENSION REDUCTION AND CLUSTER ANALYSIS DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833

More information

Basic IRT Concepts, Models, and Assumptions

Basic IRT Concepts, Models, and Assumptions Basic IRT Concepts, Models, and Assumptions Lecture #2 ICPSR Item Response Theory Workshop Lecture #2: 1of 64 Lecture #2 Overview Background of IRT and how it differs from CFA Creating a scale An introduction

More information

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur

Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Applied Multivariate Statistical Modeling Prof. J. Maiti Department of Industrial Engineering and Management Indian Institute of Technology, Kharagpur Lecture - 29 Multivariate Linear Regression- Model

More information

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model

Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model Course Introduction and Overview Descriptive Statistics Conceptualizations of Variance Review of the General Linear Model PSYC 943 (930): Fundamentals of Multivariate Modeling Lecture 1: August 22, 2012

More information

My data doesn t look like that..

My data doesn t look like that.. Testing assumptions My data doesn t look like that.. We have made a big deal about testing model assumptions each week. Bill Pine Testing assumptions Testing assumptions We have made a big deal about testing

More information

Hierarchical Factor Analysis.

Hierarchical Factor Analysis. Hierarchical Factor Analysis. As published in Benchmarks RSS Matters, April 2013 http://web3.unt.edu/benchmarks/issues/2013/04/rss-matters Jon Starkweather, PhD 1 Jon Starkweather, PhD jonathan.starkweather@unt.edu

More information

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,

More information

WELCOME! Lecture 13 Thommy Perlinger

WELCOME! Lecture 13 Thommy Perlinger Quantitative Methods II WELCOME! Lecture 13 Thommy Perlinger Parametrical tests (tests for the mean) Nature and number of variables One-way vs. two-way ANOVA One-way ANOVA Y X 1 1 One dependent variable

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Ryan P Adams So far, we have primarily examined linear classifiers and regressors, and considered several different ways to train them When we ve found the linearity

More information

Chapter 19: Logistic regression

Chapter 19: Logistic regression Chapter 19: Logistic regression Self-test answers SELF-TEST Rerun this analysis using a stepwise method (Forward: LR) entry method of analysis. The main analysis To open the main Logistic Regression dialog

More information

Phenotypic factor analysis

Phenotypic factor analysis 1 Phenotypic factor analysis Conor V. Dolan & Michel Nivard VU, Amsterdam Boulder Workshop - March 2018 2 Phenotypic factor analysis A statistical technique to investigate the dimensionality of correlated

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Walkthrough for Illustrations. Illustration 1

Walkthrough for Illustrations. Illustration 1 Tay, L., Meade, A. W., & Cao, M. (in press). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods. doi: 10.1177/1094428114553062 Walkthrough for Illustrations

More information

Big Data Analysis with Apache Spark UC#BERKELEY

Big Data Analysis with Apache Spark UC#BERKELEY Big Data Analysis with Apache Spark UC#BERKELEY This Lecture: Relation between Variables An association A trend» Positive association or Negative association A pattern» Could be any discernible shape»

More information

Factor Analysis Using SPSS

Factor Analysis Using SPSS Factor Analysis Using SPSS For an overview of the theory of factor analysis please read Field (2000) Chapter 11 or refer to your lecture. Factor analysis is frequently used to develop questionnaires: after

More information

Principal Component Analysis. Applied Multivariate Statistics Spring 2012

Principal Component Analysis. Applied Multivariate Statistics Spring 2012 Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction

More information

An Investigation of the Accuracy of Parallel Analysis for Determining the Number of Factors in a Factor Analysis

An Investigation of the Accuracy of Parallel Analysis for Determining the Number of Factors in a Factor Analysis Western Kentucky University TopSCHOLAR Honors College Capstone Experience/Thesis Projects Honors College at WKU 6-28-2017 An Investigation of the Accuracy of Parallel Analysis for Determining the Number

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Chapter 7: Correlation

Chapter 7: Correlation Chapter 7: Correlation Oliver Twisted Please, Sir, can I have some more confidence intervals? To use this syntax open the data file CIr.sav. The data editor looks like this: The values in the table are

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

Manual Of The Program FACTOR. v Windows XP/Vista/W7/W8. Dr. Urbano Lorezo-Seva & Dr. Pere Joan Ferrando

Manual Of The Program FACTOR. v Windows XP/Vista/W7/W8. Dr. Urbano Lorezo-Seva & Dr. Pere Joan Ferrando Manual Of The Program FACTOR v.9.20 Windows XP/Vista/W7/W8 Dr. Urbano Lorezo-Seva & Dr. Pere Joan Ferrando urbano.lorenzo@urv.cat perejoan.ferrando@urv.cat Departament de Psicologia Universitat Rovira

More information

Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology

Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology Psychology 308c Dale Berger Interactions and Centering in Regression: MRC09 Salaries for graduate faculty in psychology This example illustrates modeling an interaction with centering and transformations.

More information