Practical: Phenotypic Factor Analysis

Similar documents
Chapter 13 GMM for Linear Factor Models in Discount Factor form. GMM on the pricing errors gives a crosssectional

Solution of Assignment #2

Applied Statistics II - Categorical Data Analysis Data analysis using Genstat - Exercise 2 Logistic regression

Observer Bias and Reliability By Xunchi Pu

Chapter 6 Student Lecture Notes 6-1

AS 5850 Finite Element Analysis

Review Statistics review 14: Logistic regression Viv Bewick 1, Liz Cheek 1 and Jonathan Ball 2

ECE602 Exam 1 April 5, You must show ALL of your work for full credit.

TEMASEK JUNIOR COLLEGE, SINGAPORE. JC 2 Preliminary Examination 2017

Errata. Items with asterisks will still be in the Second Printing

Laboratory work # 8 (14) EXPERIMENTAL ESTIMATION OF CRITICAL STRESSES IN STRINGER UNDER COMPRESSION

Nonparametric Methods: Goodness-of-Fit Tests

APPENDIX: STATISTICAL TOOLS

Linear Non-Gaussian Structural Equation Models

Estimation of apparent fraction defective: A mathematical approach

Cramér-Rao Inequality: Let f(x; θ) be a probability density function with continuous parameter

That is, we start with a general matrix: And end with a simpler matrix:

Text: WMM, Chapter 5. Sections , ,

Abstract Interpretation: concrete and abstract semantics

Self-Adjointness and Its Relationship to Quantum Mechanics. Ronald I. Frank 2016

Chapter 3 Lecture 14 Longitudinal stick free static stability and control 3 Topics

Introduction to Condensed Matter Physics

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Constants and Conversions:

Recursive Estimation of Dynamic Time-Varying Demand Models

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

INFLUENCE OF GROUND SUBSIDENCE IN THE DAMAGE TO MEXICO CITY S PRIMARY WATER SYSTEM DUE TO THE 1985 EARTHQUAKE

UNTYPED LAMBDA CALCULUS (II)

Content Skills Assessments Lessons. Identify, classify, and apply properties of negative and positive angles.

Differentiation of Exponential Functions

2. Laser physics - basics

Exam 1. It is important that you clearly show your work and mark the final answer clearly, closed book, closed notes, no calculator.

Analyzing Frequencies

Search for the Dark Photon at Belle for 0.27 < m A < 3 GeV/c 2

VII. Quantum Entanglement

Elements of Statistical Thermodynamics

Extracting Common Factors to Classify Companies Listed in the Stock Exchange of Thailand by Using an Accounting Based Model

Appendix. Kalman Filter

CS 361 Meeting 12 10/3/18

Probability and Stochastic Processes: A Friendly Introduction for Electrical and Computer Engineers Roy D. Yates and David J.

Schrodinger Equation in 3-d

Sara Godoy del Olmo Calculation of contaminated soil volumes : Geostatistics applied to a hydrocarbons spill Lac Megantic Case

What are those βs anyway? Understanding Design Matrix & Odds ratios

Self-interaction mass formula that relates all leptons and quarks to the electron

Eigenvalue Distributions of Quark Matrix at Finite Isospin Chemical Potential

MULTIVARIATE BAYESIAN REGRESSION ANALYSIS APPLIED TO PSEUDO-ACCELERATION ATENUATTION RELATIONSHIPS

General Notes About 2007 AP Physics Scoring Guidelines

Hydrogen Atom and One Electron Ions

15. Stress-Strain behavior of soils

4.2 Design of Sections for Flexure

SME 3033 FINITE ELEMENT METHOD. Bending of Prismatic Beams (Initial notes designed by Dr. Nazri Kamsah)

Panel Data Analysis Introduction

EXST Regression Techniques Page 1

Linear-Phase FIR Transfer Functions. Functions. Functions. Functions. Functions. Functions. Let


Class(ic) Scorecards

There is an arbitrary overall complex phase that could be added to A, but since this makes no difference we set it to zero and choose A real.

DIFFERENTIAL EQUATION

Coupled Pendulums. Two normal modes.

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory

Electromagnetics Research Group A THEORETICAL MODEL OF A LOSSY DIELECTRIC SLAB FOR THE CHARACTERIZATION OF RADAR SYSTEM PERFORMANCE SPECIFICATIONS

Prediction of the Pressure Signature of a Ship in a Seastate

Logistic, Poisson, and Nonlinear Regression Problems

AP Biology Lab 7 GENETICS OF ORGANISMS

Properties of Phase Space Wavefunctions and Eigenvalue Equation of Momentum Dispersion Operator

ST 524 NCSU - Fall 2008 One way Analysis of variance Variances not homogeneous

Analyzing genotype-by-environment interaction using curvilinear regression

Properties of Quarks ( ) Isospin. π = 1, 1

Search sequence databases 3 10/25/2016

Dealing with quantitative data and problem solving life is a story problem! Attacking Quantitative Problems

ARIMA Methods of Detecting Outliers in Time Series Periodic Processes

A Propagating Wave Packet Group Velocity Dispersion

Pipe flow friction, small vs. big pipes

The general linear model for fmri

LEP Higgs Search Results. Chris Tully Weak Interactions and Neutrinos Workshop January 21-26, 2002

Week 3: Connected Subgraphs

What does the data look like? Logistic Regression. How can we apply linear model to categorical data like this? Linear Probability Model

Full Waveform Inversion Using an Energy-Based Objective Function with Efficient Calculation of the Gradient

Lie Groups HW7. Wang Shuai. November 2015

Robust Regression. Appendix to An R and S-PLUS Companion to Applied Regression. John Fox. January 2002

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

Chapter 10. The Chi-Squared Test

Chapter 13 Aggregate Supply

Diploma Macro Paper 2

MCB137: Physical Biology of the Cell Spring 2017 Homework 6: Ligand binding and the MWC model of allostery (Due 3/23/17)

Additional Math (4047) Paper 2 (100 marks) y x. 2 d. d d

A Bayesian criterion for simplicity in inverse problem parametrization

Why is a E&M nature of light not sufficient to explain experiments?

ME311 Machine Design

Math 34A. Final Review

INC 693, 481 Dynamics System and Modelling: Linear Graph Modeling II Dr.-Ing. Sudchai Boonto Assistant Professor

Unit 30: Inference for Regression

Lecture Outline. Skin Depth Power Flow 8/7/2018. EE 4347 Applied Electromagnetics. Topic 3e

Chapter 14 Aggregate Supply and the Short-run Tradeoff Between Inflation and Unemployment

Finite element discretization of Laplace and Poisson equations

SECTION where P (cos θ, sin θ) and Q(cos θ, sin θ) are polynomials in cos θ and sin θ, provided Q is never equal to zero.

Quasi-Classical States of the Simple Harmonic Oscillator

Inflation and Unemployment

MEMORIAL UNIVERSITY OF NEWFOUNDLAND

Transcription:

Practical: Phnotypic Factor Analysis Big 5 dimnsions Nuroticism & Extravrsion in 361 fmal UvA studnts - Exploratory Factor Analysis (EFA) using R (factanal) with Varimax and Promax rotation - Confirmatory Factor Analysis (CFA) using OpnMx Dolan & Abdllaoui Bouldr Workshop 2016

Nuroticism: idntifis individuals who ar pron to psychological distrss n1 - Anxity: lvl of fr floating anxity n2 - Angry Hostility: tndncy to xprinc angr and rlatd stats such as frustration and bittrnss n3 - Dprssion: tndncy to xprinc flings of guilt, sadnss, dspondncy and lonlinss n4 - Slf-Consciousnss: shynss or social anxity n5 - Impulsivnss: tndncy to act on cravings and urgs rathr than dlaying gratification n6 - Vulnrability: gnral suscptibility to strss Extravrsion: quantity and intnsity of nrgy dirctd outwards into th social world 1 - Warmth: intrst in and frindlinss towards othrs 2 - Grgariousnss: prfrnc for th company of othrs 3 - Assrtivnss: social ascndancy and forcfulnss of xprssion 4 - Activity: pac of living 5 - Excitmnt Sking: nd for nvironmntal stimulation 6 - Positiv Emotions: tndncy to xprinc positiv motions 2

# Part 1: rad th data EFA # clar th mmory rm(list=ls(all=true)) # load OpnMx library(opnmx) # st workingdirctory stwd( YOUR_WORKING_DIRECTORY") # rad th data datb5=rad.tabl('rdataf') # rad th fmal data # assign variabl nams varlabs=c('sx', 'n1', 'n2', 'n3', 'n4', 'n5', 'n6', '1', '2', '3', '4', '5', '6', 'o1', 'o2', 'o3', 'o4', 'o5', 'o6', 'a1', 'a2', 'a3', 'a4', 'a5', 'a6', 'c1', 'c2', 'c3', 'c4', 'c5', 'c6') colnams(datb5)=varlabs # slct th variabls of intrst isl=c(2:13) # slction of variabls n1-n6, 1-6 datb2=datb5[,isl] # th data fram that w'll us blow. 3

# Part 2: summary statistics EFA Ss1=cov(datb2[,1:12]) # calculat th covarianc matrix in fmals print(round(ss1,1)) Rs1=cov2cor(Ss1) print(round(rs1,2)) # convrt to corrlation matrix Ms1=apply(datb2[,1:12],2,man) print(round(ms1,2)) # End of part 2 # fmals mans > print(round(ms1,2)) n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 23.22 20.38 23.35 23.99 27.62 20.00 31.23 28.81 23.36 25.43 27.78 31.13 > print(round(rs1,2)) n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 n1 1.00 0.44 0.77 0.59 0.18 0.72-0.25-0.20-0.29-0.24-0.14-0.40 n2 0.44 1.00 0.45 0.30 0.27 0.44-0.36-0.22 0.07-0.03 0.02-0.33 n3 0.77 0.45 1.00 0.62 0.20 0.69-0.28-0.24-0.36-0.29-0.11-0.47 n4 0.59 0.30 0.62 1.00 0.15 0.59-0.34-0.24-0.42-0.19-0.11-0.39 n5 0.18 0.27 0.20 0.15 1.00 0.22 0.06 0.15 0.06 0.13 0.37 0.19 n6 0.72 0.44 0.69 0.59 0.22 1.00-0.28-0.14-0.35-0.27-0.08-0.43 1-0.25-0.36-0.28-0.34 0.06-0.28 1.00 0.46 0.13 0.28 0.15 0.59 2-0.20-0.22-0.24-0.24 0.15-0.14 0.46 1.00 0.14 0.19 0.37 0.38 3-0.29 0.07-0.36-0.42 0.06-0.35 0.13 0.14 1.00 0.38 0.13 0.26 4-0.24-0.03-0.29-0.19 0.13-0.27 0.28 0.19 0.38 1.00 0.27 0.45 5-0.14 0.02-0.11-0.11 0.37-0.08 0.15 0.37 0.13 0.27 1.00 0.30 6-0.40-0.33-0.47-0.39 0.19-0.43 0.59 0.38 0.26 0.45 0.30 1.00 4

EFA SCREEPLOT ignvalus 1 2 3 4 How many factors? Ambiguous... - Eignvalus > 1 suggsts 3 factors - Elbow critrion suggsts 2 factors 2 4 6 8 10 12 1:12 5

EFA S y = Y t + Q 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 6

S y = Y t + Q EFA Matrix (12x2) of factor loadings (not: loading <.1 not shown): Factor1 Factor2 n1 0.851 n2 0.486 n3 0.838 n4 0.647-0.140 n5 0.464 0.501 n6 0.801 1 0.614 2 0.537 3-0.294 0.199 4 0.466 5 0.102 0.497 6-0.198 0.731 n1 n2 n3 1 1 N n4 n5 n6 1 2 3 E 4 5 6 Factor Corrlation matrix (2x2) Y: Factor1 Factor2 Factor1 1.000-0.368 Factor2-0.368 1.000 Diagonal covarianc matrix (12x12) of rsiduals (Q): n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 0.259 0.735 0.237 0.496 0.705 0.330 0.574 0.707 0.831 0.738 0.780 0.321 7

unrotatd rotatd 8

EFA Goodnss of fit of EFA 2 factor modl. Tst of th hypothsis that 2 factors ar sufficint. Th chi squar statistic is 289.76 on 43 dgrs of frdom. Th p-valu is 2.52-38 By this statistical critrion th modl is judgd to b accptabl if th p-valu is gratr than th chosn alpha (.g. alpha=.05). By th statistical critrion, w d rjct th modl! 9

# Part 4A: saturatd modl CFA ny=12 # numbr of indicators n=2 # xpctd numbr of common factors varnams=colnams(datb2) # var nams ### fit th saturatd modl ### # dfin th mans and covarianc matrix in OpnMx to obtain th saturatd modl logliklihood Rs1=mxMatrix(typ='Stand',nrow=ny,ncol=ny,fr=TRUE,valu=.05, lbound=-.9,ubound=.9,nam='cor') Sds1=mxMatrix(typ='Diag',nrow=ny,ncol=ny,fr=TRUE,valu=5,nam='sds') Man1=mxMatrix(typ='Full',nrow=1,ncol=ny,fr=TRUE,valu=25,nam='man1') MkS1=mxAlgbra(xprssion=sds%*%cor%*%sds,nam='Ssat1') # 12x12 corrlation matrix # 12x12 diagonal matrix (st dvs) # 1x12 vctor mans # xpctd covarianc matrix satmodls1=mxmodl('part1',rs1, Sds1, Man1,MkS1) # assmbl th modl # data + stimation function satdats1=mxmodl("part2", mxdata( obsrvd=datb2, typ="raw"), # th data mxexpctationnormal( covarianc="part1.ssat1", mans="part1.man1", dimnams=varnams), # th fit function mxfitfunctionml() ) # data & xpctd cov/mans # fit th saturatd modl... Modls1 <- mxmodl("modls1", satmodls1, satdats1, mxalgbra(part2.objctiv, nam="minus2logliklihood"), mxfitfunctionalgbra("minus2logliklihood")) Modls1_out <- mxrun(modls1) 10

CFA > summary(modls1_out) obsrvd statistics: 4332 stimatd paramtrs: 90 dgrs of frdom: 4242-2 log liklihood: 23578.09 numbr of obsrvations: 361 11

CFA S y = Y t + Q 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 12

Ly=mxMatrix(typ='Full',nrow=ny,ncol=n, fr=matrix(c( T,F, T,F, T,F, T,F, T,F, T,F, F,T, F,T, F,T, F,T, F,T, F,T),ny,n,byrow=T), valus=c(4,4,4,4,4,4,0,0,0,0,0,0, 0,0,0,0,0,0,4,4,4,4,4,4), # rad colunm-wis labls=matrix(c( 'f1_1','f1_2', 'f2_1','f2_2', 'f3_1','f3_2', 'f4_1','f4_2', 'f5_1','f5_2', 'f6_1','f6_2', 'f7_1','f7_2', 'f8_1','f8_2', 'f9_1','f9_2', 'f10_1','f10_2', 'f11_1','f11_2', 'f12_1','f12_2'),ny,n,byrow=t),nam='ly'), n1 n2 n3 Dfin factor loading matrix S y = Y t + Q 1 1 N n4 n5 n6 1 2 3 E 4 5 6 13 CFA

CFA T=mxMatrix(typ='Diag',nrow=ny,ncol=ny, labls=c('rn1','rn2','rn3','rn4','rn5','rn6', 'r1','r2','r3','r4','r5','r6'), fr=true,valu=10,nam='t') Dfin covarianc matrix of rsiduals Q S y = Y t + Q 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 14

CFA ## latnt corrlation matrix Ps=mxMatrix(typ='Symm',nrow=n,ncol=n, fr=c(false,true,false), labls=c('v1_0','r12_0', 'v2_0'), valus=c(1,.0,1),nam='ps') Dfin covarianc matrix of factors Y S y = Y t + Q NOTE: scaling of th common factors by fixing th variancs to qual 1. Y is a corrlation matrix! 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 15

CFA # mans Tau=mxMatrix(typ='Full',nrow=1,ncol=ny,fr=TRUE,valu=25, labls=c('mn1','mn2','mn3','mn4','mn5','mn6', 'm1','m2','m3','m4','m5','m6'), nam='mans') 16

CFA S y = Y t + Q MKS=mxAlgbra(xprssion=Ly%*%(Ps)%*%t(Ly)+ T%*%t(T),nam='Sigma'), MKM=mxAlgbra(xprssion=Mans,nam='mans')... assmbl th modl and run CFM1_out = mxrun(cfamodl1) 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 17

CFA > mxcompar(modls1_out,cfm1_out) bas comparison p minus2ll df AIC diffll diffdf p 1 modls1 <NA> 90 23578.09 4242 15094.09 NA NA NA 2 modls1 CFM1 37 23975.63 4295 15385.63 397.5483 53 2.785297-54 This modl dos not fit (but w alrady know this from th EFA rsults). 18

CFA > print(round(fa1s1pr$loadings[,1:2],4)) Factor1 Factor2 n1 0.8515-0.0252 n2 0.4858-0.0682 n3 0.8377-0.0870 n4 0.6465-0.1404 n5 0.4638 0.5005 n6 0.8014-0.0444 1-0.0907 0.6142 2-0.0126 0.5367 3-0.2943 0.1988 4-0.0996 0.4664 5 0.1024 0.4973 6-0.1978 0.7308 > round(st_ly,3) [,1] [,2] [1,] 4.923 0.000 [2,] 2.383 0.000 [3,] 5.197 0.000 [4,] 2.993 0.000 [5,] 0.889 0.000 [6,] 3.785 0.000 [7,] 0.000 2.565 [8,] 0.000 2.225 [9,] 0.000 1.822 [10,] 0.000 1.873 [11,] 0.000 1.576 [12,] 0.000 3.691 To do: fr th cross loadings Ly[5,2] and Ly[9,1] 19

To do: fr th cross loadings Ly[5,2] and Ly[9,1] CFA r 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 Tst whthr th cross loadings ar unqual zro using th liklihood ratio tst 20

CFA bas comparison p minus2ll df AIC diffll diffdf p 1 CFM1 <NA> 39 23891.96 4293 15305.96 NA NA NA 2 CFM1 CFM1 37 23975.63 4295 15385.63 83.67033 2 6.779838-19 Givn alpha=.05, w rjct th hypothsis Ly[5,2] = Ly[9,1] = 0 Thrfor ithr on or both ar not qual to zro. print(round(st_ly,2)) [,1] [,2] [1,] 4.88 0.00 [2,] 2.38 0.00 [3,] 5.20 0.00 [4,] 3.02 0.00 [5,] 2.30 2.32 [6,] 3.80 0.00 [7,] 0.00 2.50 [8,] 0.00 2.23 [9,] -1.47 0.89 [10,] 0.00 1.85 [11,] 0.00 1.74 [12,] 0.00 3.73 21

Inspct th corrlation matrix Y CFA -.569 1 1 N E n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 1.000-0.569-0.569 1.000 22

Rliability of th indicators > round(diag(sfit1)/diag(sfit),2) [1] 0.74 0.26 0.77 0.50 0.28 0.66 0.42 0.25 0.17 0.26 0.16 0.74 r 1 1 N E st_ly=mxeval(cfamodl1.ly,cfm2_out) st_ps=mxeval(cfamodl1.ps,cfm2_out) st_t=mxeval(cfamodl1.t,cfm2_out) st_t=st_t^2 Sfit1=st_Ly%*%st_Ps%*%t(st_Ly) Sfit=Sfit1+st_T rl=diag(sfit1)/diag(sfit) print(round(rl,3)) Varianc of n1 du to N dividd by th total varianc of n1: CFA 23.77 / 32.24 =.74. n1 n2 n3 n4 n5 n6 1 2 3 4 5 6 Th common factor N xplains 74% of th varianc in itm 1. 23