The General Linear Model. Monday, Lecture 2 Jeanette Mumford University of Wisconsin - Madison

The General Linear Model Monday, Lecture 2 Jeanette Mumford University of Wisconsin - Madison

How we re approaching the GLM Regression for behavioral data Without using matrices Understand least squares Using matrices With more than 1 regressor, you need this

What you ll get out of this What is least squares? What is a residual? How do you multiply a matrix and a vector? What are degrees of freedom? How do you obtain the estimates for the GLM using matrix math including the variance

Do you remember the equation for a line?

Do you remember the equation for a line? y=b+mx

Reaction Time (s) Do you remember the equation for a line? RT i + 0 + Age i 1 Age

Reaction Time (s) Do you remember the equation for a line? population mean RT i = 0 + Age i 1 Age

Reaction Time (s) Do you remember the equation for a line? RT i = 0 + Age i 1 fit isn t perfect, so we must account for error Age

The Model For the i th observational unit : The dependent (random) variable : Independent variable (not random), : Model parameters : Random error, how the observation deviates from the population mean

Simple summary mean(y i ) var(y i )

Reaction Time (s) Fitting the Model Q: Which line fits the data best? Age

Fitting the Model Reaction Time (s) Minimize the distance between the data and the line (error). Absolute distance? squared distance? Error term Age

Least Squares Minimize squared differences Minimize

Least Squares Minimize squared differences Minimize Works out nicely distribution-wise Easy minimization problem

Bias and Variance

Bias and Variance high bias / low variance low bias / high variance high bias / high variance low bias / low variance

Property of least squares Gauss Markov Assumptions error has mean 0 things aren t correlated variance is the same for all observations Unbiased and have lowest variance among all unbiased estimators

What about the variance? We also need an estimate for Start with the sums of squared error Divide by the appropriate degrees of freedom # of independent pieces of information - # parameters in model

Take away up to this point We use typically use least squares estimation to estimate the betas in regression Gauss Markov Minimum variance among all unbiased estimators

You don t need to do regression this way Anybody ever hear of using absolute error instead of squared error? Do you know the context?? Anybody ever hear of purposely biasing (!) an estimate in order to reduce variability? Do you know the context?

Multiple Linear Regression Add more parameters to the model Time for linear algebra!

Matrices is a 2x3 matrix Row index Column index

Matrices Square matrix- Same # of rows and columns Vector- column(row) vector has 1 column(row)

Matrices Transpose: or. Swap columns and rows. Element-wise addition and subtraction

Matrices Multiplication: Trickier Number of columns of first matrix must match number of rows of second matrix

Multiplication Matrices

Matrices Multiplication 1x4+

Matrices Multiplication 1x4+2x1=6

Matrices Multiplication 1x2+2x4=10

Multiplication Matrices

You try it out 1 2 3 4 0 B @ 1 1 1 1 1 C A =??

You try it out 0 B @ 1 1 1 1 1 C A 1 2 3 4

You try it out 0 B @ 1 1 1 1 1 C A 0 B @ 1 2 3 4 1 C A

You try it out 0 B @ 1 1 1 2 1 3 1 4 1 C A 0 1

Matrix Inverse Denoted Only for square matrices Only exists if matrix is full rank All columns (rows) are linearly independent, but I ll spare the details

Rank Deficient Matrices 2*column1=column3 column1+column2=column3

Rank Deficient Matrices 2*column1=column3 column1+column2=column3 SPM can handle rank deficiency, if the contrasts are specified properly

Can you find the rank deficiency??

Inverting rectangular matrix If the columns *only* are linearly independent, then is invertible Pseudoinverse:

Back to linear regression.................. (nx1) (nx4) (4x1) (nx1)

Viewing the Design Matrix Look at the actual numbers M F age

Viewing the Design Matrix Look at in image representation Darker=smaller # M F age

Multiple Linear Regression The distribution of Y is a multivariate Normal 0 0 0 0

Multiple Linear Regression is really easy to derive

Multiple Linear Regression is really easy to derive Same as least squares, but much easier to understand and write code for thanks linear algebra!

Multiple Linear Regression where N=length(Y) p=length( )

Multiple Linear Regression where N=length(Y) p=length( ) Or Rank(X)

Statistical Properties So the estimate is unbiased But we don t know

Take away Matrix algebra makes GLM estimation waaay easier Make sure you re comfortable multiplying a matrix and a vector Handy to know how to estimate the parameters

Ask me some questions

Do you know the answers? What is least squares? What is a residual? How do you multiply a matrix and a vector? What are degrees of freedom? How do you obtain the estimates for the GLM using matrix math including the variance

Questions??