8. Modeling with higher-level variables This chapter is concerned with predictor variables measured at a higher level; in this case at district and not at house level. We consider how to replicate data from an input file; estimate and display cross-level interactions involving categorical and continuous variables as linear terms and as polynomials Model 12 cross-level interactions between Envq and Type Open model10.wsz (to keep things simple we will ignore Size-5 in the random part) The data for the higher-level variable is in an external file; with the district number File on main menu ASCII text file input Name the columns to reflect that there is a district number and an environmental quality index for each of the 50 districts 143
To replicate the index to the house level; first sort on Distno and carry Envq Then we need to merge the short data (the 50 values, 1 per district) onto the long data for every house; this the opposite of Unreplicate (Take). Data manipulation on main menu Merge(replicate) 144
The Merge screen merges data from columns of length corresponding to the number of units at a higher level to a column of length equal to the number units at a specified lower level. The merging is based upon the matching of Ids in the higher and lower level columns. You must specify which columns these identifiers are in. In the above screen the 50 donor identifier are the district number which we have just read in, and the 1126 receiver identifiers are the level 2 units representing districts. The Envq index is replicated and stored back in the Envq column. The revised data is shown by To create the cross-level interactions between house type and the environmental index Model on Main menu Equations Add term Envq (that is as a main effect; uncentered as zero means something] Add term Order 1 (for 1 st order interactions) Variable: Envq Variable: Type with Terr as the reference category 145
More iterations to convergence and use RIGLS (Variance term removed for Detached removed to get partial convergence; then re-inserted) Questions 8.1 What does the constant measure? 0.103envq? 0.170Semi.envq 0.103Detached.envq We can test if all three terms involving Envq are significant. 146
Clearly the differential term for Detached is not; but the other two terms are. It would be helpful to get a graph of this cross level interaction. Model on main menu Customised predictions In the Setup window, we have to consider each of the variables in turn and choose the values that we want for each predictor in turn Click on Size and Change Range to just 5 [making predictions for just a 5 roomed house] Click on Type; Change Range from Mean to Category and tick on all three types Click on Envq; Change Range to get the Percentiles; specify the 5, 25, 50, 75 and 95 percentiles Click on Cons; ensure the value is set to 1 Click on District34.Cons; ensure the value is set to 0 Fill Grid Predict [Ignore warning] Predictions This gives us the predicted house prices and the 95% confidence intervals for the combinations of 3 types of houses and 5 different percentiles 15 sets of predictions in all 147
Plot Grid Y: mean.pred X: envq.pred Grouped by: type.pred Tick 95% CI s as lines to get the following graph 148
For all three types of property, house prices increase with environmental quality. Model 13 a second order polynomial for Envq All the models we have so far fitted have assumed a linear relationship between the response and the continuous predictor be it size or environmental quality. Now we will fir a 2ndr orde polynomial to assess whether the relationship is curved. In the equations window Click on Envq Modify term Click on polynomial and choose Polynomial of degree 2 1 More to convergence On inspection it looks as if there may a curve for Detached. Customised Predictions Setup window: keep everything the same Predict [ignore warning] Predictions 1 This will create a new variable the square of Envq and interact with Semi and Detached, as Type and Envq are already involved in a 1 st order interaction. 149
Plot Grid Y: mean.pred X: envq.pred Grouped by: type.pred Tick 95% CI s as lines to get the following graph There is in indeed some evidence of a curve. The linear increase with Envq for Detached properties is not sustained at high quality but there is a lot of uncertainty around this curve. Model 14 cross level interactions between Size and Envq We can also fit, estimate and display a cross-level interaction involving two continuous variables: Size and Envq. In equations window Click on envq2.det Delete term Confirm all 4 explanatory variables are to be removed Click on Envq Modify term Poly Degree 1 Done Add term Order 1 150
Variable Size Variable Envq More to convergence Done Model on main menu Customised predictions In the Setup window, we have to consider each of the variables in turn and choose the values that we want for each predictor in turn Clear old settings Click on Size and Change Range Range Upper bound 8 Lower bound 2 Increment 2 Click on Type; Change Range and change Category to Mean Click on Envq; Change Range to Percentiles; specify the 5, 25, 50, 75 and 95 percentiles Click on Cons; ensure the value is set to 1 Click on District34.Cons; ensure the value is set to 0 Fill Grid Predict [Ignore warning] Predictions This gives us the predicted house prices and the 95% confidence intervals for the combinations of 4 size of houses and 5 different percentiles 20 sets of predictions in all 151
Plot Grid Y: mean.pred X: envq.pred Trellis Y: Size.pred [columnar plots for 4 different values of Size] Tick 95% CI s as lines to get the following graph Intriguingly the increase in price with quality is only found for the larger properties. To get a 3D plot In the command window surf 1 'size.pred' 'envq.pred' 'mean.pred' show 1 Right click on graph brings up controls for contouring and animation and much else 152
This view shows coloured contour bands viewed to show the relationship with size for low values of environmental quality. This view shows coloured contour bands viewed to show the relationship with size for high values of environmental quality. 153
Some answers Questions 8.1 What does the constant measure? The average price of a 5-roomed terrace house across London (except District 34) where the district environmental quality is zero 0.103envq? This is the main effect for EnvironQual, as this goes up by 10 points, house prices for terraces goes up by 1k 0.170Semi.envq This is the differential slope for EnvironQual for a Semi as opposed to a Terrace house; that is the slope for Semi is 0.103 + 0.170 0.103Detached.envq This is the differential slope for EnvironQual for a Detached as opposed to a Terrace house; that is the slope for Semi is 0.103 + 0.1103 End of chapter 8 Written June 2009 154