Introduction

Size: px
Start display at page:

Download "Introduction"

Transcription

1 =============================================================================== mla.doc MLA /31/97 =============================================================================== Copyright (c) by Frank Busing Introduction Thank you for your interest in MLA, software for multilevel analysis of data with two levels. The main goal of MLA is to provide a program with an easy-to-use interface, alternative estimation methods and extensive resampling options. This file contains information about the following topics: - running MLA - syntax - comments - statements - summary Running MLA MLA runs as a stand-alone batch program. It uses an input file and an output file as parameters. This means that the program can be started by the command MLA [-hhv] <inputfile> <outputfile> where <inputfile> should be replaced by the name of the input file and <outputfile> replaced by the name of the output file. The options are help (-h), extended help (-H) and verbosity (-v), respectively. Both input- and output files are simple text files (ASCII). Syntax The input file consists of statements, which are case INsensitive. Every statement begins with a slash and a keyword (e.g., /TITLE). Every keyword may be abbreviated, but it must be at least of length three to be recognized. Other text following the keyword and/or leading spaces will be ignored. The rest of the statements must follow on lines below the keyword and should precede the next statement. These lines are called substatements and may also consist of one or more keywords (e.g., FILE). The last statement to be read is the /END statement. All other statements, and corresponding substatements, may appear in any order (but before the /END statement if they are to be reckoned with). A substatement may continue on the next line. In this case the first line must be ended with two backslashes (\\). Comments Comments are preceded by a percent sign (%) and may appear throughout the input file. All text on a line, after and including the percent sign, will serve as comment and is ignored as program input. Statements

2 /TITle (optional) Following the keyword /TITle, the first non-blank line contains the title for the analysis. The title is repeated on top of every part of the output. /DATa (required) The /DATa statement contains information about the data file. This statement has six substatements, three of which are required. FILe (required) This substatement indicates the name of the data file. The name is given after the equals sign and must satisfy the usual DOS conventions on filenames. The file itself is a free-field formatted numbers-only ASCII file. This means that values of variables must be separated by at least one blank. The file must consist of one case per line. Cases must be sorted by the level-2 identifier variable (see below). VARiables (required) The VARiables substatement specifies the number of variables in the data file. ID1 (optional) One of the variables in the data file MAY contain a code (number) that identifies the level-1 units. The number is used in the output to label level-1 units. The variable number has to follow the keyword ID1 and it must indicate the position of the identifier variable in the data file. The variable number must be at least 1 and less than or equal to the number of variables, indicated on the VARiables substatement. If omitted, the order in which the level-1 units are read from the data file is used as label. ID2 (required) One of the variables in the data file MUST contain a code (number) that identifies the level-2 units. The number is essential for a correct discrimination of the level-2 units. Cases must be sorted by the level-2 identifier variable identified on this substatement. MISsing (optional) For every variable, one missing value may be specified on this substatement. After the equal sign, first the variable is indicated followed by the missing value between parenthesis. More variables and values are seperated by comma's. CENTER (optional) CENTER means Center Grand Mean (Kreft and de Leeuw, 1996). Following the CENTER substatement, the variables must be specified, which will be centered (ingnoring grouping) just after reading the data, but before any analysis. More variables are seperated by comma's. /MODel (required) The /MODel statement is followed by a set of equations that specify the model that has to be estimated. There is only one level-1 equation, but there may be one or more level-2 equations. The order in which the level-1 and level-2

3 equations appear is arbitrary. The terms used in the level-1 equation are: - V = variable, which is a variable in the data file. V may be either indicating the outcome variable (V in front of the equal sign) or a predictor variable (V following the equal sign). - B = beta component. At level-1 these are the regression coefficients that seem to be outcome variables at the second level. - E = the level-1 random term. This term is considered to be a residual or error term. The variance of this term has to be estimated from the data. The level-2 equations partly consist of the same terms, but also of specific level-2 equation terms: - B = beta component, corresponding with the level-1 regression coefficient. At this level, however, B can be viewed as an outcome variable. - G = gamma component. These are the fixed parameters to be estimated in the multilevel model. - V = one of the variables from the data file (as explained above). - U = level-2 random term. As with the first level, this component is considered a residual or error term, but now for the second level. The second level may have more than one error term: one for each level-2 equation (i.e., for each beta element). The variances and the covariances of these terms have to be estimated from the data. In the equations each term is directly followed by a number (except for the level-1 random term E). For the V term this number is the variable number, the position of the variable in the data file (e.g., V4, the fourth variable in the data file). The other terms only use a number for identification, without any additional meaning (e.g., G3, one of the fixed parameters). The B terms have meaning in the equations of both levels. Every equation consists of one term before and at least one term after the equals sign. Terms on the right hand side of the equations are connected by plus signs. A variable and a corresponding parameter are connected by an asterisk (*). This is used to connect a fixed parameter and a predictor variable in level-2 equations and to connect a level-1 regression coefficient and a predictor variable in the level-1 equation. /CONstraints (optional) The /CONstraints statement allows fixed paramters to be fixed to a certain value. Constraints can thus only be applied to fixed parameters and then of the form: fixed paramter equals certain value, for example G2=2.0. The fixed paramter is held fixed during estimation and is not used for estimation of the standard errors either. The standard error will be zero and no t-test is performed for this parameter. /TEChnical (optional) The /TEChnical statement provides useful possibilities to alter the estimation process. If this statement and subsequent substatements are not specified, the program will run using default values. ESTimation method (optional) The substatement ESTimation method provides the opportunity to set the estimation method. One can choose between FIMl and REMl. FIMl is the default method and represents full information maximum likelihood estimation. REMl is restricted maximum likelihood estimation.

4 MINimization method (optional) This substatement sets the minimization method. One can choose between BFGS, using the Broyden-Fletcher-Goldfarb-Shanno variant of Davidon-Fletcher-Powell quasi-newton minimization method, and EM, the Expectation-Maximization method. The default minimization method is BFGS. REParameterization (optional) The level-2 covariance matrix should be a positive (semi-)definite matrix. To impose this restrictions, the parameters can be written in the following way: C=LDL', where C is the covariance matrix, L is a lower triangular matrix and D a diagonal matrix. The elements of the diagonal matrix D may either be square ROOTs or powers of e (the complementary of the natural LOGarithm). On default ROOT reparameterization is performed. WARnings (optional) If the maximum number of warnings is reached, the program terminates execution. This substatement can change the default value of 25. The value must be an integer between 1 and MAXimum number of iterations (optional) The default value of MAXiter is 100. This number should be sufficient for reaching convergence if the sample size is large enough and/or the number of parameters to be estimated is not too large. Changing the minimization method or the convergence criterion (see below) can make it necessary to raise the maximum number of iterations. The value must be an integer between 1 and CONvergence (optional) After each iteration the new function value is compared to the previous function value. The obtained difference is compared to a convergence related value. If F[i-1]-F[i] /{0.5*( F[i] + F[i-1] )} <= convergence, convergence is said to have been reached. In this formula, F[i] is the function value after the i-th iteration. The default value of CONvergence is 1.0E-08 and permitted values range from 0.0 to 1.0. SEEd (optional) For diagnostic purposes, one can provide an initial number (seed) for the random number generator. This is specified by the substatement SEEd. Using the same initial seed, the simulation samples will be identical. The seed value must be an integer between 1 and 1,073,735,823. LUXury (optional) Uniform deviates are obtained with te RANLUX pseudo-random number generator (Luscher, 1994). For this generator several types of LUXury may be specified. Five standard levels are defined: 0 = very long period, but fails many tests, 1 = considerable improvement, but still fails some tests, 2 = passes all tests, but theoretically still defective, 3 = default 4 = highest luxury, all 24 bits of mantissa thoroughly chaotic. A higher luxury level also means a slower generation of uniform deviates. FILe (optional) Results of the current analysis are written to file. The file must be a valid filename and be specified after the file keyword. Only the essential

5 information is written to this file. It's content changes over time and some self-inspection will show what is written where. /SIMulation (optional) Several options for simulation are available in MLA. These include jackknife, bootstrap and permutation. Theoretical details concerning the implementation of these resampling methods for the two-level model can be found in the MLA manual (Busing, Meijer, and van der Leeden, 1994). KINd (required) With this substatement the user can choose from three options, namely BOOtstrap, JACkknife and PERmutation simulation. All types of simulation work as follows: 1. perform analysis 2. obtain a (new) sample 3. repeat the analysis 4. save the (new) estimates The last three steps, together called a replication, are repeated a number of times. Afterwards, bias-corrected estimates of model parameters and nonparametric estimates of standard errors are computed. These estimates are computed from the set of saved estimates and the original maximum likelihood estimates. METhod This substatement specifies the method of bootstrap to be performed. It is required whenever KINd = BOOtstrap. One can choose between three different methods: 1. RESiduals (or ERRor). This method resamples the elements of the level-1 and level-2 residuals. Subsequently a new outcome or dependent variable is computed using these residuals, the original predictor or independent variables and the parameter estimates (fixed components). 2. CASes. Using this method a bootstrap sample is created by resampling the original data. Thus, complete cases are randomly drawn (with replacement) from the original cases. The procedure follows the nested structure in the data, by a nested resampling of cases: level-2 units are randomly drawn (with replacement) and cases within a particular drawn level-2 unit are drawn (with replacement). 3. PARametric. This method computes a new outcome or dependent variable using the original predictor variables, the parameter estimates and a set of level-1 and level-2 residuals. The residuals are drawn from a normal distribution with mean zero and variance sigma squared for the level-1 residuals, and from a (multivariate) normal distribution with zero mean vector and covariance matrix theta for the level-2 residuals. TYPe The substatement type is only required whenever the substatement KINd = BOOtstrap is used in combination with METhod = RESiduals. The TYPe substatement specifies the type of estimation that is used to determine the level-1 and level-2 residuals. One can choose between RAW and SHRunken. BALancing (optional) For the bootstrap methods RESiduals and CASes, a balanced bootstrap can be specified on this substatement. In that case BALancing = BALanced must be specified. Default is BALancing = UNBalanced.

6 RESample (optional) The substatement RESample offers the user the choice at which level units will be resampled. The default is 0, which means that at both levels units will be resampled. If KINd = JACkknife, or KINd = BOOtstrap and METhod = CASes, the user may choose 1 or 2, which means that only level-1 units or only level-2 units will be resampled, respectively. LINking The level-1 and level-2 residuals can be drawn linked or unlinked during simulation. Linking the residuals means that the level-1 residuals will be drawn from the same unit as where the level-2 residual was drawn from. This is specified with LINking = LINked. Specifying LINking = UNLinked has the same result as not using the substatement at all. This is the default. REPlications (optional) Using the substatement REPlications the number of bootstrap replications is specified. It must be an integer value between 1 and The default value is 100. CONvergence (optional) See the /TEChnical statement. Specifying the CONvergence substatement within the /SIMulation statement has only implications for the convergence during simulation. FILe (optional) Results of the simulation analysis can be written to a file. Using the substatement file, a filename may be specified. Filenames must satisfy the usual DOS conventions on filenames. /INTerval (optional) Several options for confidence interval estimation are available in MLA. These include normal interval, percentile, bias corrected percentile and bootstrap-t. KINd (required) With this substatement the user can choose from four methods, namely NORmal, PERcentile, BIAs-corrected percentile and BOOtstrap-t. WEIght (optional) This substatement has implications for the internal bootstrap, performed on the bootstrap-t confidence interval estimation. A balanced bootstrap can be specified on this substatement. In that case WEIght = BALanced must be specified. Default is WEIght = UNBalanced. REPlications (optional) As for the previous substatement, this substatement has also implications for the bootstrap-t method. The number of internal bootstrap replications is specified. It must be an integer value between 1 and The default value is 25. ALPha (optional) ALPha is the confidence level (two-sided). Now, the confidence interval is

7 equal to 100(1-2*Alpha). The default value of ALPha is CONvergence (optional) See the /TEChnical statement. Specifying the CONvergence substatement within the /INTerval statement has only implications for the convergence during interval estimation. FILe (optional) Results of the interval estimation can be written to a file. Using the substatement file, a filename may be specified. Filenames must satisfy the usual DOS conventions on filenames. /PRInt (optional) The /PRInt statement gives the user control over the output. Not all output is optional. The default output consists of a title page, an echo of the input, and system information. Output for the simulation analysis is generated whenever the /SIMulation statement is used. INPut If INPut = YES then the input information is digested and displayed in two parts. A required and an optional part. Default is NO. DEScriptives After the keyword descriptives the user may specify both variables and level-2 identification codes. For the total sample size and for every level-2 unit specified the following statistics are computed and displayed: mean, standard deviation, variance, skewness, kurtosis, Kolmogorov- Smirnov's Z, significance level of K-S's Z, minimum, 5th-quantile, first quartile, median, third quartile, 95th quantile and the maximum. RANdom level-1 coefficients The random level-1 coefficients or level-2 outcomes consist of ordinary least squares estimates per level-2 unit. After the keyword B's and sigma may be specified. OLSquares This part contains the ordinary least squares estimates for the fixed (gamma's) and random (variances and covariances of U and E) parameters. A regression analysis is performed, ignoring grouping. For the level-1 error variance two estimates are displayed, the one-step (E(1)) and two-step (E(2)) estimate. RESiduals After the keyword both level-1 and level-2 residuals may be specified (U and E). For the first level, three different types of residuals are displayed, namely the total, raw, and shrunken residuals. The level-2 residuals are the raw and shrunken residuals for the specified level-2 components. These estimates are based on the BFGS-FIML estimates. POSterior means Displayed are the posterior means which are specified following the keyword. These estimates are based on the BFGS-FIML estimates.

8 DIAgnostics For diagnostic purposes the Mahalanobis distances for the level-2 residuals are displayed. /PLOt (optional) The /PLOt statement gives the user control over some plot options. HIStograms This options is only in effect when \SIMULATION is chosen. If so, all parameters may be specified and histograms will be displayed of the specified parameters. SCAtters Scatterplots can be obtained for prediction and residuals. Specifying prediction produces a scatterplot of the response variable versus the predicted values based on the estimated fixed parameters. Specifying a variable produces a scatterplot of this variables versus all residuals associated with this variable. Summary Explanation of the codes used below. <a> means alpha numeric. <filename> means a filename has to be specified. <i> means an integer, possibly followed by the default value. <f> means a floating point, possibly followed by the default value. [a B] means choose between a and b with B as default.,... means that more of the same may occur. /TITLE <a> /DATA file variables id1 id2 missing center /MODEL V<i> = E V<i> = B<i> + E V<i> = B<i> + B<i>*V<i> E = <filename> = <i> = <i> = <i> = V<i>(<f>),... = V<i>,... (one of these equations) B<i> = G<i> B<i> = G<i> + G<i>*V<i> +... B<i> = U<i> ((n)one or more of these equations) B<i> = G<i> + U<i> B<i> = G<i> + G<i>*V<i> U<i>... /CONSTRAINTS G<i> = <f>... /TECHNICAL estimation method = [FIML reml] minimization method = [BFGS em]

9 reparameterization = [none ROOT logarithm] warnings = <i25> maximum number of iterations = <i100> convergence = <f > seed = <i> luxury = <i3> /SIMULATION kind = [bootstrap jackknife permutation] method = [residuals cases parametric] type = [raw shrunken] resample = [0 1 2] balancing = [UNBALANCED balanced] linking = [UNLINKED linked] replications = <i100> convergence = <f > file = <filename> /INTERVAL kind = [normal percentile bias-corr. bootstrap-t] replications = <i25> alpha = <f0.10> convergence = <f > file = <filename> /PRINT input = [yes NO] descriptives = V<i>,...,<i>,... random level-1 coefficients = B<i>,...,sigma olsquares = [yes NO] residuals = U<i>,...,E posterior means = B<i>,... diagnostics = [yes NO] /PLOT histograms = G<i>,...,U<i>*U<i>,...,E scatters = prediction,v<i>,... /END An annotated example can be found in MLA.IN. --- end of mla.doc

10 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ MLA 3.2 Syntaxchart: (MLA.IN) /TITLE % optional title MLA version 3.2: annotated example /DATA % specification of data file = mla.dat % data file vars = 6 % total number of variables in data file id1 = 3 % Level-1 identification code variable number id2 = 2 % Level-2 identification code variable number missing = v4( ) % missing value variable 4 = center = v6 % center grand mean level-1 predictor variable 6 /MODEL % model specification b1 = g1 \\ % lines may be broken + g2*v6 \\ % using two backslashes + u1 % intercept: level-2 equation 1 b2 = g3 + g4*v6 + u2 % slope: level-2 equation 2 v4 = b1 + b2*v5 + e % level-1 equation /TECHNICAL estimation = fiml % additional, technical specifications % full information maximum likelihood estimation minimization = bfgs % minimization method is dfp:bfgs reparam = root % reparameterization c=ll' warnings = 50 % maximum warnings raised to 50 maximum = 500 % raise maximum number of iterations seed = % initial seed to be used luxury = 4 % increase luxury level for random number generator convergence = 1.0E-12 % set convergence criterion /SIMULATION % specify resample simulation kind = bootstrap % bootstrap simulation analysis method = residuals % resample residuals type = shrunken % use shrunken residuals balance = unbalanced % no balance in resampling linking = unlinked % no linking of level-2 and level-1 residuals replications= 200 % set #replications to 200 /INTERVAL % interval estimation kind = bias-corrected % bias-corrected percentile interval alpha = 0.05 % interval width of 0.90 /PRINT % additional print specification inp = yes % print input specifications des = v4,v5,v6,2,3 % print descriptives of v4,v5,v6, level-2 units 2,3 ols = yes % print ordinary least squares estimates ran = all % print all random level-1 coefficients res = u1,u2 % print residuals u1 and u2 pos = all % print all available posterior means dia = yes % print diagnostics /PLOT hist = g2,g3,g4 scat = pred,v6,v5 /END % additional plot specification % plot histograms of g2, g3 and g4 % plot scatterplots: prediction- and residual-plots % final statement: the end.

11 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Beispieldaten für MLA.IN (MLA.DAT)

12 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/

13 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/

14 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/

15 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Ausgabeprotokoll von MLA32 (MLA.OUT): MMMM MMMMM LLLL MMMMM MMMMMM LLLL AA MMMM M MMMMMMM LLLL MMMM MM MMM MMMM LLLL MMMM MMMM MMMM LLLL MMMM MM MMMM LLLL AA MMMM M MMMM LLLL MMMM MMMM LLLL MMMM MMMM LLLL MMMM MMMM LLLL MMMM MMMM LLLLLLLLLLLLLLLLLLLLLLLLLLLL MMMM MMMM LLLLLLLLLLLLLLLLLLLLLLLLLLLLLL Multilevel Analysis for Two Level Data Version 3.2 Developed by Frank Busing Erik Meijer Rien van der Leeden Published by Leiden University Faculty of Social and Behavioural Sciences Department of Psychometrics and Research Methodology Wassenaarseweg 52 P.O. Box RB Leiden The Netherlands Phone +31 (0) Fax +31 (0)

16 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 1 Wed Oct 11 14:07: Inputfile statements 1 /TITLE % optional title 2 MLA version 3.2: annotated example 3 /DATA % specification of data 4 file = mla.dat % data file 5 vars = 6 % total number of variables in data file 6 id1 = 3 % Level-1 identification code variable number 7 id2 = 2 % Level-2 identification code variable number 8 missing = v4( ) % missing value variable 4 = center = v6 % center grand mean level-1 predictor variable 6 10 /MODEL % model specification 11 b1 = g1 \\ % lines may be broken 12 + g2*v6 \\ % using two backslashes 13 + u1 % intercept: level-2 equation 1 14 b2 = g3 + g4*v6 + u2 % slope: level-2 equation 2 15 v4 = b1 + b2*v5 + e % level-1 equation 16 /TECHNICAL % additional, technical specifications 17 estimation = fiml % full information maximum likelihood estimation 18 minimization = bfgs % minimization method is dfp:bfgs 19 reparam = root % reparameterization c=ll' 20 warnings = 50 % maximum warnings raised to maximum = 500 % raise maximum number of iterations 22 seed = % initial seed to be used 23 luxury = 4 % increase luxury level for random number generator 24 convergence = 1.0E-12 % set convergence criterion 25 /SIMULATION % specify resample simulation 26 kind = bootstrap % bootstrap simulation analysis 27 method = residuals % resample residuals 28 type = shrunken % use shrunken residuals 29 balance = unbalanced % no balance in resampling 30 linking = unlinked % no linking of level-2 and level-1 residuals 31 replications= 200 % set #replications to /INTERVAL % interval estimation 33 kind = bias-corrected % bias-corrected percentile interval 34 alpha = 0.05 % interval width of /PRINT % additional print specification 36 inp = yes % print input specifications 37 des = v4,v5,v6,2,3 % print descriptives of v4,v5,v6, level-2 units 2,3 38 ols = yes % print ordinary least squares estimates 39 ran = all % print all random level-1 coefficients 40 res = u1,u2 % print residuals u1 and u2 41 pos = all % print all available posterior means 42 dia = yes % print diagnostics 43 /PLOT % additional plot specification 44 hist = g2,g3,g4 % plot histograms of g2, g3 and g4 45 scat = pred,v6,v5 % plot scatterplots: prediction- and residual-plots 46 /END % final statement: the end. 46 lines read from "mla.in"

17 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 2 Wed Oct 11 14:07: MLA version 3.2: annotated example Input information Required Name of datafile : MLA.DAT Number of variables : 6 Level-2 id. column : 2 Equation 1 : B1=G1+G2*V6+U1 Equation 2 : B2=G3+G4*V6+U2 Equation 3 : V4=B1+B2*V5+E Single equation : V4=E0+G1+G2*V6+G3*V5+G4*V6*V5+U1+U2*V5 Optional Title of analysis : MLA version 3.2: annotated example Level-1 id. column : 3 Missing for var 4 : Center variables : V6 Estimation method : 1 Minimization method : 1 Reparameterization : 1 Maximum iterations : 500 Convergence : 1e-12 Warnings (maximum) : 50 Kind of simulation : 1 Simulation method : 3 Simulation balance : 0 Simulation linking : 0 Residuals type : 2 Resampling type : 0 Luxury level : 4 Initial random seed : Simulation convergence : 1e-10 Number of replications : 200 Simulation output file : Kind of CI estimation : CI alpha : CI convergence : 1e-10 CI replications : 25 Print input : 1 Print explore : 1, V4,V5,V6,2,3 Print olsq : 1 Print outcomes : ALL Compute residuals : 1 Print residuals : U1,U2 Print posterior means : ALL Print diagnostics : 1 Print intervals : 1 Max equations : 3 Level-1 size : 2 Level-2 : 2 X-size : 4 Z-size : 2 Parameters : 8 Level-2 parameters : 3 Input file : mla.in Output file : mla.out Verbose : 0 Monte Carlo : 0

18 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Monte Carlo file : Plot histograms : G2,G3,G4 Plot scatterplots : PRED,V6,V5 Response variable : 4 Explanatory variables : 0(1) 6(2) 5(3) 6(4) Random level-2 vars. : 0(1) 5(2) Random level-1 coeffs. : 0(1) 5(2) Level-2 outcome 1 : 0(1)[1] 6(2)[1] Level-2 outcome 2 : 0(3)[2] 6(4)[2] MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 3 Wed Oct 11 14:07: MLA version 3.2: annotated example Data descriptives Data descriptives for all units # Level-1 units = 231 # missing Level-1 units = 1 # correct Level-1 units = 230 # correct Level-2 units = 15 Var Mean Stddev Variance Skewness Kurtosis K-S Z Prob(Z) Var Minimum P5 Q1 Median Q3 P95 Maximum Data descriptives for level-2 unit 2 # Level-1 units = 16 Var Mean Stddev Variance Skewness Kurtosis K-S Z Prob(Z) Var Minimum P5 Q1 Median Q3 P95 Maximum Data descriptives for level-2 unit 3 # Level-1 units = 18 Var Mean Stddev Variance Skewness Kurtosis K-S Z Prob(Z)

19 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Var Minimum P5 Q1 Median Q3 P95 Maximum MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 4 Wed Oct 11 14:07: MLA version 3.2: annotated example Random Level-1 coefficients: ordinary least squares estimates per level-2 unit Parameter B1 Unit Size Estimate SE T Prob(T) Mean Variance Parameter B2 Unit Size Estimate SE T Prob(T) Mean Variance

20 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Parameter SIGMA Unit Size Estimate SE T Prob(T) Mean Variance Note: random level-1 coefficients are also referred to as level-2 outcomes See documentation for further elaboration on this subject MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 5 Wed Oct 11 14:07: MLA version 3.2: annotated example Ordinary least squares estimates Fixed parameters Label Estimate SE G G G G Random parameters Label Estimate SE E(1) U1*U U2*U U2*U E(2) E(1): one-step estimate of sigma squared (ignoring grouping) E(2): two-step estimate of sigma squared See documentation for further elaboration on these subjects

21 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 6 Wed Oct 11 14:07: MLA version 3.2: annotated example Full information maximum likelihood estimates (BFGS) Fixed parameters Label Estimate SE T Prob(T) G G G G Random parameters Label Estimate SE T Prob(T) U1*U U2*U U2*U E Conditional intra-class correlation = 0.47/( ) = # iterations = 10-2*Log(L) = MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 7 Wed Oct 11 14:07: MLA version 3.2: annotated example Residuals Level-2 residuals U1 Unit Raw Shrunken Mean Variance Covariance

22 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ K-S Z Prob(Z) Level-2 residuals U2 Unit Raw Shrunken Mean Variance Covariance K-S Z Prob(Z) Note: shrunken level-2 residuals are also referred to as conditional means Note: covariance refers to covariance with corresponding level-1 residuals See documentation for further elaboration on this subject MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 8 Wed Oct 11 14:07: MLA version 3.2: annotated example Posterior means Parameter B1 Unit Estimate Mean Variance

23 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Parameter B2 Unit Estimate Mean Variance Note: posterior means = shrunken estimates of random level-1 coefficients See documentation for further elaboration on this subject MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 9 Wed Oct 11 14:07: MLA version 3.2: annotated example Diagnostics Level-2 sample size = 15 Total sample size = 230 Mean Level-1 sample size = 15 Effective sample size = 44 Squared correlation coefficients Norm based R-squared = Grand mean based R-squared = Context mean based R-squared = Trimmed mean based R-squared = Level-1 outliers (sorted by Prob) Level-1 Level-2 Level-1 Unit Unit Unit T Prob

24 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/

25 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/

26 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/

27 Dr. Wolfgang Langer - Methoden V: Grundlagen der Mehrebenenanalyse - WiSe 2000/ Level-2 Mahalanobis distances (sorted by Prob(M)) Unit M Prob(M) Effective sample size: N/(1+(N/J-1)*intra-class correlation) Squared correlation coefficients (R-squared) are highly speculative in nature Prob(M): probability - area under the curve of the chi-square distribution See documentation for further elaboration on this subject MLA (R) Multilevel Analysis for Two Level Data Version Copyright Leiden University All Rights Reserved Part 10 Wed Oct 11 14:07: MLA version 3.2: annotated example Bootstrap estimates (unbalanced unlinked shrunken residuals) Replications done = 200 Replications used = 200

Introduction

Introduction =============================================================================== mla.doc MLA 2.2c 11/11/96 =============================================================================== Copyright (c) 1993-96

More information

MLA Software for MultiLevel Analysis of Data with Two Levels

MLA Software for MultiLevel Analysis of Data with Two Levels MLA Software for MultiLevel Analysis of Data with Two Levels User s Guide for Version 4.1 Frank M. T. A. Busing Leiden University, Department of Psychology, P.O. Box 9555, 2300 RB Leiden, The Netherlands.

More information

Bootstrapping, Randomization, 2B-PLS

Bootstrapping, Randomization, 2B-PLS Bootstrapping, Randomization, 2B-PLS Statistics, Tests, and Bootstrapping Statistic a measure that summarizes some feature of a set of data (e.g., mean, standard deviation, skew, coefficient of variation,

More information

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017

MLMED. User Guide. Nicholas J. Rockwood The Ohio State University Beta Version May, 2017 MLMED User Guide Nicholas J. Rockwood The Ohio State University rockwood.19@osu.edu Beta Version May, 2017 MLmed is a computational macro for SPSS that simplifies the fitting of multilevel mediation and

More information

Passing-Bablok Regression for Method Comparison

Passing-Bablok Regression for Method Comparison Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional

More information

Handout 1: Predicting GPA from SAT

Handout 1: Predicting GPA from SAT Handout 1: Predicting GPA from SAT appsrv01.srv.cquest.utoronto.ca> appsrv01.srv.cquest.utoronto.ca> ls Desktop grades.data grades.sas oldstuff sasuser.800 appsrv01.srv.cquest.utoronto.ca> cat grades.data

More information

Review of Statistics 101

Review of Statistics 101 Review of Statistics 101 We review some important themes from the course 1. Introduction Statistics- Set of methods for collecting/analyzing data (the art and science of learning from data). Provides methods

More information

MULTILEVEL MODELS. Multilevel-analysis in SPSS - step by step

MULTILEVEL MODELS. Multilevel-analysis in SPSS - step by step MULTILEVEL MODELS Multilevel-analysis in SPSS - step by step Dimitri Mortelmans Centre for Longitudinal and Life Course Studies (CLLS) University of Antwerp Overview of a strategy. Data preparation (centering

More information

A Re-Introduction to General Linear Models (GLM)

A Re-Introduction to General Linear Models (GLM) A Re-Introduction to General Linear Models (GLM) Today s Class: You do know the GLM Estimation (where the numbers in the output come from): From least squares to restricted maximum likelihood (REML) Reviewing

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Hypothesis Testing for Var-Cov Components

Hypothesis Testing for Var-Cov Components Hypothesis Testing for Var-Cov Components When the specification of coefficients as fixed, random or non-randomly varying is considered, a null hypothesis of the form is considered, where Additional output

More information

y response variable x 1, x 2,, x k -- a set of explanatory variables

y response variable x 1, x 2,, x k -- a set of explanatory variables 11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate

More information

Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED

Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED Testing Indirect Effects for Lower Level Mediation Models in SAS PROC MIXED Here we provide syntax for fitting the lower-level mediation model using the MIXED procedure in SAS as well as a sas macro, IndTest.sas

More information

Experimental Design and Data Analysis for Biologists

Experimental Design and Data Analysis for Biologists Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1

More information

Contents. Acknowledgments. xix

Contents. Acknowledgments. xix Table of Preface Acknowledgments page xv xix 1 Introduction 1 The Role of the Computer in Data Analysis 1 Statistics: Descriptive and Inferential 2 Variables and Constants 3 The Measurement of Variables

More information

Fractional Polynomial Regression

Fractional Polynomial Regression Chapter 382 Fractional Polynomial Regression Introduction This program fits fractional polynomial models in situations in which there is one dependent (Y) variable and one independent (X) variable. It

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp

Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp Nonlinear Regression Summary... 1 Analysis Summary... 4 Plot of Fitted Model... 6 Response Surface Plots... 7 Analysis Options... 10 Reports... 11 Correlation Matrix... 12 Observed versus Predicted...

More information

This manual is Copyright 1997 Gary W. Oehlert and Christopher Bingham, all rights reserved.

This manual is Copyright 1997 Gary W. Oehlert and Christopher Bingham, all rights reserved. This file consists of Chapter 4 of MacAnova User s Guide by Gary W. Oehlert and Christopher Bingham, issued as Technical Report Number 617, School of Statistics, University of Minnesota, March 1997, describing

More information

Distribution Fitting (Censored Data)

Distribution Fitting (Censored Data) Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

SPSS LAB FILE 1

SPSS LAB FILE  1 SPSS LAB FILE www.mcdtu.wordpress.com 1 www.mcdtu.wordpress.com 2 www.mcdtu.wordpress.com 3 OBJECTIVE 1: Transporation of Data Set to SPSS Editor INPUTS: Files: group1.xlsx, group1.txt PROCEDURE FOLLOWED:

More information

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions

Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error Distributions Journal of Modern Applied Statistical Methods Volume 8 Issue 1 Article 13 5-1-2009 Least Absolute Value vs. Least Squares Estimation and Inference Procedures in Regression Models with Asymmetric Error

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

LCA_Distal_LTB Stata function users guide (Version 1.1)

LCA_Distal_LTB Stata function users guide (Version 1.1) LCA_Distal_LTB Stata function users guide (Version 1.1) Liying Huang John J. Dziak Bethany C. Bray Aaron T. Wagner Stephanie T. Lanza Penn State Copyright 2017, Penn State. All rights reserved. NOTE: the

More information

LCA Distal Stata function users guide (Version 1.0)

LCA Distal Stata function users guide (Version 1.0) LCA Distal Stata function users guide (Version 1.0) Liying Huang John J. Dziak Bethany C. Bray Aaron T. Wagner Stephanie T. Lanza Penn State Copyright 2016, Penn State. All rights reserved. Please send

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen 2 / 28 Preparing data for analysis The

More information

Ratio of Polynomials Fit Many Variables

Ratio of Polynomials Fit Many Variables Chapter 376 Ratio of Polynomials Fit Many Variables Introduction This program fits a model that is the ratio of two polynomials of up to fifth order. Instead of a single independent variable, these polynomials

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Newsom Psy 510/610 Multilevel Regression, Spring

Newsom Psy 510/610 Multilevel Regression, Spring Psy 510/610 Multilevel Regression, Spring 2017 1 Diagnostics Chapter 10 of Snijders and Bosker (2012) provide a nice overview of assumption tests that are available. I will illustrate only one statistical

More information

5.3 Three-Stage Nested Design Example

5.3 Three-Stage Nested Design Example 5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R.

Wiley. Methods and Applications of Linear Models. Regression and the Analysis. of Variance. Third Edition. Ishpeming, Michigan RONALD R. Methods and Applications of Linear Models Regression and the Analysis of Variance Third Edition RONALD R. HOCKING PenHock Statistical Consultants Ishpeming, Michigan Wiley Contents Preface to the Third

More information

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook BIOMETRY THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH THIRD E D I T I O N Robert R. SOKAL and F. James ROHLF State University of New York at Stony Brook W. H. FREEMAN AND COMPANY New

More information

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error

The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error The Stata Journal (), Number, pp. 1 12 The Simulation Extrapolation Method for Fitting Generalized Linear Models with Additive Measurement Error James W. Hardin Norman J. Arnold School of Public Health

More information

Practical Statistics for the Analytical Scientist Table of Contents

Practical Statistics for the Analytical Scientist Table of Contents Practical Statistics for the Analytical Scientist Table of Contents Chapter 1 Introduction - Choosing the Correct Statistics 1.1 Introduction 1.2 Choosing the Right Statistical Procedures 1.2.1 Planning

More information

Introduction to SAS proc mixed

Introduction to SAS proc mixed Faculty of Health Sciences Introduction to SAS proc mixed Analysis of repeated measurements, 2017 Julie Forman Department of Biostatistics, University of Copenhagen Outline Data in wide and long format

More information

3 Joint Distributions 71

3 Joint Distributions 71 2.2.3 The Normal Distribution 54 2.2.4 The Beta Density 58 2.3 Functions of a Random Variable 58 2.4 Concluding Remarks 64 2.5 Problems 64 3 Joint Distributions 71 3.1 Introduction 71 3.2 Discrete Random

More information

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed. EXST3201 Chapter 13c Geaghan Fall 2005: Page 1 Linear Models Y ij = µ + βi + τ j + βτij + εijk This is a Randomized Block Design (RBD) with a single factor treatment arrangement (2 levels) which are fixed.

More information

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.

* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course. Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course

More information

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D.

Designing Multilevel Models Using SPSS 11.5 Mixed Model. John Painter, Ph.D. Designing Multilevel Models Using SPSS 11.5 Mixed Model John Painter, Ph.D. Jordan Institute for Families School of Social Work University of North Carolina at Chapel Hill 1 Creating Multilevel Models

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.

1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. 1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive

More information

Measuring relationships among multiple responses

Measuring relationships among multiple responses Measuring relationships among multiple responses Linear association (correlation, relatedness, shared information) between pair-wise responses is an important property used in almost all multivariate analyses.

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES Normal Error RegressionModel : Y = β 0 + β ε N(0,σ 2 1 x ) + ε The Model has several parts: Normal Distribution, Linear Mean, Constant Variance,

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Single and multiple linear regression analysis

Single and multiple linear regression analysis Single and multiple linear regression analysis Marike Cockeran 2017 Introduction Outline of the session Simple linear regression analysis SPSS example of simple linear regression analysis Additional topics

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

SEM Day 3 Lab Exercises SPIDA 2007 Dave Flora

SEM Day 3 Lab Exercises SPIDA 2007 Dave Flora SEM Day 3 Lab Exercises SPIDA 2007 Dave Flora 1 Today we will see how to estimate SEM conditional latent trajectory models and interpret output using both SAS and LISREL. Exercise 1 Using SAS PROC CALIS,

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal

Hypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric

More information

The Nonparametric Bootstrap

The Nonparametric Bootstrap The Nonparametric Bootstrap The nonparametric bootstrap may involve inferences about a parameter, but we use a nonparametric procedure in approximating the parametric distribution using the ECDF. We use

More information

Some general observations.

Some general observations. Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces

More information

A brief introduction to mixed models

A brief introduction to mixed models A brief introduction to mixed models University of Gothenburg Gothenburg April 6, 2017 Outline An introduction to mixed models based on a few examples: Definition of standard mixed models. Parameter estimation.

More information

unadjusted model for baseline cholesterol 22:31 Monday, April 19,

unadjusted model for baseline cholesterol 22:31 Monday, April 19, unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol

More information

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping

Outline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals

More information

Statistics Toolbox 6. Apply statistical algorithms and probability models

Statistics Toolbox 6. Apply statistical algorithms and probability models Statistics Toolbox 6 Apply statistical algorithms and probability models Statistics Toolbox provides engineers, scientists, researchers, financial analysts, and statisticians with a comprehensive set of

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates 2011-03-16 Contents 1 Generalized Linear Mixed Models Generalized Linear Mixed Models When using linear mixed

More information

Institute of Actuaries of India

Institute of Actuaries of India Institute of Actuaries of India Subject CT3 Probability and Mathematical Statistics For 2018 Examinations Subject CT3 Probability and Mathematical Statistics Core Technical Syllabus 1 June 2017 Aim The

More information

Generalized linear models

Generalized linear models Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models

More information

Hotelling s One- Sample T2

Hotelling s One- Sample T2 Chapter 405 Hotelling s One- Sample T2 Introduction The one-sample Hotelling s T2 is the multivariate extension of the common one-sample or paired Student s t-test. In a one-sample t-test, the mean response

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs Outline Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team UseR!2009,

More information

The Correlation Principle. Estimation with (Nonparametric) Correlation Coefficients

The Correlation Principle. Estimation with (Nonparametric) Correlation Coefficients 1 The Correlation Principle Estimation with (Nonparametric) Correlation Coefficients 1 2 The Correlation Principle for the estimation of a parameter theta: A Statistical inference of procedure should be

More information

SPECIAL TOPICS IN REGRESSION ANALYSIS

SPECIAL TOPICS IN REGRESSION ANALYSIS 1 SPECIAL TOPICS IN REGRESSION ANALYSIS Representing Nominal Scales in Regression Analysis There are several ways in which a set of G qualitative distinctions on some variable of interest can be represented

More information

Estimation and Centering

Estimation and Centering Estimation and Centering PSYED 3486 Feifei Ye University of Pittsburgh Main Topics Estimating the level-1 coefficients for a particular unit Reading: R&B, Chapter 3 (p85-94) Centering-Location of X Reading

More information

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Chapter 8 (More on Assumptions for the Simple Linear Regression) EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure.

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure. Chapter 460 Introduction This program calculates the harmonic regression of a time series. That is, it fits designated harmonics (sinusoidal terms of different wavelengths) using our nonlinear regression

More information

Review of Multiple Regression

Review of Multiple Regression Ronald H. Heck 1 Let s begin with a little review of multiple regression this week. Linear models [e.g., correlation, t-tests, analysis of variance (ANOVA), multiple regression, path analysis, multivariate

More information

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club

BE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club BE640 Intermediate Biostatistics 2. Regression and Correlation Simple Linear Regression Software: SAS Emergency Calls to the New York Auto Club Source: Chatterjee, S; Handcock MS and Simonoff JS A Casebook

More information

TOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS

TOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS T E C H N I C A L B R I E F TOTAL JITTER MEASUREMENT THROUGH THE EXTRAPOLATION OF JITTER HISTOGRAMS Dr. Martin Miller, Author Chief Scientist, LeCroy Corporation January 27, 2005 The determination of total

More information

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */ CLP 944 Example 4 page 1 Within-Personn Fluctuation in Symptom Severity over Time These data come from a study of weekly fluctuation in psoriasis severity. There was no intervention and no real reason

More information

Summary of OLS Results - Model Variables

Summary of OLS Results - Model Variables Summary of OLS Results - Model Variables Variable Coefficient [a] StdError t-statistic Probability [b] Robust_SE Robust_t Robust_Pr [b] VIF [c] Intercept 12.722048 1.710679 7.436839 0.000000* 2.159436

More information

Basic Statistical Analysis

Basic Statistical Analysis indexerrt.qxd 8/21/2002 9:47 AM Page 1 Corrected index pages for Sprinthall Basic Statistical Analysis Seventh Edition indexerrt.qxd 8/21/2002 9:47 AM Page 656 Index Abscissa, 24 AB-STAT, vii ADD-OR rule,

More information

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Mixed models in R using the lme4 package Part 7: Generalized linear mixed models Douglas Bates University of Wisconsin - Madison and R Development Core Team University of

More information

E X P L O R E R. R e l e a s e A Program for Common Factor Analysis and Related Models for Data Analysis

E X P L O R E R. R e l e a s e A Program for Common Factor Analysis and Related Models for Data Analysis E X P L O R E R R e l e a s e 3. 2 A Program for Common Factor Analysis and Related Models for Data Analysis Copyright (c) 1990-2011, J. S. Fleming, PhD Date and Time : 23-Sep-2011, 13:59:56 Number of

More information

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d.

Research Design: Topic 18 Hierarchical Linear Modeling (Measures within Persons) 2010 R.C. Gardner, Ph.d. Research Design: Topic 8 Hierarchical Linear Modeling (Measures within Persons) R.C. Gardner, Ph.d. General Rationale, Purpose, and Applications Linear Growth Models HLM can also be used with repeated

More information

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics DETAILED CONTENTS About the Author Preface to the Instructor To the Student How to Use SPSS With This Book PART I INTRODUCTION AND DESCRIPTIVE STATISTICS 1. Introduction to Statistics 1.1 Descriptive and

More information

Simulation. Where real stuff starts

Simulation. Where real stuff starts 1 Simulation Where real stuff starts ToC 1. What is a simulation? 2. Accuracy of output 3. Random Number Generators 4. How to sample 5. Monte Carlo 6. Bootstrap 2 1. What is a simulation? 3 What is a simulation?

More information

Introduction to Regression

Introduction to Regression Regression Introduction to Regression If two variables covary, we should be able to predict the value of one variable from another. Correlation only tells us how much two variables covary. In regression,

More information

Investigating Models with Two or Three Categories

Investigating Models with Two or Three Categories Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might

More information

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES

ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES ROBUSTNESS OF MULTILEVEL PARAMETER ESTIMATES AGAINST SMALL SAMPLE SIZES Cora J.M. Maas 1 Utrecht University, The Netherlands Joop J. Hox Utrecht University, The Netherlands In social sciences, research

More information

the logic of parametric tests

the logic of parametric tests the logic of parametric tests define the test statistic (e.g. mean) compare the observed test statistic to a distribution calculated for random samples that are drawn from a single (normal) distribution.

More information

Generalized Linear Models

Generalized Linear Models York SPIDA John Fox Notes Generalized Linear Models Copyright 2010 by John Fox Generalized Linear Models 1 1. Topics I The structure of generalized linear models I Poisson and other generalized linear

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

CHAPTER 5. Outlier Detection in Multivariate Data

CHAPTER 5. Outlier Detection in Multivariate Data CHAPTER 5 Outlier Detection in Multivariate Data 5.1 Introduction Multivariate outlier detection is the important task of statistical analysis of multivariate data. Many methods have been proposed for

More information

Review of CLDP 944: Multilevel Models for Longitudinal Data

Review of CLDP 944: Multilevel Models for Longitudinal Data Review of CLDP 944: Multilevel Models for Longitudinal Data Topics: Review of general MLM concepts and terminology Model comparisons and significance testing Fixed and random effects of time Significance

More information

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods Permutation Tests Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods The Two-Sample Problem We observe two independent random samples: F z = z 1, z 2,, z n independently of

More information

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An

More information

Model checking overview. Checking & Selecting GAMs. Residual checking. Distribution checking

Model checking overview. Checking & Selecting GAMs. Residual checking. Distribution checking Model checking overview Checking & Selecting GAMs Simon Wood Mathematical Sciences, University of Bath, U.K. Since a GAM is just a penalized GLM, residual plots should be checked exactly as for a GLM.

More information

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective Second Edition Scott E. Maxwell Uniuersity of Notre Dame Harold D. Delaney Uniuersity of New Mexico J,t{,.?; LAWRENCE ERLBAUM ASSOCIATES,

More information

Descriptive Statistics

Descriptive Statistics *following creates z scores for the ydacl statedp traitdp and rads vars. *specifically adding the /SAVE subcommand to descriptives will create z. *scores for whatever variables are in the command. DESCRIPTIVES

More information

Analysis of Covariance (ANCOVA) with Two Groups

Analysis of Covariance (ANCOVA) with Two Groups Chapter 226 Analysis of Covariance (ANCOVA) with Two Groups Introduction This procedure performs analysis of covariance (ANCOVA) for a grouping variable with 2 groups and one covariate variable. This procedure

More information

EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS

EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS EDF 7405 Advanced Quantitative Methods in Educational Research MULTR.SAS The data used in this example describe teacher and student behavior in 8 classrooms. The variables are: Y percentage of interventions

More information

PLS205 Lab 2 January 15, Laboratory Topic 3

PLS205 Lab 2 January 15, Laboratory Topic 3 PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way

More information

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Mixed models in R using the lme4 package Part 5: Generalized linear mixed models Douglas Bates Madison January 11, 2011 Contents 1 Definition 1 2 Links 2 3 Example 7 4 Model building 9 5 Conclusions 14

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information