Statistics GIDP Ph.D. Qualifying Exam Methodology
|
|
- Cuthbert August Moody
- 5 years ago
- Views:
Transcription
1 Statistics GIDP Ph.D. Qualifying Exam Methodology May 26, 2017, 9:00am-1:00pm Instructions: Put your ID (not your name) on each sheet. Complete exactly 5 of 6 problems; turn in only those sheets you wish to have graded. Each question, but not necessarily each part, is equally weighted. Provide answers on the supplied pads of paper and/or use a Microsoft word document or equivalent to report your software code and outputs. Number each problem. You may turn in only one electronic document. Embed relevant code and output/graphics into your word document. Write on only one side of each sheet if you use paper. You may use the computer and/or a calculator. Stay calm and do your best. Good luck! 1. A process engineer is testing the yield of a product manufactured on five machines. Each machine has two operators, one for the day shift and one for the night shift. Assume the operator factor is random. We take five samples from each machine for each operator and obtain the following data ( machine.csv ): Machine Day Operator Night Operator (a) What design is this? (b) State the statistical model and assumptions. (c) Analyze the data and draw a conclusion. (d) If these five machines were randomly selected from many machines in the factory, would the conclusion be same as the one obtained in (3)? Explain (no calculation needed). (e) Attach your SAS code here. 2. A nickel-titanium alloy is used to make components for jet turbine aircraft engines. Cracking is a potentially serious problem, as it can lead to non-recoverable failure. A test is run at the parts producer to determine the effects of four factors on cracks. The four factors are pouring temperature (A), titanium content (B), heat treatment method (C), and the amount of grain
2 refiner used (D). Each factor contains two levels and 16 runs are performed. Two operators need to take care of these 16 runs. There might be some variation between the two operators. A B C D Operator (a) Help them to divide the workload equally by filling in the table above. (b) Assume the response measurements in the above table (from top to bottom) are 25, 71, 48, 45, 68, 40, 60, 65, 43, 80, 25, 104, 55, 86, 70, 76 and the dataset is given in aircraft.csv. Use SAS code to estimate the factor effects. Which factor effects appear to be large? Is there a large variation between the two operators? (c) Conduct an analysis of variance to verify the conclusion of (b). (d) Attach your SAS code here.
3 3. A study was carried out to compare the writing lifetime of four premium brands of pens. It was thought that the writing surface will affect lifetime, so three different surfaces were used and the data are given as below. Surface average Brand average (a) What design is this? (b) State the statistical model with assumptions. (c) How would you check whether there exists any significant interaction between the surfaces and brands of pens? State your hypothesis in mathematical notation. (d) Analyze the data using the given dataset pen.csv. (e) Attach your SAS code. (f) Assume that in this study 3 observations were collected for each combination and each value in the above table was the average of 3 replicates. A two-way ANOVA model with interaction is fitted and the MSE is Complete the following ANOVA table and draw conclusions. Source DF SS MS F-value P-value Brand Surface Interaction Error
4 4. In a study of carbohydrate uptake (Y) as a function of other factors in male diabetics, observations were taken as follows: Y Age, x 1 Weight, x 2 Dietary Protein, x Analyze these data to determine which (if any) of the predictor variables (including any appropriate interactions) appear to significantly affect carbohydrate uptake. Throughout, set α = 0.05, but for simplicity do not employ any adjustments for multiplicity/multiple inferences when assessing the effects of the predictor variables. Remember to assess the quality of the fit via standard diagnostics. Attach supporting components of your computer code. Report your findings. The data are found in the file diet.csv. 5. A large dataset of n = 1030 samples of concrete involving a total of p 1 = 8 predictor variables was collected: x 1 = Age x 2 = Cement x 3 = Furnace Slag x 4 = Superplasticizer x 5 = Water x 6 = Fly ash x 7 = Coarse Aggregate x 8 =Fine Aggregate along with Y = Compressive Strength The data appear in the file concrete.csv. Consider a multiple linear regression (MLR) model for these data, with E[Y] = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 + β 8 X 8. Conduct a variable selection search to identify a possible reduced model among the eight predictor variables with this data set. Employ backward elimination and take minimum-bic
5 as your selection criterion. Attach supporting components of your computer code. What is the recommended set of variables for further study? 6. Consider the simple linear regression model: Y i ~ indep. N(β 0 +β 1 X i, σ 2 ), i = 1,..., n, where in particular it is known that β 0 = 1, so that E[Y] = 1 + β 1 X. Suppose interest exists in estimating the X values at which E[Y] = 0. Let this target parameter be ξ. a) Find ξ as a function of β 1. Also find the maximum likelihood estimator for ξ. Call this ˆξ. b) Recall from Section in Casella & Berger that the Delta Method can be used to determine the asymptotic features of a function of random variables. In particular, for a random variable U and a differentiable function g(u), where E[U] = θ, a first-order approximation to E[g(U)] is E[g(U)] g(θ) + { g(θ)}e(u θ) θ Use this to find a first-order approximation for E[ˆξ]. c) In part (b), a second-order approximation to E[g(U)] is also available from Casella & Berger s book: E[g(U)] g(θ) + { g(θ) θ }E(U θ) + g(θ) ½{ 2 }E[(U θ) 2 ] θ 2 Use this to find a second-order approximation for E[ˆξ].
6 2017 May - method 1. A process engineer is testing the yield of a product manufactured on five machines. Each machine has two operators, one for the day shift and one for the night shift. Assume the operator factor is random. We take five samples from each machine for each operator and obtain the following data ( machine.csv ): Machine Day Operator Night Operator (a) What design is this? Nested design (operator is nested within the machine). (b) State the statistical model and assumptions. y!"# = μ + τ! + β!! + ε!!"! and! τ! = 0, and β!! ~N(0, σ!! ), ε!!" ~N(0, σ! ) (c) Analyze the data and draw a conclusion. From the SAS output of the above model it is obvious that the machine has a significant effect as the p-value is while the operator does not. Ty pe 1 Ana l y s i s o f Va r i a nc e Source D F Sum of Squares Me a n Square Ex pe c t e d Me a n Square Er r o r Te r m Er r o r DF F Va l u e Pr > F ma c h i n e Va r ( Re s i dua l ) + 5 Va r ( ope r a t or ( machi ne )) + Q(machine) MS ( o p e r a t o r ( ma c h i n e )) operat or( machi ne ) Va r ( Re s i dua l ) + 5 Va r ( ope r a t or ( ma c hi ne )) MS ( Re s i d u a l ) Re s i dua l Va r ( Re s i dua l ).... Or use the default setting method=reml, we get:
7 Ty pe 3 Te s t s o f Fi x e d Ef f e c t s Ef f e c t Num DF De n DF F Val ue Pr > F ma c h i n e (d) If these five machines were randomly selected from many machines in the factory, would the conclusion be same as the one obtained in (3)? Explain (no calculation needed). Yes, as the F-value for testing the machine effect is the same as the F-value in (3). (e) Attach your SAS code here. data Q1; input operator machine datalines; run; proc mixed method=typ1 data=q1; class operator machine; model y=machine; random operator(machine);
8 run; 2. A nickel-titanium alloy is used to make components for jet turbine aircraft engines. Cracking is a potentially serious problem, as it can lead to non-recoverable failure. A test is run at the parts producer to determine the effects of four factors on cracks. The four factors are pouring temperature (A), titanium content (B), heat treatment method (C), and the amount of grain refiner used (D). Each factor contains two levels and 16 runs are performed. Two operators need to take care of these 16 runs. There might be some variation between the two operators. A B C D Operator (a) Help them to divide the workload equally by filling in the table above. A B C D Operator
9 (b) Assume the response measurements in the above table (from top to bottom) are 25, 71, 48, 45, 68, 40, 60, 65, 43, 80, 25, 104, 55, 86, 70, 76 and the dataset is given in aircraft.csv. Use SAS code to estimate the factor effects. Which factor effects appear to be large? Is there a large variation between the two operators? It seems that the A, C, D, AC, and AD have large effect, as well as the operator factor. _NAME_ COL1 effect operator AC BCD ACD CD BD AB ABC BC B ABD C D AD A (c) Conduct an analysis of variance to verify the conclusion of (b). The ANOVA result below shows that the factors A, D, and interaction AC and AD are significant, as well as the operator.
10 Source DF Ty pe I I I SS Me a n S q u a r e F Val ue Pr > F A < C D AC < AD < operat or < (d) Attach your SAS code here. data Q2; input A B C D operater y; datalines; ; run; data inter; set Q2; AB=A*B; AC=A*C; AD=A*D; BC=B*C; BD=B*D; CD=C*D;ABC=AB*C; ABD=AB*D; ACD=AC*D;BCD=BC*D; block=abc*d; proc reg outest=effects data=inter; model y=a B C D AB AC AD BC BD CD ABC ABD ACD BCD block;
11 run; data effect2; set effects; drop y Intercept _RMSE_; run; proc transpose data=effect2 out=effect3; data effect4; set effect3; effect=col1*2; proc sort data=effect4; by effect; proc print data=effect4; run; data effect5; set effect4; where _NAME_^='block'; proc print data=effect5; run; proc rank data=effect5 normal=blom; var effect; ranks neff; run; proc gplot; plot effect*neff=_name_; run; proc glm data=inter; class A C D AC AD; model y=a C D AC AD; run;
12 3. A study was carried out to compare the writing lifetime of four premium brands of pens. It was thought that the writing surface will affect lifetime, so three different surfaces were used and the data are given as below. Surface average Brand average (a) What design is this? Randomized complete block design (RCBD), because the surfaces represent a known source of variation, and therefore they represent the block factor. (b) State the statistical model with assumptions. y!" = μ + τ! + β! + ε!", τ! = 0, β! = 0, and ε!" ~N(0, σ! ) (c) How would you check whether there exists any significant interaction between the surfaces and brands of pens? State your hypothesis in mathematical notation. Use one-degree freedom Tukey s method to check the interaction between these two factors. y!" = μ + τ! + β! + γτ! β! + ε!" H! : γ = 0 vs. H! : γ 0 (d) Analyze the data using the given dataset pen.csv. One-degree freedom Tukey s test shows that the interaction is not significant (pvalue= Source DF Ty pe I I I SS Me a n S q u a r e F Val ue Pr > F surface brand q So use the additive model: y!" = μ + τ! + β! + ε!" The type III SS ANOVA table shows that both surface and brand have significant effects. Source DF Ty pe I I I SS Me a n S q u a r e F Val ue Pr > F
13 Source DF Ty pe I I I SS Me a n S q u a r e F Val ue Pr > F surface brand Check model adequacy: The residual plot and QQ-plot show no unusual pattern. Te s t s f o r No r ma l i t y Te s t St at i s t i c p Val ue Shapi ro- Wi l k W Pr < W Ko l mo g o r o v - Smi rnov D Pr > D > Cr a me r - von Mi s e s W- Sq Pr > W- Sq Ande r s o n- Da r l i ng A- Sq Pr > A- Sq (e) Attach your SAS code. data Q3; input surface brand lifetime; datalines;
14 ; proc glm data=q3; class surface brand; model lifetime=surface brand; output out=diag r=res p=pred; run; data two; set diag; q=pred*pred; proc glm data=two; class surface brand; model lifetime=surface brand q/ss3; run; proc sgplot data=diag; scatter x=pred y=res; refline 0; run; proc univariate data=diag normal; var res; qqplot res/normal (L=1 mu=est sigma=est); run; (f) Assume that in this study 3 observations were collected for each combination and each value in the above table was the average of 3 replicates. A two-way ANOVA model with interaction is fitted and the MSE is Complete the following ANOVA table and draw conclusions.
15 Source DF SS MS F-value P-value Brand Surface Interaction Error Source DF SS MS F-value P-value Brand <0.001 Surface <0.01 Interaction >0.1 Error Both the brand and surface play a significant effect on the lifetime, but their interaction does not. μ = 692 τ1 = = 46 τ2 = = 10 τ3 = = 21 τ4 = = 15 β1 = = β2 = = 2.75 β3 = = 16 τβ11 = = 9.25 τβ12 = = τβ13 = = 2 τβ21 = = 9.75 τβ22 = = 8.75 τβ23 = = 1 τβ31 = = τβ32 = = 1.75 τβ33 = = 10 τβ41 = = τβ42 = = 0.75 τβ43 = = 13 SS_brand=3*3*(46^2+10^2+21^2+15^2)=25,938
16 SS_surface=4*3*(13.25^2+2.75^2+16^2)= SS_bs=3*(9.25^ ^2+2^2+9.75^2+8.75^ ^2+1.75^2+10^ ^2+0.75^2+13^2)= Or use the way below: If we assume that the table is average lifetime, then the totals for each treatment can be recovered by multiplying by 3, i.e., y!". = y!". 3, e.g., y!!. = = Similarly, the listed averages can recover other totals, SS!"#$% = y!!.. bn y! abn = SS!"#$ = y!.!. an y! abn = SS!"# = y!... n = 25, = y! abn = = 34,056 SS!"#$,!"!"# = SS!"# SS!"#$% SS!"#$ = SS! = MSE a b n 1 = In a study of carbohydrate uptake (Y) as a function of other factors in male diabetics, observations were taken as follows: Y Age, x 1 Weight, x 2 Dietary Protein, x
17 Analyze these data to determine which (if any) of the predictor variables (including any appropriate interactions) appear to significantly affect carbohydrate uptake. Throughout, set α = 0.05, but for simplicity do not employ any adjustments for multiplicity/multiple inferences when assessing the effects of the predictor variables. Remember to assess the quality of the fit via standard diagnostics. Attach supporting components of your computer code. Report your findings. The data are found in the file diet.csv. To start, always plot the data! Sample R code diet.df = read.csv( file.choose() ) attach( diet.df ) Y = Y; X1 = Age; X2 = Weight; X3 = Dietary.Protein pairs( cbind(y,x1,x2,x3), pch=19 )
18 No disturbing patterns are seen in the scatterplot matrix. Now fit the model; sample R code follows (notice use of centered predictors to properly accommodate the higher-order interaction terms): x1 = Age - mean(age); x2 = Weight - mean(weight) x3 = Dietary.Protein - mean(dietary.protein) diet.lm = lm( Y ~ x1*x2*x3 ) anova( diet.lm ) This yields the following ANOVA table (output edited):
19 Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) x x x x1:x x1:x x2:x x1:x2:x Residuals From the ANOVA table we see that the Sequential Sums of Squares working from the bottom (i.e., the Partial SS) up show no significant interaction of any type (pointwise, at the 5% level). Formally, we test this via: anova( lm(y~x1+x2+x3), diet.lm ) producing Model 1: Y ~ x1 + x2 + x3 Model 2: Y ~ x1 * x2 * x3 Res.Df RSS Df Sum of Sq F Pr(>F) The P-value for testing all four interactions is P = > 0.05 = α. Again, no interactions are significant. Move now to a reduced model with only main-effect terms (so return to the original, uncentered predictor variables): dietrm.lm = lm( Y~Age+Weight+Dietary.Protein ); anova( dietrm.lm ) Analysis of Variance Table Response: Y Df Sum Sq Mean Sq F value Pr(>F) Age Weight Dietary.Protein Residuals Examining the Partial SS shows that Protein is significant at the (pointwise) 5% level. To study the other terms we can either (i) rearrange the sequential order of the reduced-model ANOVA to isolate Weight and then Age, and test their Partial SS contributions, or (ii) since each is a 1 d.f. test, just examine the t-tests for assessing each pointwise β-coefficient. The latter approach is faster: summary( dietrm.lm )
20 producing (output edited) Call: lm(formula = Y ~ X1 + X2 + X3) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) Age Weight Dietary.Protein Residual standard error: on 16 degrees of freedom Multiple R-squared: 0.515, Adjusted R-squared: F-statistic: on 3 and 16 DF, p-value: We see Age is insignificant with P = > 0.05, but Weight is significant with P = < 0.05, each at the (pointwise) 5% level. Thus a final reduced model retains only Weight and Protein: dietfinal.lm = lm( Y~Weight+Dietary.Protein ); summary( dietfinal.lm ) Call: lm(formula = Y ~ X2 + X3) Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) Weight Dietary.Protein Residual standard error: on 17 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 2 and 17 DF, p-value: For diagnostic quality assessment: (i) Check VIFs for multicollinearity between X2 and X3: library( car ) vif( dietfinal.lm ); mean(vif( dietfinal.lm )) Weight Dietary.Protein Since both values are below 10 and their mean is clearly equal to < 6, no concerns with multicollinearity are evident. (ii) Check normal Q-Q plot: sample R code qqnorm( resid(dietfinal.lm), main=null, pch=19) qqline(resid(dietfinal.lm)) produces the following graphic (no substantive concerns are evidenced).
21 (iii) Studentized residual plot (with outlier screen): sample R code is n = length(y); p = length( coef(dietfinal.lm) ) tcrit = qt( 1-.5*(.05/n), n-p-1 ) plot( rstudent(dietfinal.lm) ~ fitted(dietfinal.lm), pch=19, ylim=c(-ceiling(tcrit),ceiling(tcrit)) ) abline( h=0 ) abline( h=tcrit, lty=2 ); abline( h=-tcrit, lty=2 )
22 From the residual plot, no troublesome patterns are seen, and no outliers are observed to extend past the screening limits of ±t(1 (0.05/2n); n p 1) = ± (iv) Influence measures: sample R code is influence.measures( dietfinal.lm ) which produces the following output (edited) Influence measures of lm(formula = Y ~ Weight + Dietary.Protein) : dfb.1_ dfb.wght dfb.dt.p dffit cov.r cook.d hat inf *
23 * We see observations at i = 4 and i = 12 are marked for further study: At i = 4 and i =12 the hat matrix diagonals, h ii, exceed 2p/n = 0.3, indicating high leverage at these points. At i = 4 the value of DFFITS exceeds 1 in absolute value, so this point again exhibits high influence. The Cook s Distance D i values are available as the sixth column of the influence.measures object, so we can check their associated F-probability values via Di = influence.measures(dietfinal.lm)$infmat[,6] which( pf(di, df1=p, df2=n-p) > 0.5 ) the result of which is null. Thus no influence is seen on the Cook s Distance metric. Lastly, no values of DFBETAS 2 or DFBETAS 3 exceed 1 in absolute value, so no influence is seen on that measure. 5. A large dataset of n = 1030 samples of concrete involving a total of p 1 = 8 predictor variables was collected: x 1 = Age x 2 = Cement x 3 = Furnace Slag x 4 = Superplasticizer x 5 = Water x 6 = Fly ash x 7 = Coarse Aggregate x 8 =Fine Aggregate along with Y = Compressive Strength The data appear in the file concrete.csv. Consider a multiple linear regression (MLR) model for these data, with E[Y] = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + β 4 X 4 + β 5 X 5 + β 6 X 6 + β 7 X 7 + β 8 X 8. Conduct a variable selection search to identify a possible reduced model among the eight predictor variables with this data set. Employ backward elimination and take minimum-bic as your selection criterion. Attach supporting components of your computer code. What is the recommended set of variables for further study? Begin by loading the data and creating X variables (notice that the response variable Y has already been transformed as Compressive Strength):
24 concrete.df = read.csv(file.choose()) attach( concrete.df ) Y = Y x1 = age x2 = cement x3 = slag x4 = superplasticizer x5 = water x6 = fly.ash x7 = coarse.aggregate x8 = fine.aggregate Always plot the data! The command pairs( concrete.df ) produces a scatterplot matrix (see next page), in which a number of interesting patterns appears. None. however, are grossly disturbing at face value.
25 Next, build the regression fit and apply backward elimination with BIC control: library( leaps ) cement.lm = lm( Y ~x1+x2+x3+x4+x5+x6+x7+x8 ) n = length(y) step( cement.lm, direction="backward", k=log(n) ) #BIC This produces [output edited -- note that R writes AIC in the leaps output, but we did institute the stepwise regression with the BIC option k=log(n)]: Start: AIC=
26 Y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 Df Sum of Sq RSS AIC - x x <none> x x x x x x Step: AIC= Y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 Df Sum of Sq RSS AIC - x <none> x x x x x x Step: AIC= Y ~ x1 + x2 + x3 + x4 + x5 + x6 Df Sum of Sq RSS AIC <none> x x x x x x Call: lm(formula = Y ~ x1 + x2 + x3 + x4 + x5 + x6) Coefficients: (Intercept) x1 x2 x3 x x5 x
27 We see after two back-steps, a reduced model with only the first 6 predictors x1 = age x2 = cement x3 = slag x4 = superplasticizer x5 = water x6 = fly.ash is recommended for further study.
28 6. Consider the simple linear regression model: Y i ~ indep. N(β 0 +β 1 X i, σ 2 ), i = 1,..., n, where in particular it is known that β 0 = 1, so that E[Y] = 1 + β 1 X. Suppose interest exists in estimating the X values at which E[Y] = 0. Let this target parameter be ξ. a) Find ξ as a function of β 1. Also find the maximum likelihood estimator for ξ. Call this ˆξ. This is essentially a one-parameter inverse regression problem. We have E[Y] = 1 + β 1 X. Clearly, at E[Y] = 0 we have 0 = 1 + β 1 X, so solving for X produces ξ= 1/β 1. To find the MLE ˆξ, appeal to ML invariance and first find ˆβ 1. The fastest way to do so is to recognize that if E[Y] = 1 + β 1 X, then E[Y 1] = β 1 X. That is, we essentially regress the new response variable (Y i 1) against X i through the origin! Referring to the various equations in Sec. 4.4 of Kutner et al., we find that the least squares estimator for β 1 is ˆβ 1 = n i=1 X i(y i 1) / n i=1 X i 2. Under the homogeneous-variance, normal-parent assumption here, this estimator remains identical to the MLE, so take ˆξ = 1/ ˆβ 1 = n i=1 X i 2 / n i=1 X i(y i 1). b) Recall from Section in Casella & Berger that the Delta Method can be used to determine the asymptotic features of a function of random variables. In particular, for a random variable U and a differentiable function g(u), where E[U] = θ, a first-order approximation to E[g(U)] is E[g(U)] g(θ) + { g(θ)}e(u θ) θ Use this to find a first-order approximation for E[ˆξ]. Let g(β 1 ) = ξ = 1/β 1. We know that the MLE for β 1 is unbiased such that E[ ˆβ 1 ] = β 1. Then from the Delta Method we see E[ˆξ] = E[ 1/ ˆβ 1 ] g(β 1 ) + β 1 g(β 1 ) E( ˆβ 1 β 1 ) = 1/β 1 + β 1 g(β 1 )(0) = 1/β 1 = ξ. c) In part (b), a second-order approximation to E[g(U)] is also available from Casella & Berger s book: E[g(U)] g(θ) + { g(θ) θ }E(U θ) + g(θ) ½{ 2 }E[(U θ) 2 ] θ 2 Use this to find a second-order approximation for E[ˆξ]. Again, let g(β 1 ) = ξ = 1/β 1. We know that the MLE for β 1 is unbiased such that E[ ˆβ 1 ] = β 1. Thus for the second-order Delta Method approximation we have E( ˆβ 1 β 1 ) = 0 and E[( ˆβ 1 β 1 ) 2 ] = Var[ ˆβ 1 ] This latter quantity is Var[ ˆβ 1 ] = Var[ n i=1 X i(y i 1)/ n j=1 X j 2 ] = Var[ n i=1 X i(y i 1)]/( n j=1 X j 2 ) 2
29 = n i=1 X i 2 Var[Y i 1]/( n j=1 X j 2 ) 2 = n i=1 X i 2 Var[Y i ]/( n j=1 X j 2 ) 2 = n i=1 X i 2 σ 2 /( n j=1 X j 2 ) 2 = σ 2 n i=1 X i 2 /( n j=1 X j 2 ) 2 = σ 2 /( n j=1 X j 2 ). Collecting all this together yields E[ˆξ] = E[ 1/ ˆβ 1 ] g(β 1 ) + β 1 g(β 1 ) (0) + ½ 2 g(β 1 ) β 1 2 Var[ ˆβ 1 ] = 1 β 1 = 1 β σ 2 2β 1 3 n j=1 X j 2 σ 2 β 3 1 n = ξ j=1 X j 2 ξ 2 σ n j=1 X j 2. (We see that to second order, a bias exists in the point estimator. However, it can in fact be shown that E[ˆξ] does not exist, as E[ ˆξ ] diverges. Thus, one must always be careful with these sorts of approximate expansions.)
Statistics GIDP Ph.D. Qualifying Exam Methodology January 10, 9:00am-1:00pm
Statistics GIDP Ph.D. Qualifying Exam Methodology January 10, 9:00am-1:00pm Instructions: Put your ID (not name) on each sheet. Complete exactly 5 of 6 problems; turn in only those sheets you wish to have
More informationStatistics GIDP Ph.D. Qualifying Exam Methodology May 26 9:00am-1:00pm
Statistics GIDP Ph.D. Qualifying Exam Methodology May 26 9:00am-1:00pm Instructions: Put your ID (not name) on each sheet. Complete exactly 5 of 6 problems; turn in only those sheets you wish to have graded.
More informationStatistics GIDP Ph.D. Qualifying Exam Methodology
Statistics GIDP Ph.D. Qualifying Exam Methodology January 9, 2018, 9:00am 1:00pm Instructions: Put your ID (not your name) on each sheet. Complete exactly 5 of 6 problems; turn in only those sheets you
More informationLecture 10: 2 k Factorial Design Montgomery: Chapter 6
Lecture 10: 2 k Factorial Design Montgomery: Chapter 6 Page 1 2 k Factorial Design Involving k factors Each factor has two levels (often labeled + and ) Factor screening experiment (preliminary study)
More informationLecture 11: Blocking and Confounding in 2 k design
Lecture 11: Blocking and Confounding in 2 k design Montgomery: Chapter 7 Page 1 There are n blocks Randomized Complete Block 2 k Design Within each block, all treatments (level combinations) are conducted.
More informationStatistics GIDP Ph.D. Qualifying Exam Methodology May 26 9:00am-1:00pm
Statistics GIDP Ph.D. Qualifying Exam Methodology May 26 9:00am-1:00pm Instructions: Put your ID (not name) on each sheet. Complete exactly 5 of 6 problems; turn in only those sheets you wish to have graded.
More informationLecture 12: 2 k Factorial Design Montgomery: Chapter 6
Lecture 12: 2 k Factorial Design Montgomery: Chapter 6 1 Lecture 12 Page 1 2 k Factorial Design Involvingk factors: each has two levels (often labeled+and ) Very useful design for preliminary study Can
More informationStatistics GIDP Ph.D. Qualifying Exam Methodology
Statistics GIDP Ph.D. Qualifying Exam Methodology May 28, 2015, 9:00am- 1:00pm Instructions: Provide answers on the supplied pads of paper; write on only one side of each sheet. Complete exactly 2 of the
More informationStatistics GIDP Ph.D. Qualifying Exam Methodology
Statistics GIDP Ph.D. Qualifying Exam Methodology May 28th, 2015, 9:00am- 1:00pm Instructions: Provide answers on the supplied pads of paper; write on only one side of each sheet. Complete exactly 2 of
More informationLecture 12: 2 k p Fractional Factorial Design
Lecture 12: 2 k p Fractional Factorial Design Montgomery: Chapter 8 Page 1 Fundamental Principles Regarding Factorial Effects Suppose there are k factors (A,B,...,J,K) in an experiment. All possible factorial
More information20g g g Analyze the residuals from this experiment and comment on the model adequacy.
3.4. A computer ANOVA output is shown below. Fill in the blanks. You may give bounds on the P-value. One-way ANOVA Source DF SS MS F P Factor 3 36.15??? Error??? Total 19 196.04 3.11. A pharmaceutical
More informationAnswer Keys to Homework#10
Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean
More informationAssignment 9 Answer Keys
Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67
More informationSTATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002
Time allowed: 3 HOURS. STATISTICS 174: APPLIED STATISTICS FINAL EXAM DECEMBER 10, 2002 This is an open book exam: all course notes and the text are allowed, and you are expected to use your own calculator.
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationChapter 6 The 2 k Factorial Design Solutions
Solutions from Montgomery, D. C. (004) Design and Analysis of Experiments, Wiley, NY Chapter 6 The k Factorial Design Solutions 6.. A router is used to cut locating notches on a printed circuit board.
More information3.4. A computer ANOVA output is shown below. Fill in the blanks. You may give bounds on the P-value.
3.4. A computer ANOVA output is shown below. Fill in the blanks. You may give bounds on the P-value. One-way ANOVA Source DF SS MS F P Factor 3 36.15??? Error??? Total 19 196.04 Completed table is: One-way
More informationLecture 14: 2 k p Fractional Factorial Design
Lecture 14: 2 k p Fractional Factorial Design Montgomery: Chapter 8 1 Lecture 14 Page 1 Fundamental Principles Regarding Factorial Effects Suppose there arek factors (A,B,...,J,K) in an experiment. All
More informationStrategy of Experimentation II
LECTURE 2 Strategy of Experimentation II Comments Computer Code. Last week s homework Interaction plots Helicopter project +1 1 1 +1 [4I 2A 2B 2AB] = [µ 1) µ A µ B µ AB ] +1 +1 1 1 +1 1 +1 1 +1 +1 +1 +1
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationLecture 9: Factorial Design Montgomery: chapter 5
Lecture 9: Factorial Design Montgomery: chapter 5 Page 1 Examples Example I. Two factors (A, B) each with two levels (, +) Page 2 Three Data for Example I Ex.I-Data 1 A B + + 27,33 51,51 18,22 39,41 EX.I-Data
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationStatistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).
Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results
More informationSingle Factor Experiments
Single Factor Experiments Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 4 1 Analysis of Variance Suppose you are interested in comparing either a different treatments a levels
More informationOutline. Topic 20 - Diagnostics and Remedies. Residuals. Overview. Diagnostics Plots Residual checks Formal Tests. STAT Fall 2013
Topic 20 - Diagnostics and Remedies - Fall 2013 Diagnostics Plots Residual checks Formal Tests Remedial Measures Outline Topic 20 2 General assumptions Overview Normally distributed error terms Independent
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More informationStat 4510/7510 Homework 7
Stat 4510/7510 Due: 1/10. Stat 4510/7510 Homework 7 1. Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details
More informationTopic 18: Model Selection and Diagnostics
Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationReview: Second Half of Course Stat 704: Data Analysis I, Fall 2014
Review: Second Half of Course Stat 704: Data Analysis I, Fall 2014 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2014 1 / 13 Chapter 8: Polynomials & Interactions
More informationholding all other predictors constant
Multiple Regression Numeric Response variable (y) p Numeric predictor variables (p < n) Model: Y = b 0 + b 1 x 1 + + b p x p + e Partial Regression Coefficients: b i effect (on the mean response) of increasing
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationSTAT 571A Advanced Statistical Regression Analysis. Chapter 8 NOTES Quantitative and Qualitative Predictors for MLR
STAT 571A Advanced Statistical Regression Analysis Chapter 8 NOTES Quantitative and Qualitative Predictors for MLR 2015 University of Arizona Statistics GIDP. All rights reserved, except where previous
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationNo other aids are allowed. For example you are not allowed to have any other textbook or past exams.
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More informationSTAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis
STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO
More informationRegression Diagnostics
Diag 1 / 78 Regression Diagnostics Paul E. Johnson 1 2 1 Department of Political Science 2 Center for Research Methods and Data Analysis, University of Kansas 2015 Diag 2 / 78 Outline 1 Introduction 2
More informationSTATISTICS 174: APPLIED STATISTICS TAKE-HOME FINAL EXAM POSTED ON WEBPAGE: 6:00 pm, DECEMBER 6, 2004 HAND IN BY: 6:00 pm, DECEMBER 7, 2004 This is a
STATISTICS 174: APPLIED STATISTICS TAKE-HOME FINAL EXAM POSTED ON WEBPAGE: 6:00 pm, DECEMBER 6, 2004 HAND IN BY: 6:00 pm, DECEMBER 7, 2004 This is a take-home exam. You are expected to work on it by yourself
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Fall, 2013 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the
More informationMultiple Linear Regression
Multiple Linear Regression University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/~eariasca/teaching.html 1 / 42 Passenger car mileage Consider the carmpg dataset taken from
More informationSAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model
Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More informationBusiness Statistics. Tommaso Proietti. Linear Regression. DEF - Università di Roma 'Tor Vergata'
Business Statistics Tommaso Proietti DEF - Università di Roma 'Tor Vergata' Linear Regression Specication Let Y be a univariate quantitative response variable. We model Y as follows: Y = f(x) + ε where
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 8, 2014 List of Figures in this document by page: List of Figures 1 Popcorn data............................. 2 2 MDs by city, with normal quantile
More information5.3 Three-Stage Nested Design Example
5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens
More information1 Introduction 1. 2 The Multiple Regression Model 1
Multiple Linear Regression Contents 1 Introduction 1 2 The Multiple Regression Model 1 3 Setting Up a Multiple Regression Model 2 3.1 Introduction.............................. 2 3.2 Significance Tests
More information14 Multiple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 14 Multiple Linear Regression 14.1 The multiple linear regression model In simple linear regression, the response variable y is expressed in
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationLecture 1: Linear Models and Applications
Lecture 1: Linear Models and Applications Claudia Czado TU München c (Claudia Czado, TU Munich) ZFS/IMS Göttingen 2004 0 Overview Introduction to linear models Exploratory data analysis (EDA) Estimation
More informationStat 5102 Final Exam May 14, 2015
Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions
More informationRegression Analysis for Data Containing Outliers and High Leverage Points
Alabama Journal of Mathematics 39 (2015) ISSN 2373-0404 Regression Analysis for Data Containing Outliers and High Leverage Points Asim Kumer Dey Department of Mathematics Lamar University Md. Amir Hossain
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationChapter 5 Introduction to Factorial Designs Solutions
Solutions from Montgomery, D. C. (1) Design and Analysis of Experiments, Wiley, NY Chapter 5 Introduction to Factorial Designs Solutions 5.1. The following output was obtained from a computer program that
More informationLecture 10: Experiments with Random Effects
Lecture 10: Experiments with Random Effects Montgomery, Chapter 13 1 Lecture 10 Page 1 Example 1 A textile company weaves a fabric on a large number of looms. It would like the looms to be homogeneous
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationMultiple Linear Regression. Chapter 12
13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.
More informationFractional Factorial Designs
Fractional Factorial Designs ST 516 Each replicate of a 2 k design requires 2 k runs. E.g. 64 runs for k = 6, or 1024 runs for k = 10. When this is infeasible, we use a fraction of the runs. As a result,
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationIES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc
IES 612/STA 4-573/STA 4-576 Winter 2008 Week 1--IES 612-STA 4-573-STA 4-576.doc Review Notes: [OL] = Ott & Longnecker Statistical Methods and Data Analysis, 5 th edition. [Handouts based on notes prepared
More informationSuppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks.
58 2. 2 factorials in 2 blocks Suppose we needed four batches of formaldehyde, and coulddoonly4runsperbatch. Thisisthena2 4 factorial in 2 2 blocks. Some more algebra: If two effects are confounded with
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationStat 500 Midterm 2 12 November 2009 page 0 of 11
Stat 500 Midterm 2 12 November 2009 page 0 of 11 Please put your name on the back of your answer book. Do NOT put it on the front. Thanks. Do not start until I tell you to. The exam is closed book, closed
More information1) Answer the following questions as true (T) or false (F) by circling the appropriate letter.
1) Answer the following questions as true (T) or false (F) by circling the appropriate letter. T F T F T F a) Variance estimates should always be positive, but covariance estimates can be either positive
More informationSTAT22200 Spring 2014 Chapter 14
STAT22200 Spring 2014 Chapter 14 Yibi Huang May 27, 2014 Chapter 14 Incomplete Block Designs 14.1 Balanced Incomplete Block Designs (BIBD) Chapter 14-1 Incomplete Block Designs A Brief Introduction to
More informationOverview Scatter Plot Example
Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationStatistics for exp. medical researchers Regression and Correlation
Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence
More informationSCHOOL OF MATHEMATICS AND STATISTICS
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester
More informationLecture 4. Random Effects in Completely Randomized Design
Lecture 4. Random Effects in Completely Randomized Design Montgomery: 3.9, 13.1 and 13.7 1 Lecture 4 Page 1 Random Effects vs Fixed Effects Consider factor with numerous possible levels Want to draw inference
More informationChapter 6 The 2 k Factorial Design Solutions
Solutions from Montgomery, D. C. () Design and Analysis of Experiments, Wiley, NY Chapter 6 The k Factorial Design Solutions 6.. An engineer is interested in the effects of cutting speed (A), tool geometry
More informationExample: Poisondata. 22s:152 Applied Linear Regression. Chapter 8: ANOVA
s:5 Applied Linear Regression Chapter 8: ANOVA Two-way ANOVA Used to compare populations means when the populations are classified by two factors (or categorical variables) For example sex and occupation
More informationComparison of a Population Means
Analysis of Variance Interested in comparing Several treatments Several levels of one treatment Comparison of a Population Means Could do numerous two-sample t-tests but... ANOVA provides method of joint
More informationInstitutionen för matematik och matematisk statistik Umeå universitet November 7, Inlämningsuppgift 3. Mariam Shirdel
Institutionen för matematik och matematisk statistik Umeå universitet November 7, 2011 Inlämningsuppgift 3 Mariam Shirdel (mash0007@student.umu.se) Kvalitetsteknik och försöksplanering, 7.5 hp 1 Uppgift
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More informationChapter 10 Building the Regression Model II: Diagnostics
Chapter 10 Building the Regression Model II: Diagnostics 許湘伶 Applied Linear Regression Models (Kutner, Nachtsheim, Neter, Li) hsuhl (NUK) LR Chap 10 1 / 41 10.1 Model Adequacy for a Predictor Variable-Added
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent
More informationRegression Model Building
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation in Y with a small set of predictors Automated
More informationIntroduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes)
Introduction to Design and Analysis of Experiments with the SAS System (Stat 7010 Lecture Notes) Asheber Abebe Discrete and Statistical Sciences Auburn University Contents 1 Completely Randomized Design
More informationLecture 4. Checking Model Adequacy
Lecture 4. Checking Model Adequacy Montgomery: 3-4, 15-1.1 Page 1 Model Checking and Diagnostics Model Assumptions 1 Model is correct 2 Independent observations 3 Errors normally distributed 4 Constant
More informationWeek 7 Multiple factors. Ch , Some miscellaneous parts
Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationDiagnostics and Transformations Part 2
Diagnostics and Transformations Part 2 Bivariate Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University Multilevel Regression Modeling, 2009 Diagnostics
More informationEXST 7015 Fall 2014 Lab 08: Polynomial Regression
EXST 7015 Fall 2014 Lab 08: Polynomial Regression OBJECTIVES Polynomial regression is a statistical modeling technique to fit the curvilinear data that either shows a maximum or a minimum in the curve,
More informationSCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: "Statistics Tables" by H.R. Neave PAS 371 SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester 2008 9 Linear
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationBE640 Intermediate Biostatistics 2. Regression and Correlation. Simple Linear Regression Software: SAS. Emergency Calls to the New York Auto Club
BE640 Intermediate Biostatistics 2. Regression and Correlation Simple Linear Regression Software: SAS Emergency Calls to the New York Auto Club Source: Chatterjee, S; Handcock MS and Simonoff JS A Casebook
More informationOne-way ANOVA Model Assumptions
One-way ANOVA Model Assumptions STAT:5201 Week 4: Lecture 1 1 / 31 One-way ANOVA: Model Assumptions Consider the single factor model: Y ij = µ + α }{{} i ij iid with ɛ ij N(0, σ 2 ) mean structure random
More information