Answer to exercise 'height vs. age' (Juul)
|
|
- Ann Bruce
- 5 years ago
- Views:
Transcription
1 Answer to exercise 'height vs. age' (Juul) Question 1 Fitting a straight line to height for males in the age range 5-20 and making the corresponding illustration is performed by writing: proc reg data=juul; where age ge 5 and age le 20 and sex='male'; model height=age; proc gplot data=juul; where age ge 5 and age le 20 and sex='male'; plot height*age / haxis=axis1 vaxis=axis2 frame; axis1 value=(h=3) minor=none label=(h=3); axis2 value=(h=3) minor=none label=(a=90 R=0 H=3); symbol1 v=circle i=rl c=black h=2 l=1 w=2; It looks as if the variation increases somewhat with age, but for ease of interpretation, we abstain from performing any transformation. More importantly, it looks as if linearity is not entirely adequate, at least not extending beyond the age of 16. This is indeed not very surprising. 1
2 Question 2 Graphical ts of parabolas and cubic functions may be performed simply by using i=rq or i=rc in the symbol-statement in gplot. We get Quadratic ts: Cubic ts: In order to t the models for males and females simultaneously, we may use two dierent parametrisations. First, we choose the estimation friendy version: 2
3 proc glm data=juul; where age ge 5 and age le 20; class sex; model height=sex sex*age sex*age2 sex*age3 / noint solution; which gives The GLM Procedure Dependent Variable: height Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Uncorrected Total R-Square Coeff Var Root MSE height Mean Source DF Type I SS Mean Square F Value Pr > F sex <.0001 age*sex <.0001 age2*sex <.0001 age3*sex <.0001 Source DF Type III SS Mean Square F Value Pr > F sex <.0001 age*sex age2*sex <.0001 age3*sex <.0001 Standard Parameter Estimate Error t Value Pr > t sex female <.0001 sex male <.0001 age*sex female age*sex male age2*sex female age2*sex male <.0001 age3*sex female <.0001 age3*sex male <.0001 or with the test friendly parametrisation: 3
4 proc glm data=juul; where age ge 5 and age le 20; class sex; model height=age age2 age3 sex sex*age sex*age2 sex*age3 / solution; which gives The GLM Procedure Dependent Variable: height Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE height Mean Source DF Type I SS Mean Square F Value Pr > F age <.0001 age <.0001 age <.0001 sex <.0001 age*sex <.0001 age2*sex <.0001 age3*sex Source DF Type III SS Mean Square F Value Pr > F age age <.0001 age <.0001 sex age*sex age2*sex age3*sex Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 age B age B <.0001 age B <.0001 sex female B sex male B... 4
5 age*sex female B age*sex male B... age2*sex female B age2*sex male B... age3*sex female B age3*sex male B... NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. We see that the two genders do not deviate in the coecient of the third order term (P=0.32), but (seen from Type I) they do dier in the coecient of the second order term (F=44.04, P < ). From the estimation friendly parametrisation, we may also note that the linear term is not signicant for girls (P=0.67), but this is rather hard to interpret... Question 3 Automatic t with e.g. i=sm40 gives the result which pretty much resembles the parabolic ts. Question 4. We now consider cutpoints at the ages 10, 12, 13 and 15 and dene variables 5
6 denoting number of years above these respective thresholds. data juul; set juul; /* definition af new intercept, at the age 5 */ age5=age-5; extra_age10=max(age-10,0); extra_age12=max(age-12,0); extra_age13=max(age-13,0); extra_age15=max(age-15,0); We now use these new variables to dene a linear spline, and at the same time we test linearity using a contrast-statement proc glm data=juul; where age ge 5 and age le 20; by sex; model lheight=age5 extra_age15 extra_age13 extra_age12 extra_age10 / solution; contrast 'all' extra_age10 1, extra_age12 1, extra_age13 1, extra_age15 1; From this we get sex=male The GLM Procedure Number of Observations Used 502 Dependent Variable: height Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE height Mean
7 Source DF Type I SS Mean Square F Value Pr > F age <.0001 extra_age <.0001 extra_age extra_age extra_age Source DF Type III SS Mean Square F Value Pr > F age <.0001 extra_age <.0001 extra_age extra_age extra_age Contrast DF Contrast SS Mean Square F Value Pr > F all <.0001 Standard Parameter Estimate Error t Value Pr > t Intercept <.0001 age <.0001 extra_age <.0001 extra_age extra_age extra_age Question 5 We nd the rst deviation from the straight line at the age of 13 (based on the Type I tests, P=0.0002). Question 6 For girls, we get similarly sex=female The GLM Procedure Number of Observations Used 628 Dependent Variable: height Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE height Mean
8 Source DF Type I SS Mean Square F Value Pr > F age <.0001 extra_age <.0001 extra_age <.0001 extra_age extra_age Source DF Type III SS Mean Square F Value Pr > F age <.0001 extra_age extra_age extra_age extra_age Contrast DF Contrast SS Mean Square F Value Pr > F all <.0001 Standard Parameter Estimate Error t Value Pr > t Intercept <.0001 age <.0001 extra_age extra_age extra_age extra_age and again, we see the rst deviation at the age of 13 (F=17.95, P < ). Question 7 A model for the two genders simultaneously is dened as proc glm data=juul; where age ge 5 and age le 20; class sex; model height=age5 extra_age10 extra_age12 extra_age13 extra_age15 sex sex*extra_age15 sex*extra_age13 sex*extra_age12 sex*extra_age10 sex*age5 / solution; and yields 8
9 The GLM Procedure Dependent Variable: height Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE height Mean Source DF Type I SS Mean Square F Value Pr > F age <.0001 extra_age <.0001 extra_age <.0001 extra_age <.0001 extra_age <.0001 sex <.0001 extra_age15*sex <.0001 extra_age13*sex <.0001 extra_age12*sex extra_age10*sex age5*sex Source DF Type III SS Mean Square F Value Pr > F age <.0001 extra_age extra_age extra_age extra_age <.0001 sex extra_age15*sex extra_age13*sex extra_age12*sex extra_age10*sex age5*sex Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 age B <.0001 extra_age B extra_age B extra_age B extra_age B <.0001 sex female B sex male B... extra_age15*sex female B extra_age15*sex male B... extra_age13*sex female B extra_age13*sex male B... 9
10 extra_age12*sex female B extra_age12*sex male B... extra_age10*sex female B extra_age10*sex male B... age5*sex female B age5*sex male B... NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. We see some rather small signs of the genders dier already from the age of 10, although this does not reach signicance. From the age 13 and onwards, they clearly dier, since the boys grow more quickly than the girls. A scatter plot of the predictions looks like If we reduce the model to include only changes in the slope from the age 13 and onwards, we get The GLM Procedure Class Level Information Class Levels Values sex 2 female male Number of Observations Read 1142 Number of Observations Used 1130 Dependent Variable: height 10
11 Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total R-Square Coeff Var Root MSE height Mean Source DF Type I SS Mean Square F Value Pr > F age <.0001 extra_age <.0001 extra_age <.0001 sex <.0001 extra_age15*sex <.0001 extra_age13*sex <.0001 Source DF Type III SS Mean Square F Value Pr > F age <.0001 extra_age extra_age <.0001 sex extra_age15*sex extra_age13*sex <.0001 Standard Parameter Estimate Error t Value Pr > t Intercept B <.0001 age <.0001 extra_age B <.0001 extra_age B <.0001 sex female B sex male B... extra_age15*sex female B extra_age15*sex male B... extra_age13*sex female B <.0001 extra_age13*sex male B... NOTE: The X'X matrix has been found to be singular, and a generalized inverse was used to solve the normal equations. Terms whose estimates are followed by the letter 'B' are not uniquely estimable. with a new plot of predicted values: 11
12 From the output as well as from the gures, we note that between the ages 13 and 15, boys grow an extra 4.4 cm (s.e.=0.62cm) a year on average, compared to the girls. Using the estimation friendly version of this model gives us Standard Parameter Estimate Error t Value Pr > t age <.0001 sex female <.0001 sex male <.0001 extra_age15*sex female extra_age15*sex male <.0001 extra_age13*sex female <.0001 extra_age13*sex male <.0001 Question 8 The Gompertz growth model takes the following form: height = α e β e γ age It is a bit tricky to determine starting values for this equation, since we have no possibility of transforming to linearity. First of all, we realize that when age increases, we approach an asymptote which is α. Hence, a reasonable guess of α is a maximum height, i.e. around 12
13 200cm. The parameter β is related to the timing of growth, whereas γ describes the growth velocity. Setting α to be 200, we may rearrange the above equation, and transform with the (natural) logarithm: height/200 e β e γ age log(200/height) β e γ age log(log(200/height)) log(β) γ age By dening the transformed variable y=log(log(200/height)), we may therefore estimate β and γ using linear regression, e.g. for males: data juul; set juul; y=log(log(200/height)); /* create rather crude starting values for nonlinear regression */ proc reg data=juul; model y=age; where age ge 5 and age le 20 and sex='male'; from which we get The REG Procedure Dependent Variable: y Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model <.0001 Error Corrected Total Root MSE R-Square Dependent Mean Adj R-Sq Coeff Var
14 Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t Intercept <.0001 age <.0001 We therefore take the starting values to be: α = 200, β = exp( ) = 1.54, γ = Question 9 We now t the nonlinear model using these starting values proc nlin data=juul; where age ge 5 and age le 20; by sex; parms alfa=200 beta=1.5 gamma=0.15; model height=alfa*exp(-beta*exp(-gamma*age)); output out=ny p=yhat r=resid; sex=female The NLIN Procedure Dependent Variable height Approx Parameter Estimate Std Error Approximate 95% Confidence Limits alfa beta gamma sex=male The NLIN Procedure Approx Parameter Estimate Std Error Approximate 95% Confidence Limits alfa beta gamma
15 The output statement creates a new data set called ny, from which we may make residual plots, or as shown here, plots showing the tted relations: We note the slight increase in variance with age, which could be eliminated by a logarithmic transformation of height. Therefore we could instead t the model as data juul; set juul; lheight=log(height); proc nlin data=juul; where age ge 5 and age le 20; by sex; parms alfa=200 beta=1.5 gamma=0.15; model lheight=log(alfa)-beta*exp(-gamma*age); output out=ny p=yhat r=resid; 15
16 sex=female The NLIN Procedure Approx Parameter Estimate Std Error Approximate 95% Confidence Limits alfa beta gamma sex=male The NLIN Procedure Approx Parameter Estimate Std Error Approximate 95% Confidence Limits alfa beta gamma Question 10 We may compare males and females in various ways. From the estimates above, we may perform simple t-tests, e.g. for the parameter β: = 1.11 which is not signicant, no matter how many degrees of freedom this quantity might have (approximately). We may also perform a test in SAS by analysing the two genders simultaneously: proc nlin data=juul; where age ge 5 and age le 20; parms alfa=200 beta=1.5 gamma=0.15 alfadif=0 betadif=0 gammadif=0; if sex='male' then model lheight=log(alfa)-beta*exp(-gamma*age); if sex='female' then model lheight=log(alfa+alfadif)- (beta+betadif)*exp(-(gamma+gammadif)*age); 16
17 yielding The NLIN Procedure Dependent Variable lheight Approx Parameter Estimate Std Error Approximate 95% Confidence Limits alfa beta gamma alfadif betadif gammadif We may conclude that the αs and the γs dier between males and females, whereas the βs do not seem to dier. A cautious interpretation is that boys grow to become taller than girls (α) but that their growth appear to happen at a slower pace (γ). No dierence can be found in the timing of growth between the two genders. 17
18 Answer to additional questions for the BOC exercise Question 1 We t the nonlinear relation directly using the procedure nlin: proc nlin data=boc; parms beta=0.8 gamma=228; model boc=gamma*exp(-beta/days); output out=ny p=yhat r=resid; which creates the following output (plus some more, which is not shown here) Dependent Variable boc Method: Gauss-Newton Iterative Phase Sum of Iter beta gamma Squares NOTE: Convergence criterion met. NOTE: An intercept was not specified for this model. Sum of Mean Approx Source DF Squares Square F Value Pr > F Regression <.0001 Residual Uncorrected Total Corrected Total
19 Approx Parameter Estimate Std Error Approximate 95% Confidence Limits beta gamma This provides a direct estimation of both parameters and we may compare the new condence intervals with the ones obtained from linearisation of the data: method β γ linearisation (0.727,0.888) (219.6,237.7) non-linear regression (0.713,0.952) (221.2,240.0) The dierences are not pronounced, but the nonlinear regression gives slightly wider condence intervals, especially for β. Question 2 The linear regression of the transformed relation is to be preferred since the assumption of variance homogeneity is better satised for this relation. This may also be seen from the residual plot below which is based on the nonlinear regression and which shows a clear 'trumpet' pattern. 19
20 However, the ts for the two models do not dier much as can be seen from the plots below: 20
Analysis of variance and regression. November 22, 2007
Analysis of variance and regression November 22, 2007 Parametrisations: Choice of parameters Comparison of models Test for linearity Linear splines Lene Theil Skovgaard, Dept. of Biostatistics, Institute
More informationParametrisations, splines
/ 7 Parametrisations, splines Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,
More information6. Multiple regression - PROC GLM
Use of SAS - November 2016 6. Multiple regression - PROC GLM Karl Bang Christensen Department of Biostatistics, University of Copenhagen. http://biostat.ku.dk/~kach/sas2016/ kach@biostat.ku.dk, tel: 35327491
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationNonlinear regression. Nonlinear regression analysis. Polynomial regression. How can we model non-linear effects?
Nonlinear regression Nonlinear regression analysis Peter Dalgaard (orig. Lene Theil Skovgaard) Department of Biostatistics University of Copenhagen Simple kinetic model Compartment models Michaelis Menten
More informationAnalysis of variance and regression. November 29, 2007
Analysis of variance and regression November 29, 2007 Nonlinear regression: Simple kinetic model Compartment models Michaelis Menten reaction Dose-response relationships Lene Theil Skovgaard, Dept. of
More informationANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003
ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003 The MEANS Procedure DRINKING STATUS=1 Analysis Variable : TRIGL N Mean Std Dev Minimum Maximum 164 151.6219512 95.3801744
More informationT-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum
T-test: means of Spock's judge versus all other judges 1 The TTEST Procedure Variable: pcwomen judge1 N Mean Std Dev Std Err Minimum Maximum OTHER 37 29.4919 7.4308 1.2216 16.5000 48.9000 SPOCKS 9 14.6222
More informationNonlinear curve fitting
16 Nonlinear curve fitting Curve fitting problems occur in many scientific areas. The typical case is that you wish to fit the relation between some response y and a one-dimensional predictor x, by adjusting
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationSwabs, revisited. The families were subdivided into 3 groups according to the factor crowding, which describes the space available for the household.
Swabs, revisited 18 families with 3 children each (in well defined age intervals) were followed over a certain period of time, during which repeated swabs were taken. The variable swabs indicates how many
More informationLinear models Analysis of Covariance
Esben Budtz-Jørgensen April 22, 2008 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor
More informationChapter 8 Quantitative and Qualitative Predictors
STAT 525 FALL 2017 Chapter 8 Quantitative and Qualitative Predictors Professor Dabao Zhang Polynomial Regression Multiple regression using X 2 i, X3 i, etc as additional predictors Generates quadratic,
More informationLinear models Analysis of Covariance
Esben Budtz-Jørgensen November 20, 2007 Linear models Analysis of Covariance Confounding Interactions Parameterizations Analysis of Covariance group comparisons can become biased if an important predictor
More informationunadjusted model for baseline cholesterol 22:31 Monday, April 19,
unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol
More informationAnalysis of variance and regression. April 17, Contents Comparison of several groups One-way ANOVA. Two-way ANOVA Interaction Model checking
Analysis of variance and regression Contents Comparison of several groups One-way ANOVA April 7, 008 Two-way ANOVA Interaction Model checking ANOVA, April 008 Comparison of or more groups Julie Lyng Forman,
More informationOutline. Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping
Topic 19: Remedies Outline Review regression diagnostics Remedial measures Weighted regression Ridge regression Robust regression Bootstrapping Regression Diagnostics Summary Check normality of the residuals
More informationGeneral Linear Model (Chapter 4)
General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients
More informationdata proc sort proc corr run proc reg run proc glm run proc glm run proc glm run proc reg CONMAIN CONINT run proc reg DUMMAIN DUMINT run proc reg
data one; input id Y group X; I1=0;I2=0;I3=0;if group=1 then I1=1;if group=2 then I2=1;if group=3 then I3=1; IINT1=I1*X;IINT2=I2*X;IINT3=I3*X; *************************************************************************;
More informationThe General Linear Model. April 22, 2008
The General Linear Model. April 22, 2008 Multiple regression Data: The Faroese Mercury Study Simple linear regression Confounding The multiple linear regression model Interpretation of parameters Model
More informationOutline. Analysis of Variance. Acknowledgements. Comparison of 2 or more groups. Comparison of serveral groups
Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression10_2/index.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk
More informationThe General Linear Model. November 20, 2007
The General Linear Model. November 20, 2007 Multiple regression Data: The Faroese Mercury Study Simple linear regression Confounding The multiple linear regression model Interpretation of parameters Model
More informationOverview Scatter Plot Example
Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables
More informationOutline. Analysis of Variance. Comparison of 2 or more groups. Acknowledgements. Comparison of serveral groups
Outline Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~jufo/varianceregressionf2011.html Comparison of serveral groups Model checking Marc Andersen, mja@statgroup.dk
More informationa. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =
Stat 28 Fall 2004 Key to Homework Exercise.10 a. There is evidence of a linear trend: winning times appear to decrease with year. A straight-line model for predicting winning times based on year is: Winning
More informationChapter 8 (More on Assumptions for the Simple Linear Regression)
EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually
More informationAnalysis of variance. April 16, Contents Comparison of several groups
Contents Comparison of several groups Analysis of variance April 16, 2009 One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics
More informationAnalysis of variance. April 16, 2009
Analysis of variance April 16, 2009 Contents Comparison of several groups One-way ANOVA Two-way ANOVA Interaction Model checking Acknowledgement for use of presentation Julie Lyng Forman, Dept. of Biostatistics
More informationusing the beginning of all regression models
Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationAnalysis of Variance
1 / 70 Analysis of Variance Analysis of variance and regression course http://staff.pubhealth.ku.dk/~lts/regression11_2 Marc Andersen, mja@statgroup.dk Analysis of variance and regression for health researchers,
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationIn Class Review Exercises Vartanian: SW 540
In Class Review Exercises Vartanian: SW 540 1. Given the following output from an OLS model looking at income, what is the slope and intercept for those who are black and those who are not black? b SE
More informationUNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator
UNIVERSITY OF TORONTO Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS Duration - 3 hours Aids Allowed: Calculator LAST NAME: FIRST NAME: STUDENT NUMBER: There are 27 pages
More informationCorrelated data. Introduction. We expect students to... Aim of the course. Faculty of Health Sciences. NFA, May 19, 2014.
Faculty of Health Sciences Introduction Correlated data NFA, May 19, 2014 Introduction Julie Lyng Forman & Lene Theil Skovgaard Department of Biostatistics University of Copenhagen The idea of the course
More information3 Variables: Cyberloafing Conscientiousness Age
title 'Cyberloafing, Mike Sage'; run; PROC CORR data=sage; var Cyberloafing Conscientiousness Age; run; quit; The CORR Procedure 3 Variables: Cyberloafing Conscientiousness Age Simple Statistics Variable
More informationStatistics for exp. medical researchers Regression and Correlation
Faculty of Health Sciences Regression analysis Statistics for exp. medical researchers Regression and Correlation Lene Theil Skovgaard Sept. 28, 2015 Linear regression, Estimation and Testing Confidence
More informationSAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model
Topic 23 - Unequal Replication Data Model Outline - Fall 2013 Parameter Estimates Inference Topic 23 2 Example Page 954 Data for Two Factor ANOVA Y is the response variable Factor A has levels i = 1, 2,...,
More informationStat 302 Statistical Software and Its Applications SAS: Simple Linear Regression
1 Stat 302 Statistical Software and Its Applications SAS: Simple Linear Regression Fritz Scholz Department of Statistics, University of Washington Winter Quarter 2015 February 16, 2015 2 The Spirit of
More informationTopic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model
Topic 17 - Single Factor Analysis of Variance - Fall 2013 One way ANOVA Cell means model Factor effects model Outline Topic 17 2 One-way ANOVA Response variable Y is continuous Explanatory variable is
More informationData Set 8: Laysan Finch Beak Widths
Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More informationStatistics 5100 Spring 2018 Exam 1
Statistics 5100 Spring 2018 Exam 1 Directions: You have 60 minutes to complete the exam. Be sure to answer every question, and do not spend too much time on any part of any question. Be concise with all
More informationStatistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).
Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results
More informationSplineLinear.doc 1 # 9 Last save: Saturday, 9. December 2006
SplineLinear.doc 1 # 9 Problem:... 2 Objective... 2 Reformulate... 2 Wording... 2 Simulating an example... 3 SPSS 13... 4 Substituting the indicator function... 4 SPSS-Syntax... 4 Remark... 4 Result...
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationUse of Dummy (Indicator) Variables in Applied Econometrics
Chapter 5 Use of Dummy (Indicator) Variables in Applied Econometrics Section 5.1 Introduction Use of Dummy (Indicator) Variables Model specifications in applied econometrics often necessitate the use of
More informationEXST7015: Estimating tree weights from other morphometric variables Raw data print
Simple Linear Regression SAS example Page 1 1 ********************************************; 2 *** Data from Freund & Wilson (1993) ***; 3 *** TABLE 8.24 : ESTIMATING TREE WEIGHTS ***; 4 ********************************************;
More informationAnalysis of Variance. Source DF Squares Square F Value Pr > F. Model <.0001 Error Corrected Total
Math 221: Linear Regression and Prediction Intervals S. K. Hyde Chapter 23 (Moore, 5th Ed.) (Neter, Kutner, Nachsheim, and Wasserman) The Toluca Company manufactures refrigeration equipment as well as
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More informationAnalysis of Covariance
Analysis of Covariance (ANCOVA) Bruce A Craig Department of Statistics Purdue University STAT 514 Topic 10 1 When to Use ANCOVA In experiment, there is a nuisance factor x that is 1 Correlated with y 2
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationTopic 20: Single Factor Analysis of Variance
Topic 20: Single Factor Analysis of Variance Outline Single factor Analysis of Variance One set of treatments Cell means model Factor effects model Link to linear regression using indicator explanatory
More informationSTAT 350. Assignment 4
STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the
More informationStatistics 512: Applied Linear Models. Topic 1
Topic Overview This topic will cover Course Overview & Policies SAS Statistics 512: Applied Linear Models Topic 1 KNNL Chapter 1 (emphasis on Sections 1.3, 1.6, and 1.7; much should be review) Simple linear
More informationssh tap sas913, sas
B. Kedem, STAT 430 SAS Examples SAS8 ===================== ssh xyz@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm Multiple Regression ====================== 0. Show
More informationTopic 28: Unequal Replication in Two-Way ANOVA
Topic 28: Unequal Replication in Two-Way ANOVA Outline Two-way ANOVA with unequal numbers of observations in the cells Data and model Regression approach Parameter estimates Previous analyses with constant
More informationVariance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.
10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for
More informationAnswer Keys to Homework#10
Answer Keys to Homework#10 Problem 1 Use either restricted or unrestricted mixed models. Problem 2 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean
More informationChapter 2 Inferences in Simple Linear Regression
STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires
More informationCOMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION
COMPREHENSIVE WRITTEN EXAMINATION, PAPER III FRIDAY AUGUST 26, 2005, 9:00 A.M. 1:00 P.M. STATISTICS 174 QUESTION Answer all parts. Closed book, calculators allowed. It is important to show all working,
More informationChapter 11. Analysis of Variance (One-Way)
Chapter 11 Analysis of Variance (One-Way) We now develop a statistical procedure for comparing the means of two or more groups, known as analysis of variance or ANOVA. These groups might be the result
More informationAssignment 9 Answer Keys
Assignment 9 Answer Keys Problem 1 (a) First, the respective means for the 8 level combinations are listed in the following table A B C Mean 26.00 + 34.67 + 39.67 + + 49.33 + 42.33 + + 37.67 + + 54.67
More informationLogistic Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University
Logistic Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Logistic Regression 1 / 38 Logistic Regression 1 Introduction
More information171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th
Name 171:162 Design and Analysis of Biomedical Studies, Summer 2011 Exam #3, July 16th Use the selected SAS output to help you answer the questions. The SAS output is all at the back of the exam on pages
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3.1 through 3.3 Fall, 2013 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the
More informationLecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3
Lecture 3. Experiments with a Single Factor: ANOVA Montgomery 3-1 through 3-3 Page 1 Tensile Strength Experiment Investigate the tensile strength of a new synthetic fiber. The factor is the weight percent
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More informationCourse Information Text:
Course Information Text: Special reprint of Applied Linear Statistical Models, 5th edition by Kutner, Neter, Nachtsheim, and Li, 2012. Recommended: Applied Statistics and the SAS Programming Language,
More informationSection 9c. Propensity scores. Controlling for bias & confounding in observational studies
Section 9c Propensity scores Controlling for bias & confounding in observational studies 1 Logistic regression and propensity scores Consider comparing an outcome in two treatment groups: A vs B. In a
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More informationData Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression
Data Mining and Data Warehousing Henryk Maciejewski Data Mining Predictive modelling: regression Algorithms for Predictive Modelling Contents Regression Classification Auxiliary topics: Estimation of prediction
More informationSTAT 3A03 Applied Regression Analysis With SAS Fall 2017
STAT 3A03 Applied Regression Analysis With SAS Fall 2017 Assignment 5 Solution Set Q. 1 a The code that I used and the output is as follows PROC GLM DataS3A3.Wool plotsnone; Class Amp Len Load; Model CyclesAmp
More informationPLS205!! Lab 9!! March 6, Topic 13: Covariance Analysis
PLS205!! Lab 9!! March 6, 2014 Topic 13: Covariance Analysis Covariable as a tool for increasing precision Carrying out a full ANCOVA Testing ANOVA assumptions Happiness! Covariable as a Tool for Increasing
More information1.) Fit the full model, i.e., allow for separate regression lines (different slopes and intercepts) for each species
Lecture notes 2/22/2000 Dummy variables and extra SS F-test Page 1 Crab claw size and closing force. Problem 7.25, 10.9, and 10.10 Regression for all species at once, i.e., include dummy variables for
More informationTopic 18: Model Selection and Diagnostics
Topic 18: Model Selection and Diagnostics Variable Selection We want to choose a best model that is a subset of the available explanatory variables Two separate problems 1. How many explanatory variables
More information5.3 Three-Stage Nested Design Example
5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens
More informationChapter 12: Multiple Regression
Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x
More informationComparison of a Population Means
Analysis of Variance Interested in comparing Several treatments Several levels of one treatment Comparison of a Population Means Could do numerous two-sample t-tests but... ANOVA provides method of joint
More information1. (Problem 3.4 in OLRT)
STAT:5201 Homework 5 Solutions 1. (Problem 3.4 in OLRT) The relationship of the untransformed data is shown below. There does appear to be a decrease in adenine with increased caffeine intake. This is
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 8, 2014 List of Figures in this document by page: List of Figures 1 Popcorn data............................. 2 2 MDs by city, with normal quantile
More informationThe program for the following sections follows.
Homework 6 nswer sheet Page 31 The program for the following sections follows. dm'log;clear;output;clear'; *************************************************************; *** EXST734 Homework Example 1
More informationLab 11. Multilevel Models. Description of Data
Lab 11 Multilevel Models Henian Chen, M.D., Ph.D. Description of Data MULTILEVEL.TXT is clustered data for 386 women distributed across 40 groups. ID: 386 women, id from 1 to 386, individual level (level
More informationLecture 13 Extra Sums of Squares
Lecture 13 Extra Sums of Squares STAT 512 Spring 2011 Background Reading KNNL: 7.1-7.4 13-1 Topic Overview Extra Sums of Squares (Defined) Using and Interpreting R 2 and Partial-R 2 Getting ESS and Partial-R
More information1: a b c d e 2: a b c d e 3: a b c d e 4: a b c d e 5: a b c d e. 6: a b c d e 7: a b c d e 8: a b c d e 9: a b c d e 10: a b c d e
Economics 102: Analysis of Economic Data Cameron Spring 2016 Department of Economics, U.C.-Davis Final Exam (A) Tuesday June 7 Compulsory. Closed book. Total of 58 points and worth 45% of course grade.
More informationST Correlation and Regression
Chapter 5 ST 370 - Correlation and Regression Readings: Chapter 11.1-11.4, 11.7.2-11.8, Chapter 12.1-12.2 Recap: So far we ve learned: Why we want a random sample and how to achieve it (Sampling Scheme)
More informationEXST 7015 Fall 2014 Lab 08: Polynomial Regression
EXST 7015 Fall 2014 Lab 08: Polynomial Regression OBJECTIVES Polynomial regression is a statistical modeling technique to fit the curvilinear data that either shows a maximum or a minimum in the curve,
More informationResiduals from regression on original data 1
Residuals from regression on original data 1 Obs a b n i y 1 1 1 3 1 1 2 1 1 3 2 2 3 1 1 3 3 3 4 1 2 3 1 4 5 1 2 3 2 5 6 1 2 3 3 6 7 1 3 3 1 7 8 1 3 3 2 8 9 1 3 3 3 9 10 2 1 3 1 10 11 2 1 3 2 11 12 2 1
More informationSimple Linear Regression
Chapter 2 Simple Linear Regression Linear Regression with One Independent Variable 2.1 Introduction In Chapter 1 we introduced the linear model as an alternative for making inferences on means of one or
More informationLecture 1 Linear Regression with One Predictor Variable.p2
Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of
More informationSTA 6207 Practice Problems Nonlinear Regression
STA 6207 Practice Problems Nonlinear Regression Q.1. A study was conducted to measure the effects of pea density (X 1, in plants/m 2 ) and volunteer barley density (X 2, in plants/m 2 ) on pea seed yield
More informationStat 501, F. Chiaromonte. Lecture #8
Stat 501, F. Chiaromonte Lecture #8 Data set: BEARS.MTW In the minitab example data sets (for description, get into the help option and search for "Data Set Description"). Wild bears were anesthetized,
More informationProblem Set 10: Panel Data
Problem Set 10: Panel Data 1. Read in the data set, e11panel1.dta from the course website. This contains data on a sample or 1252 men and women who were asked about their hourly wage in two years, 2005
More informationStatistical Modelling in Stata 5: Linear Models
Statistical Modelling in Stata 5: Linear Models Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 07/11/2017 Structure This Week What is a linear model? How good is my model? Does
More informationTopic 2. Chapter 3: Diagnostics and Remedial Measures
Topic Overview This topic will cover Regression Diagnostics Remedial Measures Statistics 512: Applied Linear Models Some other Miscellaneous Topics Topic 2 Chapter 3: Diagnostics and Remedial Measures
More informationEXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS
Norges teknisk naturvitenskapelige universitet Institutt for matematiske fag Side 1 av 8 Contact during exam: Bo Lindqvist Tel. 975 89 418 EXAM IN TMA4255 EXPERIMENTAL DESIGN AND APPLIED STATISTICAL METHODS
More informationNon-Linear Models. Estimating Parameters from a Non-linear Regression Model
Non-Linear Models Mohammad Ehsanul Karim Institute of Statistical Research and training; University of Dhaka, Dhaka, Bangladesh Estimating Parameters from a Non-linear Regression Model
More informationQuestion 1a 1b 1c 1d 1e 2a 2b 2c 2d 2e 2f 3a 3b 3c 3d 3e 3f M ult: choice Points
Economics 102: Analysis of Economic Data Cameron Spring 2016 May 12 Department of Economics, U.C.-Davis Second Midterm Exam (Version A) Compulsory. Closed book. Total of 30 points and worth 22.5% of course
More informationTopic 29: Three-Way ANOVA
Topic 29: Three-Way ANOVA Outline Three-way ANOVA Data Model Inference Data for three-way ANOVA Y, the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b Factor C with levels
More information