Solutions - Homework #2
|
|
- Priscilla Greene
- 5 years ago
- Views:
Transcription
1 Solutions - Homework #2 1. Problem 1: Biological Recovery (a) A scatterplot of the biological recovery percentages versus time is given to the right. In viewing this plot, there is negative, slightly nonlinear relationship between recovery percentage and time. There also appears to be more variability in the recoveries at early times in that the relationship is most clearly defined at greater times. Hence, there appears to be some variance heterogeneity present. Biological Recovery Percentage Scatterplot of Recovery Percentage vs. Time (b) A scatterplot of the log biological recovery percentages versus time is given to the right. In viewing this plot, the relationship is still negative as before, but is now fairly linear. Additionally, the amount of variability about this linear pattern appears to be homogeneous over all times indicating variance homogeneity for this relationship. Hence, a simple linear regression model on this log scale seems more appropriate than fitting a nonlinear model on the original scale due to the variance heterogeneity on the raw scale. (c) The simple linear regression model y i = β +β 1 x i +ǫ i (with y = log biological recovery and x = time) was fit to these data, producing the following parameter estimates Log Biological Recovery Percentage Time (in minutes) Scatterplot of Log Recovery vs. Time Time (in minutes) for the intercept and slope: β = & β 1 =.368. The R 2 -statistic as reported from the MatLaboutputwas R 2 =.9287,meaningthat92.9% of the variation in log recovery percentages was explained by the model on time. The relevant output from the tstats regression output structure is shown below and all code used in this problem is given at the end of these solutions. Coefficients: Value Std. Error t value p-value Intercept <.5 times <.5 Multiple R-Squared:.9287 F-statistic: on 1 and 11 degrees of freedom, p-value = e-7 (d) An estimate of σ 2 is given by the mean square error (MSE), which is reported in MatLab as out.mse in the out regression structure. Hence, MSE =.431. The standard errors of β and β 1 are given in the Coefficients table above as: SE( β ) =.188 & SE( β 1 ) =.31. Using these standard errors and computing the t-critical value with n p = 13 2 = 11 degrees 1
2 of freedom at the 99% level (t.995 (11) = 3.158), individual 99% confidence intervals for β & β 1 were computed via MatLab as: For β : β ±t 11 SE( β ) = ±3.158(.188)= ±.3379 (3.63, 4.3). For β 1 : β1 ±t 11 SE( β 1 ) =.368±3.158(.31)=.368±.96 (-.464, -.273). Hence, we are 99% confident that the slope of the regression of log recovery percentage on time is between 3.63 and 4.3 log(%) per minute. We are also (individually) 99% confident that the log recovery percentage at time is between and log(%). Since neither confidence interval contains the value, then both β and β 1 appear to be significant (significantly different from ) in this model. (e) The confidence bands were computed using the confregplot function from the course webpage, as shown in the scatterplot to the right below with the fitted regression line. The prediction bands are also plotted. To get at the gains in precision in estimating the mean of y for different values of x, the table to the left below gives the margins of error in the confidence interval for E(y) for several values of x. As expected, the further we get from the mean x-value of 3, the greater the variability in estimating E(y). To be 5 more precise, the margin of error at either 4.5 extreme of the times ( or 6) is roughly 41% larger (.29 to.41) than that at the 4 middle time (3 minutes). x Margin of Error Log Recovery Percentage Fitted Line Confidence Bands Prediction Bands Time (in minutes) 2. Problem 2: Logistic reparameterization: Suppose we begin with the logistic growth model parameterized as: Mu u(t) = u +(M u )exp( kt) where (M,u,k) are the model parameters. First, divide all terms in this model by u. Doing so gives: u(t) = = M ( ) M u 1+ exp( kt) u M [ ( M 1+exp log 1 u ) ] kt ( since M [ ( )]) M 1 = exp log 1. u u ( ) M M Defining a = log 1 and b = k, this can be written: u(t) =, as desired. u 1+exp(a+bt) 2
3 3. Problem 3: Weibull fit problem 14 Growth vs. Time (a) A scatterplot of the growth amounts vs. time is given to the right. In viewing this plot, the growth amounts initially increase very rapidly but seem to level off around 1, orgs./.5ml. It is also worth noting that there seems to be more variability in these growth amounts as their values increase. The resulting pattern may be well-described by an exponential growth curve, although we may require the additional flexibility provided by the Weibull model. Growth (orgs./ml) (b) Using the Weibull growth model parameterized as: y i = α{1 exp[ (t i /σ) γ ]}+ǫ i, Time (in days) we first recognize that α is the upper asymptote of the curve, since as time increases, the exponential piece goes to zero. Eyeballing where this limit occurs from the scatterplot, we choose α = 11, as the starting value for α in a nonlinear least squares fit. Picking two points from the scatterplot, we can see that (x,y) = (2,248) & (6,944) roughly fit the curved pattern seen. Substituting these values into the Weibull model gives the following pair of equations: 248 = 11{1 exp[ (2/σ) γ ]} and 944 = 11{1 exp[ (6/σ) γ ]}. Solving this system of two equations in the two unknowns (σ,γ) gives: 852 = 11exp[ (2/σ) γ ] 156 = 11exp[ (6/σ) γ ] = ln(852/11)= (2/σ)γ ln(156/11)= (6/σ) γ (dividing the equations) [ ] ln(852/11) = γ = ln /ln(1/3) = 1.85 ln(156/11) = ln( ln(156/11)) = ln(6/σ) γ = σ = 6exp[ ln( ln(156/11))/γ] = ln(852/11) ln(156/11) = 2γ 6 γ = 6 exp[ ln( ln(156/11))/1.85] = Hence, the starting values I used to fit the Weibull model were: (α,σ,γ) = (11,4.18,1.85). (c) With these starting values found above, nlinfit was used to fit a Weibull growth curve to these data, resulting in the fitted curve plotted above. The resulting parameter estimates are: α = 1,369, σ = , γ = (d) A residual plot and normal quantile plot are shown below. In viewing these plots, the residual plot exhibits a potential pattern with most of the data appearing at the largest predicted values. However, this is a byproduct of having roughly 1 of the values at the asymptote of 1,. The points at this largest predicted value do have more variability possibly indicating variance heterogeneity in the larger growth values. The normal quantile plot shows a reasonably linear relationship between the residuals and the standard normal quantiles, indicating no serious departures from normality for the residuals. 3
4 3 Residual Plot 3 Normal Quantile Plot Residuals 1 5 Residuals Predicted Values Standard Normal Quantiles (e) The nlparci function in MatLab was used to find individual 95% confidence intervals for the three model parameters. The resulting intervals are reported below along with the by hand calculations using the standard errors reported in the next part. For each confidence interval to be at the 95% level individually, since there are n = 16 data pairs and p = 3 parameters being estimated, there are n p = 13 degrees of freedom available. Under the assumption that the sampling distributions of the parameter estimates are normal, we use a t-based confidence interval for α,σ, and γ. The critical t-value is: t = t 13 (.975) = as found via MatLab. This resulted in the following set of 3 individual 95% confidence intervals for the 3 parameters: For α : α±t SE( α) = 1369±2.164( )= 1369±74 = (9629,1119), For σ : σ ±t SE( σ) = ±2.164(.2939)= ±.635 = (3.3526,4.6226), For γ : γ ±t SE( γ) = ±2.164(.5892)= ±1.273= (1.211,3.7569). The confidence interval for α can be interpreted as: We are 95% confident that the true value for α in the Weibull model relating growth amount to time is between 9629 and More preceisely, we are 95% confident that the maximum growth reached is between 9629 and 1119 orgs./ml. The others can be interpreted similarly. To have 3 confidence intervals simultaneously at the 95% level, we need to use the Bonferroni correction as discussed briefly in class. To do this, instead of finding the critical t-value at the.975 percentile of the t-distribution with 13 degrees of freedom, we divide the lower tail (.25) by 3 (the number of CIs desired) to get a tail probability of.25/3 =.83 and then compute t.9917 (13) to get the critical value. Doing so gives t.9917 (13) = and a wider set of CIs: For α : (9428,1131) For σ : (3.181,4.795) For γ : (.866,4.12) in which we are simultaneously 95% confident that the three parameters fall inside their respective intervals. (f) As indicated in the previous part, the operative t-critical value is: t = t 13 (.975) = The lengths of the three confidence intervals divided by 2 give the margins of error for the three intervals. Dividing these margins of error by t gives the standard errors for each of the three parameter estimates. These calculations are summarized in the table below. 4
5 Parameter Estimate CI Margin of Error Standard Error α 1369 (9629,1119) ( )/2 = 74 74/t = σ (3.3526,4.6226) ( )/2= /t =.2939 γ (1.211,3.757) ( )/2= /t =.5893 (g) Since the confidence interval for γ clearly does not include the value γ = 1, then we have evidence at the.5-level that γ differs from 1 and that this extra Weibull parameter is significant in the model. If we instead consider the Bonferroni-corrected 95% confidence intervals (which is really the correct thing to do), since the CI for γ contains the value 1 (albeit barely), we would conclude that this parameter is unnecessary in the model If we omit this parameter, which we are justified in doing, we are left with the 2-parameter exponential growth model which was fully discussed in class. We would then conclude that the model is overspecified. clear all; MatLab Code Used % ============================================================ % % Problem 1: Plot of biological recovery percentages vs. time % % ============================================================ % load../data/cloud.mat % Reads in the data time = cloud.time; % Renames the time variable recov = cloud.recovery; % Renames the recovery variable figure(1) plot(time,recov, ko ) % Plots recoveries vs. time xlim([-5 65]) % Sets x-limits on plot xlabel( Time (in minutes), fontsize,14) ylabel( Biological Recovery Percentage, fontsize,14) title( Scatterplot of Recovery Percentage vs. Time, fontsize,14, fontweight, b ) figure(2) logrecov = log(recov); % Log of recoveries plot(time,logrecov, ko ) % Plots log recovery vs. time xlim([-5,65]) % Sets x-limits on plot xlabel( Time (in minutes), fontsize,14) ylabel( Log Biological Recovery Percentage, fontsize,14) title( Scatterplot of Log Recovery vs. Time, fontsize,14, fontweight, b ) % ===================================================================== % % Problem 1: Regression of log biological recovery percentages vs. time % % ===================================================================== % out = regstats(logrecov,time); % Regresses recovery (y) on time (x) out.tstat % Requests relevant parameter estimate info out.rsquare % Multiple R-Squared Value out.mse % Mean Squared Error (MSE) out.fstat % Model F-Statistic, df, pval n = length(time); % Sample size p = 2; % Defines # parameters bhat = out.beta; % Vector of parameter estimates 5
6 seb = sqrt(diag(out.covb)); tbonf = tinv((1-.5),n-p); ci_b = [bhat-tbonf*seb,... bhat+tbonf*seb]; ci_b % Vector of standard errors % 99% uncorrected t*-value % Confidence intervals for betas % in 2 columns (lower,upper) % Prints confidence intervals % ================================================= % % Problem 1: Confidence bands for E(log recoveries) % % ================================================= % xlab = Time (in minutes) ; % X-axis label ylab = Log Recovery Percentage ; % Y-axis label % The confregplot function plots the confidence and prediction % bands for E(y). This function requires as inputs the x & y % variables, labels for these variables, and the confidence level. confregplot(time,logrecov,xlab,ylab,99) % =============================================== % % Problem 3: Weibull fit to Growth Data with plot % % =============================================== % load../data/paramecium.mat; % Loads the growth data growth = paramecium.growth; % Defines growth vector time = paramecium.time; % Defines time vector plot(time,growth, ko ) % Plots growth vs. time xlim([ 16]) % x-axis plotting limits xlabel( Time (in days), fontsize,14, fontweight, b ); ylabel( Growth (orgs./ml), fontsize,14, fontweight, b ); title( Growth vs. Time, fontsize,14, fontweight, b ); hold on; % Hold the current plot % ======================================================= % % Problem 3c - Nonlinear Weibull model fit to growth data % % ======================================================= % beta1 = [ ]; % Parameter starting values [betahat1 resid1 J1] = nlinfit(time,... % Performs nonlinear Weibull fit growth,@weibull,beta1); % returning betahats, resids time1 = :.1:16; % Vector of times from to 16 yhat1 = betahat1(1)*(1-exp(-(time1./... % Computes Weibull predicted betahat1(2)).^betahat1(3))); % values (yhat s) plot(time1,yhat1); % Plots the fitted line hold off; % End hold on current plot nlintool(time,growth,@weibull,beta1); % Plots 95% confidence bands % =========================================================== % % Problem 3d - Residual and normal quantile plot of residuals % % =========================================================== % figure(1) % 1st Figure yhat.weib = growth - resid1; % Computes Weibull predicted values plot(yhat.weib,resid1, ko ); % Plots residuals vs. predicted y-values xlabel( Predicted Values, fontsize,14, fontweight, b ); 6
7 ylabel( Residuals, fontsize,14, fontweight, b ); title( Residual Plot, fontsize,14, fontweight, b ); figure(2); % 2nd Figure qqplot(resid1); % Normal quantile plot of residuals xlabel( Standard Normal Quantiles, fontsize,14, fontweight, b ); ylabel( Residuals, fontsize,14, fontweight, b ); title( Normal Quantile Plot, fontsize,14, fontweight, b ); % ======================================================================= % % Problem 3e,f - Individual 95% Confidence intervals for the 3 parameters % % ======================================================================= % ci = nlparci(betahat1,resid1,j1); % Computes CIs for (alpha,sigma,gamma) moe = (ci(:,2)-ci(:,1))/2; % Margin of error from CIs se = moe/tinv(.975,13); % Standard errors from CIs 7
Solutions - Homework #2
45 Scatterplot of Abundance vs. Relative Density Parasite Abundance 4 35 3 5 5 5 5 5 Relative Host Population Density Figure : 3 Scatterplot of Log Abundance vs. Log RD Log Parasite Abundance 3.5.5.5.5.5
More informationThe General Linear Model
The General Linear Model Thus far, we have discussed measures of uncertainty for the estimated parameters ( β 0, β 1 ) and responses ( y h ) from the simple linear regression model We would like to extend
More informationThe General Linear Model
The General Linear Model Thus far, we have discussed measures of uncertainty for the estimated parameters ( β 0, β 1 ) and responses ( y h ) from the simple linear regression model We would like to extend
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationAnalysis of Variance (ANOVA)
Analysis of Variance (ANOVA) Much of statistical inference centers around the ability to distinguish between two or more groups in terms of some underlying response variable y. Sometimes, there are but
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationStatistics 572 Semester Review
Statistics 572 Semester Review Final Exam Information: The final exam is Friday, May 16, 10:05-12:05, in Social Science 6104. The format will be 8 True/False and explains questions (3 pts. each/ 24 pts.
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationProject Two. James K. Peterson. March 26, Department of Biological Sciences and Department of Mathematical Sciences Clemson University
Project Two James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University March 26, 2019 Outline 1 Cooling Models 2 Estimating the Cooling Rate k 3 Typical
More informationInference in Normal Regression Model. Dr. Frank Wood
Inference in Normal Regression Model Dr. Frank Wood Remember We know that the point estimator of b 1 is b 1 = (Xi X )(Y i Ȳ ) (Xi X ) 2 Last class we derived the sampling distribution of b 1, it being
More informationST505/S697R: Fall Homework 2 Solution.
ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationHandout 4: Simple Linear Regression
Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:
More informationSTAT 350. Assignment 4
STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationChapter 8 (More on Assumptions for the Simple Linear Regression)
EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationSolutions - Homework #1
Solutions - Homework #1 1. Problem 1: Below appears a summary of the paper The pattern of a host-parasite distribution by Schmid & Robinson (197). Using the gnat Culicoides crepuscularis as a host specimen
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationProject Two. Outline. James K. Peterson. March 27, Cooling Models. Estimating the Cooling Rate k. Typical Cooling Project Matlab Session
Project Two James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University March 27, 2018 Outline Cooling Models Estimating the Cooling Rate k Typical Cooling
More informationData Set 8: Laysan Finch Beak Widths
Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationEXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"
EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationLecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran
Lecture 2 Linear Regression: A Model for the Mean Sharyn O Halloran Closer Look at: Linear Regression Model Least squares procedure Inferential tools Confidence and Prediction Intervals Assumptions Robustness
More informationRegression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.
TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted
More informationMatLab Code for Simple Convex Minimization
MatLab Code for Simple Convex Minimization, 1-8, 2017 Draft Version April 25, 2018 MatLab Code for Simple Convex Minimization James K. Peterson 1 * Abstract We look at MatLab code to solve a simple L 1
More informationSTAT Chapter 11: Regression
STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD
More informationAny of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.
STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals... 11 Observed
More information36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression
36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form
More informationPsychology 282 Lecture #4 Outline Inferences in SLR
Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationIntroduction and Single Predictor Regression. Correlation
Introduction and Single Predictor Regression Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Correlation A correlation
More informationINFERENCE FOR REGRESSION
CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We
More informationSimple Linear Regression
Simple Linear Regression Reading: Hoff Chapter 9 November 4, 2009 Problem Data: Observe pairs (Y i,x i ),i = 1,... n Response or dependent variable Y Predictor or independent variable X GOALS: Exploring
More informationTable of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).
Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationa. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =
Stat 28 Fall 2004 Key to Homework Exercise.10 a. There is evidence of a linear trend: winning times appear to decrease with year. A straight-line model for predicting winning times based on year is: Winning
More informationThe scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationThe Big Picture. Model Modifications. Example (cont.) Bacteria Count Example
The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots
More informationBivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data.
Bivariate Data: Graphical Display The scatterplot is the basic tool for graphically displaying bivariate quantitative data. Example: Some investors think that the performance of the stock market in January
More informationCh Inference for Linear Regression
Ch. 12-1 Inference for Linear Regression ACT = 6.71 + 5.17(GPA) For every increase of 1 in GPA, we predict the ACT score to increase by 5.17. population regression line β (true slope) μ y = α + βx mean
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More informationSimple Linear Regression
Simple Linear Regression ST 370 Regression models are used to study the relationship of a response variable and one or more predictors. The response is also called the dependent variable, and the predictors
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More informationStatistical Test of the Global Warming Hypothesis
Statistical Test of the Global Warming Hypothesis http://www.leapcad.com/climate_analysis/statistical_test_of_global_warming.xmcd Statistical test of how well Solar Insolation and CO2 variation explain
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationModel Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007
Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big
More informationWhat to do if Assumptions are Violated?
What to do if Assumptions are Violated? Abandon simple linear regression for something else (usually more complicated). Some examples of alternative models: weighted least square appropriate model if the
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationRatio of Polynomials Fit Many Variables
Chapter 376 Ratio of Polynomials Fit Many Variables Introduction This program fits a model that is the ratio of two polynomials of up to fifth order. Instead of a single independent variable, these polynomials
More informationSOCY5601 Handout 8, Fall DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS
SOCY5601 DETECTING CURVILINEARITY (continued) CONDITIONAL EFFECTS PLOTS More on use of X 2 terms to detect curvilinearity: As we have said, a quick way to detect curvilinearity in the relationship between
More informationReview for Final Exam Stat 205: Statistics for the Life Sciences
Review for Final Exam Stat 205: Statistics for the Life Sciences Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 205: Statistics for the Life Sciences 1 / 20 Overview of Final Exam
More informationAnalysis of Variance
Statistical Techniques II EXST7015 Analysis of Variance 15a_ANOVA_Introduction 1 Design The simplest model for Analysis of Variance (ANOVA) is the CRD, the Completely Randomized Design This model is also
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression 1. Regression Equation A simple linear regression (also known as a bivariate regression) is a linear equation describing the relationship between an explanatory
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationStatistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).
Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More informationMultiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company
Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More information13 Simple Linear Regression
B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity
More informationTABLES AND FORMULAS FOR MOORE Basic Practice of Statistics
TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x
More informationSTAT 350 Final (new Material) Review Problems Key Spring 2016
1. The editor of a statistics textbook would like to plan for the next edition. A key variable is the number of pages that will be in the final version. Text files are prepared by the authors using LaTeX,
More informationSTAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More informationSAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c
Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression
More informationStat 101 Exam 1 Important Formulas and Concepts 1
1 Chapter 1 1.1 Definitions Stat 101 Exam 1 Important Formulas and Concepts 1 1. Data Any collection of numbers, characters, images, or other items that provide information about something. 2. Categorical/Qualitative
More informationTopic 14: Inference in Multiple Regression
Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More information15: Regression. Introduction
15: Regression Introduction Regression Model Inference About the Slope Introduction As with correlation, regression is used to analyze the relation between two continuous (scale) variables. However, regression
More informationMath 3330: Solution to midterm Exam
Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the
More informationSTATISTICS 479 Exam II (100 points)
Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationRecent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data
Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More information10 Model Checking and Regression Diagnostics
10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance
More informationTOPIC 9 SIMPLE REGRESSION & CORRELATION
TOPIC 9 SIMPLE REGRESSION & CORRELATION Basic Linear Relationships Mathematical representation: Y = a + bx X is the independent variable [the variable whose value we can choose, or the input variable].
More informationECON 497 Midterm Spring
ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain
More informationST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false.
ST 512-Practice Exam I - Osborne Directions: Answer questions as directed. For true/false questions, circle either true or false. 1. A study was carried out to examine the relationship between the number
More informationIntroduction to Linear Regression
Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46
More informationCHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS
CHAPTER 5 FUNCTIONAL FORMS OF REGRESSION MODELS QUESTIONS 5.1. (a) In a log-log model the dependent and all explanatory variables are in the logarithmic form. (b) In the log-lin model the dependent variable
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationExample: Four levels of herbicide strength in an experiment on dry weight of treated plants.
The idea of ANOVA Reminders: A factor is a variable that can take one of several levels used to differentiate one group from another. An experiment has a one-way, or completely randomized, design if several
More informationTerminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1
Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationInteraction effects for continuous predictors in regression modeling
Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationLecture 10: F -Tests, ANOVA and R 2
Lecture 10: F -Tests, ANOVA and R 2 1 ANOVA We saw that we could test the null hypothesis that β 1 0 using the statistic ( β 1 0)/ŝe. (Although I also mentioned that confidence intervals are generally
More informationSupplementary materials Quantitative assessment of ribosome drop-off in E. coli
Supplementary materials Quantitative assessment of ribosome drop-off in E. coli Celine Sin, Davide Chiarugi, Angelo Valleriani 1 Downstream Analysis Supplementary Figure 1: Illustration of the core steps
More informationConsider fitting a model using ordinary least squares (OLS) regression:
Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful
More informationTest 3 Practice Test A. NOTE: Ignore Q10 (not covered)
Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................
More information