Solutions - Homework #2

Size: px
Start display at page:

Download "Solutions - Homework #2"

Transcription

1 45 Scatterplot of Abundance vs. Relative Density Parasite Abundance Relative Host Population Density Figure : 3 Scatterplot of Log Abundance vs. Log RD Log Parasite Abundance Log Relative Host Population Density. Problem : Parasite Abundance Figure : Solutions - Homework # (a) A scatterplot of the parasite abundances versus relative host population densities is given in Figure. In viewing this plot, there is an L-shaped relationship between the two variables where the majority of values are small for both abundance and density. Truthfully, it is hard to discern much of any relationship because so many of the species are clustered at these smaller values due to the few values that are orders of magnitude larger than the others in both variables. (b) A scatterplot of the log abundances versus the log relative densities is given in Figure. In viewing this plot, the relationship is positive, fairly linear, and moderate in strength. The log transformation has effectively reduced the magnitude of the larger values in each variable allowing us to see the relationship between the two variables much more clearly. (c) The simple linear regression model y i = β + β x i + ϵ i (with y = log parasite abundance and x = log relative density) was fit to these data, producing the following parameter estimates for the intercept and slope: β =.4767 & β =.763. The R -statistic as reported from the MatLab output was R =.3, meaning that 3.% of the variation in log parasite abundances was explained by the model on log relative density. The relevant output from the tstats regression output structure is shown below and all code used in this problem is given at the end of these solutions.

2 Coefficients: Value Std. Error t value p-value Intercept Log Density Multiple R-Squared:.3 F-statistic: 7.34 on and 7 degrees of freedom, p-value =.5 (d) An estimate of σ is given by the mean square error (MSE), which is reported in MatLab as out.mse in the out regression structure. Hence, MSE =.39. The standard errors of β and β are given in the Coefficients table above as: SE( β ) =.67 & SE( β ) =.8. Using these standard errors and computing the t-critical value with n p = 9 = 7 degrees of freedom at the 99% level (t.995(7) =.898), individual 99% confidence intervals for β & β were computed via MatLab as: For β : β ± t 7 SE( β ) =.4767 ±.898(.67) =.4767 ±.7583 (-.8,.4). For β : β ± t 7 SE( β ) =.763 ±.898(.8) =.763 ±.873 (-.5,.58). Hence, we are 99% confident that the slope of the regression of log parasite abundance on log relative density is between -.5 and.58. We are also (individually) 99% confident that the log abundance at a log relative density of (relative density = ) is between -.8 and.4. Since both confidence intervals contain the value, then both β and β appear to be insignificant at the. level in this model. This is confirmed by the p-values as both values are greater than.. (e) The confidence bands were computed using the confregplot function from the course webpage, as shown in the scatterplot in Figure 3 with the fitted regression line. The prediction bands are also plotted. To get at the gains in precision in estimating the mean of y for different values of x, the table to the left below gives the margins of error in the confidence interval for E(y) for several values of x. As expected, the further we get from the mean x-value of., the greater the variability in estimating E(y). To be more precise, the margin of error at either extreme of the log densities (-. or.6) is roughly % larger (.74 to.65) than that at the middle log density (-.). x Margin of Error Problem : Logistic reparameterization: Suppose we begin with the logistic growth model parameterized as: Mu u(t) = u + (M u ) exp( kt) where (M, u, k) are the model parameters. First, divide all terms in this model by u. Doing so gives: u(t) = = M ( ) M u + exp( kt) u M [ ( M + exp log u ) ] kt ( since M [ ( )]) M = exp log. u u

3 6 4 Log Parasite Abundance 4 6 Fitted Line Confidence Bands Prediction Bands Log Relative Density Figure 3: 6 Calcium vs. Time 5 Calcium (nmoles/mg) Time (in minutes) Figure 4: ( ) M M Defining a = log and b = k, this can be written: u(t) =, as desired. u + exp(a + bt) 3. Problem 3: Weibull fit problem (a) A scatterplot of the calcium amounts vs. time is given to the right. In viewing this plot, the calcium amounts initially increase very rapidly but seem to level off around 5 nmoles/mg. It is also worth noting that there seems to be more variability in these calcium amounts as their values increase. The resulting pattern may be well-described by an exponential growth curve, although the pattern seems to change somewhat abruptly around a time of 4 minutes. (b) Using the Weibull growth model parameterized as: y i = α { exp [ (t i /σ) γ ]} + ϵ i, we first recognize that α is the upper asymptote of the curve, since as time increases, the exponential piece goes to zero. Eyeballing where this limit occurs from the scatterplot, we choose α = 4 as the starting value for α in a nonlinear least squares fit. Picking two points from the scatterplot, we can see that (x, y) = (, ) & (6, 3) roughly fit the curved pattern seen. Substituting these values into the Weibull model gives the following pair of equations: = 4 { exp [ (/σ) γ ]} and 3 = 4 { exp [ (6/σ) γ ]}. 3

4 .5 Residual Plot.5 Residual Plot.5.5 Residuals Residuals Predicted Values, Predicted Values Figure 5: Solving this system of two equations in the two unknowns (σ, γ) gives: 3 = 4 exp [ (/σ) γ ] = 4 exp [ (6/σ) γ ] = = γ = ln ln(3/4) = (/σ)γ ln(/4) = (6/σ) γ = ln(4/3) ln(4) [ ln(4/3) ln(4) = γ (dividing the equations) 6γ ] / ln(/3) =.88 = ln(ln(4)) = ln(6/σ) γ = σ = 6 exp [ ln(ln(4))/γ] = 6 exp [ ln(ln(4))/.88] = 4.4. Hence, the starting values I used to fit the Weibull model were: (α, σ, γ) = (4, 4.4,.88). (c) With these starting values found above, nlinfit was used to fit a Weibull growth curve to these data, resulting in the fitted curve plotted in Figure 4. The resulting parameter estimates are: α = 4.835, σ = 4.736, γ =.56. (d) A residual plot and normal quantile plot are shown in Figure 5. In viewing these plots, the residual plot appears to exhibit random scatter about the -line indicating variance homogeneity among the model residuals. There are a couple of large residuals at larger predicted values, but nothing systematic enough to worry about heterogeneity issues. The normal quantile plot shows a reasonably linear relationship between the residuals and the standard normal quantiles, indicating no serious departures from normality for the residuals. These two assumptions justify our use of t-based inferences in later parts of this problem. (e) The nlparci function in MatLab was used to find individual 95% confidence intervals for the three model parameters. The resulting intervals are reported below along with the by hand calculations using the standard errors reported in the next part. For each confidence interval to be at the 95% level individually, since there are n = 7 data pairs and p = 3 parameters being estimated, there are n p = 4 degrees of freedom available. Under the assumption that the sampling distributions of the parameter estimates are normal, we use a t-based confidence interval for α, σ, and γ. The critical t-value is: t = t 4 (.975) =.64 as found via MatLab. This resulted in the following set of 3 individual 95% confidence intervals for the 3 parameters: For α : α ± t SE( α) = 4.83 ±.64(.4745) = 4.83 ±.979 = (3.34, 5.6), For σ : σ ± t SE( σ) = ±.64(.76) = 4.73 ±.6 = (.9, 7.354), For γ : γ ± t SE( γ) =.56 ±.64(.7) =.6 ±.469 = (.547,.485). 4

5 clear all; The confidence interval for α can be interpreted as: We are 95% confident that the true value for α in the Weibull model relating calcium amount to suspension time is between 3.34 and 5.6. The others can be interpreted similarly. To have 3 confidence intervals simultaneously at the 95% level, we need to use the Bonferroni correction as discussed briefly in class. To do this, instead of finding the critical t-value at the.975 percentile of the t-distribution with 4 degrees of freedom, we divide the lower tail (.5) by 3 (the number of CIs desired) to get a tail probability of.5/3 =.83 and then compute t.997 (4) to get the critical value. Doing so gives t.997 (4) =.574 and a wider set of CIs: For α : (3.6, 5.54) For σ : (.46, 8.) For γ : (.43,.6) in which we are simultaneously 95% confident that the three parameters fall inside their respective intervals. (f) As indicated in the previous part, the operative t-critical value is: t = t 4 (.975) =.64. The lengths of the three confidence intervals divided by give the margins of error for the three intervals. Dividing these margins of error by t gives the standard errors for each of the three parameter estimates. These calculations are summarized in the table below. Parameter Estimate CI Margin of Error Standard Error α 4.83 (3.34,5.6) ( )/ =.98.98/t =.475 σ 4.73 (.9,7.354) ( )/ =.64.64/t =.7 γ.6 (.547,.485) ( )/ = /t =.7 (g) Since the confidence interval for γ clearly includes the value γ = (which makes sense since γ =.6 was so close to ), then we have absolutely no evidence that γ differs from and it does seem that this parameter is unnecessary. If we omit this parameter, which we are clearly justified in doing, we are left with the -parameter exponential growth model which was fully discussed in class. As it stands, the model is overspecified. MatLab Code Used % Problem : Plots of parasite abundances vs. relative densities % load../data/arneberg; % Load the data figure() reldens = arneberg.reldens; abund = arneberg.abund; plot(reldens,abund, ko ) % Plots abundance vs. rel. density xlabel( Relative Host Population Density, fontsize,4) ylabel( Parasite Abundance, fontsize,4) title( Scatterplot of Abundance vs. Relative Density, fontsize,4, fontweight, b ) figure() logreldens = log(reldens); % Log of relative densities logabund = log(abund); % Log of abundances plot(logreldens,logabund, ko ) % Plots log abundance vs. log RD xlabel( Log Relative Host Population Density, fontsize,4) ylabel( Log Parasite Abundance, fontsize,4) 5

6 title( Scatterplot of Log Abundance vs. Log RD, fontsize,4, fontweight, b ) % Problem : Regression of log abundance on log relative density % out = regstats(logabund,logreldens); % Regresses abundance (y) on density (x) out.tstat % Requests relevant parameter estimate info out.rsquare % Multiple R-Squared Value out.mse % Mean Squared Error (MSE) out.fstat % Model F-Statistic, df, pval n = length(abund); % Sample size p = ; % Defines # parameters bhat = out.beta; % Vector of parameter estimates seb = sqrt(diag(out.covb)); % Vector of standard errors tbonf = tinv((-.5),n-p); % 99% uncorrected t*-value ci_b = [bhat-tbonf*seb,... % Confidence intervals for betas bhat+tbonf*seb]; % in columns (lower,upper) ci_b % Prints confidence intervals % ================================================= % % Problem : Confidence bands for E(log abundances) % % ================================================= % xlab = Log Relative Density ; % X-axis label ylab = Log Parasite Abundance ; % Y-axis label % The confregplot function plots the confidence and prediction % bands for E(y). This function requires as inputs the x & y % variables, labels for these variables, and the confidence level. confregplot(logreldens,logabund,xlab,ylab,99) % ================================================ % % Problem 3: Weibull fit to Calcium Data with plot % % ================================================ % load../data/calcium.mat; % Loads the calcium data calc = calcium.calcium; time = calcium.time; % Defines calcium & time vectors plot(time,calc, ko ) % Plots calcium vs. time xlim([ 6]) % x-axis plotting limits xlabel( Time (in minutes), fontsize,4, fontweight, b ); ylabel( Calcium (nmoles/mg), fontsize,4, fontweight, b ); title( Calcium vs. Time, fontsize,4, fontweight, b ); hold on; % Hold the current plot % ======================================================== % % Problem 3c - Nonlinear Weibull model fit to calcium data % % ======================================================== % beta = [ ]; % Parameter starting value [betahat resid J] = nlinfit(time,... % Performs nonlinear Weibull fit calc,@weibull,beta); % returning betahats, resids time = :.:5; % Vector of times from to 5 yhat = betahat()*(-exp(-(time./... % Computes Weibull predicted 6

7 betahat()).^betahat(3))); % values (yhat s) plot(time,yhat); % Plots the fitted line hold off; % End hold on current plot %nlintool(calcium.time,calcium.calcium,@weibull,beta); % =========================================================== % % Problem 3d - Residual and normal quantile plot of residuals % % =========================================================== % figure() % st Figure yhat.weib = calc - resid; % Computes Weibull predicted values plot(yhat.weib,resid, ko ); % Plots residuals vs. predicted y-values xlabel( Predicted Values, fontsize,4, fontweight, b ); ylabel( Residuals, fontsize,4, fontweight, b ); title( Residual Plot, fontsize,4, fontweight, b ); figure(); % nd Figure qqplot(resid); % Normal quantile plot of residuals xlabel( Standard Normal Quantiles, fontsize,4, fontweight, b ); ylabel( Residuals, fontsize,4, fontweight, b ); title( Normal Quantile Plot, fontsize,4, fontweight, b ); % ======================================================================= % % Problem 3e,f - Individual 95% Confidence intervals for the 3 parameters % % ======================================================================= % ci = nlparci(betahat,resid,j); % Computes CIs for (alpha,sigma,gamma) moe = (ci(:,)-ci(:,))/; % Margin of error from CIs se = moe/tinv(.975,4); % Standard errors from CIs 7

Solutions - Homework #2

Solutions - Homework #2 Solutions - Homework #2 1. Problem 1: Biological Recovery (a) A scatterplot of the biological recovery percentages versus time is given to the right. In viewing this plot, there is negative, slightly nonlinear

More information

The General Linear Model

The General Linear Model The General Linear Model Thus far, we have discussed measures of uncertainty for the estimated parameters ( β 0, β 1 ) and responses ( y h ) from the simple linear regression model We would like to extend

More information

The General Linear Model

The General Linear Model The General Linear Model Thus far, we have discussed measures of uncertainty for the estimated parameters ( β 0, β 1 ) and responses ( y h ) from the simple linear regression model We would like to extend

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) Much of statistical inference centers around the ability to distinguish between two or more groups in terms of some underlying response variable y. Sometimes, there are but

More information

Density Temp vs Ratio. temp

Density Temp vs Ratio. temp Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,

More information

Solutions - Homework #1

Solutions - Homework #1 Solutions - Homework #1 1. Problem 1: Below appears a summary of the paper The pattern of a host-parasite distribution by Schmid & Robinson (197). Using the gnat Culicoides crepuscularis as a host specimen

More information

Lecture 18: Simple Linear Regression

Lecture 18: Simple Linear Regression Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength

More information

Statistics 572 Semester Review

Statistics 572 Semester Review Statistics 572 Semester Review Final Exam Information: The final exam is Friday, May 16, 10:05-12:05, in Social Science 6104. The format will be 8 True/False and explains questions (3 pts. each/ 24 pts.

More information

Inference for the Regression Coefficient

Inference for the Regression Coefficient Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates

More information

Unit 6 - Simple linear regression

Unit 6 - Simple linear regression Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable

More information

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran

Lecture 2 Linear Regression: A Model for the Mean. Sharyn O Halloran Lecture 2 Linear Regression: A Model for the Mean Sharyn O Halloran Closer Look at: Linear Regression Model Least squares procedure Inferential tools Confidence and Prediction Intervals Assumptions Robustness

More information

Handout 4: Simple Linear Regression

Handout 4: Simple Linear Regression Handout 4: Simple Linear Regression By: Brandon Berman The following problem comes from Kokoska s Introductory Statistics: A Problem-Solving Approach. The data can be read in to R using the following code:

More information

Data Set 8: Laysan Finch Beak Widths

Data Set 8: Laysan Finch Beak Widths Data Set 8: Finch Beak Widths Statistical Setting This handout describes an analysis of covariance (ANCOVA) involving one categorical independent variable (with only two levels) and one quantitative covariate.

More information

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.

More information

ST505/S697R: Fall Homework 2 Solution.

ST505/S697R: Fall Homework 2 Solution. ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)

More information

Inference in Normal Regression Model. Dr. Frank Wood

Inference in Normal Regression Model. Dr. Frank Wood Inference in Normal Regression Model Dr. Frank Wood Remember We know that the point estimator of b 1 is b 1 = (Xi X )(Y i Ȳ ) (Xi X ) 2 Last class we derived the sampling distribution of b 1, it being

More information

INFERENCE FOR REGRESSION

INFERENCE FOR REGRESSION CHAPTER 3 INFERENCE FOR REGRESSION OVERVIEW In Chapter 5 of the textbook, we first encountered regression. The assumptions that describe the regression model we use in this chapter are the following. We

More information

Inferences for Regression

Inferences for Regression Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In

More information

Ch Inference for Linear Regression

Ch Inference for Linear Regression Ch. 12-1 Inference for Linear Regression ACT = 6.71 + 5.17(GPA) For every increase of 1 in GPA, we predict the ACT score to increase by 5.17. population regression line β (true slope) μ y = α + βx mean

More information

9 Correlation and Regression

9 Correlation and Regression 9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the

More information

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 =

a. The least squares estimators of intercept and slope are (from JMP output): b 0 = 6.25 b 1 = Stat 28 Fall 2004 Key to Homework Exercise.10 a. There is evidence of a linear trend: winning times appear to decrease with year. A straight-line model for predicting winning times based on year is: Winning

More information

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference. Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3

Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3 Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency

More information

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont. TCELL 9/4/205 36-309/749 Experimental Design for Behavioral and Social Sciences Simple Regression Example Male black wheatear birds carry stones to the nest as a form of sexual display. Soler et al. wanted

More information

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.

Estimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X. Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.

More information

Chapter 8 (More on Assumptions for the Simple Linear Regression)

Chapter 8 (More on Assumptions for the Simple Linear Regression) EXST3201 Chapter 8b Geaghan Fall 2005: Page 1 Chapter 8 (More on Assumptions for the Simple Linear Regression) Your textbook considers the following assumptions: Linearity This is not something I usually

More information

10 Model Checking and Regression Diagnostics

10 Model Checking and Regression Diagnostics 10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance

More information

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). Statistics 512: Solution to Homework#11 Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat). 1. Perform the two-way ANOVA without interaction for this model. Use the results

More information

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues

Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +

More information

Inference for Regression

Inference for Regression Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of

More information

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 22, 2015 Lecture 4: Linear Regression TCELL Simple Regression Example Male black wheatear birds carry stones to the nest as a form

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD

More information

Project Two. James K. Peterson. March 26, Department of Biological Sciences and Department of Mathematical Sciences Clemson University

Project Two. James K. Peterson. March 26, Department of Biological Sciences and Department of Mathematical Sciences Clemson University Project Two James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University March 26, 2019 Outline 1 Cooling Models 2 Estimating the Cooling Rate k 3 Typical

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression 1. Regression Equation A simple linear regression (also known as a bivariate regression) is a linear equation describing the relationship between an explanatory

More information

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example

The Big Picture. Model Modifications. Example (cont.) Bacteria Count Example The Big Picture Remedies after Model Diagnostics The Big Picture Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Residual plots

More information

Inference for Regression Simple Linear Regression

Inference for Regression Simple Linear Regression Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

Confidence Interval for the mean response

Confidence Interval for the mean response Week 3: Prediction and Confidence Intervals at specified x. Testing lack of fit with replicates at some x's. Inference for the correlation. Introduction to regression with several explanatory variables.

More information

Psychology 282 Lecture #4 Outline Inferences in SLR

Psychology 282 Lecture #4 Outline Inferences in SLR Psychology 282 Lecture #4 Outline Inferences in SLR Assumptions To this point we have not had to make any distributional assumptions. Principle of least squares requires no assumptions. Can use correlations

More information

Unit 6 - Introduction to linear regression

Unit 6 - Introduction to linear regression Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,

More information

STAT 4385 Topic 03: Simple Linear Regression

STAT 4385 Topic 03: Simple Linear Regression STAT 4385 Topic 03: Simple Linear Regression Xiaogang Su, Ph.D. Department of Mathematical Science University of Texas at El Paso xsu@utep.edu Spring, 2017 Outline The Set-Up Exploratory Data Analysis

More information

Business Statistics. Lecture 10: Course Review

Business Statistics. Lecture 10: Course Review Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,

More information

Statistical Test of the Global Warming Hypothesis

Statistical Test of the Global Warming Hypothesis Statistical Test of the Global Warming Hypothesis http://www.leapcad.com/climate_analysis/statistical_test_of_global_warming.xmcd Statistical test of how well Solar Insolation and CO2 variation explain

More information

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data

Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data Recent Advances in the Field of Trade Theory and Policy Analysis Using Micro-Level Data July 2012 Bangkok, Thailand Cosimo Beverelli (World Trade Organization) 1 Content a) Classical regression model b)

More information

STAT 350. Assignment 4

STAT 350. Assignment 4 STAT 350 Assignment 4 1. For the Mileage data in assignment 3 conduct a residual analysis and report your findings. I used the full model for this since my answers to assignment 3 suggested we needed the

More information

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007

Model Modifications. Bret Larget. Departments of Botany and of Statistics University of Wisconsin Madison. February 6, 2007 Model Modifications Bret Larget Departments of Botany and of Statistics University of Wisconsin Madison February 6, 2007 Statistics 572 (Spring 2007) Model Modifications February 6, 2007 1 / 20 The Big

More information

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

EXST Regression Techniques Page 1. We can also test the hypothesis H : œ 0 versus H : EXST704 - Regression Techniques Page 1 Using F tests instead of t-tests We can also test the hypothesis H :" œ 0 versus H :" Á 0 with an F test.! " " " F œ MSRegression MSError This test is mathematically

More information

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis STAT 3900/4950 MIDTERM TWO Name: Spring, 205 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis Instructions: You may use your books, notes, and SPSS/SAS. NO

More information

Ratio of Polynomials Fit Many Variables

Ratio of Polynomials Fit Many Variables Chapter 376 Ratio of Polynomials Fit Many Variables Introduction This program fits a model that is the ratio of two polynomials of up to fifth order. Instead of a single independent variable, these polynomials

More information

Chapter 14 Simple Linear Regression (A)

Chapter 14 Simple Linear Regression (A) Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables

More information

1 A Review of Correlation and Regression

1 A Review of Correlation and Regression 1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then

More information

Inference for Regression Inference about the Regression Model and Using the Regression Line

Inference for Regression Inference about the Regression Model and Using the Regression Line Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about

More information

A discussion on multiple regression models

A discussion on multiple regression models A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value

More information

23. Inference for regression

23. Inference for regression 23. Inference for regression The Practice of Statistics in the Life Sciences Third Edition 2014 W. H. Freeman and Company Objectives (PSLS Chapter 23) Inference for regression The regression model Confidence

More information

ST430 Exam 1 with Answers

ST430 Exam 1 with Answers ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.

More information

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences

Example. Multiple Regression. Review of ANOVA & Simple Regression /749 Experimental Design for Behavioral and Social Sciences 36-309/749 Experimental Design for Behavioral and Social Sciences Sep. 29, 2015 Lecture 5: Multiple Regression Review of ANOVA & Simple Regression Both Quantitative outcome Independent, Gaussian errors

More information

9. Linear Regression and Correlation

9. Linear Regression and Correlation 9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Explained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.

Chapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc. Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright

More information

STAT Chapter 11: Regression

STAT Chapter 11: Regression STAT 515 -- Chapter 11: Regression Mostly we have studied the behavior of a single random variable. Often, however, we gather data on two random variables. We wish to determine: Is there a relationship

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

Homework 2: Simple Linear Regression

Homework 2: Simple Linear Regression STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA

More information

Regression and Models with Multiple Factors. Ch. 17, 18

Regression and Models with Multiple Factors. Ch. 17, 18 Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least

More information

Biostatistics 380 Multiple Regression 1. Multiple Regression

Biostatistics 380 Multiple Regression 1. Multiple Regression Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)

More information

Regression. Marc H. Mehlman University of New Haven

Regression. Marc H. Mehlman University of New Haven Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and

More information

Correlation and simple linear regression S5

Correlation and simple linear regression S5 Basic medical statistics for clinical and eperimental research Correlation and simple linear regression S5 Katarzyna Jóźwiak k.jozwiak@nki.nl November 15, 2017 1/41 Introduction Eample: Brain size and

More information

Suppose we obtain a MLR equation as follows:

Suppose we obtain a MLR equation as follows: Psychology 8 Lecture #9 Outline Probing Interactions among Continuous Variables Suppose we carry out a MLR analysis using a model that includes an interaction term and we find the interaction effect to

More information

13 Simple Linear Regression

13 Simple Linear Regression B.Sc./Cert./M.Sc. Qualif. - Statistics: Theory and Practice 3 Simple Linear Regression 3. An industrial example A study was undertaken to determine the effect of stirring rate on the amount of impurity

More information

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression AMS 315/576 Lecture Notes Chapter 11. Simple Linear Regression 11.1 Motivation A restaurant opening on a reservations-only basis would like to use the number of advance reservations x to predict the number

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Project Two. Outline. James K. Peterson. March 27, Cooling Models. Estimating the Cooling Rate k. Typical Cooling Project Matlab Session

Project Two. Outline. James K. Peterson. March 27, Cooling Models. Estimating the Cooling Rate k. Typical Cooling Project Matlab Session Project Two James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University March 27, 2018 Outline Cooling Models Estimating the Cooling Rate k Typical Cooling

More information

Final Exam. Name: Solution:

Final Exam. Name: Solution: Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model

Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model United States Department of Agriculture Forest Service Forest Products Laboratory Research Paper FPL-RP-484 Estimation and Confidence Intervals for Parameters of a Cumulative Damage Model Carol L. Link

More information

Statistical View of Least Squares

Statistical View of Least Squares Basic Ideas Some Examples Least Squares May 22, 2007 Basic Ideas Simple Linear Regression Basic Ideas Some Examples Least Squares Suppose we have two variables x and y Basic Ideas Simple Linear Regression

More information

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company Multiple Regression Inference for Multiple Regression and A Case Study IPS Chapters 11.1 and 11.2 2009 W.H. Freeman and Company Objectives (IPS Chapters 11.1 and 11.2) Multiple regression Data for multiple

More information

3 Variables: Cyberloafing Conscientiousness Age

3 Variables: Cyberloafing Conscientiousness Age title 'Cyberloafing, Mike Sage'; run; PROC CORR data=sage; var Cyberloafing Conscientiousness Age; run; quit; The CORR Procedure 3 Variables: Cyberloafing Conscientiousness Age Simple Statistics Variable

More information

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES

PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES PubH 7405: REGRESSION ANALYSIS SLR: DIAGNOSTICS & REMEDIES Normal Error RegressionModel : Y = β 0 + β ε N(0,σ 2 1 x ) + ε The Model has several parts: Normal Distribution, Linear Mean, Constant Variance,

More information

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

with the usual assumptions about the error term. The two values of X 1 X 2 0 1 Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The

More information

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c Inference About the Slope ffl As with all estimates, ^fi1 subject to sampling var ffl Because Y jx _ Normal, the estimate ^fi1 _ Normal A linear combination of indep Normals is Normal Simple Linear Regression

More information

Correlation Analysis

Correlation Analysis Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the

More information

Ordinary Least Squares Regression Explained: Vartanian

Ordinary Least Squares Regression Explained: Vartanian Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent

More information

Consider fitting a model using ordinary least squares (OLS) regression:

Consider fitting a model using ordinary least squares (OLS) regression: Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful

More information

Topic 14: Inference in Multiple Regression

Topic 14: Inference in Multiple Regression Topic 14: Inference in Multiple Regression Outline Review multiple linear regression Inference of regression coefficients Application to book example Inference of mean Application to book example Inference

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Regression on Faithful with Section 9.3 content

Regression on Faithful with Section 9.3 content Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

Chapter 1 Linear Regression with One Predictor

Chapter 1 Linear Regression with One Predictor STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the

More information

AMS 7 Correlation and Regression Lecture 8

AMS 7 Correlation and Regression Lecture 8 AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation

More information

Soil Phosphorus Discussion

Soil Phosphorus Discussion Solution: Soil Phosphorus Discussion Summary This analysis is ambiguous: there are two reasonable approaches which yield different results. Both lead to the conclusion that there is not an independent

More information

Overview Scatter Plot Example

Overview Scatter Plot Example Overview Topic 22 - Linear Regression and Correlation STAT 5 Professor Bruce Craig Consider one population but two variables For each sampling unit observe X and Y Assume linear relationship between variables

More information

ECON 497 Midterm Spring

ECON 497 Midterm Spring ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain

More information

UNIT 12 ~ More About Regression

UNIT 12 ~ More About Regression ***SECTION 15.1*** The Regression Model When a scatterplot shows a relationship between a variable x and a y, we can use the fitted to the data to predict y for a given value of x. Now we want to do tests

More information

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line?

Steps for Regression. Simple Linear Regression. Data. Example. Residuals vs. X. Scatterplot. Make a Scatter plot Does it make sense to plot a line? Steps for Regression Simple Linear Regression Make a Scatter plot Does it make sense to plot a line? Check Residual Plot (Residuals vs. X) Are there any patterns? Check Histogram of Residuals Is it Normal?

More information

BIVARIATE DATA data for two variables

BIVARIATE DATA data for two variables (Chapter 3) BIVARIATE DATA data for two variables INVESTIGATING RELATIONSHIPS We have compared the distributions of the same variable for several groups, using double boxplots and back-to-back stemplots.

More information

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z). For example P(X 1.04) =.8508. For z < 0 subtract the value from

More information

STATISTICS 479 Exam II (100 points)

STATISTICS 479 Exam II (100 points) Name STATISTICS 79 Exam II (1 points) 1. A SAS data set was created using the following input statement: Answer parts(a) to (e) below. input State $ City $ Pop199 Income Housing Electric; (a) () Give the

More information

TOPIC 9 SIMPLE REGRESSION & CORRELATION

TOPIC 9 SIMPLE REGRESSION & CORRELATION TOPIC 9 SIMPLE REGRESSION & CORRELATION Basic Linear Relationships Mathematical representation: Y = a + bx X is the independent variable [the variable whose value we can choose, or the input variable].

More information

Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals

Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals Assumptions, Diagnostics, and Inferences for the Simple Linear Regression Model with Normal Residuals 4 December 2018 1 The Simple Linear Regression Model with Normal Residuals In previous class sessions,

More information