using the beginning of all regression models

Similar documents
General Linear Model (Chapter 4)

Lecture 1 Linear Regression with One Predictor Variable.p2

Analysis of Variance

Inference for Regression Simple Linear Regression

Statistics 5100 Spring 2018 Exam 1

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

Statistical Techniques II EXST7015 Simple Linear Regression

df=degrees of freedom = n - 1

Lecture 3: Inference in SLR

In Class Review Exercises Vartanian: SW 540

Topic 17 - Single Factor Analysis of Variance. Outline. One-way ANOVA. The Data / Notation. One way ANOVA Cell means model Factor effects model

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

3 Variables: Cyberloafing Conscientiousness Age

Correlation and Regression

x3,..., Multiple Regression β q α, β 1, β 2, β 3,..., β q in the model can all be estimated by least square estimators

1995, page 8. Using Multiple Regression to Make Comparisons SHARPER & FAIRER. > by SYSTAT... > by hand = 1.86.

ANALYSES OF NCGS DATA FOR ALCOHOL STATUS CATEGORIES 1 22:46 Sunday, March 2, 2003

Analysis of Variance Bios 662

The General Linear Model. April 22, 2008

Table 1: Fish Biomass data set on 26 streams

The General Linear Model. November 20, 2007

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

13 Simple Linear Regression

Chapter 1 Linear Regression with One Predictor

STAT 3900/4950 MIDTERM TWO Name: Spring, 2015 (print: first last ) Covered topics: Two-way ANOVA, ANCOVA, SLR, MLR and correlation analysis

Problem Set 10: Panel Data

Topic 28: Unequal Replication in Two-Way ANOVA

Review of Multiple Regression

Ch 2: Simple Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

Confidence Interval for the mean response

Data Set 8: Laysan Finch Beak Widths

Biostatistics 380 Multiple Regression 1. Multiple Regression

Topic 20: Single Factor Analysis of Variance

T-test: means of Spock's judge versus all other judges 1 12:10 Wednesday, January 5, judge1 N Mean Std Dev Std Err Minimum Maximum

Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.

1 Independent Practice: Hypothesis tests for one parameter:

Biological Applications of ANOVA - Examples and Readings

STAT 350: Summer Semester Midterm 1: Solutions

Linear models Analysis of Covariance

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Multiple Regression an Introduction. Stat 511 Chap 9

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Simple Linear Regression: One Qualitative IV

Linear models Analysis of Covariance

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

IES 612/STA 4-573/STA Winter 2008 Week 1--IES 612-STA STA doc

Inference. ME104: Linear Regression Analysis Kenneth Benoit. August 15, August 15, 2012 Lecture 3 Multiple linear regression 1 1 / 58

Lecture 10 Multiple Linear Regression

sociology sociology Scatterplots Quantitative Research Methods: Introduction to correlation and regression Age vs Income

Regression M&M 2.3 and 10. Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables

Two-Way ANOVA. Chapter 15

1. (Rao example 11.15) A study measures oxygen demand (y) (on a log scale) and five explanatory variables (see below). Data are available as

Answer to exercise 'height vs. age' (Juul)

SMA 6304 / MIT / MIT Manufacturing Systems. Lecture 10: Data and Regression Analysis. Lecturer: Prof. Duane S. Boning

Basic Business Statistics, 10/e

Categorical Predictor Variables

Chapter 9 - Correlation and Regression

Topic 29: Three-Way ANOVA

Lecture 11 Multiple Linear Regression

Lab 10 - Binary Variables

Data Mining and Data Warehousing. Henryk Maciejewski. Data Mining Predictive modelling: regression

AMS 315/576 Lecture Notes. Chapter 11. Simple Linear Regression

Lecture 6 Multiple Linear Regression, cont.

A discussion on multiple regression models

Section 9c. Propensity scores. Controlling for bias & confounding in observational studies

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

Correlation and Simple Linear Regression

Failure Time of System due to the Hot Electron Effect

6. Multiple regression - PROC GLM

STOR 455 STATISTICAL METHODS I

Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression

28. SIMPLE LINEAR REGRESSION III

SAS Procedures Inference about the Line ffl model statement in proc reg has many options ffl To construct confidence intervals use alpha=, clm, cli, c

Simple, Marginal, and Interaction Effects in General Linear Models

Inferences for Regression

Multiple Regression: Chapter 13. July 24, 2015

WORKSHOP 3 Measuring Association

Exercise I.1 I.2 I.3 I.4 II.1 II.2 III.1 III.2 III.3 IV.1 Question (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Answer

Identify the scale of measurement most appropriate for each of the following variables. (Use A = nominal, B = ordinal, C = interval, D = ratio.

1 The basics of panel data

2.1: Inferences about β 1

Review of the General Linear Model

appstats27.notebook April 06, 2017

SAS Commands. General Plan. Output. Construct scatterplot / interaction plot. Run full model

L6: Regression II. JJ Chen. July 2, 2015

CAMPBELL COLLABORATION

Confidence Intervals, Testing and ANOVA Summary

Lecture 18: Simple Linear Regression

ECON3150/4150 Spring 2015

SMAM 314 Exam 42 Name

Univariate analysis. Simple and Multiple Regression. Univariate analysis. Simple Regression How best to summarise the data?

Topic 14: Inference in Multiple Regression

Lecture 11: Simple Linear Regression

Lecture 5: Hypothesis testing with the classical linear model

THE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS

Correlation & Simple Regression

SPECIAL TOPICS IN REGRESSION ANALYSIS

STAT 3A03 Applied Regression With SAS Fall 2017

Transcription:

Estimating using the beginning of all regression models 3 examples Note about shorthand Cavendish's 29 measurements of the earth's density Heights (inches) of 14 11 year-old males from Alberta study Half-life of Caffeine (hours) in n=13 healthy non-smokers The shorthand?(0,σ) is used for some distribution with mean 0 and standard deviation σ. Some authors use variance rather than sd: notation?(0,σ 2 ). Note about "Minimum Requirements" for Least Squares Estimation 6.0 62 There is no requirement that ε N(0,σ) i.e that the ε's be "Normal" i.e. 9 8 Gaussian. Later, for statistical inferences about the parameters being estimated, the inferences may be somewhat inaccurate if n is small and the distribution of the ε's is not N(0,σ) or if the ε's are not independent of each other. 5.5 58 7 6 Fitting (i.e. calculating the parameter estimates of) the model for height By calculator or means procedure: y = Σy n = 57.21 5.0 56 54 5 4 3 s 2 = Σ(y y ) 2 n 1 By Mystat/SYSTAT regression program MODEL HEIGHT = CONSTANT ESTIMATE = 64.357 13 = 4.95 (s =2.2) 4.5 Statistics 52 2 Output from Systat Regression Program: DEP VAR: HEIGHT N:14 MULTIPLE R: 0.0 SQUARED MULTIPLE R: 0.0 ADJUSTED SQUARED MULTIPLE R: 0.0 STANDARD ERROR OF ESTIMATE: 2.22 n 29 14 13 MIN 4.88 53.00 9.40 MAX 5.85 61.00 5.95 MEAN ( y ) 5.45 57.21 5.95 VARIANCE s 2 0.0488 4.96 3.94 SD ( s ) 0.22 2.22 1.99 Least Squares Estimate of µ Σ( y - y ) 2 is smaller than Σ(y any other measure of the centre) 2 That's why we can call the statistic y the Least Squares estimator of µ. (see applet on best location to wait for elevator in Ch 1 Resources for 7, and 'elevator article' in Ch 1 of Course 697; see also applets in Ch for 7) Statistical Model y = µ + ε ε?(0,σ) VARIABLE COEFFICIENT STD ERROR STD COEF TOLERANCE T P(2 TAIL) CONSTANT 57.21 0.59465 0.00000. 96.0 0.00 Finding parameter estimates on output of statistical packages If you compare with the calculations above, you will readily identify the estimate y = 57.21 for the µ parameter. But what is the estimate of the σ 2 or σ parameter? We know from our calculator that s = 2.22.. In the SYSTAT output (SAS output later!), this estimate is given under the not-very-informative term STANDARD ERROR OF ESTIMATE. i.e. 2 Σ(y y ) 2 = ; n 1 = STANDARD ERROR OF ESTIMATE = = 2.22 (SPSS uses this SEE terminology too!) You can think of each (y y ) as the 'residual' variation from the mean, and you can therefore call Σ(y y ) 2 the Sum of Squares of the Residuals, or Residual Sum of Squares for short. Bridge from Chapter 7 single mean or differences of 2 means to Chapter : regression means indexed by values of an (arrayed along values of an) 'x' variable. page 1

Estimating using the beginning of all regression models What of the other items output by the regression program? Output from SAS PROC REG (see "fitting µ via regression" -- under resources for Ch ) What is STD ERROR = 0.59465? It is the SE of CONSTANT i.e. of y. It is what we called the SEM in Chapter 7, where it was given by the formula Standard Error of Mean = SEM = SE( y ) = s / n = 2.22 / 14 = 0.59. What is T = 96? (actually, it was t =.96E+02 before I translated it to the more friendly 0.96 x 0 = 96) it is the test statistic corresponding to the test of whether the underlying parameter (µ in our case) is ZERO i.e. of the H 0 : µ=0. Of course, the silly computer programmer doesn't know what µ refers to, or that the mean height of 11 year old boys in Alberta cannot be zero. Since we might have a case where there was genuine interest in the H 0 that µ=0, we will show where t = 96 came from: remember from earlier the 1-sample t-test and the formula t = (ybar-0)/se(ybar)= 57.21/0.59 = 96 (if don't fuss about rounding errors) What is P(2 TAIL) = 0.00? (it was P(2 TAIL) = 0.00000 before I truncated it) It is the P-value obtained by calculating the probability that an observation from the t distribution with n 1 = 13 df would exceed 96 in absolute value. What are STD COEF and TOLERANCE? Lets not worry about them for now! Fitting the beginning of all regression models using SAS proc reg data=sasuser.alberta; where ( I_Female = 0 and age =11 ) ; model height = ; run; (I discovered this way of calculating ybar by accident I forgot to put terms on the R hand side of a regression equation! It works the same way in INSIGHT) The model is simply but it can be thought of as y = µ + ε y = µ.1 + ε y = µ.x 0 + ε where x 0 1 (a constant); it is as though we have set it up so that the "predictor variable" x 0 in the regression equation is always 1. Then µ is the parameter to be estimated. Some software programs insist that you specify the constant; others assume it unless told otherwise. Dependent Variable: HEIGHT Analysis of Variance* Model 0 0.000... Error 13 64.357 4.95 C Total 13 64.357 Root MSE 2.22 R-square 0.0000 Dep Mean 57.21 Adj R-sq 0.0000 C.V. 3.89 INTERCEP 1 57.21 0.59 96.2 0.0001 Notice SAS uses the word "INTERCEP" rather than CONSTANT... and because all names before SAS version 8 were restricted to 8 letters, INTERCEPT gets shortened to "INTERCEP". More importantly, note the name SAS gives to the square root of the average of the squared residuals.. ROOT MEAN SQUARE ERROR, shortened to ROOT MSE ie. average squared deviation = 64.357/13 = 4.95; 4.95 = 2.22 here they are less confusing than SPSS and SYSTAT (to be fair, SEE is used a lot in measurement and psychophysics for variation of measurements on individuals [ie no n involved], rather than of statistics) (*)The ANOVA TABLE Usually, we are not interested in the overall mean µ of Y but rather in the 'effects' of variables x1, x2, x3 on the mean Y at each X configuration. In such situations, the 'remaining' y variation is measured from the fitted mean for each configuration of x's; here we have no explanatory variables x1 x2 x3. We cannot "explain" the variation in height reflected in the Error Sum of Squares 64.357 or the Error Mean Square = 64.357/13 = 4.95 or the Root Mean Square Error (RMSE) =sqrt[4.95] = 2.22. In analyses where there are explanatory variables x1 x2 x3... (rather than the constant x 0 we used here) the Anova Table will use the overall Σ(y y ) 2, which SAS calls the "Corrected Total Sum of Squares" as the bottom line SStotal, and R-Square will be the Model SS as a proportion of this SStotal. If we add variables x1, x2, x3... to the regression above, then the 64.35 will become the SStotal to be further partitioned into SSmodel and SSerror Σ(y x1,x2,x3 - µx1,x2,x3 ) 2. For " µ via regression" for density and caffeine 1/2 life, see resources for Ch. Bridge from Chapter 7 single mean or differences of 2 means to Chapter : regression means indexed by values of an (arrayed along values of an) 'x' variable. page 2

70 Estimating 11 year olds Boys adults 70 Statistical Model for difference in ave. male vs. ave. female height example (see M&M p 663) Males + ε Females: y = µ FEMALE + ε 65 55 MALE FEMALE 65 55 11 12 AGE 8 6 4 2 0 NON SMOKERS SMOKERS n 14 16 14 33 13 13 ε N(0,σ) All: + ( µ FEMALE µ MALE ) If Female + ε ; ε N(0,σ) Writing = µ FEMALE µ MALE... Males + + ε = µ MALE + 1 + ε + + ε = µ MALE + 1 + ε... Females + + ε = µ MALE + 1 + ε y y y y y y + + ε = µ MALE + 1 + ε y 57.21 57.25 0.04 57.21 59.00 1.79 5.95 3.53-2.42 s 2.22 3.41 2.22 3.05 1.99 1.43 INDEPENDENT SAMPLES T-TEST based on y y VAR t DF PROB t DF PROB t DF PROB y = = µ MALE + I + ε ALL I = "Indicator" of Female = 0 if Male; = 1 if Female Or, in more conventional Greek letters ( β's rather than µ and ) used in regression: S 0.03 26.0 0.9729 2.24 33.4 0.0319-3.56 21.8 0.0018 P 0.03 28 0.9736 1.97 45 0.0547-3.56 24 0.0016 VAR S=SEPARATE VARIANCES T-TEST P=POOLED VARIANCES* T-TEST y = β 0 + β 1 Indicator_of_Female + ε y = β 0 + β 1 "x" + ε Fitting (i.e. calculating the parameter estimates of) the model * (for later) first panel (heights of males vs females) Pooled variance = 13(2.22)2 + 15(3.41) 2 13 + 15 = 8.5 By calculator β 1 β 0 = b 1 = "slope" = r xy SD(y) SD(x) = b 0 = "intercept" = y b 1 x σ 2 = MSE = average squared residual = Σ(y y ) 2. n 2 Bridge from Chapter 7 single mean or differences of 2 means to Chapter : regression means indexed by values of an (arrayed along values of an) 'x' variable. page 3

Fitting (i.e. calculating the parameter estimates of) the model Estimating What of the other items output by the regression program? By SYSTAT computer package STD ERROR(β 1 ) = 1.070 is the SE of = µ FEMALE µ MALE MODEL HEIGHT = CONSTANT + I_FEMALE ESTIMATE In Chapter 7, we would have calculated it by the formula OUTPUT (I've put the parameter estimates in italics ) N:30 MULTIPLE R:0.006 SQUARED MULTIPLE R: 0.000 ADJU. SQUARED MULTIPLE R: 0.00 STANDARD ERROR OF ESTIMATE: 2.92 VARIABLE COEFFICIENT STD ERROR STD COEF TOL. T P(2 TAIL) CONSTANT 57.21 0.78 0.0000..73E+02 0.0000 I_FEMALE 0.04 1.07 0.0063 1.00 0.03338 0.9736 SE of difference in Means = SE( y FEMALE - y MALE ) = s 2 [ 1/n 1 + 1/n 2 ] = s 1/n 1 + 1/n 2 if use pooled variances. You can check that pooled variance = 8.5485 so that ANALYSIS OF VARIANCE SOURCE SUM-OF-SQUARES DF MEAN-SQUARE F-RATIO P s = sqrt[8.5485] =2.92 REGRESSION 0.00952 1 0.0095 0.00111 0.9736 [it is no accident that the regression gives the same values, since the RESIDUAL 239.35714 28 8.5485 * residuals are the variation of the individuals from the mean in their own Translation of OUTPUT ("matching up" parameter estimates ) gender group... exactly the same procedure as is used in 'pooling' the variances for the t-test] β 1 β 0 σ 2 σ (COEFFICIENT for I_FEMALE) = 0.04 (COEFFICIENT for CONSTANT) = 57.21 (MEAN-SQUARE RESIDUAL) = 8.5485 (ROOT MEAN-SQUARE RESIDUAL) = 8.5485 = 2.92 Remember what the Greek letters stood for in our statistical model: Thus SE( µ FEMALE µ MALE) = 2.92 1/14 + 1/16 = 1.07 T = 0.03338 in the I_FEMALE row is the test statistic corresponding to the test of whether the underlying parameter is ZERO. β 1 = = µ FEMALE µ MALE in our case β 0 β 1 = = µ MALE = 57.21 = µ FEMALE µ MALE = 0.04 So µ FEMALE = 57.21 + 0.04 = 57.25 * Residuals are calculated by squaring the deviation of each y from the estimated (fitted) mean for persons with the same "X" value in this case those of the same sex, summing them to give 239.357, and dividing the sum by 28 to get 8.5485. This is the same procedure used in Ch 7 to calculate a pooled variance! (If I do the pooled variance calculations without rounding, I get the same 8.5485. So the regression model 'recovers' the original means and pooled variance! It is formed by taking the ratio of the parameter estimate to its SE, namely β t = 1 STANDARD ERROR[β 1] = 0.04 1.08 = 0.0355 P(2 TAIL) = 0.9736 is the P-value obtained by calculating the probability that an observation from the "central" or "null" t distribution with 14+16 2 = 28 df would exceed 0.0335 in absolute value. 0(1 - α)% Confidence Interval for β 1 (and other β's) Use β 1 and SE[β1 ], together with 0(α/2)% and 0(1 α/2)% percentiles of the t 28 distribution, to form 0(1-α)% Confidence Interval (CI) for β 1. Most modern software packages print CI's or will, if user requests them! (see examples under Resources for Chapter ) Bridge from Chapter 7 single mean or differences of 2 means to Chapter : regression means indexed by values of an (arrayed along values of an) 'x' variable. page 4

Other items output by the regression program... continued Estimating The Analysis of Variance (ANOVA) TABLE Since we are not usually interested in the overall mean µ of the two genders, but rather in their difference -- represented by the parameter β 1 = = µ FEMALE µ MALE, the regression program uses the overall mean as a starting point, and calculates the overall variation of the 30 observations from the mean of the 30 observations; If you had taken a calculator and calculated the variance of the 30 observations, you would have to calculate s 2 = Σ(y y ) 2 30 1 = 239.36666 29 It then partitions the SStotal, based on 29 df, into = SS total 29 REGRESSION SS = 0.00952 based on 1 df + RESIDUAL SS = 239.35714 based on 28 df ---------------------------------------- TOTAL SS = 239.36666 based on 29 df The regression has 1 term x whose coefficient represents the height variation across gender and that's why the df for the regression is 1. As explained above, the Error Sum of Squares is calculated as Fitting the above regressions using PROC GLM in SAS: see full analysis in separate document under Resources in Chapter. PROC REG DATA=sasuser.alberta; where (age =11); MODEL height = I_Female; run; Dependent Variable: HEIGHT Analysis of Variance Model 1 0.00952 0.00952 0.001 0.9736 Error 28 239.35714 8.54847 C Total 29 239.36667 Root MSE 2.92 R-square 0.0000 Dep Mean 57.23 Adj R-sq -0.0357 C.V. 5. Error Sum of Squares = Σ(y y ) 2 30 2 INTERCEP 1 57.21 0.781412 73.219 0.0001 I_FEMALE 1 0.04 1.069992 0.033 0.9736 where y = y MALE = 57.21 in the case of males and y = y FEMALE = 57.25 in the case of females. So in effect Mean Square Error = Σ(y y MALE ) 2 + Σ(y y FEMALE ) 2 {14-1 } + {16-1} Note the identical P Values from pooled variances t-test of the difference in two means, and the test of whether the regression parameter which represents this difference is zero. I can't get SAS to directly print CI to accompany each estimate, but could use parameter estimate ± Z α/2 Standard Error[parameter estimate] for 0(1-α)% CI in this instance, since df for t are large enough (28) that t can be approximated by Z. Bridge from Chapter 7 single mean or differences of 2 means to Chapter : regression means indexed by values of an (arrayed along values of an) 'x' variable. page 5

Estimating Height vs Age: Fitting the regression using PROC REG in SAS Halflife of Caffeine in Smokers and Non Smokers : PROC REG DATA=sasuser.alberta; WHERE (I_Female = 0 and age < 13); MODEL height = age ; RUN; Dependent Variable: HEIGHT Analysis of Variance Model 1 31.34498 31.34498 3.893 0.0547 Error 45 362.35714 8.05238 C Total 46 393.70213 Root MSE 2.83767 R-square 0.0796 Dep Mean 58.46809 Adj R-sq 0.0592 C.V. 4.85337 INTERCEP 1 37.57.59 3.545 0.0009 AGE 1 1.78 0.90 1.973 0.0547 DATA from11; SET sasuser.alberta; /* create a new 'age' variable */ after11 = age - 11; /* age 11 --> 0 */ PROC REG DATA = from11; WHERE(I_Female = 0 and age < 13); MODEL height = after11 ; RUN; Output from SAS program Same as above, except... INTERCEP 1 57.21 0.75 75.441 0.0001 AFTER11 1 1.78 0.90 1.973 0.0547 Note the much smaller SE for the INTERCEPT -- which now has interpretation:- the estimated mean at age 11. 40 "slope" = 1.79 inches / 1 year 57.21 59.00 40 37.57 30 Why the intercept is the 30 only item to change? 20 20 Age 0 0 Age 11 0 5 11 12 0 1 (11) (12) via 2 different SAS PROCedures: GLM {General Linear Model) and REG GLM often used when some variables are categorical. with several ( k > 2) levels, and user too lazy to create k-1 indicator variables. With k=2 categories, any 2 numerical values will suffice, as long as user remembers how far apart the 2 values are! data a; infile 'halflife.dat'; input halflife smoking; smoking=1 if smoker, 0 if not. PROC GLM ; model halflife = smoking; run; Output from SAS PROC GLM program Source DF Squares Square F Value Pr > F Model 1 37.92 37.92 12.65 0.0016 Error 24 71.92 2.99 Corrected Total 25 9.84 R-Square C.V. Root MSE HALFLIFE Mean 0.34 36.47 (%) 1.73 4.74 T for H0: Pr > T Std Error of Parameter Estimate Parameter=0 Estimate* INTERCEPT 5.95 12.40 0.0001 0.48 SMOKING -2.41-3.56 0.0016 0.67 data a; infile 'halflife.dat'; input halflife smoking; proc REG ; model halflife = smoking; run; Analysis of Variance Model 1 37.92 37.92 12.654 0.001 Error 24 71.92 2.99 C Total 25 9.84 Root MSE 1.73 R-square 0.3452 Dep Mean 4.74615 INTERCEP 1 5.95 0.48 12.401 0.0001 SMOKING 1-2.41 0.67-3.557 0.0016 * Std Error of Estimate Cf REG. SE's for parameter. estimate. Do not confuse with same term, used by some, for the RMSE Even within SAS Institute, different teams use different terms! DO NOT USE AS MANY DECIMAL PLACES AS SAS REPORTS!!! In most packages, can specify # of decimals; if not, TRUNCATE! Bridge from Chapter 7 single mean or differences of 2 means to Chapter : regression means indexed by values of an (arrayed along values of an) 'x' variable. page 6