Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure.
|
|
- Felix Moody
- 5 years ago
- Views:
Transcription
1 STATGRAPHICS Rev. 9/13/213 Calibration Models Summary... 1 Data Input... 3 Analysis Summary... 5 Analysis Options... 7 Plot of Fitted Model... 9 Predicted Values... 1 Confidence Intervals Observed versus Predicted Residual Plots Comparison of Alternative Models Unusual Residuals Influential Points Save Results Calculations Summary The Calibration Models procedure is designed to construct a statistical model describing the relationship between 2 variables, X and Y, where the intent of the model-building is to construct an equation that can be used to predict X given Y. In a typical application, X represents the true value of some important quantity, while Y is the measured value. Initially, a set of samples with known X values are used to calibrate the model. Later, when samples with unknown X values are measured, the fitted model is used to make an inverse prediction of X from the measured values Y. Any of 27 linear and nonlinear models may be fit. The output parallels that of the Simple Regression procedure. Sample StatFolio: calibration.sgp 213 by StatPoint Technologies, Inc. Calibration Models - 1
2 Sample Data: STATGRAPHICS Rev. 9/13/213 The file galactose.sgd contains data on an experiment performed using a new method for measuring the concentration of galactose in blood. The data is similar to that reported by Neter et al (1998). n = 12 samples with known galactose concentrations X ranging between 1. and 1. were measured. The data are shown below: Known Measured An additional sample of unknown concentration was measured, yielding Y = An estimate of the actual concentration of the additional sample is desired, with a 95% confidence interval. 213 by StatPoint Technologies, Inc. Calibration Models - 2
3 STATGRAPHICS Rev. 9/13/213 Data Input The data input dialog box can be used in 2 ways: 1. Given measurements of samples with known values of X, it can be used to fit the calibration model. The coefficients of the model may be saved for later use. 2. If new measurements are made, the stored coefficients can be used to predict the true value of X. Fitting the Calibration Model Y (measured): numeric column containing the n measured values of the quantity to be predicted. X (actual): numeric column containing the n known values of that quantity. Fitted Model Statistics: left blank when fitting a new model. Weights: optional numeric column containing weights to be applied to the residuals if performing a weighted least squares fit. If the variability of Y changes as a function of X, these weights can be used to compensate for the different levels of variability. Select: subset selection. 213 by StatPoint Technologies, Inc. Calibration Models - 3
4 STATGRAPHICS Rev. 9/13/213 Action: select Fit New Model to estimate a new model from Y and X. Using a Stored Model Y (measured): numeric column (or single number) containing the measured values of the quantity to be predicted. Fitted Model Statistics: column containing the statistics saved from the original model estimation. This would normally have been created using the Save Results option when the model was calibrated. The column consists of the estimated intercept, slope, and other relevant information. Action: select Predict X from Y. 213 by StatPoint Technologies, Inc. Calibration Models - 4
5 STATGRAPHICS Rev. 9/13/213 Analysis Summary When fitting a new calibration model, the Analysis Summary shows information about the fitted model. Calibration Models - measured vs. known Y (measured): measured X (actual): known Number of observations: 12 Linear model: Y = a + b*x Least Squares Standard T Parameter Estimate Error Statistic P-Value Intercept Slope Analysis of Variance Source Sum of Squares Df Mean Square F-Ratio P-Value Model Residual Lack-of-Fit Pure Error Total (Corr.) Correlation Coefficient = R-Squared = percent R-Squared (adjusted for d.f.) = percent Standard Error of Est. = Mean absolute error = Durbin-Watson statistic = (P=.942) Lag 1 residual autocorrelation = Residual Analysis Estimation n 12 MSE MAE MAPE ME E-16 MPE Validation Included in the output are: Variables and model: identification of the input variables and the model that was fit. By default, a linear model of the form Y = a + b X (1) is fit, although a different model may be selected using Analysis Options. Coefficients: the estimated coefficients, standard errors, t-statistics, and P values. The estimates of the model coefficients can be used to write the fitted equation, which in the example is measured = known (2) The t-statistic tests the null hypothesis that the corresponding model parameter equals, versus the alternative hypothesis that it does not equal. Small P-Values (less than.5 if 213 by StatPoint Technologies, Inc. Calibration Models - 5
6 STATGRAPHICS Rev. 9/13/213 operating at the 5% significance level) indicate that a model coefficient is significantly different from. In the sample data, the slope is significantly different from but the intercept is not. Analysis of Variance: decomposition of the variability of the dependent variable Y into a model sum of squares and a residual or error sum of squares. The residual sum of squares is further partitioned into a lack-of-fit component and a pure error component. Of particular interest are the F-tests and the associated P-values. The F-test on the Model line tests the statistical significance of the fitted model. A small P-Value (less than.5 if operating at the 5% significance level) indicates that a significant relationship of the form specified exists between Y and X. In the sample data, the model is highly significant. The F-test on the Lack-of-fit line tests the adequacy of the selected linear model in describing the observed relationship between Y and X. A small P-Value indicates that the selected model does not adequately describe the relationship. In such cases, a nonlinear model could be selected using Analysis Options. For the sample data, the large P-Value indicates that the linear model is adequate. Note: the lack-of-fit test is available only when more than one measurement has been obtained at the same value of X. Statistics: summary statistics for the fitted model, including: Correlation coefficient - measures the strength of the linear relationship between Y and X on a scale ranging from -1 (perfect negative linear correlation) to +1 (perfect positive linear correlation). In the sample data, the correlation is very strong. R-squared - represents the percentage of the variability in Y which has been explained by the fitted regression model, ranging from % to 1%. For the sample data, the regression has accounted for about 99.9% of the variability amongst the measurements. Adjusted R-Squared the R-squared statistic, adjusted for the number of coefficients in the model. This value is often used to compare models with different numbers of coefficients. Standard Error of Est. the estimated standard deviation of the residuals (the deviations around the model). This value is used to create prediction limits for new observations. Mean Absolute Error the average absolute value of the residuals. Durbin-Watson Statistic a measure of serial correlation in the residuals. If the residuals vary randomly, this value should be close to 2. A small P-value indicates a non-random pattern in the residuals. For data recorded over time, a small P-value could indicate that some trend over time has not been accounted for. Lag 1 Residual Autocorrelation the estimated correlation between consecutive residuals, on a scale of 1 to 1. Values far from indicate that significant structure remains unaccounted for by the model. Residual Analysis if a subset of the rows in the datasheet have been excluded from the analysis using the Select field on the data input dialog box, the fitted model is used to make predictions of the Y values for those rows. This table shows statistics on the prediction errors, defined by 213 by StatPoint Technologies, Inc. Calibration Models - 6
7 STATGRAPHICS Rev. 9/13/213 e i y yˆ (3) i i Included are the mean squared error (MSE), the mean absolute error (MAE), the mean absolute percentage error (MAPE), the mean error (ME), and the mean percentage error (MPE). This validation statistics can be compared to the statistics for the fitted model to determine how well that model predicts observations outside of the data used to fit it. Analysis Options Type of Model: the model to be estimated. All of the models displayed can be linearized by transforming either X, Y, or both. When fitting a nonlinear model, STATGRAPHICS first transforms the data, then fits the model, and then inverts the transformation to display the results. Include Constant: whether to include a constant term or intercept in the model. If the constant is removed, the fitted model will pass through the origin at (X,Y) = (,). The available models are shown in the following table: 213 by StatPoint Technologies, Inc. Calibration Models - 7
8 STATGRAPHICS Rev. 9/13/213 Model Equation Transformation on Y Transformation on X Linear y 1 x none none y 2 x square root none Square root-y 1 Exponential 1 x y e log none Reciprocal-Y y x 1 reciprocal none Squared-Y 1 y 1 x square none Square root-x y 1 x none square root Double square root y 2 1 x square root square root Log-Y square root-x 1 x y e log square root Reciprocal-Y square y x reciprocal square root 1 1 root-x Squared-Y square root- square square root y 1 x X Logarithmic-X y 1 ln( x) none log Square root-y log-x y ln(x 2 square root log 1 ) Multiplicative y x 1 log log Reciprocal-Y log-x 1 reciprocal log y ln( ) 1 x Squared-Y log-x y ln( ) square log Reciprocal-X 1 x y / x none reciprocal 1 2 Square root-y y 1 / x square root reciprocal reciprocal- X S-curve 1/ x y e log reciprocal Double reciprocal y x 1 reciprocal reciprocal Squared-Y reciprocal-x / y / x square reciprocal 1 Squared-X 2 y none square 1x 2 Square root-y squared- 2 y 1x square root square X 2 Log-Y squared-x 1x y e log square Reciprocal-Y squared-x 2 y x 1 reciprocal square Double squared 2 y 1 1x Logistic 1x e y 1x 1 e square square y/(1-y) none Log probit y ( 1 ln( x)) 1 ( y ) log (inv. normal) 213 by StatPoint Technologies, Inc. Calibration Models - 8
9 measured STATGRAPHICS Rev. 9/13/213 To determine which model to fit to the data, the output in the Comparison of Alternative Models pane described below can be helpful, since it fits all of the models and lists them in decreasing order of R-squared. Plot of Fitted Model This pane shows the fitted model or models, together with confidence limits and prediction limits if desired. 12 Plot of Fitted Model measured = *known (6.2535, ) known The plot includes: The line of best fit or prediction equation: yˆ aˆ bx ˆ (4) This is the equation that would be used to predict values of the dependent variable Y given values of the independent variable X, or vice versa. Confidence intervals for the mean response at X. These are the inner bounds in the above plot and describe how well the location of the line has been estimated given the available data sample. As the size of the sample n increases, these bounds will become tighter. You should also note that the width of the bounds varies as a function of X, with the line estimated most precisely near the average value x. Prediction limits for new observations. These are the outer bounds in the above plot and describe how precisely one could predict where a single new observation would lie. Regardless of the size of the sample, new observations will vary around the true line with a standard deviation equal to. Prediction of a single value. Using Pane Options, a single prediction can be made and plotted. For example, the above plot predicts the value of X given a sample with measured value Y = The predicted value of X equals 6.516, with 95% confidence limits extending from 6.25 to by StatPoint Technologies, Inc. Calibration Models - 9
10 Pane Options STATGRAPHICS Rev. 9/13/213 Include: the limits to include on the plot. Confidence Level: the confidence percentage for the limits. Predict: whether to predict Y or X. Enter the value of the other variable in the At field. Mean Size or Weight: if the measured value is the average of more than one sample, enter the number of samples m used to calculate the average. Predicted Values The model can be used to predict X given Y or Y given X. In the first case, the output is shown below: Predicted Values for X 95.% Predicted Prediction Limits Y-bar X Lower Upper Included in the table are: Y - the measured value at which the prediction is to be made. Predicted X - the predicted value of X using the fitted model. Prediction limits - prediction limits for X at the selected level of confidence. These are the same values displayed on the plot of the fitted model. 213 by StatPoint Technologies, Inc. Calibration Models - 1
11 Pane Options STATGRAPHICS Rev. 9/13/213 Predict: whether to predict Y or X. Confidence Level: the confidence percentage for the limits. Mean Size or Weight: if the measured value is the average of more than one sample, enter the number of samples m used to calculate the average. Predict At: up to 1 values at which to make predictions. Confidence Intervals The Confidence Intervals pane shows the potential estimation error associated with each coefficient in the model. 95.% confidence intervals for coefficient estimates Standard Parameter Estimate Error Lower Limit Upper Limit CONSTANT SLOPE Pane Options 213 by StatPoint Technologies, Inc. Calibration Models - 11
12 STATGRAPHICS Rev. 9/13/213 Type of Interval: either a two-sided confidence interval or a one-sided confidence bound may be created. Confidence Level: percentage level for the interval or bound. Hypothesis Tests The Hypothesis Tests pane can be used to test hypotheses about the model coefficients. In each case, a t-test is performed. The default tests are shown below: Hypothesis Tests Null hypothesis: intercept =. Alternative hypothesis: intercept not equal. Computed t statistic = P-value = Do not reject the null hypothesis for alpha =.5. Null hypothesis: slope = 1. Alternative hypothesis: slope not equal 1. Computed t statistic = P-value = Do not reject the null hypothesis for alpha =.5. The first test concerns whether or not the intercept equals. If so, the model goes through the origin. A small P-Value (less than.5 if operating at the 5% significance level) would indicate that the intercept was not equal to. In this case, the result is not significant, so the line may well go through the origin. If the slope of the line equals 1, a non-zero intercept would be related to bias in the measurements. The second test concerns whether or not the slope equals 1. For a linear model, a slope of 1 indicates that when the known value changes, the measured value changes by the same amount. A small P-Value would indicate that the slope was significantly different than 1. In the current case, neither null hypothesis is rejected, indicating that a possible equation for the calibration curve is measured = known. Pane Options Intercept: the value of the intercept specified by the null hypothesis. Slope: the value of the slope specified by the null hypothesis. 213 by StatPoint Technologies, Inc. Calibration Models - 12
13 observed STATGRAPHICS Rev. 9/13/213 Alternative: the type of alternative hypothesis. If Not Equal is selected, a two-sided P-value is calculated. Otherwise, a one-sided P-value is calculated. Alpha: the probability of a Type I error (rejecting the null hypothesis when it is true). This does not affect the P-value, only the conclusion stated beneath it. Observed versus Predicted The Observed versus Predicted plot shows the observed values of Y on the vertical axis and the predicted values Yˆ on the horizontal axis Plot of measured predicted If the model fits well, the points should be randomly scattered around the diagonal line. It is sometimes possible to see curvature in this plot, which would indicate the need for a curvilinear model rather than a linear model. Any change in variability from low values of Y to high values of Y might also indicate the need to transform the dependent variable before fitting a model to the data. Residual Plots As with all statistical models, it is good practice to examine the residuals. In a regression, the residuals are defined by e i y yˆ (5) i i i.e., the residuals are the differences between the observed data values and the fitted model. The Calibration Models procedure various type of residual plots, depending on Pane Options. 213 by StatPoint Technologies, Inc. Calibration Models - 13
14 percentage Studentized residual STATGRAPHICS Rev. 9/13/213 Scatterplot versus X This plot is helpful in visualizing any need for a curvilinear model. 2.8 Residual Plot known Normal Probability Plot This plot can be used to determine whether or not the deviations around the line follow a normal distribution, which is the assumption used to form the prediction intervals. Normal Probability Plot for measured Studentized residual If the deviations follow a normal distribution, they should fall approximately along a straight line. 213 by StatPoint Technologies, Inc. Calibration Models - 14
15 autocorrelation STATGRAPHICS Rev. 9/13/213 Residual Autocorrelations This plot calculates the autocorrelation between residuals as a function of the number of rows between them in the datasheet. 1 Residual Autocorrelations for measured lag It is only relevant if the data have been collected sequentially. Any bars extending beyond the probability limits would indicate significant dependence between residuals separated by the indicated lag, which would violate the assumption of independence made when fitting the regression model. Pane Options Plot: the type of residuals to plot: 1. Residuals the residuals from the least squares fit. 2. Studentized residuals the difference between the observed values y i and the predicted values ŷ i when the model is fit using all observations except the i-th, divided by the estimated standard error. These residuals are sometimes called externally deleted 213 by StatPoint Technologies, Inc. Calibration Models - 15
16 STATGRAPHICS Rev. 9/13/213 residuals, since they measure how far each value is from the fitted model when that model is fit using all of the data except the point being considered. This is important, since a large outlier might otherwise affect the model so much that it would not appear to be unusually far away from the line. Type: the type of plot to be created. A Scatterplot is used to test for curvature. A Normal Probability Plot is used to determine whether the model residuals come from a normal distribution. An Autocorrelation Function is used to test for dependence between consecutive residuals. Plot Versus: for a Scatterplot, the quantity to plot on the horizontal axis. Number of Lags: for an Autocorrelation Function, the maximum number of lags. For small data sets, the number of lags plotted may be less than this value. Confidence Level: for an Autocorrelation Function, the level used to create the probability limits. Comparison of Alternative Models The Comparison of Alternative Models pane shows the R-squared values obtained when fitting each of the 27 available models: Comparison of Alternative Models Model Correlation R-Squared Linear % Double square root % Double squared % Double reciprocal % Square root-y logarithmic-x % Multiplicative % Square root-x % Square root-y % Logarithmic-Y square root-x % S-curve model % Squared-Y % Squared-X % Logarithmic-X % Exponential % Squared-Y square root-x % Square root-y squared-x % Reciprocal-X % Squared-Y logarithmic-x % Logarithmic-Y squared-x % Squared-Y reciprocal-x % Reciprocal-Y squared-x % Reciprocal-Y <no fit> Reciprocal-Y square root-x <no fit> Reciprocal-Y logarithmic-x <no fit> Square root-y reciprocal-x <no fit> Logistic <no fit> Log probit <no fit> The models are listed in decreasing order of R-squared. When selecting an alternative model, consideration should be given to those models near the top of the list. However, since the R- 213 by StatPoint Technologies, Inc. Calibration Models - 16
17 STATGRAPHICS Rev. 9/13/213 Squared statistics are calculated after transforming X and/or Y, the model with the highest R- squared may not be the best. You should always plot the fitted model to see whether it does a good job for your data. Unusual Residuals Once the model has been fit, it is useful to study the residuals to determine whether any outliers exist that should be removed from the data. The Unusual Residuals pane lists all observations that have Studentized residuals of 2. or greater in absolute value. Unusual Residuals Predicted Studentized Row X Y Y Residual Residual Studentized residuals greater than 3 in absolute value correspond to points more than 3 standard deviations from the fitted model, which is an extremely rare event for a normal distribution. Note: Points can be removed from the fit while examining the Plot of the Fitted Model by clicking on a point and then pressing the Exclude/Include button on the analysis toolbar. Excluded points are marked with an X. Influential Points In fitting a regression model, all observations do not have an equal influence on the parameter estimates in the fitted model. In a simple regression, points located at very low or very high values of X have greater influence than those located nearer to the mean of X. The Influential Points pane displays any observations that have high influence on the fitted model: Influential Points Predicted Studentized Row X Y Y Residual Leverage Average leverage of single data point = The table shows every point with leverage equal to 3 or more times that of an average data point, where the leverage of an observation is a measure of its influence on the estimated model coefficients. In general, values with leverage exceeding 5 times that of an average data value should be examined closely, since they have unusually large impact on the fitted model. Save Results The following results may be saved to the datasheet: 1. Model Statistics a column of numeric values with information about the fitted model. This column can be used later to predict values of X by selecting Predict X from Y on the data input dialog box. 2. Predicted Values the predicted value of Y corresponding to each of the n observations. 3. Lower Limits for Predictions the lower prediction limits for each predicted value. 4. Upper Limits for Predictions the upper prediction limits for each predicted value. 5. Lower Limits for Forecast Means the lower confidence limits for the mean value of Y at each of the n values of X. 213 by StatPoint Technologies, Inc. Calibration Models - 17
18 STATGRAPHICS Rev. 9/13/ Upper Limits for Forecast Means the upper confidence limits for the mean value of Y at each of the n values of X. 7. Residuals the n residuals. 8. Studentized Residuals the n Studentized residuals. 9. Leverages the leverage values corresponding to the n values of X. 1. Coefficients the estimated model coefficients. Calculations Inverse Predictions xˆ new y ˆ new o (6) ˆ 1 Lower and upper limits for x new are found using Fieller s approach, which solves for the values of xˆ at which the prediction limits new xˆ new x yˆ t / 2, n2 MSE (7) m n S XX are equal to y new, where m is the mean size or weight and xx n i1 2 x x S (8) i Additional calculations may be found in the Simple Regression documentation. 213 by StatPoint Technologies, Inc. Calibration Models - 18
Nonlinear Regression. Summary. Sample StatFolio: nonlinear reg.sgp
Nonlinear Regression Summary... 1 Analysis Summary... 4 Plot of Fitted Model... 6 Response Surface Plots... 7 Analysis Options... 10 Reports... 11 Correlation Matrix... 12 Observed versus Predicted...
More informationRidge Regression. Summary. Sample StatFolio: ridge reg.sgp. STATGRAPHICS Rev. 10/1/2014
Ridge Regression Summary... 1 Data Input... 4 Analysis Summary... 5 Analysis Options... 6 Ridge Trace... 7 Regression Coefficients... 8 Standardized Regression Coefficients... 9 Observed versus Predicted...
More informationBox-Cox Transformations
Box-Cox Transformations Revised: 10/10/2017 Summary... 1 Data Input... 3 Analysis Summary... 3 Analysis Options... 5 Plot of Fitted Model... 6 MSE Comparison Plot... 8 MSE Comparison Table... 9 Skewness
More informationPolynomial Regression
Polynomial Regression Summary... 1 Analysis Summary... 3 Plot of Fitted Model... 4 Analysis Options... 6 Conditional Sums of Squares... 7 Lack-of-Fit Test... 7 Observed versus Predicted... 8 Residual Plots...
More informationHow To: Deal with Heteroscedasticity Using STATGRAPHICS Centurion
How To: Deal with Heteroscedasticity Using STATGRAPHICS Centurion by Dr. Neil W. Polhemus July 28, 2005 Introduction When fitting statistical models, it is usually assumed that the error variance is the
More informationMultivariate T-Squared Control Chart
Multivariate T-Squared Control Chart Summary... 1 Data Input... 3 Analysis Summary... 4 Analysis Options... 5 T-Squared Chart... 6 Multivariate Control Chart Report... 7 Generalized Variance Chart... 8
More informationLAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION
LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of
More informationThe entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.
One-Way ANOVA Summary The One-Way ANOVA procedure is designed to construct a statistical model describing the impact of a single categorical factor X on a dependent variable Y. Tests are run to determine
More informationDistribution Fitting (Censored Data)
Distribution Fitting (Censored Data) Summary... 1 Data Input... 2 Analysis Summary... 3 Analysis Options... 4 Goodness-of-Fit Tests... 6 Frequency Histogram... 8 Comparison of Alternative Distributions...
More informationPrincipal Components. Summary. Sample StatFolio: pca.sgp
Principal Components Summary... 1 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 8 Component Weights... 9 D and 3D Component Plots... 10 Data Table... 11 D and 3D Component
More informationItem Reliability Analysis
Item Reliability Analysis Revised: 10/11/2017 Summary... 1 Data Input... 4 Analysis Options... 5 Tables and Graphs... 5 Analysis Summary... 6 Matrix Plot... 8 Alpha Plot... 10 Correlation Matrix... 11
More informationDOE Wizard Screening Designs
DOE Wizard Screening Designs Revised: 10/10/2017 Summary... 1 Example... 2 Design Creation... 3 Design Properties... 13 Saving the Design File... 16 Analyzing the Results... 17 Statistical Model... 18
More informationInferences for Regression
Inferences for Regression An Example: Body Fat and Waist Size Looking at the relationship between % body fat and waist size (in inches). Here is a scatterplot of our data set: Remembering Regression In
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 13 Simple Linear Regression 13-1 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of
More informationLAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION
LAB 5 INSTRUCTIONS LINEAR REGRESSION AND CORRELATION In this lab you will learn how to use Excel to display the relationship between two quantitative variables, measure the strength and direction of the
More informationFactor Analysis. Summary. Sample StatFolio: factor analysis.sgp
Factor Analysis Summary... 1 Data Input... 3 Statistical Model... 4 Analysis Summary... 5 Analysis Options... 7 Scree Plot... 9 Extraction Statistics... 10 Rotation Statistics... 11 D and 3D Scatterplots...
More informationProbability Plots. Summary. Sample StatFolio: probplots.sgp
STATGRAPHICS Rev. 9/6/3 Probability Plots Summary... Data Input... 2 Analysis Summary... 2 Analysis Options... 3 Uniform Plot... 3 Normal Plot... 4 Lognormal Plot... 4 Weibull Plot... Extreme Value Plot...
More informationLINEAR REGRESSION ANALYSIS. MODULE XVI Lecture Exercises
LINEAR REGRESSION ANALYSIS MODULE XVI Lecture - 44 Exercises Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Exercise 1 The following data has been obtained on
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationChapter 16. Simple Linear Regression and dcorrelation
Chapter 16 Simple Linear Regression and dcorrelation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationCircle the single best answer for each multiple choice question. Your choice should be made clearly.
TEST #1 STA 4853 March 6, 2017 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 32 multiple choice
More informationCorrespondence Analysis
STATGRAPHICS Rev. 7/6/009 Correspondence Analysis The Correspondence Analysis procedure creates a map of the rows and columns in a two-way contingency table for the purpose of providing insights into the
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationUnit 10: Simple Linear Regression and Correlation
Unit 10: Simple Linear Regression and Correlation Statistics 571: Statistical Methods Ramón V. León 6/28/2004 Unit 10 - Stat 571 - Ramón V. León 1 Introductory Remarks Regression analysis is a method for
More informationChapter 16. Simple Linear Regression and Correlation
Chapter 16 Simple Linear Regression and Correlation 16.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationTrendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues
Trendlines Simple Linear Regression Multiple Linear Regression Systematic Model Building Practical Issues Overfitting Categorical Variables Interaction Terms Non-linear Terms Linear Logarithmic y = a +
More informationKeller: Stats for Mgmt & Econ, 7th Ed July 17, 2006
Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationAutomatic Forecasting
Automatic Forecasting Summary The Automatic Forecasting procedure is designed to forecast future values of time series data. A time series consists of a set of sequential numeric data taken at equally
More informationRegression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear
Regression analysis is a tool for building mathematical and statistical models that characterize relationships between variables Finds a linear relationship between: - one independent variable X and -
More informationRatio of Polynomials Fit One Variable
Chapter 375 Ratio of Polynomials Fit One Variable Introduction This program fits a model that is the ratio of two polynomials of up to fifth order. Examples of this type of model are: and Y = A0 + A1 X
More informationChapter 12: Multiple Regression
Chapter 12: Multiple Regression 12.1 a. A scatterplot of the data is given here: Plot of Drug Potency versus Dose Level Potency 0 5 10 15 20 25 30 0 5 10 15 20 25 30 35 Dose Level b. ŷ = 8.667 + 0.575x
More informationBasic Business Statistics 6 th Edition
Basic Business Statistics 6 th Edition Chapter 12 Simple Linear Regression Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value of a dependent variable based
More informationStatistics II Exercises Chapter 5
Statistics II Exercises Chapter 5 1. Consider the four datasets provided in the transparencies for Chapter 5 (section 5.1) (a) Check that all four datasets generate exactly the same LS linear regression
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice
The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test
More informationCh 13 & 14 - Regression Analysis
Ch 3 & 4 - Regression Analysis Simple Regression Model I. Multiple Choice:. A simple regression is a regression model that contains a. only one independent variable b. only one dependent variable c. more
More informationCorrelation Analysis
Simple Regression Correlation Analysis Correlation analysis is used to measure strength of the association (linear relationship) between two variables Correlation is only concerned with strength of the
More informationThe Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)
The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE
More informationHow To: Analyze a Split-Plot Design Using STATGRAPHICS Centurion
How To: Analyze a SplitPlot Design Using STATGRAPHICS Centurion by Dr. Neil W. Polhemus August 13, 2005 Introduction When performing an experiment involving several factors, it is best to randomize the
More informationAnalysis of Covariance (ANCOVA) with Two Groups
Chapter 226 Analysis of Covariance (ANCOVA) with Two Groups Introduction This procedure performs analysis of covariance (ANCOVA) for a grouping variable with 2 groups and one covariate variable. This procedure
More informationArrhenius Plot. Sample StatFolio: arrhenius.sgp
Summary The procedure is designed to plot data from an accelerated life test in which failure times have been recorded and percentiles estimated at a number of different temperatures. The percentiles P
More informationBusiness Statistics. Lecture 10: Course Review
Business Statistics Lecture 10: Course Review 1 Descriptive Statistics for Continuous Data Numerical Summaries Location: mean, median Spread or variability: variance, standard deviation, range, percentiles,
More informationEstimating σ 2. We can do simple prediction of Y and estimation of the mean of Y at any value of X.
Estimating σ 2 We can do simple prediction of Y and estimation of the mean of Y at any value of X. To perform inferences about our regression line, we must estimate σ 2, the variance of the error term.
More informationReview 6. n 1 = 85 n 2 = 75 x 1 = x 2 = s 1 = 38.7 s 2 = 39.2
Review 6 Use the traditional method to test the given hypothesis. Assume that the samples are independent and that they have been randomly selected ) A researcher finds that of,000 people who said that
More information1 A Review of Correlation and Regression
1 A Review of Correlation and Regression SW, Chapter 12 Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then
More informationAMS 7 Correlation and Regression Lecture 8
AMS 7 Correlation and Regression Lecture 8 Department of Applied Mathematics and Statistics, University of California, Santa Cruz Suumer 2014 1 / 18 Correlation pairs of continuous observations. Correlation
More informationRegression Analysis. Table Relationship between muscle contractile force (mj) and stimulus intensity (mv).
Regression Analysis Two variables may be related in such a way that the magnitude of one, the dependent variable, is assumed to be a function of the magnitude of the second, the independent variable; however,
More information1. Define the following terms (1 point each): alternative hypothesis
1 1. Define the following terms (1 point each): alternative hypothesis One of three hypotheses indicating that the parameter is not zero; one states the parameter is not equal to zero, one states the parameter
More informationInference for the Regression Coefficient
Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression line. We can shows that b 0 and b 1 are the unbiased estimates
More informationMultiple Regression Basic
Chapter 304 Multiple Regression Basic Introduction Multiple Regression Analysis refers to a set of techniques for studying the straight-line relationships among two or more variables. Multiple regression
More informationMidterm 2 - Solutions
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put
More informationChapter 9. Correlation and Regression
Chapter 9 Correlation and Regression Lesson 9-1/9-2, Part 1 Correlation Registered Florida Pleasure Crafts and Watercraft Related Manatee Deaths 100 80 60 40 20 0 1991 1993 1995 1997 1999 Year Boats in
More informationBusiness Statistics. Lecture 9: Simple Regression
Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals
More informationDr. Maddah ENMG 617 EM Statistics 11/28/12. Multiple Regression (3) (Chapter 15, Hines)
Dr. Maddah ENMG 617 EM Statistics 11/28/12 Multiple Regression (3) (Chapter 15, Hines) Problems in multiple regression: Multicollinearity This arises when the independent variables x 1, x 2,, x k, are
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #6
STA 8 Applied Linear Models: Regression Analysis Spring 011 Solution for Homework #6 6. a) = 11 1 31 41 51 1 3 4 5 11 1 31 41 51 β = β1 β β 3 b) = 1 1 1 1 1 11 1 31 41 51 1 3 4 5 β = β 0 β1 β 6.15 a) Stem-and-leaf
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More information28. SIMPLE LINEAR REGRESSION III
28. SIMPLE LINEAR REGRESSION III Fitted Values and Residuals To each observed x i, there corresponds a y-value on the fitted line, y = βˆ + βˆ x. The are called fitted values. ŷ i They are the values of
More informationSTA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007
STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007 LAST NAME: SOLUTIONS FIRST NAME: STUDENT NUMBER: ENROLLED IN: (circle one) STA 302 STA 1001 INSTRUCTIONS: Time: 90 minutes Aids allowed: calculator.
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationParametric Estimating Nonlinear Regression
Parametric Estimating Nonlinear Regression The term nonlinear regression, in the context of this job aid, is used to describe the application of linear regression in fitting nonlinear patterns in the data.
More informationAnalysis of Bivariate Data
Analysis of Bivariate Data Data Two Quantitative variables GPA and GAES Interest rates and indices Tax and fund allocation Population size and prison population Bivariate data (x,y) Case corr® 2 Independent
More informationLOOKING FOR RELATIONSHIPS
LOOKING FOR RELATIONSHIPS One of most common types of investigation we do is to look for relationships between variables. Variables may be nominal (categorical), for example looking at the effect of an
More informationDiagnostics and Remedial Measures
Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression
More informationRatio of Polynomials Fit Many Variables
Chapter 376 Ratio of Polynomials Fit Many Variables Introduction This program fits a model that is the ratio of two polynomials of up to fifth order. Instead of a single independent variable, these polynomials
More informationBusiness Statistics. Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220. Dr. Mohammad Zainal
Department of Quantitative Methods & Information Systems Business Statistics Chapter 14 Introduction to Linear Regression and Correlation Analysis QMIS 220 Dr. Mohammad Zainal Chapter Goals After completing
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationInference for Regression Inference about the Regression Model and Using the Regression Line
Inference for Regression Inference about the Regression Model and Using the Regression Line PBS Chapter 10.1 and 10.2 2009 W.H. Freeman and Company Objectives (PBS Chapter 10.1 and 10.2) Inference about
More informationInference for Regression Simple Linear Regression
Inference for Regression Simple Linear Regression IPS Chapter 10.1 2009 W.H. Freeman and Company Objectives (IPS Chapter 10.1) Simple linear regression p Statistical model for linear regression p Estimating
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationLecture 48 Sections Mon, Nov 16, 2009
and and Lecture 48 Sections 13.4-13.5 Hampden-Sydney College Mon, Nov 16, 2009 Outline and 1 2 3 4 5 6 Outline and 1 2 3 4 5 6 and Exercise 13.4, page 821. The following data represent trends in cigarette
More informationy response variable x 1, x 2,, x k -- a set of explanatory variables
11. Multiple Regression and Correlation y response variable x 1, x 2,, x k -- a set of explanatory variables In this chapter, all variables are assumed to be quantitative. Chapters 12-14 show how to incorporate
More informationChapter 2: Looking at Data Relationships (Part 3)
Chapter 2: Looking at Data Relationships (Part 3) Dr. Nahid Sultana Chapter 2: Looking at Data Relationships 2.1: Scatterplots 2.2: Correlation 2.3: Least-Squares Regression 2.5: Data Analysis for Two-Way
More informationRegression Diagnostics Procedures
Regression Diagnostics Procedures ASSUMPTIONS UNDERLYING REGRESSION/CORRELATION NORMALITY OF VARIANCE IN Y FOR EACH VALUE OF X For any fixed value of the independent variable X, the distribution of the
More informationCanonical Correlations
Canonical Correlations Summary The Canonical Correlations procedure is designed to help identify associations between two sets of variables. It does so by finding linear combinations of the variables in
More informationDr. Junchao Xia Center of Biophysics and Computational Biology. Fall /1/2016 1/46
BIO5312 Biostatistics Lecture 10:Regression and Correlation Methods Dr. Junchao Xia Center of Biophysics and Computational Biology Fall 2016 11/1/2016 1/46 Outline In this lecture, we will discuss topics
More informationChapter 7. Scatterplots, Association, and Correlation
Chapter 7 Scatterplots, Association, and Correlation Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 29 Objective In this chapter, we study relationships! Instead, we investigate
More informationUnit 11: Multiple Linear Regression
Unit 11: Multiple Linear Regression Statistics 571: Statistical Methods Ramón V. León 7/13/2004 Unit 11 - Stat 571 - Ramón V. León 1 Main Application of Multiple Regression Isolating the effect of a variable
More informationSTAT 212 Business Statistics II 1
STAT 1 Business Statistics II 1 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICAL SCIENCES DHAHRAN, SAUDI ARABIA STAT 1: BUSINESS STATISTICS II Semester 091 Final Exam Thursday Feb
More informationLecture 11: Simple Linear Regression
Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink
More informationRegression used to predict or estimate the value of one variable corresponding to a given value of another variable.
CHAPTER 9 Simple Linear Regression and Correlation Regression used to predict or estimate the value of one variable corresponding to a given value of another variable. X = independent variable. Y = dependent
More informationBusiness Statistics. Lecture 10: Correlation and Linear Regression
Business Statistics Lecture 10: Correlation and Linear Regression Scatterplot A scatterplot shows the relationship between two quantitative variables measured on the same individuals. It displays the Form
More informationFractional Polynomial Regression
Chapter 382 Fractional Polynomial Regression Introduction This program fits fractional polynomial models in situations in which there is one dependent (Y) variable and one independent (X) variable. It
More informationCan you tell the relationship between students SAT scores and their college grades?
Correlation One Challenge Can you tell the relationship between students SAT scores and their college grades? A: The higher SAT scores are, the better GPA may be. B: The higher SAT scores are, the lower
More information2. Outliers and inference for regression
Unit6: Introductiontolinearregression 2. Outliers and inference for regression Sta 101 - Spring 2016 Duke University, Department of Statistical Science Dr. Çetinkaya-Rundel Slides posted at http://bit.ly/sta101_s16
More informationCHAPTER EIGHT Linear Regression
7 CHAPTER EIGHT Linear Regression 8. Scatter Diagram Example 8. A chemical engineer is investigating the effect of process operating temperature ( x ) on product yield ( y ). The study results in the following
More informationTHE ROYAL STATISTICAL SOCIETY 2008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS
THE ROYAL STATISTICAL SOCIETY 008 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE (MODULAR FORMAT) MODULE 4 LINEAR MODELS The Society provides these solutions to assist candidates preparing for the examinations
More informationPassing-Bablok Regression for Method Comparison
Chapter 313 Passing-Bablok Regression for Method Comparison Introduction Passing-Bablok regression for method comparison is a robust, nonparametric method for fitting a straight line to two-dimensional
More informationASSIGNMENT 3 SIMPLE LINEAR REGRESSION. Old Faithful
ASSIGNMENT 3 SIMPLE LINEAR REGRESSION In the simple linear regression model, the mean of a response variable is a linear function of an explanatory variable. The model and associated inferential tools
More informationMath 3330: Solution to midterm Exam
Math 3330: Solution to midterm Exam Question 1: (14 marks) Suppose the regression model is y i = β 0 + β 1 x i + ε i, i = 1,, n, where ε i are iid Normal distribution N(0, σ 2 ). a. (2 marks) Compute the
More informationMULTIPLE LINEAR REGRESSION IN MINITAB
MULTIPLE LINEAR REGRESSION IN MINITAB This document shows a complicated Minitab multiple regression. It includes descriptions of the Minitab commands, and the Minitab output is heavily annotated. Comments
More information4.1 Least Squares Prediction 4.2 Measuring Goodness-of-Fit. 4.3 Modeling Issues. 4.4 Log-Linear Models
4.1 Least Squares Prediction 4. Measuring Goodness-of-Fit 4.3 Modeling Issues 4.4 Log-Linear Models y = β + β x + e 0 1 0 0 ( ) E y where e 0 is a random error. We assume that and E( e 0 ) = 0 var ( e
More informationEcn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:
Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 You have until 10:20am to complete this exam. Please remember to put your name,
More information1 Introduction to Minitab
1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you
More informationRegression. Marc H. Mehlman University of New Haven
Regression Marc H. Mehlman marcmehlman@yahoo.com University of New Haven the statistician knows that in nature there never was a normal distribution, there never was a straight line, yet with normal and
More informationMultiple Variable Analysis
Multiple Variable Analysis Revised: 10/11/2017 Summary... 1 Data Input... 3 Analysis Summary... 3 Analysis Options... 4 Scatterplot Matrix... 4 Summary Statistics... 6 Confidence Intervals... 7 Correlations...
More informationCircle a single answer for each multiple choice question. Your choice should be made clearly.
TEST #1 STA 4853 March 4, 215 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 31 questions. Circle
More informationNCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure.
Chapter 460 Introduction This program calculates the harmonic regression of a time series. That is, it fits designated harmonics (sinusoidal terms of different wavelengths) using our nonlinear regression
More informationHomework 2: Simple Linear Regression
STAT 4385 Applied Regression Analysis Homework : Simple Linear Regression (Simple Linear Regression) Thirty (n = 30) College graduates who have recently entered the job market. For each student, the CGPA
More information