1 Chapter 2, Problem Set 1
|
|
- Coleen Adams
- 5 years ago
- Views:
Transcription
1 1 Chapter 2, Problem Set 1 1. The first model is the smoothest because it imposes a straight line. Only two degrees of freedom are lost. The second model exhibits the most jagged fit because each distinct value for temperature has its own conditional mean. Thirty-nine degrees of freedom are lost. There is no smoothing at all. The third model is a compromise between the first and second models with four degrees of freedom used up. The compromise is far closer to a straight line than to the unsmoothed fit ~ partial for ~ as.factor() as.factor() ~ s() s() The AIC statistics for the three models are , , and , respectively. By the AIC criterion, the third model has the best fit. The residual deviance statistics for the three models are , , and , respectively. The second model has the smallest residual deviation and by that criterion has the best fit. However, the residual 1
2 deviance does not take into account the flexibility of the model as represented by the AIC. Failing to take into account the flexibility built into a fitting function has cause misleading conclusions about fit quality. From the GAM plots, the third model seems to be most useful because it strikes the best balance between simplicity and quality of fit. The first model is the simplest but has the worst fit. The second model fits the data well (indeed, it is the best one can do) but would likely not perform well if one applied a test data set to the model and would be difficult to interpret in any case. It is too complex. If there were good substantive reasons to favor a linear fit, the first model might well be selected even though it did not perform as well as the third model by statistical criteria. 2. The span parameter seems to be the most influential on the fit of the smoother. A smaller span signifies more knots and a more jagged fit. As the span approaches 1, the smoother becomes more linear. The degree and family tuning parameters influence the path of the smoother, but not to the same extent as the span. Below are the different smoothers with a span of.25,.50, or.75, and a degree of 0, 1, or 2: Span=.25, Degree=0 Span=.25, Degree=1 Span=.25, Degree=2 2
3 Span=.50, Degree=0 Span=.50, Degree=1 Span=.50, Degree=2 3
4 Span=.75, Degree=0 Span=.75, Degree=1 Span=.75, Degree=2 3. There are numerous tuning parameter settings that suggest a positive, monotonic reionship and fit the data well. Any of these settings work well: span=0.50, degree=1, family=gaussian or symmetric span=0.75, degree=1, family=gaussian or symmetric span=0.75, degree=2, family=gaussian or symmetric 4. The slope is reively f (though still positive) for temperatures less than roughly 75 degrees. At roughly 75 degrees, the slope is considerably steeper and continues to increase monotonically. Below is a smoother using span=0.75 and degree=1. 4
5 Span=.75, Degree=1 5. The quantity and local variation of data points will dictate the width of the confidence intervals. With respect to temperature, instability is greatest at the tail ends of temperature values because there are reively fewer observations at the boundary of the temperature distribution. Conversely, stability is at its greatest for temperatures in the mid-60s, mid-70s, and mid-high 80s, at which there are a number of points and reively little variation in values of. 5
6 2 Chapter 2, Problem Set 2 1. Summary statistics for four different generalized additive models with loess polynomials are provided below: AIC Deviance lo(, degree=1)+lo(, degree=1) lo(, degree=1)+lo(, degree=2) lo(, degree=2)+lo(, degree=1) lo(, degree=2)+lo(, degree=2) The AIC and deviance statistics exhibit little variation. All AIC values are on the order of 161,700, and the deviance statistics range from 11,743,705 to 11,764,933. The partial plots show some differences in specific local regions of the predictor of interest. When a 1-D polynomial is applied to itude, the 1-D smoother of itude shows a negative and monotonic reionship between temperature. The negative slope is becomes steeper around itude value of 29. When a 2-D polynomial is applied to itude, the 1-D smoother shows two reive maximums at itude values of roughly 29 and Aside 6
7 from these small peaks, the reionship between temperature and itude is negative. When a 1-D polynomial is applied to itude, the suggested reionship between temperature and itude is jagged and exhibits several inflection points. For itude less than roughly -103, the reionship is negative. Between itudes -103 and -101, temperature increases. After a brief, slight decrease at -101, temperature increases until roughly -97. For itude values greater than -97, temperature decreases once again. However, the inflection points occur where the data become quite sparse; there is not much support. Unless there are very good subject-matter reasons for those inflections, they might usefully be ignored. When a 2-D polynomial is applied to itude, we again observe inflection points near -103 and -97. In addition, there are pronounced peaks at roughly -102 and But the same caveats apply. lo(, degree = 1) lo(, degree = 1) lo(, degree = 1) lo(, degree = 2)
8 lo(, degree = 2) lo(, degree = 1) lo(, degree = 2) lo(, degree = 2) When we construct a generalized additive model using a 2-D loess of itude and itude, we are able to observe one 3-D perspective plot (rather than two 2-D plots, as in #1). Using a span=0.75, we observe that temperature is highest for reively smaller values of itude and larger values of itude. erature decreases essentially monotonically with respect to itude. The surface is not substantially torqued so that earlier additive model may be adequate. There do not seem to be any interaction effects. There remain those curious inflection points for a the front two edges of the figure. 8
9 lo(,, span = 0.5) 3 Chapter 2, Problem Set 3 1. Using span=0.5, the partial plots corresponding to itude and itude bear similar resemblance to the partial plots constructed in problem set 2, question 1. The partial plot on year shows two pronounced peaks at the mid-1960s and the e 1980s. eratures decrease from the mid- 1960s to the mid-1970s, then increase from the mid-1970s until the early 1980s. After 1980, temperatures decline once again. Finally, the partial plot on month reveals nonlinear behavior, with a maximum during the summer (around July). eratures increase from January until the summer and then begin to decline once again. Note that if one increases the span parameter (to say, 0.75 rather than 0.5), the smoothers become more linear many of the local peaks disappear. 2. Using the default settings for degrees of freedom (df=4), penalized smoothing splines smoothers are less jagged than the loess smoothers (in which span=0.5). Also, the reive minima and maxima are less pronounced when employing penalized splines. The greater smoothness at a finegrained level may result from explicit imposition by the penalized approach of smoothness requirements at the knots. However, in comparison to the plots produced in #1, the overall conclusions regarding the reionships between temperature and the predictors are generally the same. 9
10 lo(, span = 0.5) lo(, span = 0.5) lo(year, span = 0.5) lo(month, span = 0.5) year month The reionship between temperature and itude is negative and monotonic, with a steeper decrease in temperature for itude values greater than 28. In the partial plot with itude, there are two sharp inflection points around -103 and at -96. With respect to year, the penalized smoothing splines suggest a reive minimum around 1975 and reive maxima around 1966 and 1981 (approximately). 10
11 s() s() s(year) s(month) year month 4 Chapter 2, Problem Set 4 1. In the generalized linear model, the AIC is 992 and the residual deviance is Age and family income are negatively reed to a wife s labor force participation. The log of the expected wage is positively reed to labor force participation. In the generalized additive model, the AIC is and the residual deviance is GAM fits better even after adjusting for the degrees of freedom used up. Additionally, the partial plots generated by plot.gam() suggest that the reionships between the response and each of the predictors are not linear. For example, labor force participation increases until around age 45 and then declines. This pattern is totally missed by generalized linear model. GAM is a better procedure in this case. 11
12 s(age) s(inc) age inc s(lwg) lwg 12
Interaction effects for continuous predictors in regression modeling
Interaction effects for continuous predictors in regression modeling Testing for interactions The linear regression model is undoubtedly the most commonly-used statistical model, and has the advantage
More informationGeneralized Additive Models
Generalized Additive Models The Model The GLM is: g( µ) = ß 0 + ß 1 x 1 + ß 2 x 2 +... + ß k x k The generalization to the GAM is: g(µ) = ß 0 + f 1 (x 1 ) + f 2 (x 2 ) +... + f k (x k ) where the functions
More informationRegression Model Specification in R/Splus and Model Diagnostics. Daniel B. Carr
Regression Model Specification in R/Splus and Model Diagnostics By Daniel B. Carr Note 1: See 10 for a summary of diagnostics 2: Books have been written on model diagnostics. These discuss diagnostics
More informationLinear Regression. Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x).
Linear Regression Simple linear regression model determines the relationship between one dependent variable (y) and one independent variable (x). A dependent variable is a random variable whose variation
More informationSelection of Variables and Functional Forms in Multivariable Analysis: Current Issues and Future Directions
in Multivariable Analysis: Current Issues and Future Directions Frank E Harrell Jr Department of Biostatistics Vanderbilt University School of Medicine STRATOS Banff Alberta 2016-07-04 Fractional polynomials,
More informationEffects of fishing, market price, and climate on two South American clam species
The following supplement accompanies the article Effects of fishing, market price, and climate on two South American clam species Leonardo Ortega 1, Juan Carlos Castilla 2, Marco Espino 3, Carmen Yamashiro
More informationMS-C1620 Statistical inference
MS-C1620 Statistical inference 10 Linear regression III Joni Virta Department of Mathematics and Systems Analysis School of Science Aalto University Academic year 2018 2019 Period III - IV 1 / 32 Contents
More informationModel checking overview. Checking & Selecting GAMs. Residual checking. Distribution checking
Model checking overview Checking & Selecting GAMs Simon Wood Mathematical Sciences, University of Bath, U.K. Since a GAM is just a penalized GLM, residual plots should be checked exactly as for a GLM.
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationA CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA SCOTT EDWIN DOUGLAS KREIDER. B.S., The College of William & Mary, 2008 A THESIS
A CASE STUDY IN HANDLING OVER-DISPERSION IN NEMATODE COUNT DATA by SCOTT EDWIN DOUGLAS KREIDER B.S., The College of William & Mary, 2008 A THESIS submitted in partial fulfillment of the requirements for
More informationAlternatives. The D Operator
Using Smoothness Alternatives Text: Chapter 5 Some disadvantages of basis expansions Discrete choice of number of basis functions additional variability. Non-hierarchical bases (eg B-splines) make life
More information9 Correlation and Regression
9 Correlation and Regression SW, Chapter 12. Suppose we select n = 10 persons from the population of college seniors who plan to take the MCAT exam. Each takes the test, is coached, and then retakes the
More informationEngineering 7: Introduction to computer programming for scientists and engineers
Engineering 7: Introduction to computer programming for scientists and engineers Interpolation Recap Polynomial interpolation Spline interpolation Regression and Interpolation: learning functions from
More informationChapter 3: Regression Methods for Trends
Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from
More informationGeneralized Additive Models (GAMs)
Generalized Additive Models (GAMs) Israel Borokini Advanced Analysis Methods in Natural Resources and Environmental Science (NRES 746) October 3, 2016 Outline Quick refresher on linear regression Generalized
More informationModel comparison: Deviance-based approaches
Model comparison: Deviance-based approaches Patrick Breheny February 19 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/23 Model comparison Thus far, we have looked at residuals in a fairly
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 7 Interpolation Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction permitted
More informationST440/540: Applied Bayesian Statistics. (9) Model selection and goodness-of-fit checks
(9) Model selection and goodness-of-fit checks Objectives In this module we will study methods for model comparisons and checking for model adequacy For model comparisons there are a finite number of candidate
More informationThreshold Autoregressions and NonLinear Autoregressions
Threshold Autoregressions and NonLinear Autoregressions Original Presentation: Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Threshold Regression 1 / 47 Threshold Models
More informationSociology 740 John Fox. Lecture Notes. 1. Introduction. Copyright 2014 by John Fox. Introduction 1
Sociology 740 John Fox Lecture Notes 1. Introduction Copyright 2014 by John Fox Introduction 1 1. Goals I To introduce the notion of regression analysis as a description of how the average value of a response
More informationMultivariable model-building with continuous covariates: 2. Comparison between splines and fractional polynomials
Multivariable model-building with continuous covariates: 2. Comparison between splines and fractional polynomials Harald Binder, Willi Sauerbrei & Patrick Royston Universität Freiburg i. Br. Nr. 106 July
More informationGraphical Presentation of a Nonparametric Regression with Bootstrapped Confidence Intervals
Graphical Presentation of a Nonparametric Regression with Bootstrapped Confidence Intervals Mark Nicolich & Gail Jorgensen Exxon Biomedical Science, Inc., East Millstone, NJ INTRODUCTION Parametric regression
More informationChapter 8. Linear Regression. Copyright 2010 Pearson Education, Inc.
Chapter 8 Linear Regression Copyright 2010 Pearson Education, Inc. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu: Copyright
More informationOutline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data
Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison
More informationSome general observations.
Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces
More informationIntroduction to Regression
PSU Summer School in Statistics Regression Slide 1 Introduction to Regression Chad M. Schafer Department of Statistics & Data Science, Carnegie Mellon University cschafer@cmu.edu June 2018 PSU Summer School
More informationHow to deal with non-linear count data? Macro-invertebrates in wetlands
How to deal with non-linear count data? Macro-invertebrates in wetlands In this session we l recognize the advantages of making an effort to better identify the proper error distribution of data and choose
More information1 Piecewise Cubic Interpolation
Piecewise Cubic Interpolation Typically the problem with piecewise linear interpolation is the interpolant is not differentiable as the interpolation points (it has a kinks at every interpolation point)
More informationTime Series Analysis. Smoothing Time Series. 2) assessment of/accounting for seasonality. 3) assessment of/exploiting "serial correlation"
Time Series Analysis 2) assessment of/accounting for seasonality This (not surprisingly) concerns the analysis of data collected over time... weekly values, monthly values, quarterly values, yearly values,
More informationStat 4510/7510 Homework 7
Stat 4510/7510 Due: 1/10. Stat 4510/7510 Homework 7 1. Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details
More informationIntroduction to Linear regression analysis. Part 2. Model comparisons
Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual
More informationUsing modern time series analysis techniques to predict ENSO events from the SOI time series
Nonlinear Processes in Geophysics () 9: 4 45 Nonlinear Processes in Geophysics c European Geophysical Society Using modern time series analysis techniques to predict ENSO events from the SOI time series
More informationElectricity Case: Statistical Analysis of Electric Power Outages
CREATE Research Archive Published Articles & Papers 2005 Electricity Case: Statistical Analysis of Electric Power Outages Jeffrey S. Simonoff New York University Carlos E. Restrepo New York University,
More informationLecture 3: Statistical Decision Theory (Part II)
Lecture 3: Statistical Decision Theory (Part II) Hao Helen Zhang Hao Helen Zhang Lecture 3: Statistical Decision Theory (Part II) 1 / 27 Outline of This Note Part I: Statistics Decision Theory (Classical
More informationLoad-Strength Interference
Load-Strength Interference Loads vary, strengths vary, and reliability usually declines for mechanical systems, electronic systems, and electrical systems. The cause of failures is a load-strength interference
More informationHigh-dimensional regression
High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and
More informationMathematics for Economics MA course
Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between
More informationCubic Splines; Bézier Curves
Cubic Splines; Bézier Curves 1 Cubic Splines piecewise approximation with cubic polynomials conditions on the coefficients of the splines 2 Bézier Curves computer-aided design and manufacturing MCS 471
More informationStat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010
1 Linear models Y = Xβ + ɛ with ɛ N (0, σ 2 e) or Y N (Xβ, σ 2 e) where the model matrix X contains the information on predictors and β includes all coefficients (intercept, slope(s) etc.). 1. Number of
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationSTAT 704 Sections IRLS and Bootstrap
STAT 704 Sections 11.4-11.5. IRLS and John Grego Department of Statistics, University of South Carolina Stat 704: Data Analysis I 1 / 14 LOWESS IRLS LOWESS LOWESS (LOcally WEighted Scatterplot Smoothing)
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationDose-response modeling with bivariate binary data under model uncertainty
Dose-response modeling with bivariate binary data under model uncertainty Bernhard Klingenberg 1 1 Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267 and Institute of Statistics,
More informationResiduals and regression diagnostics: focusing on logistic regression
Big-data Clinical Trial Column Page of 8 Residuals and regression diagnostics: focusing on logistic regression Zhongheng Zhang Department of Critical Care Medicine, Jinhua Municipal Central Hospital, Jinhua
More informationBoosting Methods: Why They Can Be Useful for High-Dimensional Data
New URL: http://www.r-project.org/conferences/dsc-2003/ Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003) March 20 22, Vienna, Austria ISSN 1609-395X Kurt Hornik,
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationResiduals and model diagnostics
Residuals and model diagnostics Patrick Breheny November 10 Patrick Breheny Survival Data Analysis (BIOS 7210) 1/42 Introduction Residuals Many assumptions go into regression models, and the Cox proportional
More informationEstimation of nutrient requirements using broken-line regression analysis 1
Published December 8, 2014 Estimation of nutrient requirements using broken-line regression analysis 1 K. R. Robbins,* 2 A. M. Saxton,* and L. L. Southern *Department of Animal Science, University of Tennessee,
More informationInference for Regression Inference about the Regression Model and Using the Regression Line, with Details. Section 10.1, 2, 3
Inference for Regression Inference about the Regression Model and Using the Regression Line, with Details Section 10.1, 2, 3 Basic components of regression setup Target of inference: linear dependency
More informationSTK4900/ Lecture 5. Program
STK4900/9900 - Lecture 5 Program 1. Checking model assumptions Linearity Equal variances Normality Influential observations Importance of model assumptions 2. Selection of predictors Forward and backward
More informationSubject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study
Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study 1.4 0.0-6 7 8 9 10 11 12 13 14 15 16 17 18 19 age Model 1: A simple broken stick model with knot at 14 fit with
More informationNumerical Methods of Approximation
Contents 31 Numerical Methods of Approximation 31.1 Polynomial Approximations 2 31.2 Numerical Integration 28 31.3 Numerical Differentiation 58 31.4 Nonlinear Equations 67 Learning outcomes In this Workbook
More informationLinear regression. Linear regression is a simple approach to supervised learning. It assumes that the dependence of Y on X 1,X 2,...X p is linear.
Linear regression Linear regression is a simple approach to supervised learning. It assumes that the dependence of Y on X 1,X 2,...X p is linear. 1/48 Linear regression Linear regression is a simple approach
More informationFlexible Spatio-temporal smoothing with array methods
Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session IPS046) p.849 Flexible Spatio-temporal smoothing with array methods Dae-Jin Lee CSIRO, Mathematics, Informatics and
More informationWeek 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend
Week 5 Quantitative Analysis of Financial Markets Modeling and Forecasting Trend Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 :
More informationLinear model selection and regularization
Linear model selection and regularization Problems with linear regression with least square 1. Prediction Accuracy: linear regression has low bias but suffer from high variance, especially when n p. It
More informationRatio of Polynomials Fit One Variable
Chapter 375 Ratio of Polynomials Fit One Variable Introduction This program fits a model that is the ratio of two polynomials of up to fifth order. Examples of this type of model are: and Y = A0 + A1 X
More informationEconometrics II. Seppo Pynnönen. Spring Department of Mathematics and Statistics, University of Vaasa, Finland
Department of Mathematics and Statistics, University of Vaasa, Finland Spring 2018 Part III Limited Dependent Variable Models As of Jan 30, 2017 1 Background 2 Binary Dependent Variable The Linear Probability
More informationChapter 8. Linear Regression. The Linear Model. Fat Versus Protein: An Example. The Linear Model (cont.) Residuals
Chapter 8 Linear Regression Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8-1 Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Fat Versus
More informationRegularized principal components analysis
9 Regularized principal components analysis 9.1 Introduction In this chapter, we discuss the application of smoothing to functional principal components analysis. In Chapter 5 we have already seen that
More informationStep 2: Select Analyze, Mixed Models, and Linear.
Example 1a. 20 employees were given a mood questionnaire on Monday, Wednesday and again on Friday. The data will be first be analyzed using a Covariance Pattern model. Step 1: Copy Example1.sav data file
More informationSection Poisson Regression
Section 14.13 Poisson Regression Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 26 Poisson regression Regular regression data {(x i, Y i )} n i=1,
More informationPrediction of Bike Rental using Model Reuse Strategy
Prediction of Bike Rental using Model Reuse Strategy Arun Bala Subramaniyan and Rong Pan School of Computing, Informatics, Decision Systems Engineering, Arizona State University, Tempe, USA. {bsarun, rong.pan}@asu.edu
More informationPartial Generalized Additive Models
An information theoretical approach to avoid the concurvity Hong Gu 1 Mu Zhu 2 1 Department of Mathematics and Statistics Dalhousie University 2 Department of Statistics and Actuarial Science University
More informationChapter 14 Simple Linear Regression (A)
Chapter 14 Simple Linear Regression (A) 1. Characteristics Managerial decisions often are based on the relationship between two or more variables. can be used to develop an equation showing how the variables
More informationStatistics 262: Intermediate Biostatistics Model selection
Statistics 262: Intermediate Biostatistics Model selection Jonathan Taylor & Kristin Cobb Statistics 262: Intermediate Biostatistics p.1/?? Today s class Model selection. Strategies for model selection.
More informationTIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA
CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis
More informationIntroduction to Regression
Introduction to Regression p. 1/97 Introduction to Regression Chad Schafer cschafer@stat.cmu.edu Carnegie Mellon University Introduction to Regression p. 1/97 Acknowledgement Larry Wasserman, All of Nonparametric
More informationLAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION
LAB 3 INSTRUCTIONS SIMPLE LINEAR REGRESSION In this lab you will first learn how to display the relationship between two quantitative variables with a scatterplot and also how to measure the strength of
More informationLecture Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University
Lecture 15 20 Prepared By: Mohammad Kamrul Arefin Lecturer, School of Business, North South University Modeling for Time Series Forecasting Forecasting is a necessary input to planning, whether in business,
More informationLecture Outline. Biost 518 Applied Biostatistics II. Choice of Model for Analysis. Choice of Model. Choice of Model. Lecture 10: Multiple Regression:
Biost 518 Applied Biostatistics II Scott S. Emerson, M.D., Ph.D. Professor of Biostatistics University of Washington Lecture utline Choice of Model Alternative Models Effect of data driven selection of
More informationNonparametric Regression. Badr Missaoui
Badr Missaoui Outline Kernel and local polynomial regression. Penalized regression. We are given n pairs of observations (X 1, Y 1 ),...,(X n, Y n ) where Y i = r(x i ) + ε i, i = 1,..., n and r(x) = E(Y
More information10 Model Checking and Regression Diagnostics
10 Model Checking and Regression Diagnostics The simple linear regression model is usually written as i = β 0 + β 1 i + ɛ i where the ɛ i s are independent normal random variables with mean 0 and variance
More informationAutomatic Forecasting
Automatic Forecasting Summary The Automatic Forecasting procedure is designed to forecast future values of time series data. A time series consists of a set of sequential numeric data taken at equally
More informationRegression M&M 2.3 and 10. Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables
Uses Curve fitting Summarization ('model') Description Prediction Explanation Adjustment for 'confounding' variables MALES FEMALES Age. Tot. %-ile; weight,g Tot. %-ile; weight,g wk N. 0th 50th 90th No.
More informationThe Flight of the Space Shuttle Challenger
The Flight of the Space Shuttle Challenger On January 28, 1986, the space shuttle Challenger took off on the 25 th flight in NASA s space shuttle program. Less than 2 minutes into the flight, the spacecraft
More informationReduced-rank hazard regression
Chapter 2 Reduced-rank hazard regression Abstract The Cox proportional hazards model is the most common method to analyze survival data. However, the proportional hazards assumption might not hold. The
More informationOn the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models
On the Behavior of Marginal and Conditional Akaike Information Criteria in Linear Mixed Models Thomas Kneib Department of Mathematics Carl von Ossietzky University Oldenburg Sonja Greven Department of
More informationGeneralized Linear Models: An Introduction
Applied Statistics With R Generalized Linear Models: An Introduction John Fox WU Wien May/June 2006 2006 by John Fox Generalized Linear Models: An Introduction 1 A synthesis due to Nelder and Wedderburn,
More informationDynamic Time Series Regression: A Panacea for Spurious Correlations
International Journal of Scientific and Research Publications, Volume 6, Issue 10, October 2016 337 Dynamic Time Series Regression: A Panacea for Spurious Correlations Emmanuel Alphonsus Akpan *, Imoh
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationMedical Image Analysis
Medical Image Analysis CS 593 / 791 Computer Science and Electrical Engineering Dept. West Virginia University 20th January 2006 Outline 1 Discretizing the heat equation 2 Outline 1 Discretizing the heat
More informationDefining the reference range of 8-iso-PG F 2α for pregnancy women
Defining the reference range of 8-iso-PG F 2α for pregnancy women Essay in Statistics Department of Economics and Society, Dalarna University Author Supervisor Hao Luo Johan Bring Date June, 2007 Table
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationOrdinary Least Squares Regression Explained: Vartanian
Ordinary Least Squares Regression Eplained: Vartanian When to Use Ordinary Least Squares Regression Analysis A. Variable types. When you have an interval/ratio scale dependent variable.. When your independent
More informationSpatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter
Spatial bias modeling with application to assessing remotely-sensed aerosol as a proxy for particulate matter Chris Paciorek Department of Biostatistics Harvard School of Public Health application joint
More informationNumber of analysis cases (objects) n n w Weighted number of analysis cases: matrix, with wi
CATREG Notation CATREG (Categorical regression with optimal scaling using alternating least squares) quantifies categorical variables using optimal scaling, resulting in an optimal linear regression equation
More informationEPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7
Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review
More informationVariable Selection and Model Choice in Survival Models with Time-Varying Effects
Variable Selection and Model Choice in Survival Models with Time-Varying Effects Boosting Survival Models Benjamin Hofner 1 Department of Medical Informatics, Biometry and Epidemiology (IMBE) Friedrich-Alexander-Universität
More information10. Alternative case influence statistics
10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the
More informationAdditional Notes: Investigating a Random Slope. When we have fixed level-1 predictors at level 2 we show them like this:
Ron Heck, Summer 01 Seminars 1 Multilevel Regression Models and Their Applications Seminar Additional Notes: Investigating a Random Slope We can begin with Model 3 and add a Random slope parameter. If
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationSoil Phosphorus Discussion
Solution: Soil Phosphorus Discussion Summary This analysis is ambiguous: there are two reasonable approaches which yield different results. Both lead to the conclusion that there is not an independent
More information3 Results. Part I. 3.1 Base/primary model
3 Results Part I 3.1 Base/primary model For the development of the base/primary population model the development dataset (for data details see Table 5 and sections 2.1 and 2.2), which included 1256 serum
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationFrom Data To Functions Howdowegofrom. Basis Expansions From multiple linear regression: The Monomial Basis. The Monomial Basis
From Data To Functions Howdowegofrom Basis Expansions From multiple linear regression: data to functions? Or if there is curvature: y i = β 0 + x 1i β 1 + x 2i β 2 + + ɛ i y i = β 0 + x i β 1 + xi 2 β
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationIntroducing Generalized Linear Models: Logistic Regression
Ron Heck, Summer 2012 Seminars 1 Multilevel Regression Models and Their Applications Seminar Introducing Generalized Linear Models: Logistic Regression The generalized linear model (GLM) represents and
More informationISyE 691 Data mining and analytics
ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)
More information2.2 Classical Regression in the Time Series Context
48 2 Time Series Regression and Exploratory Data Analysis context, and therefore we include some material on transformations and other techniques useful in exploratory data analysis. 2.2 Classical Regression
More information