Stat 401B Exam 3 Fall 2016 (Corrected Version)
|
|
- Norman Berry
- 6 years ago
- Views:
Transcription
1 Stat 401B Exam 3 Fall 2016 (Corrected Version) I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning will receive NO partial credit. Correct numerical answers to difficult questions unaccompanied by supporting reasoning may not receive full credit. SHOW YOUR WORK/EXPLAIN YOURSELF! 1
2 1. There are data on the UCI Machine Learning Repository due originally to P. Tüfekci and H. Kaya concerning the running of a power plant. Hourly information on atmospheric conditions and a plant operating variable were collected over a number of years, along with the hourly energy output of the plant. This question concerns MLR analyses of a random sample of 200 of the hourly periods made treating mean "PE" (electrical power) as a function of the variables "AT" (ambient temperature in C ), "AP" (ambient pressure in milibars), "RH" (relative humidity in %), and "V" (exhaust vacuum in cm Mg). 5 pts a) Below is a graphic from the "leaps" function regsubsets() for the n = 200 periods. Which 2 predictors seem to be most effective in predicting PE? What fraction of the raw variability in PE do they account for? 8 pts b) Give the value of and degrees of freedom for an F statistic for comparing the full model involving all predictors to the best 2-predictor model. (While it is not really needed to answer this question, SSTot = for these n = 200 cases.) F = d.f. =, 2
3 Below are some results (cross-validation root mean squared prediction error) from repeated 10-fold cross-validation, and values of 2 MSE and R for several MLR models for PE. Model Predictors Included CV-RMSPE RMSE R-Squared 1 V AT AT,V AT,RH AT,AP,RH AT,V,RH AT,V,AP,RH pts c) Which of models 1-7 is most attractive on the basis of the table above? Explain. 4 pts d) What about the table above suggests that none of the models fit there suffers dramatic overfitting? Below are some scatterplots of the data from the 200 sample hours. 4 pts d) Is there evidence of multicollinearity in these plots? If so, what is it? 3
4 2. There is an interesting "Banknote Authentication" data set on the UCI Machine Learning repository that consists of 4 numerical features extracted from grey scale images of real and counterfeit banknotes. There are 610 counterfeit and 762 real notes represented in the data set. There is a printout beginning on Page 8 of this exam from an attempt to model the probability that a note is counterfeit (V5=1) as a function of the features (V1,V2,V3,V4). Use it to answer the following questions. a) Which of the features V1,V2,V3,V4 appears to be least important in modeling the probability that V5=1 (the note is counterfeit)? Explain. ( ( )) b) Recall that if p ( u) = exp ( u) / ( 1+ exp( u) ) then the "log odds" are u ln p( u) / 1 p( u) =. Give approximately 95% confidence limits for the increase in log odds that a banknote is counterfeit accompanying a unit increase in V1 if the other features V2,V3,V4 are held fixed. c) Give 2-sided approximately 95% confidence limits for the probability that a banknote with features V1=.2,V2=.8,V3=.4,V4=-.6 is counterfeit. 4
5 3. A data set in the book Regression Analysis by Graybill and Iyer concerns how an optical reading, y, measuring light transmitted through a chemical solution depends upon the concentration of a chemical, x (in mg/l). A possible nonlinear (in coefficients β1, β2, and β 3) form for the relationship between x and mean y is μyx = β1+ β2exp( β3x) (*) A printout beginning on Page 9 summarizes an analysis of the n = 12 pairs in the data set. a) Suppose relationship (*) above holds and that for a given concentration the optical reading is normally distributed with standard deviation σ. Give approximate 95% two-sided confidence limits for this model parameter. 5 pts b) According to the relationship (*), as concentration, x, goes from 0 to, the mean light transmitted goes from β1+ β2 to β1. The value of concentration, x, at which half of the decrease in light transmission has been realized might be of interest. What is this in terms of the model parameters? Give 95% two-sided confidence limits for this value of x. 4. On page 217 of the white Vardeman and Jobe text there are data of Koh, Morden, and Ogbourne that concern axial breaking strengths of wooden dowel rods of 3 different lengths and 3 different diameters. A printout beginning on Page 9 of this exam summarizes some computations with these data. a) What about the printed analyses of dowel strength makes direct analysis of y under the usual one-way normal model assumptions seem inappropriate? Instead we will henceforth consider analysis of y' ln( y) =. 5
6 b) Make an interaction plot enhanced with error bars based on 95% confidence limits for combination mean log strengths. What are your "margins of error" for this plotting? (Give a number.) + / margin: c) Based on the plot above, which effects appear to be both statistically detectable and most important? (Consider diameter and length main effects and interactions. List an order of importance.) d) What items on the printout support your judgment in c)? Explain how they lend support. 6
7 5. Beginning on Page 12 there is R code and output corresponding to a balanced experiment on paper airplane flight distances (carried out in an undergraduate engineering statistics class). There are 3 levels of the factor "Design," 2 levels of the factor (nose) "Weight," and 3 levels of the factor "Paper" (type) in the study. Use the R output to answer the rest of the questions on this exam. a) What is the value of s pooled for this data set? (Say where you found your value.) What does this measure in the present context? b) What is the relatively simple interpretation that is possible for these data? (What factorial effect(s) dominate(s) and what does that mean about the flying of paper airplanes?) What on the output tells you that this is so? c) What type or types of airplanes fly furthest (according to the outcome of this study)? Explain. d) What do you predict for the average flight distance of the type or types of planes you identified in part c) based on a good simple model here? 7
8 R Code and OutPut for the Banknote Data > Banknote[1:5,] V1 V2 V3 V4 V > summary(banknote) V1 V2 V3 V4 V5 Min. : Min. : Min. : Min. : Min. : st Qu.: st Qu.: st Qu.: st Qu.: st Qu.: Median : Median : Median : Median : Median : Mean : Mean : Mean : Mean : Mean : rd Qu.: rd Qu.: rd Qu.: rd Qu.: rd Qu.: Max. : Max. : Max. : Max. : Max. : > bank.out<-glm(as.factor(v5)~v1+v2+v3+v4,data=banknote,family=binomial()) Warning message: glm.fit: fitted probabilities numerically 0 or 1 occurred > summary(bank.out) Call: glm(formula = as.factor(v5) ~ V1 + V2 + V3 + V4, family = binomial(), data = Banknote) Deviance Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) e-06 *** V e-06 *** V e-06 *** V e-06 *** V (Dispersion parameter for binomial family taken to be 1) Null deviance: on 1371 degrees of freedom Residual deviance: on 1367 degrees of freedom AIC: Number of Fisher Scoring iterations: 12 > unknown<-data.frame(v1=.2,v2=.8,v3=.4,v4=-.6) > predict(bank.out,newdata=unknown,se.fit=true) $fit $se.fit [1] $residual.scale [1] 1 8
9 R Code and OutPut for the Optical Data > optical.out<-nls(y~b1+b2*exp(-b3*x),start=c(b1=0,b2=3,b3=1),trace=t) : : : : : > summary(optical.out) Formula: y ~ b1 + b2 * exp(-b3 * x) Parameters: Estimate Std. Error t value Pr(> t ) b b e-07 *** b *** Residual standard error: on 9 degrees of freedom Number of iterations to convergence: 4 Achieved convergence tolerance: 7.998e-07 > confint(optical.out) Waiting for profiling to be done : : : : % 97.5% b b b > predict(optical.out) [1] [9] R Code and OutPut for the Dowel Strength Data > cbind(type,diam,length,strength) type diam length strength [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] [12,] [13,] [14,] [15,] [16,] [17,] [18,]
10 [19,] [20,] [21,] [22,] [23,] [24,] [25,] [26,] [27,] [28,] [29,] [30,] [31,] [32,] [33,] [34,] [35,] [36,] > > options(contrasts = rep("contr.sum", 2)) > > aggregate(strength,by=list(type),fun=mean) Group.1 x > aggregate(strength,by=list(type),fun=sd) Group.1 x > summary(lm(strength~as.factor(type))) Call: lm(formula = strength ~ as.factor(type)) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** as.factor(type) e-07 *** as.factor(type) e-13 *** as.factor(type) e-13 *** as.factor(type) e-15 *** as.factor(type) *** as.factor(type) e-12 *** as.factor(type) < 2e-16 *** as.factor(type) e-07 *** Residual standard error: on 27 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 219 on 8 and 27 DF, p-value: < 2.2e-16 10
11 > > logstrength<-log(strength) > logstrength [1] [9] [17] [25] [33] > > aggregate(logstrength,by=list(type),fun=mean) Group.1 x > aggregate(logstrength,by=list(type),fun=sd) Group.1 x > summary(lm(logstrength~as.factor(type))) Call: lm(formula = logstrength ~ as.factor(type)) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** as.factor(type) as.factor(type) < 2e-16 *** as.factor(type) < 2e-16 *** as.factor(type) e-16 *** as.factor(type) e-05 *** as.factor(type) e-10 *** as.factor(type) < 2e-16 *** as.factor(type) e-13 *** Residual standard error: on 27 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 8 and 27 DF, p-value: < 2.2e-16 > > summary(lm(logstrength~as.factor(diam)*as.factor(length))) Call: lm(formula = logstrength ~ as.factor(diam) * as.factor(length)) Residuals: Min 1Q Median 3Q Max
12 Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) < 2e-16 *** as.factor(diam) < 2e-16 *** as.factor(diam) e-09 *** as.factor(length) < 2e-16 *** as.factor(length) as.factor(diam)1:as.factor(length) e-07 *** as.factor(diam)2:as.factor(length) as.factor(diam)1:as.factor(length) *** as.factor(diam)2:as.factor(length) Residual standard error: on 27 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 8 and 27 DF, p-value: < 2.2e-16 > anova(lm(logstrength~as.factor(diam)*as.factor(length))) Analysis of Variance Table Response: logstrength Df Sum Sq Mean Sq F value Pr(>F) as.factor(diam) < 2.2e-16 *** as.factor(length) < 2.2e-16 *** as.factor(diam):as.factor(length) e-06 *** Residuals R Code and OutPut for the Paper Airplane Data > cbind(design,weight,paper,dist) design weight paper dist [1,] [2,] [3,] [4,] [5,] [6,] [7,] [8,] [9,] [10,] [11,] [12,] [13,] [14,] [15,] [16,] [17,] [18,] [19,] [20,] [21,] [22,] [23,] [24,] [25,] [26,] [27,] [28,] [29,] [30,] [31,] [32,]
13 [33,] [34,] [35,] [36,] > > Design<-as.factor(design) > Weight<-as.factor(weight) > Paper<-as.factor(paper) > > summary(lm(dist~design*weight*paper)) Call: lm(formula = dist ~ Design * Weight * Paper) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) e-13 *** Design Design e-05 *** Weight Paper Design1:Weight Design2:Weight Design1:Paper Design2:Paper Weight1:Paper Design1:Weight1:Paper Design2:Weight1:Paper Residual standard error: on 24 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: on 11 and 24 DF, p-value: > anova(lm(dist~design*weight*paper)) Analysis of Variance Table Response: dist Df Sum Sq Mean Sq F value Pr(>F) Design *** Weight Paper Design:Weight Design:Paper Weight:Paper Design:Weight:Paper Residuals
Stat 401B Exam 2 Fall 2017
Stat 0B Exam Fall 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning will
More informationStat 401B Final Exam Fall 2015
Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401B Final Exam Fall 2016
Stat 40B Final Exam Fall 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401XV Final Exam Spring 2017
Stat 40XV Final Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401B Exam 2 Fall 2015
Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401B Exam 2 Fall 2016
Stat 40B Eam Fall 06 I have neither given nor received unauthorized assistance on this eam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning will
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationR Output for Linear Models using functions lm(), gls() & glm()
LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base
More informationStat 5102 Final Exam May 14, 2015
Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationSTAT 526 Spring Midterm 1. Wednesday February 2, 2011
STAT 526 Spring 2011 Midterm 1 Wednesday February 2, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points
More informationSTAT 510 Final Exam Spring 2015
STAT 510 Final Exam Spring 2015 Instructions: The is a closed-notes, closed-book exam No calculator or electronic device of any kind may be used Use nothing but a pen or pencil Please write your name and
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationWeek 7 Multiple factors. Ch , Some miscellaneous parts
Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires
More informationStat 328 Final Exam (Regression) Summer 2002 Professor Vardeman
Stat Final Exam (Regression) Summer Professor Vardeman This exam concerns the analysis of 99 salary data for n = offensive backs in the NFL (This is a part of the larger data set that serves as the basis
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours
Instructions: STATS216v Introduction to Statistical Learning Stanford University, Summer 2017 Remember the university honor code. Midterm Exam (Solutions) Duration: 1 hours Write your name and SUNet ID
More informationIE 361 EXAM #3 FALL 2013 Show your work: Partial credit can only be given for incorrect answers if there is enough information to clearly see what you were trying to do. There are two additional blank
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationOn the Inference of the Logistic Regression Model
On the Inference of the Logistic Regression Model 1. Model ln =(; ), i.e. = representing false. The linear form of (;) is entertained, i.e. ((;)) ((;)), where ==1 ;, with 1 representing true, 0 ;= 1+ +
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationStat 231 Exam 2 Fall 2013
Stat 231 Exam 2 Fall 2013 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Some IE 361 students worked with a manufacturer on quantifying the capability
More informationExample: 1982 State SAT Scores (First year state by state data available)
Lecture 11 Review Section 3.5 from last Monday (on board) Overview of today s example (on board) Section 3.6, Continued: Nested F tests, review on board first Section 3.4: Interaction for quantitative
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationStatistical Prediction
Statistical Prediction P.R. Hahn Fall 2017 1 Some terminology The goal is to use data to find a pattern that we can exploit. y: response/outcome/dependent/left-hand-side x: predictor/covariate/feature/independent
More informationA Generalized Linear Model for Binomial Response Data. Copyright c 2017 Dan Nettleton (Iowa State University) Statistics / 46
A Generalized Linear Model for Binomial Response Data Copyright c 2017 Dan Nettleton (Iowa State University) Statistics 510 1 / 46 Now suppose that instead of a Bernoulli response, we have a binomial response
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationIntroduction to the Generalized Linear Model: Logistic regression and Poisson regression
Introduction to the Generalized Linear Model: Logistic regression and Poisson regression Statistical modelling: Theory and practice Gilles Guillot gigu@dtu.dk November 4, 2013 Gilles Guillot (gigu@dtu.dk)
More informationBooklet of Code and Output for STAD29/STA 1007 Midterm Exam
Booklet of Code and Output for STAD29/STA 1007 Midterm Exam List of Figures in this document by page: List of Figures 1 Packages................................ 2 2 Hospital infection risk data (some).................
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More informationA discussion on multiple regression models
A discussion on multiple regression models In our previous discussion of simple linear regression, we focused on a model in which one independent or explanatory variable X was used to predict the value
More informationConsider fitting a model using ordinary least squares (OLS) regression:
Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationSTAT 512 MidTerm I (2/21/2013) Spring 2013 INSTRUCTIONS
STAT 512 MidTerm I (2/21/2013) Spring 2013 Name: Key INSTRUCTIONS 1. This exam is open book/open notes. All papers (but no electronic devices except for calculators) are allowed. 2. There are 5 pages in
More informationStat 231 Final Exam Fall 2013 Slightly Edited Version
Stat 31 Final Exam Fall 013 Slightly Edited Version I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. An IE 361 project group studied the operation
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationLeftovers. Morris. University Farm. University Farm. Morris. yield
Leftovers SI 544 Lada Adamic 1 Trellis graphics Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475 Manchuria No. 462 Svansota Trebi Wisconsin No. 38 No. 457 Glabron Peatland Velvet No. 475
More informationTruck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation
Background Regression so far... Lecture 23 - Sta 111 Colin Rundel June 17, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical or categorical
More informationExercise 5.4 Solution
Exercise 5.4 Solution Niels Richard Hansen University of Copenhagen May 7, 2010 1 5.4(a) > leukemia
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationAge 55 (x = 1) Age < 55 (x = 0)
Logistic Regression with a Single Dichotomous Predictor EXAMPLE: Consider the data in the file CHDcsv Instead of examining the relationship between the continuous variable age and the presence or absence
More informationActivity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression
Activity #12: More regression topics: LOWESS; polynomial, nonlinear, robust, quantile; ANOVA as regression Scenario: 31 counts (over a 30-second period) were recorded from a Geiger counter at a nuclear
More informationLogistic Regression. 0.1 Frogs Dataset
Logistic Regression We move now to the classification problem from the regression problem and study the technique ot logistic regression. The setting for the classification problem is the same as that
More informationModeling Overdispersion
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in
More information12 Modelling Binomial Response Data
c 2005, Anthony C. Brooms Statistical Modelling and Data Analysis 12 Modelling Binomial Response Data 12.1 Examples of Binary Response Data Binary response data arise when an observation on an individual
More informationBooklet of Code and Output for STAD29/STA 1007 Midterm Exam
Booklet of Code and Output for STAD29/STA 1007 Midterm Exam List of Figures in this document by page: List of Figures 1 NBA attendance data........................ 2 2 Regression model for NBA attendances...............
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationReaction Days
Stat April 03 Week Fitting Individual Trajectories # Straight-line, constant rate of change fit > sdat = subset(sleepstudy, Subject == "37") > sdat Reaction Days Subject > lm.sdat = lm(reaction ~ Days)
More informationssh tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm
Kedem, STAT 430 SAS Examples: Logistic Regression ==================================== ssh abc@glue.umd.edu, tap sas913, sas https://www.statlab.umd.edu/sasdoc/sashtml/onldoc.htm a. Logistic regression.
More informationcor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )
Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationVarious Issues in Fitting Contingency Tables
Various Issues in Fitting Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Complete Tables with Zero Entries In contingency tables, it is possible to have zero entries in a
More informationRegression so far... Lecture 21 - Logistic Regression. Odds. Recap of what you should know how to do... At this point we have covered: Sta102 / BME102
Background Regression so far... Lecture 21 - Sta102 / BME102 Colin Rundel November 18, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks
More informationMultiple Linear Regression. Chapter 12
13 Multiple Linear Regression Chapter 12 Multiple Regression Analysis Definition The multiple regression model equation is Y = b 0 + b 1 x 1 + b 2 x 2 +... + b p x p + ε where E(ε) = 0 and Var(ε) = s 2.
More informationRegression Methods for Survey Data
Regression Methods for Survey Data Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Lohr chapter 11! 1 Goals for this Lecture! Linear regression! Review of linear
More informationNature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals. Regression Output. Conditions for inference.
Understanding regression output from software Nature vs. nurture? Lecture 18 - Regression: Inference, Outliers, and Intervals In 1966 Cyril Burt published a paper called The genetic determination of differences
More informationTwo Hours. Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER. 26 May :00 16:00
Two Hours MATH38052 Mathematical formula books and statistical tables are to be provided THE UNIVERSITY OF MANCHESTER GENERALISED LINEAR MODELS 26 May 2016 14:00 16:00 Answer ALL TWO questions in Section
More informationGeneralized Linear Models in R
Generalized Linear Models in R NO ORDER Kenneth K. Lopiano, Garvesh Raskutti, Dan Yang last modified 28 4 2013 1 Outline 1. Background and preliminaries 2. Data manipulation and exercises 3. Data structures
More informationRegression Analysis IV... More MLR and Model Building
Regression Analysis IV... More MLR and Model Building This session finishes up presenting the formal methods of inference based on the MLR model and then begins discussion of "model building" (use of regression
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationInteractions in Logistic Regression
Interactions in Logistic Regression > # UCBAdmissions is a 3-D table: Gender by Dept by Admit > # Same data in another format: > # One col for Yes counts, another for No counts. > Berkeley = read.table("http://www.utstat.toronto.edu/~brunner/312f12/
More informationNo other aids are allowed. For example you are not allowed to have any other textbook or past exams.
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Sample Exam Note: This is one of our past exams, In fact the only past exam with R. Before that we were using SAS. In
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationPAPER 206 APPLIED STATISTICS
MATHEMATICAL TRIPOS Part III Thursday, 1 June, 2017 9:00 am to 12:00 pm PAPER 206 APPLIED STATISTICS Attempt no more than FOUR questions. There are SIX questions in total. The questions carry equal weight.
More information> nrow(hmwk1) # check that the number of observations is correct [1] 36 > attach(hmwk1) # I like to attach the data to avoid the '$' addressing
Homework #1 Key Spring 2014 Psyx 501, Montana State University Prof. Colleen F Moore Preliminary comments: The design is a 4x3 factorial between-groups. Non-athletes do aerobic training for 6, 4 or 2 weeks,
More informationCh Inference for Linear Regression
Ch. 12-1 Inference for Linear Regression ACT = 6.71 + 5.17(GPA) For every increase of 1 in GPA, we predict the ACT score to increase by 5.17. population regression line β (true slope) μ y = α + βx mean
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationCherry.R. > cherry d h v <portion omitted> > # Step 1.
Cherry.R ####################################################################### library(mass) library(car) cherry < read.table(file="n:\\courses\\stat8620\\fall 08\\trees.dat",header=T) cherry d h v 1
More informationIntroduction and Background to Multilevel Analysis
Introduction and Background to Multilevel Analysis Dr. J. Kyle Roberts Southern Methodist University Simmons School of Education and Human Development Department of Teaching and Learning Background and
More informationSCHOOL OF MATHEMATICS AND STATISTICS
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: Statistics Tables by H.R. Neave MAS5052 SCHOOL OF MATHEMATICS AND STATISTICS Basic Statistics Spring Semester
More informationMultiple Regression Part I STAT315, 19-20/3/2014
Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.
More informationR Hints for Chapter 10
R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.
More informationLecture 10 Multiple Linear Regression
Lecture 10 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 10-1 Topic Overview Multiple Linear Regression Model 10-2 Data for Multiple Regression Y i is the response variable
More informationChecking the Poisson assumption in the Poisson generalized linear model
Checking the Poisson assumption in the Poisson generalized linear model The Poisson regression model is a generalized linear model (glm) satisfying the following assumptions: The responses y i are independent
More informationMODULE 6 LOGISTIC REGRESSION. Module Objectives:
MODULE 6 LOGISTIC REGRESSION Module Objectives: 1. 147 6.1. LOGIT TRANSFORMATION MODULE 6. LOGISTIC REGRESSION Logistic regression models are used when a researcher is investigating the relationship between
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationRegression on Faithful with Section 9.3 content
Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,
More informationLog-linear Models for Contingency Tables
Log-linear Models for Contingency Tables Statistics 149 Spring 2006 Copyright 2006 by Mark E. Irwin Log-linear Models for Two-way Contingency Tables Example: Business Administration Majors and Gender A
More informationPoisson Regression. The Training Data
The Training Data Poisson Regression Office workers at a large insurance company are randomly assigned to one of 3 computer use training programmes, and their number of calls to IT support during the following
More informationSTAT 420: Methods of Applied Statistics
STAT 420: Methods of Applied Statistics Model Diagnostics Transformation Shiwei Lan, Ph.D. Course website: http://shiwei.stat.illinois.edu/lectures/stat420.html August 15, 2018 Department
More information