Stat 401B Final Exam Fall 2016
|
|
- Kelley McGee
- 5 years ago
- Views:
Transcription
1 Stat 40B Final Exam Fall 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning will receive NO partial credit. Correct numerical answers to difficult questions unaccompanied by supporting reasoning may not receive full credit. SHOW YOUR WORK/EXPLAIN YOURSELF!
2 . A manufacturing process produces cylindrical parts. pts a) Suppose that parts are L inches long with diameter D (also in inches). Part volume is then V = LD π /4. Suppose that L and D can be modeled as independent continuous random variables, ~ U 9.99,0.0 D ~ U.99,.00. The mean and standard deviation of part volume ( EV L ( ) and ( ) and VarV respectively) are of interest and simulation will be used to evaluate these. Provide a few lines of R code that will do this. (The R syntax for density, cdf, random variable, etc. calls associated U ab, distribution is "_unif(..,min=a,max=b,..)".) with the ( ) 8 pts b) Suppose that the parts are steel and as manufactured have weights with mean 0. oz and standard deviation. oz. Use the central limit theorem and approximate the probability that such parts have a total weight above oz. (Hint: Rephrase the question in terms of sample average weight.). The 03 International Journal of Microbiology article "Design and Optimization of a Process for Sugarcane Molasses Fermentation by Saccharomyces cerevisiae Using Response Surface Methodology" by El-Gendy, Madian, and Amr, presents results of a study made to optimize the performance of a bioethanol production process. This question employs some results in that paper. Under a first single set of process conditions, n = runs of the process produce yields (in gm/l) of bioethanol with sample mean y =.9 and sample standard deviation s =.03. a) Give 9% two-sided confidence limits for the standard deviation of process yield under these conditions. (Plug in completely, but you need not simplify.)
3 b) Provide a number, #, such that you are 9% sure that 99% of all process yields under these conditions are at least the value #. (Plug in completely, but you need not simplify.) c) Suppose that a single additional run of the process made under a second set of conditions produces a yield of y =. Assuming that the standard deviation of yields is the same for both sets of conditions, significance testing will be used to evaluate whether this new setup has a different mean yield than the first. Give the value of an appropriate test statistic and name an appropriate reference distribution to be used in finding a p -value. Value of the test statistic Reference distribution pts d) Suppose that several process conditions of interest differ only in the value of incubation period ( x x, y data pairs are needed to make confidence limits for in h). What model assumptions for a set of ( ) the rate of change of mean y with respect to x? Beginning on Page 8 there is some R code and output for a MLR analysis of n = process runs. These are potentially useful for describing yield, y (in gm/l), as a function of the process variables x = incubation period (h) x = initial ph x3 = incubation temperature ( C) x = molasses concentration (wt %) 4 3
4 e) What is estimated by the value " Period 0.43 " reported in the table on the output? (What does the value.43 represent in the context of the problem?) f) Does a model linear in the predictors x, x, x3, and x 4 provide useful ability to predict yield? Provide some quantitative support for your answer. YES or NO (circle one) As it turns out, a MLR with the (4) predictors x, x, x, x, x, x, x, x, x x, x x, x x, x x, x x, x x has R =.98 and a LOOCV RMSPE.8. The (quadratic) model fit by least squares predicts (an optimal) yield for x =, x =., x3 = 40, and x4 = 8.. This set of processing conditions has y ˆ = 9. and se = yˆ pts g) Based on the information above and the R printout, fill out the ANOVA table for computing the overall F for the quadratic model and provide s SF for the quadratic model. ANOVA Table Source SS df MS F Regression Error Total s SF = h) What do comparisons between s =.03 (from the bottom of Page ), s SF from above, and the LOOCV RMSPE of.8 suggest about this situation? s versus s SF s SF versus LOOCV RMSPE 4
5 i) Based on the (quadratic) MLR model, give 9% prediction limits for the next yield at process conditions x =, x =., x3 = 40, and x4 = 8.. (Plug in completely, but do not simplify.) j) Before recommending adoption of process conditions from part i), what steps would you take, and why? 3. The "Appendicitis data set" on the KEEL website provides measured values of medical variables for N = 0 patients and values of a 0- variable indicating whether the patient had an appendicitis. Beginning on Page 9 there is R code and output for a logistic regression analysis of these data. a) Which of the medical variables seems least helpful in predicting whether or not a patient has an appendicitis? Why? Using bestglm() in the bestglm package and cross validation (presumably on the log likelihood criterion) it is possible to identify a good "reduced" logistic regression model as one with the two predictor variables At3 and At4. There is some R code and output for this model included. b) Give two-sided 9% confidence limits for the log odds ratio of the probability that a patient with At3 =.0 and At4 = 0 has an appendicitis. (Plug in completely, but do not simplify.)
6 4. Beginning on Page 0 there is R code and output for a factorial analysis of some experimental data on the charge lives of batteries made of 3 materials at 3 different temperatures taken from an experimental design book of Montgomery. (Though the temperature is clearly quantitative, here treat both factors as qualitative.) The data for this study comprise a balanced 3 3 factorial data set. a) Are there statistically detectable interactions between Material and Temperature? Explain. YES or NO (circle one) b) Give 9% two-sided confidence limits for the difference between the Material and Material main effects. (Plug in completely, but you need not simplify.) c) Find the fitted/predicted value of battery life for a battery of Material under Temperature for a "main effects only" model of life. (If this is not possible based on the given information, say why.). There is a famous "Boston Housing" data set on the UCI ML Data Repository. It concerns the median home price in counties around Boston in the late 90's as predicted by 3 measures of community composition. This question concerns use of data from 4 counties with complete records and prediction of "MEDV". There is R code and output provided (in pretty much the same format as for Lab #) beginning on Page. (As a baseline, MLR of MEDV on 3 predictors produces ssf = 4.3 and R =.404.) Use the output to answer the following questions. a) Which of the predictors of MEDV do you like the best, and why?
7 b) What (if anything) convinces you that you can do better here than MLR for prediction purposes? c) Why/how is it obvious that the ordinary MLR (OLS) predictor differs very little from the elastic net predictor in this case? What about the particular elastic net fit (chosen by repeated cross-validation) makes this similarity unsurprising? c) What is the origin of the "vertical stripes" appearance of the plots in the "Tree" column of the matrix of scatterplots of predicted values? d) Below is a schematic of the tree predictor (plotting is "condition TRUE to the LEFT"). Give a simple description of the conditions (values of the predictors) producing the largest predicted MEDV. e) If you were going to "stack" two of the predictors here, which two would you consider and why?
8 R Code and Output for Bioethanol Analyses Biofuels Period InitialpH Temp Conc Yield summary(lm(yield~.,data=biofuels)) Call: lm(formula = Yield ~., data = Biofuels) Residuals: Min Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr( t ) (Intercept) Period InitialpH Temp Conc Residual standard error: 8.8 on degrees of freedom Multiple R-squared: 0.009, Adjusted R-squared: -0.3 F-statistic: on 4 and DF, p-value: 0.99 anova(lm(yield~.,data=biofuels)) Analysis of Variance Table Response: Yield Df Sum Sq Mean Sq F value Pr(F) Period InitialpH Temp Conc Residuals
9 R Code and Output for the Appendicitis Data Analyses Appendicitis[:0,] At At At3 At4 At At At Class summary(glm(class~.,data=appendicitis)) Call: glm(formula = Class ~., data = Appendicitis) Deviance Residuals: Min Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr( t ) (Intercept) At e-08 *** ** At At3 At At At At Signif. codes: 0 *** 0.00 ** 0.0 * (Dispersion parameter for gaussian family taken to be 0.038) Null deviance:.840 on 0 degrees of freedom Residual deviance: 0.09 on 98 degrees of freedom AIC: 3.8 Number of Fisher Scoring iterations: summary(glm(class~at3+at4,data=appendicitis)) Call: glm(formula = Class ~ At3 + At4, data = Appendicitis) Deviance Residuals: Min Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr( t ) (Intercept) e- *** At e-0 *** At ** --- Signif. codes: 0 *** 0.00 ** 0.0 * (Dispersion parameter for gaussian family taken to be 0.883) Null deviance:.84 on 0 degrees of freedom Residual deviance:.4 on 03 degrees of freedom AIC: 9.98 Number of Fisher Scoring iterations: 9
10 Predlogit<-predict(glm(Class~At3+At4,data=Appendicitis),se.fit=TRUE)$fit SEPredlogit<-predict(glm(Class~At3+At4,data=Appendicitis), + se.fit=true)$se.fit cbind(appendicitis$at3,appendicitis$at4,appendicitis$class, + round(predlogit,3),round(sepredlogit,3))[:0,] [,] [,] [,3] [,4] [,] R Code and Output for Battery Life Study Batteries Life MaterialA TempB
11 summary(lm(life~materiala*tempb,data=batteries)) Call: lm(formula = Life ~ MaterialA * TempB, data = Batteries) Residuals: Min Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr( t ) (Intercept) < e- *** MaterialA ** MaterialA TempB e-0 *** TempB MaterialA:TempB MaterialA:TempB MaterialA:TempB ** MaterialA:TempB Signif. codes: 0 *** 0.00 ** 0.0 * Residual standard error:.98 on degrees of freedom Multiple R-squared: 0., Adjusted R-squared: 0.9 F-statistic: on 8 and DF, p-value: 9.4e-0 anova(lm(life~materiala*tempb,data=batteries)) Analysis of Variance Table Response: Life Df Sum Sq Mean Sq F value Pr(F) MaterialA ** TempB e-0 *** MaterialA:TempB * Residuals Signif. codes: 0 *** 0.00 ** 0.0 * R Code and Output for Boston Housing Data Analysis Boston[:,] CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX PTRATIO B LSAT MEDV summary(boston) CRIM ZN INDUS CHAS NOX Min. :0.003 Min. : 0.00 Min. : 0.4 Min. : Min. :0.380 st Qu.: st Qu.: 0.00 st Qu.: 4.93 st Qu.: st Qu.:0.440 Median :0.903 Median : 0.00 Median : 8.4 Median : Median :0.90 Mean :.4083 Mean :. Mean :0. Mean :0.043 Mean : rd Qu.:.4 3rd Qu.: rd Qu.:8.0 3rd Qu.: rd Qu.:0.00 Max. :9.94 Max. :00.00 Max. :.4 Max. : Max. :0.80 RM AGE DIS RAD TAX Min. :3. Min. :.90 Min. :. Min. :.000 Min. :8.0 st Qu.:.9 st Qu.: 40.9 st Qu.:.3 st Qu.: st Qu.:.8 Median :.9 Median :.80 Median : 3.0 Median :.000 Median :.0 Mean :.344 Mean :. Mean : Mean :.83 Mean :3.4 3rd Qu.:.3 3rd Qu.: 9. 3rd Qu.:.40 3rd Qu.:.000 3rd Qu.:4.0 Max. :8.80 Max. :00.00 Max. :. Max. :4.000 Max. :.0 PTRATIO B LSAT MEDV Min. :.0 Min. : 0.3 Min. :. Min. :. st Qu.:.80 st Qu.:3. st Qu.:.88 st Qu.:8.0 Median :8.0 Median :39.08 Median :0.0 Median :.9 Mean :8. Mean :39.83 Mean :.44 Mean :3. 3rd Qu.:0.0 3rd Qu.:39. 3rd Qu.:.0 3rd Qu.:.0 Max. :.00 Max. :39.90 Max. :34.40 Max. :0.00
12 #k= for knn prediction is chosen by repeated CV sqrt(knn.reg(train=boston,y=boston[,4],k=)$press/4) [] 4.3 knnpred<-knn.reg(train=boston,y=boston[,4],k=)$pred cbind(boston$medv[:0],knnpred[:0]) [,] [,] [,] 4.0. [,] [3,] [4,] [,] [,] 8..4 [,].9. [8,]. 8.3 [9,]..0 [0,] #alpha=.00 and lambda=.3 for the elastic net are chosen by repeated CV #producing CV RMSPE x<-as.matrix(boston[,:3]) y<-as.matrix(boston[,4]) BostonNet<-glmnet(x,y,family="gaussian",alpha=.00,lambda=.3) ENetPred<-predict(BostonNet,newx=x) cbind(boston$medv[:0],enetpred[:0]) [,] [,] [,] [,] [3,] [4,] [,] [,] [,].9.49 [8,] [9,] [0,] #cp=.003 is a good choice of regression tree complexity parameter #chosen by CV and producing RMSPE BestTree<-rpart(MEDV~.,data=Boston,method="anova", control=rpart.control(cp=.003)) cbind(boston$medv[:0],predict(besttree)[:0]) [,] [,] #mtry= is a good choice for random forest parameter, chosen by CV on OOB error BostonRf<-randomForest(MEDV~.,data=Boston, + type="regression",ntree=000,mtry=) sqrt(bostonrf$mse[000]) [] 3.009
13 comppred<-cbind(y,lm(medv~.,data=boston)$fitted.values,knnpred,enetpred, + predict(besttree),bostonrf$predicted) colnames(comppred)<-c("medv","ols","nn","enet","tree","rf") pairs(comppred,panel=function(x,y,...){ + points(x,y) + abline(0,)},xlim=c(0,0),ylim=c(0,0)) round(cor(as.matrix(comppred)),) MEDV OLS NN ENET Tree RF MEDV OLS NN ENET Tree RF
Stat 401B Final Exam Fall 2015
Stat 401B Final Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401XV Final Exam Spring 2017
Stat 40XV Final Exam Spring 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401B Exam 3 Fall 2016 (Corrected Version)
Stat 401B Exam 3 Fall 2016 (Corrected Version) I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied
More informationStat 401B Exam 2 Fall 2015
Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationStat 401B Exam 2 Fall 2016
Stat 40B Eam Fall 06 I have neither given nor received unauthorized assistance on this eam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning will
More informationStat 401B Exam 2 Fall 2017
Stat 0B Exam Fall 07 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning will
More informationMultiple Regression Part I STAT315, 19-20/3/2014
Multiple Regression Part I STAT315, 19-20/3/2014 Regression problem Predictors/independent variables/features Or: Error which can never be eliminated. Our task is to estimate the regression function f.
More informationStat 5102 Final Exam May 14, 2015
Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions
More informationST430 Exam 2 Solutions
ST430 Exam 2 Solutions Date: November 9, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textbook are permitted but you may use a calculator. Giving
More informationR Output for Linear Models using functions lm(), gls() & glm()
LM 04 lm(), gls() &glm() 1 R Output for Linear Models using functions lm(), gls() & glm() Different kinds of output related to linear models can be obtained in R using function lm() {stats} in the base
More information(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.
FINAL EXAM ** Two different ways to submit your answer sheet (i) Use MS-Word and place it in a drop-box. (ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box. Deadline: December
More informationSTK 2100 Oblig 1. Zhou Siyu. February 15, 2017
STK 200 Oblig Zhou Siyu February 5, 207 Question a) Make a scatter box plot for the data set. Answer:Here is the code I used to plot the scatter box in R. library ( MASS ) 2 pairs ( Boston ) Figure : Scatter
More informationHW1 Roshena MacPherson Feb 1, 2017
HW1 Roshena MacPherson Feb 1, 2017 This is an R Markdown Notebook. When you execute code within the notebook, the results appear beneath the code. Question 1: In this question we will consider some real
More informationSTAT 510 Final Exam Spring 2015
STAT 510 Final Exam Spring 2015 Instructions: The is a closed-notes, closed-book exam No calculator or electronic device of any kind may be used Use nothing but a pen or pencil Please write your name and
More informationLogistic Regression - problem 6.14
Logistic Regression - problem 6.14 Let x 1, x 2,, x m be given values of an input variable x and let Y 1,, Y m be independent binomial random variables whose distributions depend on the corresponding values
More informationSTAT 526 Spring Midterm 1. Wednesday February 2, 2011
STAT 526 Spring 2011 Midterm 1 Wednesday February 2, 2011 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points
More informationWeek 7 Multiple factors. Ch , Some miscellaneous parts
Week 7 Multiple factors Ch. 18-19, Some miscellaneous parts Multiple Factors Most experiments will involve multiple factors, some of which will be nuisance variables Dealing with these factors requires
More informationFinal Exam. Name: Solution:
Final Exam. Name: Instructions. Answer all questions on the exam. Open books, open notes, but no electronic devices. The first 13 problems are worth 5 points each. The rest are worth 1 point each. HW1.
More informationLogistic Regressions. Stat 430
Logistic Regressions Stat 430 Final Project Final Project is, again, team based You will decide on a project - only constraint is: you are supposed to use techniques for a solution that are related to
More informationLogistic Regression 21/05
Logistic Regression 21/05 Recall that we are trying to solve a classification problem in which features x i can be continuous or discrete (coded as 0/1) and the response y is discrete (0/1). Logistic regression
More informationExam Applied Statistical Regression. Good Luck!
Dr. M. Dettling Summer 2011 Exam Applied Statistical Regression Approved: Tables: Note: Any written material, calculator (without communication facility). Attached. All tests have to be done at the 5%-level.
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More information" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2
Notation and Equations for Final Exam Symbol Definition X The variable we measure in a scientific study n The size of the sample N The size of the population M The mean of the sample µ The mean of the
More informationStat 602 Exam 1 Spring 2017 (corrected version)
Stat 602 Exam Spring 207 (corrected version) I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed This is a very long Exam. You surely won't be able to
More informationGeneralized linear models for binary data. A better graphical exploratory data analysis. The simple linear logistic regression model
Stat 3302 (Spring 2017) Peter F. Craigmile Simple linear logistic regression (part 1) [Dobson and Barnett, 2008, Sections 7.1 7.3] Generalized linear models for binary data Beetles dose-response example
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationStat 231 Final Exam Fall 2011
Stat 3 Final Exam Fall 0 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed . An experiment was run to compare the fracture toughness of high purity 8%
More informationGeneralized linear models
Generalized linear models Douglas Bates November 01, 2010 Contents 1 Definition 1 2 Links 2 3 Estimating parameters 5 4 Example 6 5 Model building 8 6 Conclusions 8 7 Summary 9 1 Generalized Linear Models
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationCAS MA575 Linear Models
CAS MA575 Linear Models Boston University, Fall 2013 Midterm Exam (Correction) Instructor: Cedric Ginestet Date: 22 Oct 2013. Maximal Score: 200pts. Please Note: You will only be graded on work and answers
More informationTento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/
Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/28.0018 Statistical Analysis in Ecology using R Linear Models/GLM Ing. Daniel Volařík, Ph.D. 13.
More informationSTAT 525 Fall Final exam. Tuesday December 14, 2010
STAT 525 Fall 2010 Final exam Tuesday December 14, 2010 Time: 2 hours Name (please print): Show all your work and calculations. Partial credit will be given for work that is partially correct. Points will
More informationExercise 5.4 Solution
Exercise 5.4 Solution Niels Richard Hansen University of Copenhagen May 7, 2010 1 5.4(a) > leukemia
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION. ST3241 Categorical Data Analysis. (Semester II: ) April/May, 2011 Time Allowed : 2 Hours
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3 4 5 6 Full marks
More informationBiostatistics 380 Multiple Regression 1. Multiple Regression
Biostatistics 0 Multiple Regression ORIGIN 0 Multiple Regression Multiple Regression is an extension of the technique of linear regression to describe the relationship between a single dependent (response)
More informationMATH 644: Regression Analysis Methods
MATH 644: Regression Analysis Methods FINAL EXAM Fall, 2012 INSTRUCTIONS TO STUDENTS: 1. This test contains SIX questions. It comprises ELEVEN printed pages. 2. Answer ALL questions for a total of 100
More informationDISCRIMINANT ANALYSIS: LDA AND QDA
Stat 427/627 Statistical Machine Learning (Baron) HOMEWORK 6, Solutions DISCRIMINANT ANALYSIS: LDA AND QDA. Chap 4, exercise 5. (a) On a training set, LDA and QDA are both expected to perform well. LDA
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More informationData Mining Techniques. Lecture 2: Regression
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 2: Regression Jan-Willem van de Meent (credit: Yijun Zhao, Marc Toussaint, Bishop) Administrativa Instructor Jan-Willem van de Meent Email:
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD
More informationLinear Regression Models P8111
Linear Regression Models P8111 Lecture 25 Jeff Goldsmith April 26, 2016 1 of 37 Today s Lecture Logistic regression / GLMs Model framework Interpretation Estimation 2 of 37 Linear regression Course started
More informationStat 5303 (Oehlert): Randomized Complete Blocks 1
Stat 5303 (Oehlert): Randomized Complete Blocks 1 > library(stat5303libs);library(cfcdae);library(lme4) > immer Loc Var Y1 Y2 1 UF M 81.0 80.7 2 UF S 105.4 82.3 3 UF V 119.7 80.4 4 UF T 109.7 87.2 5 UF
More informationSTATS216v Introduction to Statistical Learning Stanford University, Summer Midterm Exam (Solutions) Duration: 1 hours
Instructions: STATS216v Introduction to Statistical Learning Stanford University, Summer 2017 Remember the university honor code. Midterm Exam (Solutions) Duration: 1 hours Write your name and SUNet ID
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationSTAT 572 Assignment 5 - Answers Due: March 2, 2007
1. The file glue.txt contains a data set with the results of an experiment on the dry sheer strength (in pounds per square inch) of birch plywood, bonded with 5 different resin glues A, B, C, D, and E.
More informationOn the Inference of the Logistic Regression Model
On the Inference of the Logistic Regression Model 1. Model ln =(; ), i.e. = representing false. The linear form of (;) is entertained, i.e. ((;)) ((;)), where ==1 ;, with 1 representing true, 0 ;= 1+ +
More informationReaction Days
Stat April 03 Week Fitting Individual Trajectories # Straight-line, constant rate of change fit > sdat = subset(sleepstudy, Subject == "37") > sdat Reaction Days Subject > lm.sdat = lm(reaction ~ Days)
More informationDesign & Analysis of Experiments 7E 2009 Montgomery
Chapter 5 1 Introduction to Factorial Design Study the effects of 2 or more factors All possible combinations of factor levels are investigated For example, if there are a levels of factor A and b levels
More informationGRAD6/8104; INES 8090 Spatial Statistic Spring 2017
Lab #5 Spatial Regression (Due Date: 04/29/2017) PURPOSES 1. Learn to conduct alternative linear regression modeling on spatial data 2. Learn to diagnose and take into account spatial autocorrelation in
More informationIE 361 EXAM #3 FALL 2013 Show your work: Partial credit can only be given for incorrect answers if there is enough information to clearly see what you were trying to do. There are two additional blank
More informationR Hints for Chapter 10
R Hints for Chapter 10 The multiple logistic regression model assumes that the success probability p for a binomial random variable depends on independent variables or design variables x 1, x 2,, x k.
More informationDensity Temp vs Ratio. temp
Temp Ratio Density 0.00 0.02 0.04 0.06 0.08 0.10 0.12 Density 0.0 0.2 0.4 0.6 0.8 1.0 1. (a) 170 175 180 185 temp 1.0 1.5 2.0 2.5 3.0 ratio The histogram shows that the temperature measures have two peaks,
More informationMODELS WITHOUT AN INTERCEPT
Consider the balanced two factor design MODELS WITHOUT AN INTERCEPT Factor A 3 levels, indexed j 0, 1, 2; Factor B 5 levels, indexed l 0, 1, 2, 3, 4; n jl 4 replicate observations for each factor level
More informationModeling Overdispersion
James H. Steiger Department of Psychology and Human Development Vanderbilt University Regression Modeling, 2009 1 Introduction 2 Introduction In this lecture we discuss the problem of overdispersion in
More informationIntroduction to the Generalized Linear Model: Logistic regression and Poisson regression
Introduction to the Generalized Linear Model: Logistic regression and Poisson regression Statistical modelling: Theory and practice Gilles Guillot gigu@dtu.dk November 4, 2013 Gilles Guillot (gigu@dtu.dk)
More informationANOVA, ANCOVA and MANOVA as sem
ANOVA, ANCOVA and MANOVA as sem Robin Beaumont 2017 Hoyle Chapter 24 Handbook of Structural Equation Modeling (2015 paperback), Examples converted to R and Onyx SEM diagrams. This workbook duplicates some
More informationStat 412/512 TWO WAY ANOVA. Charlotte Wickham. stat512.cwick.co.nz. Feb
Stat 42/52 TWO WAY ANOVA Feb 6 25 Charlotte Wickham stat52.cwick.co.nz Roadmap DONE: Understand what a multiple regression model is. Know how to do inference on single and multiple parameters. Some extra
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationUNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013
UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences Midterm Test, October 2013 STAC67H3 Regression Analysis Duration: One hour and fifty minutes Last Name: First Name: Student
More informationStatistical Methods III Statistics 212. Problem Set 2 - Answer Key
Statistical Methods III Statistics 212 Problem Set 2 - Answer Key 1. (Analysis to be turned in and discussed on Tuesday, April 24th) The data for this problem are taken from long-term followup of 1423
More informationComparing Nested Models
Comparing Nested Models ST 370 Two regression models are called nested if one contains all the predictors of the other, and some additional predictors. For example, the first-order model in two independent
More informationRegression Methods for Survey Data
Regression Methods for Survey Data Professor Ron Fricker! Naval Postgraduate School! Monterey, California! 3/26/13 Reading:! Lohr chapter 11! 1 Goals for this Lecture! Linear regression! Review of linear
More information7/28/15. Review Homework. Overview. Lecture 6: Logistic Regression Analysis
Lecture 6: Logistic Regression Analysis Christopher S. Hollenbeak, PhD Jane R. Schubart, PhD The Outcomes Research Toolbox Review Homework 2 Overview Logistic regression model conceptually Logistic regression
More informationStatistical Prediction
Statistical Prediction P.R. Hahn Fall 2017 1 Some terminology The goal is to use data to find a pattern that we can exploit. y: response/outcome/dependent/left-hand-side x: predictor/covariate/feature/independent
More informationSCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester
RESTRICTED OPEN BOOK EXAMINATION (Not to be removed from the examination hall) Data provided: "Statistics Tables" by H.R. Neave PAS 371 SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester 2008 9 Linear
More informationBooklet of Code and Output for STAD29/STA 1007 Midterm Exam
Booklet of Code and Output for STAD29/STA 1007 Midterm Exam List of Figures in this document by page: List of Figures 1 NBA attendance data........................ 2 2 Regression model for NBA attendances...............
More informationBooklet of Code and Output for STAD29/STA 1007 Midterm Exam
Booklet of Code and Output for STAD29/STA 1007 Midterm Exam List of Figures in this document by page: List of Figures 1 Packages................................ 2 2 Hospital infection risk data (some).................
More informationStatistics 203 Introduction to Regression Models and ANOVA Practice Exam
Statistics 203 Introduction to Regression Models and ANOVA Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 7 pages long. There are 4 questions, first 3 worth 10
More informationTruck prices - linear model? Truck prices - log transform of the response variable. Interpreting models with log transformation
Background Regression so far... Lecture 23 - Sta 111 Colin Rundel June 17, 2014 At this point we have covered: Simple linear regression Relationship between numerical response and a numerical or categorical
More informationRegression models. Generalized linear models in R. Normal regression models are not always appropriate. Generalized linear models. Examples.
Regression models Generalized linear models in R Dr Peter K Dunn http://www.usq.edu.au Department of Mathematics and Computing University of Southern Queensland ASC, July 00 The usual linear regression
More informationSTAT 350: Summer Semester Midterm 1: Solutions
Name: Student Number: STAT 350: Summer Semester 2008 Midterm 1: Solutions 9 June 2008 Instructor: Richard Lockhart Instructions: This is an open book test. You may use notes, text, other books and a calculator.
More informationLecture 10. Factorial experiments (2-way ANOVA etc)
Lecture 10. Factorial experiments (2-way ANOVA etc) Jesper Rydén Matematiska institutionen, Uppsala universitet jesper@math.uu.se Regression and Analysis of Variance autumn 2014 A factorial experiment
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationInteractions in Logistic Regression
Interactions in Logistic Regression > # UCBAdmissions is a 3-D table: Gender by Dept by Admit > # Same data in another format: > # One col for Yes counts, another for No counts. > Berkeley = read.table("http://www.utstat.toronto.edu/~brunner/312f12/
More informationConsider fitting a model using ordinary least squares (OLS) regression:
Example 1: Mating Success of African Elephants In this study, 41 male African elephants were followed over a period of 8 years. The age of the elephant at the beginning of the study and the number of successful
More informationTA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM
STAT 301, Fall 2011 Name Lec 4: Ismor Fischer Discussion Section: Please circle one! TA: Sheng Zhgang... 341 (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan... 345 (W 1:20) / 346 (Th
More informationNATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )
NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) Categorical Data Analysis (Semester II: 2010 2011) April/May, 2011 Time Allowed : 2 Hours Matriculation No: Seat No: Grade Table Question 1 2 3
More informationStat 231 Exam 2 Fall 2013
Stat 231 Exam 2 Fall 2013 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Some IE 361 students worked with a manufacturer on quantifying the capability
More informationSTA 101 Final Review
STA 101 Final Review Statistics 101 Thomas Leininger June 24, 2013 Announcements All work (besides projects) should be returned to you and should be entered on Sakai. Office Hour: 2 3pm today (Old Chem
More informationModule 4: Regression Methods: Concepts and Applications
Module 4: Regression Methods: Concepts and Applications Example Analysis Code Rebecca Hubbard, Mary Lou Thompson July 11-13, 2018 Install R Go to http://cran.rstudio.com/ (http://cran.rstudio.com/) Click
More informationcor(dataset$measurement1, dataset$measurement2, method= pearson ) cor.test(datavector1, datavector2, method= pearson )
Tutorial 7: Correlation and Regression Correlation Used to test whether two variables are linearly associated. A correlation coefficient (r) indicates the strength and direction of the association. A correlation
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More informationA Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn
A Handbook of Statistical Analyses Using R Brian S. Everitt and Torsten Hothorn CHAPTER 6 Logistic Regression and Generalised Linear Models: Blood Screening, Women s Role in Society, and Colonic Polyps
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationIE 316 Exam 1 Fall 2011
IE 316 Exam 1 Fall 2011 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Suppose the actual diameters x in a batch of steel cylinders are normally
More informationIE 361 Exam 3 Fall I have neither given nor received unauthorized assistance on this exam.
IE 361 Exam 3 Fall 2012 I have neither given nor received unauthorized assistance on this exam. Name Date 1 1. I wish to measure the density of a small rock. My method is to read the volume of water in
More informationSTAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 6:00 PM
STAT212_E3 KING FAHD UNIVERSITY OF PETROLEUM & MINERALS DEPARTMENT OF MATHEMATICS & STATISTICS Term 171 Page 1 of 9 STAT 212: BUSINESS STATISTICS II Third Exam Tuesday Dec 12, 2017 @ 6:00 PM Name: ID #:
More informationMultiple Regression: Example
Multiple Regression: Example Cobb-Douglas Production Function The Cobb-Douglas production function for observed economic data i = 1,..., n may be expressed as where O i is output l i is labour input c
More informationSample solutions. Stat 8051 Homework 8
Sample solutions Stat 8051 Homework 8 Problem 1: Faraway Exercise 3.1 A plot of the time series reveals kind of a fluctuating pattern: Trying to fit poisson regression models yields a quadratic model if
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationLab 3 A Quick Introduction to Multiple Linear Regression Psychology The Multiple Linear Regression Model
Lab 3 A Quick Introduction to Multiple Linear Regression Psychology 310 Instructions.Work through the lab, saving the output as you go. You will be submitting your assignment as an R Markdown document.
More informationDealing with Heteroskedasticity
Dealing with Heteroskedasticity James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Dealing with Heteroskedasticity 1 / 27 Dealing
More informationChecking the Poisson assumption in the Poisson generalized linear model
Checking the Poisson assumption in the Poisson generalized linear model The Poisson regression model is a generalized linear model (glm) satisfying the following assumptions: The responses y i are independent
More informationIE 316 Exam 1 Fall 2011
IE 316 Exam 1 Fall 2011 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed 1 1. Suppose the actual diameters x in a batch of steel cylinders are normally
More informationSTATISTICS 110/201 PRACTICE FINAL EXAM
STATISTICS 110/201 PRACTICE FINAL EXAM Questions 1 to 5: There is a downloadable Stata package that produces sequential sums of squares for regression. In other words, the SS is built up as each variable
More informationPAPER 206 APPLIED STATISTICS
MATHEMATICAL TRIPOS Part III Thursday, 1 June, 2017 9:00 am to 12:00 pm PAPER 206 APPLIED STATISTICS Attempt no more than FOUR questions. There are SIX questions in total. The questions carry equal weight.
More information