We d like to know the equation of the line shown (the so called best fit or regression line).
|
|
- Willis Neal Smith
- 5 years ago
- Views:
Transcription
1 Linear Regression in R. Example. Let s create a data frame. > exam1 = c(100,90,90,85,80,75,60) > exam2 = c(95,100,90,80,95,60,40) > students = c("asuka", "Rei", "Shinji", "Mari", "Hikari", "Toji", "Kensuke") > class=data.frame(students,exam1,exam2) As seen in the previous section, a scatterplot of the data is: We d like to know the equation of the line shown (the so called best fit or regression line). The simplest way to do so, is to use the lm(linear model) command. > lm(exam2~exam1, data=class) Produces
2 Call: lm(formula = exam2 ~ exam1, data = class) Coefficients: (Intercept) exam exam2 ~ exam1 indicates we want to think of exam 2 as being the dependent variable Y, and exam 1 as the independent variable X. data=class indicates we are looking in the data frame we ve named class. The output indicates that the regression line is given by the formula That is, More detailed information can be obtained by creating an object and calling for its summary statistics. > Linearmodel1 = lm(exam2~exam1, data=class) creates an object named Linearmodel1 containing the regression information. > summary(linearmodel1) Provides additional information about the regression. Call: lm(formula = exam2 ~ exam1, data = class) Residuals: Coefficients: Estimate Std. Error t value Pr(> t ) (Intercept) exam ** Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 5 degrees of freedom Multiple R squared: , Adjusted R squared: F statistic: on 1 and 5 DF, p value: If you just want the correlation coefficient between exam 2 and exam 1, of course you could just take the square root of R squared = , or if you don t want to go to that trouble > cor(class$exam2, class$exam1) [1] does the trick.
3 Multiple regression. Example. Let s take our existing data frame from the previous section and add some additional variables, namely a third exam, and each student s final grade. > exam3=c(95, 90, 88, 75, 85, 60, 50) > final =c(95, 91, 86, 77, 86, 55, 50) > class=data.frame(students, exam1, exam2, exam3, final) > class Students exam1 exam2 exam3 final 1 Asuka Rei Shinji Mari Hikari Toji Kensuke Now suppose we wanted to create a model which relates the student s final grade to their performance on the first three exams. >grademodel = lm(final~exam1+exam2+exam3, data=class) As before lm denotes that we re making a linear model, the variable before the ~ is the independent (response) variable, and the variables after ~ are the independent (explanatory) variables. Again, the summary command will provide more information. Here s the output.
4 The value of r squared is good (98.65% of the variation in final grades is explained by the predictors in our model, but there are some things that are not so good. The entries in the column Pr( > t ) are called p values. At this point in the course, we re not going to go into all the details of what a p value is, but for now all you need to know is the lower the p value, the better that predictor is. The residuals tell you how much our model over or under predicts the final grade in the known data. The standard error can be thought of as sort of a standard deviation of the residuals or errors (so generally, the lower the standard error, the better). In the output above, note that exam 3 looks like a good predictor of final grade, and the high p values on exam 1 and exam 2 suggest we might not really need them in our model. So, let s try again, throwing out exams 1 and 2. Note that on its own, exam 3 looks like a good predictor of final grade. Tossing exam 1 and exam 2 actually reduces the standard error (which is good!), R squared is barely changed, and the adjusted R squared (which takes into account the number of predictors used in the model) actually goes up. So, a pretty good linear model to predict the final grade would be final = *exam
5 Example. At spots.gru.edu/nsmith12/rexamples/georgiacolleges.csv is a file containing data about colleges and universities in the great state of Georgia. This data set contains the name of each institution, the student count, the percentage of the students who are full time (ft.pct), the median sat score (math+verbal) at each school, and the six and four year graduation rates (six.yr.grad, four.yr.grad). Now, suppose we wanted to try to figure out why, say, UGA s graduation rate is high, and Augusta State s isn t so high. Let s create an object called gradmodel to do this. At the command line, we ll enter >gradmodel = lm(six.yr.grad ~ ft.pct + med.sat, data=ga) This model will look for a relationship between the six year graduation rate, the percentage of full time students, and the median sat score.
6 This says that the best fit is given by: Six year graduation rate =.87374*ft.pct *med.sat Let s now say you wanted to see how Augusta State should be doing according to this model. ASU s predicted graduation rate =.87374(74.5) (965) , about (percent). ASU only graduated 24.5 percent, a little worse (about 6.78 percentage points lower) than the model predicts. If you don t want to do all this by hand > predict(gradmodel, list(ft.pct = 74.5, med.sat = 965)) The syntax is predict(name of the model you want, list(values of the variables in question)). In the output, a residual tells you how far each data point differs from what is predicted by the regression equation. If you want to see all the residuals > gradmodel$residuals This tells you that for data point 1 (ASU), the (actual) percentage of students graduating in 6 years is 6.78 percentage points lower than the model predicts. For school 2 (Fort Valley State), the actual graduation rate is about 0.8 percentage points higher than the model predicts, and so on. Maybe in the future, the undergraduate side of the former ASU tightens up its standards so that the median SAT score is 1100 and 90% of the students are full time. All things being equal, we d predict a (six year) graduation rate of > predict(gradmodel, list(ft.pct = 90, med.sat = 1100)) about 56 percent.
7 Note! Whenever trying to do predictions from a regression line, there is some amount of uncertainty in the result. For instance, in the example above, we were trying to predict ASU s graduation rate, assuming a median SAT score of 965 and a full time percentage of 74.5%. If you ask R > predict(gradmodel, list(ft.pct = 74.5, med.sat = 965), interval="predict", level=.95) the output is fit lwr upr interval = predict means that we want to construct a so called prediction interval about the point in question. Setting level=.95 means that we are using a confidence level of 95%; more on that later in the course. For now, the output says: At a 95% confidence level, a range of plausible estimates for the 6 year graduation rate at an institution like ASU is 16.2% to 46.4%. Intuitively, this gives upper and lower estimates for the graduation rate based on the model we are using and based on the available data. Note that increasing the confidence level to 99%, > predict(gradmodel, list(ft.pct = 74.5, med.sat = 965), interval="predict", level=.99) fit lwr upr gives a somewhat wider prediction interval. For now, the idea is that to have greater confidence in a prediction, you need more latitude in the range of predicted values. Note! As Spock would say, all things being equal, I d agree however, all things are not equal, and one must be careful when messing around in this fashion with a regression model. To see why, try computing predict(gradmodel, list(ft.pct = 100, med.sat = 1600)) (the ideal scenario!) and make sure you see why the result doesn t make any damn sense.
8 Dummy variables. In many applications, there are qualitative factors that you might want to include in a regression. Example. At a certain university, there is a college of Science that houses the departments of Biology, Chemistry, and Mathematics. Here is some salary data about a sample of professors from that college. Let s import the data at This data tells you the salaries of faculty members in some (fictional) academic department, based on the faculty member s rank (Assistant, Associate, or Full professor), their gender, and the how long in years it has been since the faculty member obtained their Ph.D. > salary = read.csv(" > head(salary) Salary Rank Gender Yrs Assistant M Assistant M Assistant M Assistant M Assistant F Assistant F 3 Now, the difficulty is how to incorporate the predictors Rank and Gender into a regression model. The trick is to use what are called dummy variables. Gender is a binary variable, so let s add a dummy variable G to the mix. G will equal 1 if the gender is male, and 0 is female (or vice versa!). G simply encodes the gender of each faculty member in a way that can be handled numerically. Important! Rank is a little trickier. We don t want to do something like let R = 0 if the faculty member is an assistant professor, R = 1 if associate, and R = 2 if full. We d be assuming that the faculty member gets the same salary bump when they go from level 0 to 1 (assistant to associate) as they do when going from level 1 to 2 (associate to full professor). That seems like a really dicey assumption! So, what we ll do is incorporate two dummy variables. Let Assoc = 1 if the faculty member holds the rank of Associate (and 0 otherwise), let Full = 1 if they re a full professor (0 otherwise). Thus, Assistant Professors have Assoc=0 and Full=0. In general, if we want to model a qualitative factor, we need one fewer dummy variables than levels of the qualitative variable. Now, we just need to code the dummy variables and get the data ready to go. > G = c(1,1,1,1,0,0,1,0,1,1,0,1,0,1,1,0,0,1,1,1,1,1,0,1,1,0,1,0,0,1) > Assoc=c(0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0) > Full =c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1) > salary = data.frame(salary, G, Assoc, Full)
9 > salary Salary Rank Gender Yrs G Assoc Full Assistant M Assistant M Assistant M Assistant M Assistant F Assistant F Assistant M Assistant F Assistant M Associate M Associate F Now, let s create a model that takes all of these variables into account. For the moment, let s conveniently ignore that Rank and Yrs strongly correlate (which is not ideal, since we risk overfitting the model!). It s time to create a linear model. It looks like Yrs, Assoc, and Full are good predictors of salary. The middling p value on the gender variable suggests that is the weakest predictor of salary. Our tentative model is thus salary = 469*Yrs *G *Assoc *Full
10 Now,one could argue that the variables Yrs, Assoc, and Full are measuring similar things (to get promoted to associate or full professor you have to be on the job for a while!), so it might be worth investigating what happens if we drop some of these variables from the model. If you look at the output below, Note that dropping Yrs or Assoc and Full from the model causes a fairly large dip in R squared and a jump in the standard error, so I d probably leave also those variables in the model.
STAT 3022 Spring 2007
Simple Linear Regression Example These commands reproduce what we did in class. You should enter these in R and see what they do. Start by typing > set.seed(42) to reset the random number generator so
More information22s:152 Applied Linear Regression
22s:152 Applied Linear Regression Chapter 7: Dummy Variable Regression So far, we ve only considered quantitative variables in our models. We can integrate categorical predictors by constructing artificial
More informationRegression on Faithful with Section 9.3 content
Regression on Faithful with Section 9.3 content The faithful data frame contains 272 obervational units with variables waiting and eruptions measuring, in minutes, the amount of wait time between eruptions,
More informationST430 Exam 1 with Answers
ST430 Exam 1 with Answers Date: October 5, 2015 Name: Guideline: You may use one-page (front and back of a standard A4 paper) of notes. No laptop or textook are permitted but you may use a calculator.
More information1 The Classic Bivariate Least Squares Model
Review of Bivariate Linear Regression Contents 1 The Classic Bivariate Least Squares Model 1 1.1 The Setup............................... 1 1.2 An Example Predicting Kids IQ................. 1 2 Evaluating
More informationIntroduction to Linear Regression Rebecca C. Steorts September 15, 2015
Introduction to Linear Regression Rebecca C. Steorts September 15, 2015 Today (Re-)Introduction to linear models and the model space What is linear regression Basic properties of linear regression Using
More informationUnit 6 - Simple linear regression
Sta 101: Data Analysis and Statistical Inference Dr. Çetinkaya-Rundel Unit 6 - Simple linear regression LO 1. Define the explanatory variable as the independent variable (predictor), and the response variable
More informationUnit 6 - Introduction to linear regression
Unit 6 - Introduction to linear regression Suggested reading: OpenIntro Statistics, Chapter 7 Suggested exercises: Part 1 - Relationship between two numerical variables: 7.7, 7.9, 7.11, 7.13, 7.15, 7.25,
More informationIntroduction to Linear Regression
Introduction to Linear Regression James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Introduction to Linear Regression 1 / 46
More informationAnalytics 512: Homework # 2 Tim Ahn February 9, 2016
Analytics 512: Homework # 2 Tim Ahn February 9, 2016 Chapter 3 Problem 1 (# 3) Suppose we have a data set with five predictors, X 1 = GP A, X 2 = IQ, X 3 = Gender (1 for Female and 0 for Male), X 4 = Interaction
More informationQuantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression
Quantitative Understanding in Biology Module II: Model Parameter Estimation Lecture I: Linear Correlation and Regression Correlation Linear correlation and linear regression are often confused, mostly
More informationChapter 16: Understanding Relationships Numerical Data
Chapter 16: Understanding Relationships Numerical Data These notes reflect material from our text, Statistics, Learning from Data, First Edition, by Roxy Peck, published by CENGAGE Learning, 2015. Linear
More informationSolving with Absolute Value
Solving with Absolute Value Who knew two little lines could cause so much trouble? Ask someone to solve the equation 3x 2 = 7 and they ll say No problem! Add just two little lines, and ask them to solve
More informationy i s 2 X 1 n i 1 1. Show that the least squares estimators can be written as n xx i x i 1 ns 2 X i 1 n ` px xqx i x i 1 pδ ij 1 n px i xq x j x
Question 1 Suppose that we have data Let x 1 n x i px 1, y 1 q,..., px n, y n q. ȳ 1 n y i s 2 X 1 n px i xq 2 Throughout this question, we assume that the simple linear model is correct. We also assume
More informationHypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =
Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,
More informationLecture 18: Simple Linear Regression
Lecture 18: Simple Linear Regression BIOS 553 Department of Biostatistics University of Michigan Fall 2004 The Correlation Coefficient: r The correlation coefficient (r) is a number that measures the strength
More informationExample: 1982 State SAT Scores (First year state by state data available)
Lecture 11 Review Section 3.5 from last Monday (on board) Overview of today s example (on board) Section 3.6, Continued: Nested F tests, review on board first Section 3.4: Interaction for quantitative
More information1 Multiple Regression
1 Multiple Regression In this section, we extend the linear model to the case of several quantitative explanatory variables. There are many issues involved in this problem and this section serves only
More informationTwo sample Hypothesis tests in R.
Example. (Dependent samples) Two sample Hypothesis tests in R. A Calculus professor gives their students a 10 question algebra pretest on the first day of class, and a similar test towards the end of the
More informationMultiple Regression Introduction to Statistics Using R (Psychology 9041B)
Multiple Regression Introduction to Statistics Using R (Psychology 9041B) Paul Gribble Winter, 2016 1 Correlation, Regression & Multiple Regression 1.1 Bivariate correlation The Pearson product-moment
More informationInference for Regression
Inference for Regression Section 9.4 Cathy Poliak, Ph.D. cathy@math.uh.edu Office in Fleming 11c Department of Mathematics University of Houston Lecture 13b - 3339 Cathy Poliak, Ph.D. cathy@math.uh.edu
More informationappstats8.notebook October 11, 2016
Chapter 8 Linear Regression Objective: Students will construct and analyze a linear model for a given set of data. Fat Versus Protein: An Example pg 168 The following is a scatterplot of total fat versus
More informationData Analysis Using R ASC & OIR
Data Analysis Using R ASC & OIR Overview } What is Statistics and the process of study design } Correlation } Simple Linear Regression } Multiple Linear Regression 2 What is Statistics? Statistics is a
More informationChapter 8: Correlation & Regression
Chapter 8: Correlation & Regression We can think of ANOVA and the two-sample t-test as applicable to situations where there is a response variable which is quantitative, and another variable that indicates
More information4 Grouping Variables in Regression
4 Grouping Variables in Regression Qualitative variables as predictors So far, we ve considered two kinds of regression models: 1. A numerical response with a categorical or grouping predictor. Here, we
More informationSimple, Marginal, and Interaction Effects in General Linear Models
Simple, Marginal, and Interaction Effects in General Linear Models PRE 905: Multivariate Analysis Lecture 3 Today s Class Centering and Coding Predictors Interpreting Parameters in the Model for the Means
More informationRegression and the 2-Sample t
Regression and the 2-Sample t James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) Regression and the 2-Sample t 1 / 44 Regression
More informationECON 497 Midterm Spring
ECON 497 Midterm Spring 2009 1 ECON 497: Economic Research and Forecasting Name: Spring 2009 Bellas Midterm You have three hours and twenty minutes to complete this exam. Answer all questions and explain
More informationSociology 593 Exam 2 Answer Key March 28, 2002
Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably
More informationStatistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam
Statistics 191 Introduction to Regression Analysis and Applied Statistics Practice Exam Prof. J. Taylor You may use your 4 single-sided pages of notes This exam is 14 pages long. There are 4 questions,
More informationImmigration attitudes (opposes immigration or supports it) it may seriously misestimate the magnitude of the effects of IVs
Logistic Regression, Part I: Problems with the Linear Probability Model (LPM) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised February 22, 2015 This handout steals
More informationUNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics January, 2018 Work all problems. 60 points needed to pass at the Masters level, 75 to pass at the PhD
More informationChapter 26: Comparing Counts (Chi Square)
Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces
More informationLinear Regression. In this lecture we will study a particular type of regression model: the linear regression model
1 Linear Regression 2 Linear Regression In this lecture we will study a particular type of regression model: the linear regression model We will first consider the case of the model with one predictor
More informationAdvanced Quantitative Data Analysis
Chapter 24 Advanced Quantitative Data Analysis Daniel Muijs Doing Regression Analysis in SPSS When we want to do regression analysis in SPSS, we have to go through the following steps: 1 As usual, we choose
More informationHomework 2. For the homework, be sure to give full explanations where required and to turn in any relevant plots.
Homework 2 1 Data analysis problems For the homework, be sure to give full explanations where required and to turn in any relevant plots. 1. The file berkeley.dat contains average yearly temperatures for
More informationAn Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01
An Analysis of College Algebra Exam s December, 000 James D Jones Math - Section 0 An Analysis of College Algebra Exam s Introduction Students often complain about a test being too difficult. Are there
More informationPrincipal components
Principal components Principal components is a general analysis technique that has some application within regression, but has a much wider use as well. Technical Stuff We have yet to define the term covariance,
More informationLecture 11 Multiple Linear Regression
Lecture 11 Multiple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: 6.1-6.5 11-1 Topic Overview Review: Multiple Linear Regression (MLR) Computer Science Case Study 11-2 Multiple Regression
More informationLECTURE 15: SIMPLE LINEAR REGRESSION I
David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) T In 2 2 tables, statistical independence is equivalent to a population
More informationSection 5.4. Ken Ueda
Section 5.4 Ken Ueda Students seem to think that being graded on a curve is a positive thing. I took lasers 101 at Cornell and got a 92 on the exam. The average was a 93. I ended up with a C on the test.
More information10. Alternative case influence statistics
10. Alternative case influence statistics a. Alternative to D i : dffits i (and others) b. Alternative to studres i : externally-studentized residual c. Suggestion: use whatever is convenient with the
More informationAt this point, if you ve done everything correctly, you should have data that looks something like:
This homework is due on July 19 th. Economics 375: Introduction to Econometrics Homework #4 1. One tool to aid in understanding econometrics is the Monte Carlo experiment. A Monte Carlo experiment allows
More informationSystematic error, of course, can produce either an upward or downward bias.
Brief Overview of LISREL & Related Programs & Techniques (Optional) Richard Williams, University of Notre Dame, https://www3.nd.edu/~rwilliam/ Last revised April 6, 2015 STRUCTURAL AND MEASUREMENT MODELS:
More informationRegression, Part I. - In correlation, it would be irrelevant if we changed the axes on our graph.
Regression, Part I I. Difference from correlation. II. Basic idea: A) Correlation describes the relationship between two variables, where neither is independent or a predictor. - In correlation, it would
More informationRon Heck, Fall Week 3: Notes Building a Two-Level Model
Ron Heck, Fall 2011 1 EDEP 768E: Seminar on Multilevel Modeling rev. 9/6/2011@11:27pm Week 3: Notes Building a Two-Level Model We will build a model to explain student math achievement using student-level
More informationSociology 593 Exam 2 March 28, 2002
Sociology 59 Exam March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably means that
More informationRegression. Bret Hanlon and Bret Larget. December 8 15, Department of Statistics University of Wisconsin Madison.
Regression Bret Hanlon and Bret Larget Department of Statistics University of Wisconsin Madison December 8 15, 2011 Regression 1 / 55 Example Case Study The proportion of blackness in a male lion s nose
More informationBooklet of Code and Output for STAC32 Final Exam
Booklet of Code and Output for STAC32 Final Exam December 7, 2017 Figure captions are below the Figures they refer to. LowCalorie LowFat LowCarbo Control 8 2 3 2 9 4 5 2 6 3 4-1 7 5 2 0 3 1 3 3 Figure
More informationGeneralised linear models. Response variable can take a number of different formats
Generalised linear models Response variable can take a number of different formats Structure Limitations of linear models and GLM theory GLM for count data GLM for presence \ absence data GLM for proportion
More informationCLEAR EVIDENCE OF VOTING ANOMALIES IN BLADEN AND ROBESON COUNTIES RICHARD L. SMITH FEBRUARY 11, 2019
CLEAR EVIDENCE OF VOTING ANOMALIES IN BLADEN AND ROBESON COUNTIES RICHARD L. SMITH FEBRUARY 11, 2019 This is a revision of an earlier commentary submitted on January 18. I am a professor of statistics
More informationExplanatory Variables Must be Linear Independent...
Explanatory Variables Must be Linear Independent... Recall the multiple linear regression model Y j = β 0 + β 1 X 1j + β 2 X 2j + + β p X pj + ε j, i = 1,, n. is a shorthand for n linear relationships
More informationGov 2000: 9. Regression with Two Independent Variables
Gov 2000: 9. Regression with Two Independent Variables Matthew Blackwell Harvard University mblackwell@gov.harvard.edu Where are we? Where are we going? Last week: we learned about how to calculate a simple
More informationTwo-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption
Two-sample t-tests. - Independent samples - Pooled standard devation - The equal variance assumption Last time, we used the mean of one sample to test against the hypothesis that the true mean was a particular
More information9. Linear Regression and Correlation
9. Linear Regression and Correlation Data: y a quantitative response variable x a quantitative explanatory variable (Chap. 8: Recall that both variables were categorical) For example, y = annual income,
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationREVIEW 8/2/2017 陈芳华东师大英语系
REVIEW Hypothesis testing starts with a null hypothesis and a null distribution. We compare what we have to the null distribution, if the result is too extreme to belong to the null distribution (p
More informationCorrelation. January 11, 2018
Correlation January 11, 2018 Contents Correlations The Scattterplot The Pearson correlation The computational raw-score formula Survey data Fun facts about r Sensitivity to outliers Spearman rank-order
More informationRegression and Models with Multiple Factors. Ch. 17, 18
Regression and Models with Multiple Factors Ch. 17, 18 Mass 15 20 25 Scatter Plot 70 75 80 Snout-Vent Length Mass 15 20 25 Linear Regression 70 75 80 Snout-Vent Length Least-squares The method of least
More informationLecture 8: Fitting Data Statistical Computing, Wednesday October 7, 2015
Lecture 8: Fitting Data Statistical Computing, 36-350 Wednesday October 7, 2015 In previous episodes Loading and saving data sets in R format Loading and saving data sets in other structured formats Intro
More informationSTA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis. 1. Indicate whether each of the following is true (T) or false (F).
STA 4504/5503 Sample Exam 1 Spring 2011 Categorical Data Analysis 1. Indicate whether each of the following is true (T) or false (F). (a) (b) (c) (d) (e) In 2 2 tables, statistical independence is equivalent
More informationAnswer Key: Problem Set 6
: Problem Set 6 1. Consider a linear model to explain monthly beer consumption: beer = + inc + price + educ + female + u 0 1 3 4 E ( u inc, price, educ, female ) = 0 ( u inc price educ female) σ inc var,,,
More informationCorrelation & Simple Regression
Chapter 11 Correlation & Simple Regression The previous chapter dealt with inference for two categorical variables. In this chapter, we would like to examine the relationship between two quantitative variables.
More informationLecture 10: Powers of Matrices, Difference Equations
Lecture 10: Powers of Matrices, Difference Equations Difference Equations A difference equation, also sometimes called a recurrence equation is an equation that defines a sequence recursively, i.e. each
More information15.8 MULTIPLE REGRESSION WITH MANY EXPLANATORY VARIABLES
15.8 MULTIPLE REGRESSION WITH MANY EXPLANATORY VARIABLES The method of multiple regression that we have studied through the use of the two explanatory variable life expectancies example can be extended
More informationRegression of Inflation on Percent M3 Change
ECON 497 Final Exam Page of ECON 497: Economic Research and Forecasting Name: Spring 2006 Bellas Final Exam Return this exam to me by midnight on Thursday, April 27. It may be e-mailed to me. It may be
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationUNIVERSITY OF MASSACHUSETTS. Department of Mathematics and Statistics. Basic Exam - Applied Statistics. Tuesday, January 17, 2017
UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Tuesday, January 17, 2017 Work all problems 60 points are needed to pass at the Masters Level and 75
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationLinear Regression. Linear Regression. Linear Regression. Did You Mean Association Or Correlation?
Did You Mean Association Or Correlation? AP Statistics Chapter 8 Be careful not to use the word correlation when you really mean association. Often times people will incorrectly use the word correlation
More informationSwarthmore Honors Exam 2012: Statistics
Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may
More informationAnswer Key: Problem Set 5
: Problem Set 5. Let nopc be a dummy variable equal to one if the student does not own a PC, and zero otherwise. i. If nopc is used instead of PC in the model of: colgpa = β + δ PC + β hsgpa + β ACT +
More informationLecture 4 Multiple linear regression
Lecture 4 Multiple linear regression BIOST 515 January 15, 2004 Outline 1 Motivation for the multiple regression model Multiple regression in matrix notation Least squares estimation of model parameters
More informationStat 401B Exam 2 Fall 2015
Stat 401B Exam Fall 015 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed ATTENTION! Incorrect numerical answers unaccompanied by supporting reasoning
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationRecall, Positive/Negative Association:
ANNOUNCEMENTS: Remember that discussion today is not for credit. Go over R Commander. Go to 192 ICS, except at 4pm, go to 192 or 174 ICS. TODAY: Sections 5.3 to 5.5. Note this is a change made in the daily
More informationappstats27.notebook April 06, 2017
Chapter 27 Objective Students will conduct inference on regression and analyze data to write a conclusion. Inferences for Regression An Example: Body Fat and Waist Size pg 634 Our chapter example revolves
More information1995, page 8. Using Multiple Regression to Make Comparisons SHARPER & FAIRER. > by SYSTAT... > by hand = 1.86.
Using Multiple Regression to Make Comparisons SHARPER & FAIRER Illustration : Effect of sexual activity on male longevity Longevity (days) of male fruit-flies randomized to live with either uninterested
More informationBiol 206/306 Advanced Biostatistics Lab 5 Multiple Regression and Analysis of Covariance Fall 2016
Biol 206/306 Advanced Biostatistics Lab 5 Multiple Regression and Analysis of Covariance Fall 2016 By Philip J. Bergmann 0. Laboratory Objectives 1. Extend your knowledge of bivariate OLS regression to
More informationInvestigating Models with Two or Three Categories
Ronald H. Heck and Lynn N. Tabata 1 Investigating Models with Two or Three Categories For the past few weeks we have been working with discriminant analysis. Let s now see what the same sort of model might
More informationL21: Chapter 12: Linear regression
L21: Chapter 12: Linear regression Department of Statistics, University of South Carolina Stat 205: Elementary Statistics for the Biological and Life Sciences 1 / 37 So far... 12.1 Introduction One sample
More informationLecture 14: Introduction to Poisson Regression
Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu 8 May 2007 1 / 52 Overview Modelling counts Contingency tables Poisson regression models 2 / 52 Modelling counts I Why
More informationModelling counts. Lecture 14: Introduction to Poisson Regression. Overview
Modelling counts I Lecture 14: Introduction to Poisson Regression Ani Manichaikul amanicha@jhsph.edu Why count data? Number of traffic accidents per day Mortality counts in a given neighborhood, per week
More informationMultiple Regression and Regression Model Adequacy
Multiple Regression and Regression Model Adequacy Joseph J. Luczkovich, PhD February 14, 2014 Introduction Regression is a technique to mathematically model the linear association between two or more variables,
More informationMultiple Representations: Equations to Tables and Graphs Transcript
Algebra l Teacher: It s good to see you again. Last time we talked about multiple representations. If we could, I would like to continue and discuss the subtle differences of multiple representations between
More informationHypothesis testing, part 2. With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal
Hypothesis testing, part 2 With some material from Howard Seltman, Blase Ur, Bilge Mutlu, Vibha Sazawal 1 CATEGORICAL IV, NUMERIC DV 2 Independent samples, one IV # Conditions Normal/Parametric Non-parametric
More informationHOMEWORK (due Wed, Jan 23): Chapter 3: #42, 48, 74
ANNOUNCEMENTS: Grades available on eee for Week 1 clickers, Quiz and Discussion. If your clicker grade is missing, check next week before contacting me. If any other grades are missing let me know now.
More informationLecture 4: Constructing the Integers, Rationals and Reals
Math/CS 20: Intro. to Math Professor: Padraic Bartlett Lecture 4: Constructing the Integers, Rationals and Reals Week 5 UCSB 204 The Integers Normally, using the natural numbers, you can easily define
More informationLinear regression and correlation
Faculty of Health Sciences Linear regression and correlation Statistics for experimental medical researchers 2018 Julie Forman, Christian Pipper & Claus Ekstrøm Department of Biostatistics, University
More informationR 2 and F -Tests and ANOVA
R 2 and F -Tests and ANOVA December 6, 2018 1 Partition of Sums of Squares The distance from any point y i in a collection of data, to the mean of the data ȳ, is the deviation, written as y i ȳ. Definition.
More informationRegression 1: Linear Regression
Regression 1: Linear Regression Marco Baroni Practical Statistics in R Outline Classic linear regression Linear regression in R Outline Classic linear regression Introduction Constructing the model Estimation
More informationHomework 1 Solutions
Homework 1 Solutions January 18, 2012 Contents 1 Normal Probability Calculations 2 2 Stereo System (SLR) 2 3 Match Histograms 3 4 Match Scatter Plots 4 5 Housing (SLR) 4 6 Shock Absorber (SLR) 5 7 Participation
More informationECONOMETRIC MODEL WITH QUALITATIVE VARIABLES
ECONOMETRIC MODEL WITH QUALITATIVE VARIABLES How to quantify qualitative variables to quantitative variables? Why do we need to do this? Econometric model needs quantitative variables to estimate its parameters
More informationChapter 27 Summary Inferences for Regression
Chapter 7 Summary Inferences for Regression What have we learned? We have now applied inference to regression models. Like in all inference situations, there are conditions that we must check. We can test
More informationSwarthmore Honors Exam 2015: Statistics
Swarthmore Honors Exam 2015: Statistics 1 Swarthmore Honors Exam 2015: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having 7 questions. You may
More informationCHAPTER 4 & 5 Linear Regression with One Regressor. Kazu Matsuda IBEC PHBU 430 Econometrics
CHAPTER 4 & 5 Linear Regression with One Regressor Kazu Matsuda IBEC PHBU 430 Econometrics Introduction Simple linear regression model = Linear model with one independent variable. y = dependent variable
More informationCS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018
CS 301 Lecture 18 Decidable languages Stephen Checkoway April 2, 2018 1 / 26 Decidable language Recall, a language A is decidable if there is some TM M that 1 recognizes A (i.e., L(M) = A), and 2 halts
More informationOne-sample categorical data: approximate inference
One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution
More informationChs. 15 & 16: Correlation & Regression
Chs. 15 & 16: Correlation & Regression With the shift to correlational analyses, we change the very nature of the question we are asking of our data. Heretofore, we were asking if a difference was likely
More information