Regression Estimation Least Squares and Maximum Likelihood
|
|
- Agnes Harris
- 6 years ago
- Views:
Transcription
1 Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, Linear Regression Models Lecture 3, Slide 1
2 Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q= n i=1 (Y i (β 0 +β 1 X i )) 2 Minimize this by maximizing Q Find partials and set both equal to zero go to board Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 2
3 Normal Equations The result of this maximization step are called the normal equations. b 0 and b 1 are called point estimators of β 0 and β 1 respectively Yi = nb 0 +b 1 Xi Xi Y i = b 0 Xi +b 1 X 2 i This is a system of two equations and two unknowns. The solution is given by Write these on board Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 3
4 Solution to Normal Equations After a lot of algebra one arrives at b 1 = (Xi X)(Y i Ȳ) (Xi X) 2 b 0 = Ȳ b 1 X X = Ȳ = Xi n Yi n Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 4
5 Least Squares Fit? Estimate, y = 2.09x , mse: 4.15 True, y = 2x + 9, mse: 4.22 Response/Output Predictor/Input Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 5
6 Guess # Guess, y = 0x , mse: 37.1 True, y = 2x + 9, mse: 4.22 Response/Output Predictor/Input Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 6
7 Guess # Guess, y = 1.5x + 13, mse: 7.84 True, y = 2x + 9, mse: 4.22 Response/Output Predictor/Input Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 7
8 Looking Ahead: Matrix Least Squares Y 1 Y 2.. = X 1 1 X [ β1 β 0 ] Y n X n 1 Solution to this equation is solution to least squares linear regression (and maximum likelihood under normal error distribution assumption) Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 8
9 Questions to Ask Is the relationship really linear? What is the distribution of the of errors? Is the fit good? How much of the variability of the response is accounted for by including the predictor variable? Is the chosen predictor variable the best one? Frank Wood, Linear Regression Models Lecture 3, Slide 9
10 Is This Better? 40 7 Order, mse: Response/Output Predictor/Input Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 10
11 Goals for First Half of Course How to do linear regression Self familiarization with software tools How to interpret standard linear regression results How to derive tests How to assess and address deficiencies in regression models Frank Wood, Linear Regression Models Lecture 3, Slide 11
12 Properties of Solution The i th residual is defined to be e i =Y i Ŷi The sum of the residuals is zero: e i = (Y i b 0 b 1 X i ) i = Y i nb 0 b 1 Xi = 0 By first normal equation. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 12
13 Properties of Solution The sum of the observed values Y i equals the sum of the fitted valuesŷ i i Y i = i = i Ŷ i (b 1 X i +b 0 ) = (b 1 X i +Ȳ b 1 X) i = b 1 X i +nȳ b 1 n X = b 1 n X+ i i Y i b 1 n X Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 13
14 Properties of Solution The sum of the weighted residuals is zero when the residual in the i th trial is weighted by the level of the predictor variable in the i th trial i X i e i = (X i (Y i b 0 b 1 X i )) = i X i Y i b 0 Xi b 1 (X 2 i ) = 0 By second normal equation. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 14
15 Properties of Solution The sum of the weighted residuals is zero when the residual in the i th trial is weighted by the fitted value of the response variable for the i th trial Ŷ i e i = (b 0 +b 1 X i )e i i i = b 0 e i +b 1 e i X i = 0 i i By previous properties. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 15
16 Properties of Solution The regression line always goes through the point X,Ȳ Frank Wood, Linear Regression Models Lecture 3, Slide 16
17 Estimating Error Term Variance σ 2 Review estimation in non-regression setting. Show estimation results for regression setting. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 17
18 Estimation Review An estimator is a rule that tells how to calculate the value of an estimate based on the measurements contained in a sample i.e. the sample mean Ȳ = 1 n n i=1 Y i Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 18
19 Point Estimators and Bias Point estimator ˆθ=f({Y 1,...,Y n }) Unknown quantity / parameter Definition: Bias of estimator θ B(ˆθ)=E(ˆθ) θ Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 19
20 One Sample Example µ = 5, σ = 0.75 samples θ est. θ run bias_example_plot.m Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 20
21 Distribution of Estimator If the estimator is a function of the samples and the distribution of the samples is known then the distribution of the estimator can (often) be determined Methods Distribution (CDF) functions Transformations Moment generating functions Jacobians (change of variable) Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 21
22 Example Samples from a Normal(µ,σ 2 ) distribution Y i Normal(µ,σ 2 ) Estimate the population mean θ=µ, ˆθ= Ȳ = 1 n n i=1 Y i Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 22
23 Sampling Distribution of the Estimator First moment E(ˆθ) = E( 1 n = 1 n n i=1 n Y i ) i=1 This is an example of an unbiased estimator B(ˆθ)=E(ˆθ) θ=0 E(Y i )= nµ n =θ Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 23
24 Variance of Estimator Definition: Variance of estimator V(ˆθ)=E([ˆθ E(ˆθ)] 2 ) Remember: V(cY)=c 2 V(Y) V( n i=1 Y i)= n i=1 V(Y i) Only if the Y i are independent with finite variance Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 24
25 Example Estimator Variance For N(0,1) mean estimator V(ˆθ) = V( 1 n = 1 n 2 n Note assumptions i=1 n i=1 Y i ) V(Y i )= nσ2 n = σ2 2 n Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 25
26 Distribution of sample mean estimator samples Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 26
27 Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ)=E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ)=V(ˆθ)+(B(ˆθ) 2 ) Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 27
28 MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]+[E(ˆθ) θ]) 2 ) = E([ˆθ E(ˆθ)] 2 )+2E([E(ˆθ) θ][ˆθ E(ˆθ)])+E([E(ˆθ) θ] 2 ) = V(ˆθ)+2E([E(ˆθ)[ˆθ E(ˆθ)] θ[ˆθ E(ˆθ)]))+(B(ˆθ)) 2 = V(ˆθ)+2(0+0)+(B(ˆθ)) 2 = V(ˆθ)+(B(ˆθ)) 2 Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 28
29 Trade-off Think of variance as confidence and bias as correctness. Intuitions (largely) apply Sometimes a biased estimator can produce lower MSE if it lowers the variance. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 29
30 Estimating Error Term Variance σ 2 Regression model Variance of each observation Y i is σ 2 (the same as for the error term ǫ i ) Each Y i comes from a different probability distribution with different means that depend on the level X i The deviation of an observation Y i must be calculated around its own estimated mean. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 30
31 s 2 estimator for σ 2 s 2 =MSE= SSE n 2 = (Yi Ŷ i ) 2 n 2 = e 2 i n 2 MSE is an unbiased estimator of σ 2 E(MSE)=σ 2 The sum of squares SSE has n-2 degrees of freedom associated with it. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 31
32 Normal Error Regression Model No matter how the error terms ǫ i are distributed, the least squares method provides unbiased point estimators of β 0 and β 1 that also have minimum variance among all unbiased linear estimators To set up interval estimates and make tests we need to specify the distribution of the ǫ i We will assume that the ǫ i are normally distributed. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 32
33 Normal Error Regression Model Y i =β 0 +β 1 X i +ǫ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor variable in the i th trial ǫ i ~ iid N(0,σ 2 ) i = 1,,n Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 33
34 Notational Convention When you see ǫ i ~ iid N(0,σ 2 ) It is read as ǫ i is distributed identically and independently according to a normal distribution with mean 0 and variance σ 2 Examples θ ~ Poisson(λ) z ~ G(θ) Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 34
35 Maximum Likelihood Principle The method of maximum likelihood chooses as estimates those values of the parameters that are most consistent with the sample data. Frank Wood, Linear Regression Models Lecture 3, Slide 35
36 If Likelihood Function X i F(Θ),i=1...n then the likelihood function is L({X i } n i=1,θ)= n i=1 F(X i;θ) Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 36
37 Example, N(10,3) Density, Single Obs Samples N(10, 3) Density N=10, - log likelihood = Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 37
38 Example, N(10,3) Density, Single Obs. Again Samples N(10, 3) Density N=10, - log likelihood = Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 38
39 Example, N(10,3) Density, Multiple Obs Samples N(10, 3) Density N=10, - log likelihood = Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 39
40 Maximum Likelihood Estimation The likelihood function can be maximized w.r.t. the parameter(s) Θ, doing this one can arrive at estimators for parameters as well. L({X i } n i=1,θ)= n i=1 F(X i;θ) To do this, find solutions to (analytically or by following gradient) dl({x i } n i=1,θ) dθ =0 Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 40
41 Important Trick Never (almost) maximize the likelihood function, maximize the log likelihood function instead. n log(l({x i } n i=1,θ)) = log( F(X i ;Θ)) = i=1 n log(f(x i ;Θ)) i=1 Quite often the log of the density is easier to work with mathematically. Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 41
42 ML Normal Regression Likelihood function L(β 0,β 1,σ 2 ) = = n i=1 1 (2πσ 2 ) 1/2e 1 2σ 2(Y i β 0 β 1 X i ) 2 1 (2πσ 2 ) n/2e 1 2σ 2 n i=1 (Y i β 0 β 1 X i ) 2 which if you maximize (how?) w.r.t. to the parameters you get Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 42
43 Maximum Likelihood Estimator(s) β 0 b 0 same as in least squares case β 1 b 1 same as in least squares case σ 2 ˆσ 2 = i (Y i Ŷ i ) 2 n Note that ML estimator is biased as s 2 is unbiased and s 2 =MSE= n n 2ˆσ2 Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 43
44 Comments Least squares minimizes the squared error between the prediction and the true output The normal distribution is fully characterized by its first two central moments (mean and variance) Food for thought: What does the bias in the ML estimator of the error variance mean? And where does it come from? Frank Wood, Linear Regression Models Lecture 3, Slide 44
Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationInference in Regression Analysis
Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationMultiple Regression. Dr. Frank Wood. Frank Wood, Linear Regression Models Lecture 12, Slide 1
Multiple Regression Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 12, Slide 1 Review: Matrix Regression Estimation We can solve this equation (if the inverse of X
More informationRemedial Measures, Brown-Forsythe test, F test
Remedial Measures, Brown-Forsythe test, F test Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 7, Slide 1 Remedial Measures How do we know that the regression function
More informationMS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari
MS&E 226: Small Data Lecture 11: Maximum likelihood (v2) Ramesh Johari ramesh.johari@stanford.edu 1 / 18 The likelihood function 2 / 18 Estimating the parameter This lecture develops the methodology behind
More informationRegression #3: Properties of OLS Estimator
Regression #3: Properties of OLS Estimator Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #3 1 / 20 Introduction In this lecture, we establish some desirable properties associated with
More informationECE531 Lecture 10b: Maximum Likelihood Estimation
ECE531 Lecture 10b: Maximum Likelihood Estimation D. Richard Brown III Worcester Polytechnic Institute 05-Apr-2011 Worcester Polytechnic Institute D. Richard Brown III 05-Apr-2011 1 / 23 Introduction So
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationStatistics and Econometrics I
Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 12: Frequentist properties of estimators (v4) Ramesh Johari ramesh.johari@stanford.edu 1 / 39 Frequentist inference 2 / 39 Thinking like a frequentist Suppose that for some
More informationInference in Normal Regression Model. Dr. Frank Wood
Inference in Normal Regression Model Dr. Frank Wood Remember We know that the point estimator of b 1 is b 1 = (Xi X )(Y i Ȳ ) (Xi X ) 2 Last class we derived the sampling distribution of b 1, it being
More informationMax. Likelihood Estimation. Outline. Econometrics II. Ricardo Mora. Notes. Notes
Maximum Likelihood Estimation Econometrics II Department of Economics Universidad Carlos III de Madrid Máster Universitario en Desarrollo y Crecimiento Económico Outline 1 3 4 General Approaches to Parameter
More informationMultiple Linear Regression
Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there
More informationMathematical statistics
October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationProblem 1 (20) Log-normal. f(x) Cauchy
ORF 245. Rigollet Date: 11/21/2008 Problem 1 (20) f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 4 2 0 2 4 Normal (with mean -1) 4 2 0 2 4 Negative-exponential x x f(x) f(x) 0.0 0.1 0.2 0.3 0.4 0.5
More informationChapter 1. Linear Regression with One Predictor Variable
Chapter 1. Linear Regression with One Predictor Variable 1.1 Statistical Relation Between Two Variables To motivate statistical relationships, let us consider a mathematical relation between two mathematical
More informationChapters 9. Properties of Point Estimators
Chapters 9. Properties of Point Estimators Recap Target parameter, or population parameter θ. Population distribution f(x; θ). { probability function, discrete case f(x; θ) = density, continuous case The
More information5.2 Fisher information and the Cramer-Rao bound
Stat 200: Introduction to Statistical Inference Autumn 208/9 Lecture 5: Maximum likelihood theory Lecturer: Art B. Owen October 9 Disclaimer: These notes have not been subjected to the usual scrutiny reserved
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More informationEstimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1
Estimation September 24, 2018 STAT 151 Class 6 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0 1 1 1
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationChapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression
BSTT523: Kutner et al., Chapter 1 1 Chapter 1: Linear Regression with One Predictor Variable also known as: Simple Linear Regression Bivariate Linear Regression Introduction: Functional relation between
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationSimple Linear Regression Analysis
LINEAR REGRESSION ANALYSIS MODULE II Lecture - 6 Simple Linear Regression Analysis Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Prediction of values of study
More informationReview. December 4 th, Review
December 4 th, 2017 Att. Final exam: Course evaluation Friday, 12/14/2018, 10:30am 12:30pm Gore Hall 115 Overview Week 2 Week 4 Week 7 Week 10 Week 12 Chapter 6: Statistics and Sampling Distributions Chapter
More informationIntro to Linear Regression
Intro to Linear Regression Introduction to Regression Regression is a statistical procedure for modeling the relationship among variables to predict the value of a dependent variable from one or more predictor
More informationEXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009
EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates
More informationMathematical statistics
October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter
More informationMISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30
MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)
More informationPOLI 8501 Introduction to Maximum Likelihood Estimation
POLI 8501 Introduction to Maximum Likelihood Estimation Maximum Likelihood Intuition Consider a model that looks like this: Y i N(µ, σ 2 ) So: E(Y ) = µ V ar(y ) = σ 2 Suppose you have some data on Y,
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationChapter 1 Linear Regression with One Predictor
STAT 525 FALL 2018 Chapter 1 Linear Regression with One Predictor Professor Min Zhang Goals of Regression Analysis Serve three purposes Describes an association between X and Y In some applications, the
More informationMeasuring the fit of the model - SSR
Measuring the fit of the model - SSR Once we ve determined our estimated regression line, we d like to know how well the model fits. How far/close are the observations to the fitted line? One way to do
More informationAdvanced Signal Processing Introduction to Estimation Theory
Advanced Signal Processing Introduction to Estimation Theory Danilo Mandic, room 813, ext: 46271 Department of Electrical and Electronic Engineering Imperial College London, UK d.mandic@imperial.ac.uk,
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationRemedial Measures Wrap-Up and Transformations Box Cox
Remedial Measures Wrap-Up and Transformations Box Cox Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 1, Slide 1 Last Class Graphical procedures for determining appropriateness
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationFall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.
1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Due Thursday, September 19, in class What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationSTAT 135 Lab 3 Asymptotic MLE and the Method of Moments
STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,
More informationML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58
ML and REML Variance Component Estimation Copyright c 2012 Dan Nettleton (Iowa State University) Statistics 611 1 / 58 Suppose y = Xβ + ε, where ε N(0, Σ) for some positive definite, symmetric matrix Σ.
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationEconometrics A. Simple linear model (2) Keio University, Faculty of Economics. Simon Clinet (Keio University) Econometrics A October 16, / 11
Econometrics A Keio University, Faculty of Economics Simple linear model (2) Simon Clinet (Keio University) Econometrics A October 16, 2018 1 / 11 Estimation of the noise variance σ 2 In practice σ 2 too
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationTheory of Statistics.
Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased
More informationStatistics II Lesson 1. Inference on one population. Year 2009/10
Statistics II Lesson 1. Inference on one population Year 2009/10 Lesson 1. Inference on one population Contents Introduction to inference Point estimators The estimation of the mean and variance Estimating
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationParametric Inference
Parametric Inference Moulinath Banerjee University of Michigan April 14, 2004 1 General Discussion The object of statistical inference is to glean information about an underlying population based on a
More informationSimple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.
Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationSection 4.6 Simple Linear Regression
Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval
More informationIntroduction to Estimation Methods for Time Series models Lecture 2
Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationF & B Approaches to a simple model
A6523 Signal Modeling, Statistical Inference and Data Mining in Astrophysics Spring 215 http://www.astro.cornell.edu/~cordes/a6523 Lecture 11 Applications: Model comparison Challenges in large-scale surveys
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationMachine Learning Basics: Maximum Likelihood Estimation
Machine Learning Basics: Maximum Likelihood Estimation Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics 1. Learning
More informationLecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2
Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y
More informationECON The Simple Regression Model
ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In
More informationSTAT 512 sp 2018 Summary Sheet
STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}
More informationLinear Models A linear model is defined by the expression
Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationStatistics. Lecture 2 August 7, 2000 Frank Porter Caltech. The Fundamentals; Point Estimation. Maximum Likelihood, Least Squares and All That
Statistics Lecture 2 August 7, 2000 Frank Porter Caltech The plan for these lectures: The Fundamentals; Point Estimation Maximum Likelihood, Least Squares and All That What is a Confidence Interval? Interval
More informationSSR = The sum of squared errors measures how much Y varies around the regression line n. It happily turns out that SSR + SSE = SSTO.
Analysis of variance approach to regression If x is useless, i.e. β 1 = 0, then E(Y i ) = β 0. In this case β 0 is estimated by Ȳ. The ith deviation about this grand mean can be written: deviation about
More informationEIE6207: Estimation Theory
EIE6207: Estimation Theory Man-Wai MAK Dept. of Electronic and Information Engineering, The Hong Kong Polytechnic University enmwmak@polyu.edu.hk http://www.eie.polyu.edu.hk/ mwmak References: Steven M.
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7
MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 15: Examples of hypothesis tests (v5) Ramesh Johari ramesh.johari@stanford.edu 1 / 32 The recipe 2 / 32 The hypothesis testing recipe In this lecture we repeatedly apply the
More informationLecture 2 Simple Linear Regression STAT 512 Spring 2011 Background Reading KNNL: Chapter 1
Lecture Simple Linear Regression STAT 51 Spring 011 Background Reading KNNL: Chapter 1-1 Topic Overview This topic we will cover: Regression Terminology Simple Linear Regression with a single predictor
More informationIEOR 165 Lecture 7 1 Bias-Variance Tradeoff
IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to
More informationWooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics
Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationp y (1 p) 1 y, y = 0, 1 p Y (y p) = 0, otherwise.
1. Suppose Y 1, Y 2,..., Y n is an iid sample from a Bernoulli(p) population distribution, where 0 < p < 1 is unknown. The population pmf is p y (1 p) 1 y, y = 0, 1 p Y (y p) = (a) Prove that Y is the
More informationPractice Problems Section Problems
Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationIntroduction to Estimation Methods for Time Series models. Lecture 1
Introduction to Estimation Methods for Time Series models Lecture 1 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 1 SNS Pisa 1 / 19 Estimation
More informationMethods of evaluating estimators and best unbiased estimators Hamid R. Rabiee
Stochastic Processes Methods of evaluating estimators and best unbiased estimators Hamid R. Rabiee 1 Outline Methods of Mean Squared Error Bias and Unbiasedness Best Unbiased Estimators CR-Bound for variance
More informationLecture 3: Inference in SLR
Lecture 3: Inference in SLR STAT 51 Spring 011 Background Reading KNNL:.1.6 3-1 Topic Overview This topic will cover: Review of hypothesis testing Inference about 1 Inference about 0 Confidence Intervals
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)
1 ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE) Jingxian Wu Department of Electrical Engineering University of Arkansas Outline Minimum Variance Unbiased Estimators (MVUE)
More informationLectures on Simple Linear Regression Stat 431, Summer 2012
Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population
More informationMaximum Likelihood, Logistic Regression, and Stochastic Gradient Training
Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions
More informationNonparametric Regression and Bonferroni joint confidence intervals. Yang Feng
Nonparametric Regression and Bonferroni joint confidence intervals Yang Feng Simultaneous Inferences In chapter 2, we know how to construct confidence interval for β 0 and β 1. If we want a confidence
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More information