Advanced Statistics I : Gaussian Linear Model (and beyond)
|
|
- Chester Sharp
- 6 years ago
- Views:
Transcription
1 Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale
2 Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection Exercices
3 Discrete and continuous distributions Discrete distribution P = n i=1 p iδ xi Ex: Binomial, Poisson distributions Continuous distribution Q(dx) = f (x)dx. Ex: Exponential distribution A distribution can be neither purely discrete, nor purely continuous! Ex: Z = min{x,1}, where X E(λ)
4 Descriptive properties Expectation Variance µ = E[X] [ σ 2 = E (X µ) 2] = E [ X 2] µ 2 Skewness γ = E[(X µ)3 ] σ 3 Kurtosis κ = E[(X µ)4 ] σ 4 3
5 Some remarkable distributions Normal: scale- and shift- stable family Chi-2: if X 1,...,X n is a N(0,1)-sample, then Z X X 2 n χ 2 (n) Student: if X N(0,1) is independent of Z χ 2 (n), then T = X Z/n T (n) Fischer: if X χ 2 (n) is independent of Y χ 2 (m), then F = X/n Y /m F(n,m)
6 Empirical Distribution and statistics Let X 1,...,X n be a P-sample Empirical mean: Xn = 1 n n i=1 X i Empirical variance: Σ 2 n = 1 n ( Xi n X ) 2 1 n = n i=1 n i=1 X 2 i ( Xn ) 2 Empirical distribution P n = 1 n n i=1 δ X i Unbiased variance estimator ( ˆσ n 2 = 1 n ( Xi n 1 X ) n ) 2 1 n = Xi 2 n ( Xn ) 2 n 1 i=1 i=1 If P = N(0,σ 2 ), using Cochran s Theorem we get X n N(0,σ 2 /n) n ( Xi X ) 2 n σ 2 χ 2 (n 1) i=1
7 Convergence properties Theorem (LLN) If E[ X i ] <, then (in probability, almost surely) X n µ Application to S 2 n and ˆσ 2 n, etc... Convergence of the empirical distribution P n P under appropriate hypotheses
8 Central Limit Theorem Theorem (CLT ) If E[X 2 i ] <, n ( X n µ ) σ N(0,1) By Slutsky s Lemma, n ( Xn µ ) ˆσ n Student statistic: if X i N(µ,σ 2 ), n ( Xn µ ) N(0,1) ˆσ n T (n 1)
9 Confidence interval for the mean if σ is known, if σ is unknown, I α (µ) = I α (µ) = [ X n ± σφ ] 1 α/2 n [ X n ± ˆσ nt n 1 ] 1 α/2 n
10 Confidence interval for the variance if µ is known, as n i=1 ( ) 2 Xi µ σ χ 2 (n) [ n = I α (σ 2 i=1 ) = (X i µ) 2 ] n i=1, (X i µ) 2 if µ is unknown, as n i=1 = I α (σ 2 ) = [ χ n 1 α/2 χ n α/2 ( ) 2 Xi X n σ χ 2 (n 1) ˆσ 2 n ˆσ2 χ n 1 1 α/2 /(n 1), χ n 1 α/2 n /(n 1) ]
11 Comparison of two variances let X 1,1,...,X 1,n1 be a sample N(µ 1,σ 2 1 ), and X 2,1,...,X 2,n2 be an independant sample N(µ 2,σ 2 2 ), in order to test H 0 : σ 1 = σ 2 versus H 1 : σ 1 σ 2, use statistic F = ˆσ ˆσ H 0 F(n 1 1,n 2 1) 2
12 Theorem Comparison of two means let X 1,1,...,X 1,n1 be a sample N(µ 1,σ 2 ), and X 2,1,...,X 2,n2 be an independent sample N(µ 2,σ 2 ), To estimate the common variance, use ˆσ 12 = 1 n 1 + n 2 2 ( n1 ) ( X1,i X ) 2 n 2 ( 1 + X2,i X ) 2 2 i=1 σ 2 n 1 + n 2 2 χ2 (n 1 + n 2 2) i=1 in order to test H 0 : µ 1 = µ 2 versus H 1 : µ 1 µ 2, use statistic n1 n 2 ( X1 T = X ) 2 H0 T(n 1 + n 2 2) n 1 + n 2 ˆσ 12
13 Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection Exercices
14 Generic formulation Y i = α 1 xi α p x p i + σz i, Z i N(0,1) Matrice form: Y = Xθ + σz, Z N(0 n,i n ) Ex: ANOVA, regression, rupture in time series
15 Theorem Cochran s Theorem let X = (X 1,...,X n ) be a standard centered normal sample let E 1,...,E p be a decomposition of R n by two-by-two orthogonal subspaces of dimensions dime j = d j for 1 i p, let v1 i,...,vi j i be an orhogonal basis of E j Then the components of X in base (v 1,...,v n ) form another standard centered normal sample the random vectors X E1,...,X Ep obtained by projecting X on E 1,...,E p are independent so are X E1,..., X Ep, and they satisfy: X Ei 2 χ 2 (d i )
16 Theorem Generic solution The Maximum-Likelihood estimator and the least-square estimator are given by: ˆθ = (t XX ) 1 t XY N(θ,σ 2 (t XX ) 1 ) The variance σ 2 is estimated (without bias) by: ˆσ 2 = Y X ˆθ 2 n p σ2 n p χ2 (n p) ˆθ and σ 2 are independent Incremental Gramm-Schmidt procedure Theorem (Gauss-Markov) ˆθ has minimal variance among all linear unbiased estimators of θ
17 Y = Xθ + σz Xθ M 0 M
18 Y = Xθ + σz Xθ M Y M = Xˆθ M 0 M
19 Y = Xθ + σz Y Xθ 2 σ 2 χ 2 (n) Y Y M 2 σ 2 χ 2 (n p) Y M = Xˆθ M Y M Xθ 2 σ 2 χ 2 (p) Xθ 0 M
20 Theorem Simple regression: Y i = α + βx i + σz i The ML-estimators are given by: ( ˆα = Ȳ ˆβ x N α, σ2 E[x 2 ) ] nvar(x) ( N β, ˆβ = Cov(x,Y ) Var(x) σ 2 nvar(x) ( ) They are correlated: Cov ˆα, ˆβ = σ2 x nvar[x] The variance can be estimated by: ˆσ n 2 = 1 (Yi ˆα ˆβx i ) 2 n 2 ) σ2 n 2 χ2 (n 2) Smart reparameterization Y i = δ + β(x i x) + σz i
21 Polynomial regression Y = 1 x x 2... x p Can also be used for exponential growth models y i = exp(ax i + b i + ǫ i ) to determine β such that E[Y ] = αx β,... α 0 α 1 α 2. α p
22 Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection Exercices
23 Student test on a regressor Theorem In order to test H 0 = θ k = a versus H 1 = θ k a estimate the variance of ˆθ k by use the statistic ˆσ 2 (ˆβk ) = ˆσ 2 { ( t XX) 1} k,k T = ˆβ k a ) H0 T(n p) ˆσ (ˆβk Generalization: to test H 0 = t bθ = a versus H 1 = t bθ a, use T = t b ˆβ a ˆσ t b( t XX) 1 b H 0 T(n p)
24 Fischer Test model vs submodel Theorem let H E R n, dimh = q,dime = p to test H 0 = θ H versus H 1 = θ E \ H, use the statistic F = Y E Y H 2 /(p q) Y Y E 2 /(n p) H 0 F(p q,n p) reject if F > F p q,n p 1 α
25 Y = Xθ + σz 0 E Xθ H H Y E = Xˆθ E
26 Y = Xθ + σz 0 E Xθ H Y H = X ˆθ H H Y E = Xˆθ E
27 Y = Xθ + σz Y Y E 2 σ 2 χ 2 (n p) 0 E Y H Xθ 2 Y E = Xˆθ E σ 2 χ 2 (q) Y E Y H 2 σ 2 χ 2 (p q) Xθ H Y H = X ˆθ H H
28 SSS-notations and R 2 For a model M (relative to a matrix X), define total variance SSY = Y Ȳ 1 n 2 = SSE(1 n ) residual variance SSE(M) = Y X ˆθ 2 explained variance SSR(M) = X ˆθ Ȳ 1 n 2 SSY = SSE(M) + SSR(M) The quality of fit is quantified by R 2 (M) = SSR(M) SSY The Fischer statistic can be written: (SSE(H) SSE(E))/(p q) F = SSE(E)/(n p) = n dim(e) dim(e) dim(h) R2 (E) R 2 (H) 1 R 2 (E)
29 The model can be written: ANOVA Y i,k = θ i + σǫ i,k,1 i p,1 k n i Let Y i, = 1 n i k Y i,k and Y, = 1 n i,k Y i,k The variance can be decomposed as: SSY = SSR(M) + SSE(M) = i n i (Y i, Y, ) + i,k (Y i,k Y i, ) To test H 0 = θ 1 = = θ p versus H 1 = H 0, the Fischer statistic is: F = n p i n i(y i, Y, ) 2 p 1 i,k (Y F(p 1,n p) i,k Y i, ) 2
30 Exhaustive, Forward, Backward and Stepwise selection Exhaustive search: for all sizes 1 k p, find the combination of directions with highest R 2. Forward selection: at each step, add the direction most correlated with Y. Stop when the Fischer test for this direction is not rejected Backward selection: start with full model, and remove the direction with smallest t-statistic. Stop when all remaining t-statistics are significant Stepwise selection: like Forward selection, but after each inclusion remove all directions with unsignificant F-statistic Note: unless specified, 1 n is always included into the models.
31 Quadratic Risk: Bias-Variance decomposition To simplify the discussion, we consider the model Y = θ + σz where θ is arbitrary but aims at be understood by the family of models M The quadratic risk of model M M is defined as [ θ ] 2 r(m) = E ˆθM It can be decomposed as: r(m) = θ θ M 2 + σ 2 dim(m)
32 Risk Estimation and Mallow s criterion Goal: choose model M M with minimal quadratic risk r(m). Problem: the bias θ θ M 2 is unknown Idea: penalize complexity dim(m) Mallow s criterion: choose model M minimizing C p (M) = SSE(M) + 2σ 2 dim(m) Heuristic: ] r(m) = θ 2 θ M 2 + σ 2 dim(m), but E [ ˆθ M 2 = θ M 2 + σ 2 dim(m), hence ( ) r(m) = θ 2 ˆθ M 2 σ 2 dim(m) + σ 2 dim(m) has expectation r(m), but maximizing r(m) over M is equivalent to maximizing r(m) θ 2 + Y 2 = Y ˆθ M 2 + 2σ 2 dim(m)
33 Y = θ + σz θ 0 M
34 Y = θ + σz θ θ M = Π M (θ) ˆθ M = Π M (Y ) 0 M
35 Y = θ + σz θ ˆθ 2 ˆθ M = Π M (Y ) θ 2 θ M = Π M (θ) 0 M θ 1 ˆθ 1
36 Y = θ + σz θ 2 ˆθ 2 θ Bias Risk Variance θ M = Π M (θ) ˆθ M = Π M (Y ) 0 M θ 1 ˆθ 1
37 Y = θ + σz θ Bias Risk 0 M Variance θ 1 ˆθ 1
38 Y = θ + σz θ 2 ˆθ 2 Variance Risk Bias θ 0 M
39 Y = θ + σz θ 2 ˆθ 2 Variance Bias Risk Bias Risk θ Bias Risk Variance θ M = Π M (θ) ˆθ M = Π M (Y ) 0 M Variance θ 1 ˆθ 1
40 Other Criteria Adjusted R 2 : Ra(M) 2 n 1 = 1 n dim(m) (1 R2 (M)) n 1 = 1 n dim(m) SSE(M) SSY Bayesian Information Criterion: BIC(M) = SSE(M) + σ 2 dim(m)log n
41 Application: denoising a signal discretized and noisy version of f : [0, 1] R : Y k = f (k/n) + σz k,0 k n 1 choice of an orthogonal basis of R n : Fourier { [ ( )] 2πkl Ω n = sin n [ ( )] 2πkl cos n,1 k 0 l N 1,0 k 0 l N 1 n 1 2 n } 2 nested models with increasing number of non-zero Fourier coefficients,
42 Logistic Regression The Gaussian model does obviously not apply everywhere; think e.g. of a regression age/heart disease. Logistic model: Y i B ( µ( t X i θ) ), where µ(η) = exp(η) 1+exp(η) is the inverse logit function. Maximum likelihood estimation is possible numerically (Newton-Raphson method)
43 Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection Exercices
44 Discovery of R Understand and modify the source codes available on the website. The data frame called cars contains two arrays: cars$dist and cars$speed. Its gives the speed of cars and the distances taken to stop (recorded in the 1920s). A relation dist = A speed B is expected. How to estimate A and B? Test if B = 0, and then if B = 1. Find out how logistic regression can be done with R. Illustrate on some data you choose.
45 Simple Exercices Show that if X 1,...,X n is a N(0,1)-sample, then X n N(0,1/n) n ( Xi X ) 2 n χ 2 (n 1) i=1 Re-compute the formula giving ˆα and ˆβ in the simple regression model by analytic minimization of the total squared errors n i=1 (y i α βx i ) 2. Compute the squared prediction error E [ (ŷ α βx ) 2] for a new observation at point x in the simple regression model. Same exercices for the general gaussian linear model.
46 Exercice: weighting methods A two plate weighing machine is called unbiased with precision σ 2 if, an object of true weight m on the left plate is balanced by a random weight y such that y = m + σǫ on the right plate, where ǫ is a centered standard normal variable. Mister M. has three objects of mass a, b and c to weigh with such a machine, and he is allowed to proceed to three measurements. He thinks of three possibilities weighting each object separately : (a ), (b ), (c ); weighting the objects two at a time : (ab ), (ac ) and (bc ); putting each object one time on the right plate alone and two times with another on the right plate (ab c), (ac b), (bc a). What would you advice him?
47 Exercice: weighting methods A two plate weighing machine is called unbiased with precision σ 2 if, an object of true weight m on the left plate is balanced by a random weight y such that y = m + σǫ on the right plate, where ǫ is a centered standard normal variable. Mister M. has three objects of mass a, b and c to weigh with such a machine, and he is allowed to proceed to three measurements. He thinks of three possibilities weighting each object separately : (a ), (b ), (c ); weighting the objects two at a time : (ab ), (ac ) and (bc ); putting each object one time on the right plate alone and two times with another on the right plate (ab c), (ac b), (bc a). What would you advice him? More precisely: compute the individual variance for each possibility and give a first conclusion. Does it hold if one is interested in linear combinations of the weight?
48 Exercice: multi-intercept regression Botanists want to quantify the average difference of height between the trees of two forests A and B. In their model, the height of a tree is the sum of three terms: a term q depending on the quality of the ground, which is assumed to be constant in each forest: q A for the trees of forest A, q B for the trees of forest B; an unknown biological constant times the quantity of humus around the tree; a random term proper to each tree. Precisely, they want to estimate the difference D = q A q B. For their study, they have collected the height of n A trees in forest A, n B trees in forest B, as well as the quantities (h A i ) 1 j na and (h B i ) 1 j nb of humus at the basis of all thoses trees. Tell them how to do it.
Ma 3/103: Lecture 24 Linear Regression I: Estimation
Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear
More informationPeter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8
Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationSummer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.
Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationStatistics Ph.D. Qualifying Exam: Part II November 3, 2001
Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationSCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models
SCHOOL OF MATHEMATICS AND STATISTICS Linear and Generalised Linear Models Autumn Semester 2017 18 2 hours Attempt all the questions. The allocation of marks is shown in brackets. RESTRICTED OPEN BOOK EXAMINATION
More informationLinear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,
Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,
More informationEXAMINERS REPORT & SOLUTIONS STATISTICS 1 (MATH 11400) May-June 2009
EAMINERS REPORT & SOLUTIONS STATISTICS (MATH 400) May-June 2009 Examiners Report A. Most plots were well done. Some candidates muddled hinges and quartiles and gave the wrong one. Generally candidates
More informationCh 2: Simple Linear Regression
Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component
More information2017 Financial Mathematics Orientation - Statistics
2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries 5 1.1 Samples and Population............................. 5
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part : Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationLecture 14 Simple Linear Regression
Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent
More informationRegression #4: Properties of OLS Estimator (Part 2)
Regression #4: Properties of OLS Estimator (Part 2) Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #4 1 / 24 Introduction In this lecture, we continue investigating properties associated
More information[y i α βx i ] 2 (2) Q = i=1
Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation
More informationsimple if it completely specifies the density of x
3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely
More informationInference in Regression Analysis
Inference in Regression Analysis Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 4, Slide 1 Today: Normal Error Regression Model Y i = β 0 + β 1 X i + ǫ i Y i value
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationBIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation
BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)
More informationLinear Models and Estimation by Least Squares
Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:
More informationRegression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood
Regression Estimation - Least Squares and Maximum Likelihood Dr. Frank Wood Least Squares Max(min)imization Function to minimize w.r.t. β 0, β 1 Q = n (Y i (β 0 + β 1 X i )) 2 i=1 Minimize this by maximizing
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationMultivariate Regression
Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the
More informationDiagnostics can identify two possible areas of failure of assumptions when fitting linear models.
1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important
More informationStatistics. Statistics
The main aims of statistics 1 1 Choosing a model 2 Estimating its parameter(s) 1 point estimates 2 interval estimates 3 Testing hypotheses Distributions used in statistics: χ 2 n-distribution 2 Let X 1,
More informationECON 4160, Autumn term Lecture 1
ECON 4160, Autumn term 2017. Lecture 1 a) Maximum Likelihood based inference. b) The bivariate normal model Ragnar Nymoen University of Oslo 24 August 2017 1 / 54 Principles of inference I Ordinary least
More informationNonconcave Penalized Likelihood with A Diverging Number of Parameters
Nonconcave Penalized Likelihood with A Diverging Number of Parameters Jianqing Fan and Heng Peng Presenter: Jiale Xu March 12, 2010 Jianqing Fan and Heng Peng Presenter: JialeNonconcave Xu () Penalized
More informationLinear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77
Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical
More informationBTRY 4090: Spring 2009 Theory of Statistics
BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)
More informationFinal Review. Yang Feng. Yang Feng (Columbia University) Final Review 1 / 58
Final Review Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Final Review 1 / 58 Outline 1 Multiple Linear Regression (Estimation, Inference) 2 Special Topics for Multiple
More informationSTAT5044: Regression and Anova. Inyoung Kim
STAT5044: Regression and Anova Inyoung Kim 2 / 47 Outline 1 Regression 2 Simple Linear regression 3 Basic concepts in regression 4 How to estimate unknown parameters 5 Properties of Least Squares Estimators:
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationSome Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2
STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square
More informationCh 3: Multiple Linear Regression
Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery
More informationSTA 2201/442 Assignment 2
STA 2201/442 Assignment 2 1. This is about how to simulate from a continuous univariate distribution. Let the random variable X have a continuous distribution with density f X (x) and cumulative distribution
More informationData Mining Stat 588
Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic
More informationQualifying Exam in Probability and Statistics. https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf
Part 1: Sample Problems for the Elementary Section of Qualifying Exam in Probability and Statistics https://www.soa.org/files/edu/edu-exam-p-sample-quest.pdf Part 2: Sample Problems for the Advanced Section
More informationLinear models and their mathematical foundations: Simple linear regression
Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction
More informationLecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011
Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector
More informationModel comparison and selection
BS2 Statistical Inference, Lectures 9 and 10, Hilary Term 2008 March 2, 2008 Hypothesis testing Consider two alternative models M 1 = {f (x; θ), θ Θ 1 } and M 2 = {f (x; θ), θ Θ 2 } for a sample (X = x)
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationPh.D. Qualifying Exam Friday Saturday, January 6 7, 2017
Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Let X 1, X 2,, X n be a sequence of i.i.d. observations from a
More informationRegression Estimation Least Squares and Maximum Likelihood
Regression Estimation Least Squares and Maximum Likelihood Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 3, Slide 1 Least Squares Max(min)imization Function to minimize
More informationThis model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that
Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear
More informationSTAT 100C: Linear models
STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 21 Model selection Choosing the best model among a collection of models {M 1, M 2..., M N }. What is a good model? 1. fits the data well (model
More informationLecture 3: Linear Models. Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012
Lecture 3: Linear Models Bruce Walsh lecture notes Uppsala EQG course version 28 Jan 2012 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector of observed
More informationTime Series Analysis
Time Series Analysis hm@imm.dtu.dk Informatics and Mathematical Modelling Technical University of Denmark DK-2800 Kgs. Lyngby 1 Outline of the lecture Regression based methods, 1st part: Introduction (Sec.
More informationSimple Linear Regression
Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.
More informationThe regression model with one fixed regressor cont d
The regression model with one fixed regressor cont d 3150/4150 Lecture 4 Ragnar Nymoen 27 January 2012 The model with transformed variables Regression with transformed variables I References HGL Ch 2.8
More informationRegression and Statistical Inference
Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random
More information1. Simple Linear Regression
1. Simple Linear Regression Suppose that we are interested in the average height of male undergrads at UF. We put each male student s name (population) in a hat and randomly select 100 (sample). Then their
More informationSimple and Multiple Linear Regression
Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationCorrelation and Regression
Correlation and Regression October 25, 2017 STAT 151 Class 9 Slide 1 Outline of Topics 1 Associations 2 Scatter plot 3 Correlation 4 Regression 5 Testing and estimation 6 Goodness-of-fit STAT 151 Class
More informationSimple Linear Regression
Simple Linear Regression MATH 282A Introduction to Computational Statistics University of California, San Diego Instructor: Ery Arias-Castro http://math.ucsd.edu/ eariasca/math282a.html MATH 282A University
More informationProblems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B
Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2
More informationApplied Regression. Applied Regression. Chapter 2 Simple Linear Regression. Hongcheng Li. April, 6, 2013
Applied Regression Chapter 2 Simple Linear Regression Hongcheng Li April, 6, 2013 Outline 1 Introduction of simple linear regression 2 Scatter plot 3 Simple linear regression model 4 Test of Hypothesis
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationwhere x and ȳ are the sample means of x 1,, x n
y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =
More informationSimple Linear Regression
Simple Linear Regression September 24, 2008 Reading HH 8, GIll 4 Simple Linear Regression p.1/20 Problem Data: Observe pairs (Y i,x i ),i = 1,...n Response or dependent variable Y Predictor or independent
More informationFormal Statement of Simple Linear Regression Model
Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationApplied Econometrics (QEM)
Applied Econometrics (QEM) The Simple Linear Regression Model based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #2 The Simple
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationEC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix)
1 EC212: Introduction to Econometrics Review Materials (Wooldridge, Appendix) Taisuke Otsu London School of Economics Summer 2018 A.1. Summation operator (Wooldridge, App. A.1) 2 3 Summation operator For
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More information3. Linear Regression With a Single Regressor
3. Linear Regression With a Single Regressor Econometrics: (I) Application of statistical methods in empirical research Testing economic theory with real-world data (data analysis) 56 Econometrics: (II)
More informationEstimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators
Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationSimple Linear Regression: The Model
Simple Linear Regression: The Model task: quantifying the effect of change X in X on Y, with some constant β 1 : Y = β 1 X, linear relationship between X and Y, however, relationship subject to a random
More informationGeneral Linear Model: Statistical Inference
Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter 4), least
More informationBias Variance Trade-off
Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples
ST3241 Categorical Data Analysis I Generalized Linear Models Introduction and Some Examples 1 Introduction We have discussed methods for analyzing associations in two-way and three-way tables. Now we will
More informationHT Introduction. P(X i = x i ) = e λ λ x i
MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework
More informationBIOS 2083 Linear Models c Abdus S. Wahed
Chapter 5 206 Chapter 6 General Linear Model: Statistical Inference 6.1 Introduction So far we have discussed formulation of linear models (Chapter 1), estimability of parameters in a linear model (Chapter
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationSimple Linear Regression
Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)
More informationTopic 16 Interval Estimation
Topic 16 Interval Estimation Additional Topics 1 / 9 Outline Linear Regression Interpretation of the Confidence Interval 2 / 9 Linear Regression For ordinary linear regression, we have given least squares
More informationSo far our focus has been on estimation of the parameter vector β in the. y = Xβ + u
Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,
More informationPh.D. Qualifying Exam Friday Saturday, January 3 4, 2014
Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently
More informationReview of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley
Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate
More informationThe Multiple Regression Model Estimation
Lesson 5 The Multiple Regression Model Estimation Pilar González and Susan Orbe Dpt Applied Econometrics III (Econometrics and Statistics) Pilar González and Susan Orbe OCW 2014 Lesson 5 Regression model:
More informationSTAT420 Midterm Exam. University of Illinois Urbana-Champaign October 19 (Friday), :00 4:15p. SOLUTIONS (Yellow)
STAT40 Midterm Exam University of Illinois Urbana-Champaign October 19 (Friday), 018 3:00 4:15p SOLUTIONS (Yellow) Question 1 (15 points) (10 points) 3 (50 points) extra ( points) Total (77 points) Points
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationSTAT 540: Data Analysis and Regression
STAT 540: Data Analysis and Regression Wen Zhou http://www.stat.colostate.edu/~riczw/ Email: riczw@stat.colostate.edu Department of Statistics Colorado State University Fall 205 W. Zhou (Colorado State
More informationTutorial 6: Linear Regression
Tutorial 6: Linear Regression Rob Nicholls nicholls@mrc-lmb.cam.ac.uk MRC LMB Statistics Course 2014 Contents 1 Introduction to Simple Linear Regression................ 1 2 Parameter Estimation and Model
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth
More informationThis paper is not to be removed from the Examination Halls
~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,
More informationPetter Mostad Mathematical Statistics Chalmers and GU
Petter Mostad Mathematical Statistics Chalmers and GU Solution to MVE55/MSG8 Mathematical statistics and discrete mathematics MVE55/MSG8 Matematisk statistik och diskret matematik Re-exam: 4 August 6,
More information