Statistics for Applications. Chapter 7: Regression 1/43

Size: px
Start display at page:

Download "Statistics for Applications. Chapter 7: Regression 1/43"

Transcription

1 Statistics for Appications Chapter 7: Regression 1/43

2 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43

3 Heuristics of the inear regression (2) I Idea: Fit the best ine fitting the data. I Approximation: Y i a + bx i,i=1,...,n, for some (unknown) a, b 2 IR. I Find a, ˆ ˆ b that approach a and b. I More generay: Y i 2 IR,X i 2 IR d, Y i a + X > b, a 2 IR,b2 IR d. i I Goa: Write a rigorous mode and estimate a and b. 3/43

4 Heuristics of the inear regression (3) Exampes: Economics: Demand and price, D i a + bp i, i =1,..., n. Idea gas aw: PV = nrt, og P i a + b og V i + c og T i, i =1,..., n. 4/43

5 Linear regression of a r.v. Y on a r.v. X (1) Let X and Y be two rea r.v. (non necessariy independent) with two moments and such that V ar( X) 6= 0. The theoretica inear regression of Y on X is the best approximation in quadratic means of Y by a inear function of X, i.e. the r.v. a + bx, h where a and b iare the two rea numbers minimizing IE ( Y a bx) 2. By some simpe agebra: cov( X, Y ) I b =, V ar( X) cov( X, Y ) I a = IE[ Y ] bie[ X] = IE[ Y ] IE[ X]. V ar( X) 5/43

6 Linear regression of a r.v. Y on a r.v. X (2) If " = Y (a + bx), then Y = a + bx + ", with IE["] =0and cov(x, ") =0. Conversey: Assume that Y = a + bx + " for some a, b 2 IR and some centered r.v. " that satisfies cov(x, ") =0. E.g., if X? " or if IE[" X] =0, then cov(x, ") =0. Then, a + bx is the theoretica inear regression of Y on X. 6/43

7 Linear regression of a r.v. Y on a r.v. X (3) A sampe of n i.i.d. random pairs (X 1,...,X n ) with same distribution as (X, Y ) is avaiabe. We want to estimate a and b. 7/43

8 Linear regression of a r.v. Y on a r.v. X (3) A sampe of n i.i.d. random pairs (X 1,...,X n ) with same distribution as (X, Y ) is avaiabe. We want to estimate a and b. 8/43

9 Linear regression of a r.v. Y on a r.v. X (3) A sampe of n i.i.d. random pairs (X 1,...,X n ) with same distribution as (X, Y ) is avaiabe. We want to estimate a and b. 9/43

10 Linear regression of a r.v. Y on a r.v. X (3) A sampe of n i.i.d. random pairs (X 1,...,X n ) with same distribution as (X, Y ) is avaiabe. We want to estimate a and b. 10/43

11 Linear regression of a r.v. Y on a r.v. X (3) A sampe of n i.i.d. random pairs (X 1,Y 1 ),...,(X n,y n ) with same distribution as (X, Y ) is avaiabe. We want to estimate a and b. 11/43

12 Linear regression of a r.v. Y on a r.v. X (4) Definition The east squared error (LSE) estimator of (a, b) is the minimizer of the sum of squared errors: nx (Yi a bx i ) 2. i=1 (â, bˆ) is given by XY X Y ˆb =, X 2 X 2 â = Y ˆX. b 12/43

13 Linear regression of a r.v. Y on a r.v. X (5) 13/43

14 Mutivariate case (1) Y i = X i β + " i, i = 1,..., n. Vector of expanatory variabes or covariates: X i 2 IR p (wog, assume its first coordinate is 1). Dependent variabe: Y i. β = (a, b ) ; β 1 (= a) is caed the intercept. {" i } i=1,...,n : noise terms satisfying cov(x i, " i )= 0. Definition The east squared error (LSE) estimator of β is the minimizer of the sum of square errors: n βˆ = argmin X (Y i X i t) 2 t2ir p i=1 14/43

15 Mutivariate case (2) LSE in matrix form Let Y = (Y 1,...,Y n ) 2 IR n. Let X be the n p matrix whose rows are X 1,..., X caed the design). n (X is Let " = (" 1,...," n ) 2 IR n (unobserved noise) Y = Xβ + ". The LSE βˆ satisfies: βˆ = argmin ky Xtk 2 2. t2ir p 15/43

16 Mutivariate case (3) Assume that rank(x) = p. Anaytic computation of the LSE: βˆ =(X X) 1 X Y. Geometric interpretation of the LSE Xβˆ is the orthogona projection of Y onto the subspace spanned by the coumns of X: where P = X(X X) 1 X. Xβˆ = P Y, 16/43

17 Linear regression with deterministic design and Gaussian noise (1) Assumptions: The design matrix X is deterministic and rank(x) = p. The mode is homoscedastic: " 1,...," n are i.i.d. The noise vector " is Gaussian: " N n (0, σ 2 I n ), for some known or unknown σ 2 > 0. 17/43

18 Linear regression with deterministic design and Gaussian noise (2) LSE = MLE: β ˆ N p β, σ 2 (X X) 1. h i Quadratic risk of βˆ: IE kβˆ βk 2 = σ 2 tr (X X) 1. 2 h i Prediction error: IE ky Xβˆk 2 2 = σ 2 (n p). Unbiased estimator of σ 2 : σˆ2 = 1 ky Xβˆk 2 2. n p Theorem 2 (n p) σ σˆ2 χ 2. n p ˆβ? σˆ2. 18/43

19 Significance tests (1) Test whether the j-th expanatory variabe is significant in the inear regression (1 appe j appe p). H 0 : β j =0 v.s. H 1 : β j =0. If γ j is the j-th diagona coefficient of (X X) 1 (γ j > 0): βˆj Let T n (j) = p. σˆ2γ j βˆj β j p t n p. σˆ2γ j Test with non asymptotic eve 2 (0, 1): δ (j) = 1{ T (j) > q (t n p )}, n 2 where q (t n p ) is the (1 /2)-quantie of t n p. 2 19/43

20 Significance tests (2) Test whether a group of expanatory variabes is significant in the inear regression. H 0 : β j =0, 8j 2 S v.s. H 1 : 9j 2 S, β j =0, where S {1,...,p}. Bonferroni s test: δ B = max δ j2s (j) /k δ has non asymptotic eve at most., where k = S. 20/43

21 More tests (1) Let G be a k p matrix with rank(g) = k (k appe p) and λ 2 IR k. Consider the hypotheses: H 0 : Gβ = λ v.s. H 1 : Gβ = λ. The setup of the previous side is a particuar case. If H 0 is true, then: and Gβˆ λ N k 0, σ 2 G(X X) 1 G, 1 σ 2 (Gβˆ λ) G(X X) 1 G (Gβ λ) χ 2 k. 21/43

22 More tests (2) Let S n = 1 ˆσ 2 (Gˆβ λ) ( G(X X) 1 G k ) 1 (Gβ λ). If H 0 is true, then S n F k,n p. Test with non asymptotic eve 2 (0, 1): δ = 1{S n >q (F k,n p )}, where q (F k,n p ) is the (1 )-quantie of F k,n p. Definition The Fisher distribution with p and q degrees of freedom, denoted U/p by F p,q, is the distribution of, where: V/q U χ 2 p, V χ 2 q, U? V. 22/43

23 Concuding remarks Linear regression exhibits correations, NOT causaity Normaity of the noise: One can use goodness of fit tests to test whether the residuas "ˆi = Y i X i βˆ are Gaussian. Deterministic design: If X is not deterministic, a the above can be understood conditionay on X, if the noise is assumed to be Gaussian, conditionay on X. 23/43

24 Linear regression and ack of identifiabiity (1) Consider the foowing mode: with: Y = Xβ + ", 1. Y 2 IR n (dependent variabes), X 2 IR n p (deterministic design) ; 2. β 2 IR p, unknown; 3. " N n (0, σ 2 I n ). Previousy, we assumed that X had rank p, so we coud invert X X. What if X is not of rank p? E.g., if p >n? β woud no onger be identified: estimation of β is vain (uness we add more structure). 24/43

25 Linear regression and ack of identifiabiity (2) What about prediction? Xβ is sti identified. Ŷ: orthogona projection of Y onto the inear span of the coumns of X. Ŷ = Xβˆ = X(X X) XY, where A stands for the (Moore-Penrose) pseudo inverse of a matrix A. Simiary as before, if k = rank(x): kŷ Yk 22 χ 2 n k, σ 2 kŷ Yk 2 2? Ŷ. 25/43

26 Linear regression and ack of identifiabiity (3) In particuar: IE[kŶ Yk 2 2 ]=(n k)σ 2. Unbiased estimator of the variance: 1 σˆ2 = kŷ Yk 2 2. n k 26/43

27 Linear regression in high dimension (1) Consider again the foowing mode: with: Y = Xβ + ", 1. Y 2 IR n (dependent variabes), X 2 IR n p (deterministic design) ; 2. β 2 IR p, unknown: to be estimated; 3. " N n (0, σ 2 I n ). For each i, X i 2 IR p is the vector of covariates of the i-th individua. If p is too arge (p >n), there are too many parameters to be estimated (overfitting mode), athough some covariates may be irreevant. Soution: Reduction of the dimension. 27/43

28 Linear regression in high dimension (2) Idea: Assume that ony a few coordinates of β are nonzero (but we do not know which ones). Based on the sampe, seect a subset of covariates and estimate the corresponding coordinates of β. For S {1,...,p}, et βˆ S 2 argmin ky X S tk 2, t2ir S where X S is the submatrix of X obtained by keeping ony the covariates indexed in S. 28/43

29 Linear regression in high dimension (3) Seect a subset S that minimizes the prediction error penaized by the compexity (or size) of the mode: ˆ ky X S β S k 2 + λ S, where λ > 0 is a tuning parameter. If λ = 2ˆσ 2, this is the Maow s C p or AIC criterion. If λ = σˆ2 og n, this is the BIC criterion. 29/43

30 Linear regression in high dimension (4) Each of these criteria is equivaent to finding β 2 IR p that minimizes: ky Xbk λkbk 0, where kbk 0 is the number of nonzero coefficients of b. This is a computationay hard probem: nonconvex and requires to compute 2 n estimators (a the βˆ S, for S {1,...,p}). Lasso estimator: X p repacekbk 0 = 1I{b j =0} with kbk 1 j=1 and the probem becomes convex. L βˆ 2 argmin ky Xbk 2 + λkbk 1, b2ir p where λ > 0 is a tuning parameter. p X = b j j=1 30/43

31 Linear regression in high dimension (5) How to choose λ? This is a difficut question (see grad course : High-dimensiona statistics in Spring 2017). A good choice of λ with ead to an estimator βˆ that is very cose to β and wi aow to recover the subset S of a j 2 {1,...,p} for which β j =0, with high probabiity. 31/43

32 Linear regression in high dimension (6) 32/43

33 Nonparametric regression (1) In the inear setup, we assumed that Y i = X i β + " i, where X i are deterministic. This has to be understood as working conditionay on the design. This is to assume that IE[Y i X i ] is a inear function of X i, which is not true in genera. Let f(x) = IE[Y i X i = x], x 2 IR p : How to estimate the function f? 33/43

34 Nonparametric regression (2) Let p = 1 in the seque. One can make a parametric assumption on f. 2 a+bx E.g., f(x) = a + bx, f(x) = a + bx + cx, f(x) = e,... The probem reduces to the estimation of a finite number of parameters. LSE, MLE, a the previous theory for the inear case coud be adapted. What if we do not make any such parametric assumption on f? 34/43

35 Nonparametric regression (3) Assume f is smooth enough: f can be we approximated by a piecewise constant function. Idea: Loca averages. For x 2 IR: f(t) f(x) for t cose to x. For a i such that X i is cose enough to x, Y i f(x)+ " i. Estimate f(x) by the average of a Y i s for which X i is cose enough to x. 35/43

36 Nonparametric regression (4) Let h > 0: the window s size (or bandwidth). Let I x = {i = 1,...,n : X i x < h}. Let fˆ n,h(x) be the average of {Y i : i 2 I x }. 8 >< 1 X Yi if I x = ; I x i2ix fˆn,h(x) = >: 0 otherwise. 36/43

37 Nonparametric regression (5) X Y 37/43

38 Nonparametric regression (6) X Y x 0.6 h 0.1 f ^ x /43

39 Nonparametric regression (7) How to choose h? If h! 0: overfitting the data; If h! 1: underfitting, fˆ n,h(x) = Y n. 39/43

40 Nonparametric regression (8) Exampe: n = 100, f(x) = x(1 x), h = X Y f ^ nh 40/43

41 Nonparametric regression (9) Exampe: n = 100, f(x) = x(1 x), h = X Y f ^ nh 41/43

42 Nonparametric regression (10) Exampe: n = 100, f(x) = x(1 x), h = X Y f ^ nh 42/43

43 Nonparametric regression (11) Choice of h? If the smoothness of f is known (i.e., quaity of oca approximation of f by piecewise constant functions): There is a good choice of h depending on that smoothness If the smoothness of f is unknown: Other techniques, e.g. cross vaidation. 43/43

44

Two-Stage Least Squares as Minimum Distance

Two-Stage Least Squares as Minimum Distance Two-Stage Least Squares as Minimum Distance Frank Windmeijer Discussion Paper 17 / 683 7 June 2017 Department of Economics University of Bristo Priory Road Compex Bristo BS8 1TU United Kingdom Two-Stage

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain

Alberto Maydeu Olivares Instituto de Empresa Marketing Dept. C/Maria de Molina Madrid Spain CORRECTIONS TO CLASSICAL PROCEDURES FOR ESTIMATING THURSTONE S CASE V MODEL FOR RANKING DATA Aberto Maydeu Oivares Instituto de Empresa Marketing Dept. C/Maria de Moina -5 28006 Madrid Spain Aberto.Maydeu@ie.edu

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1

Inverse of a Square Matrix. For an N N square matrix A, the inverse of A, 1 Inverse of a Square Matrix For an N N square matrix A, the inverse of A, 1 A, exists if and only if A is of full rank, i.e., if and only if no column of A is a linear combination 1 of the others. A is

More information

MIT Spring 2015

MIT Spring 2015 Regression Analysis MIT 18.472 Dr. Kempthorne Spring 2015 1 Outline Regression Analysis 1 Regression Analysis 2 Multiple Linear Regression: Setup Data Set n cases i = 1, 2,..., n 1 Response (dependent)

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Model Selection for Geostatistical Models

Model Selection for Geostatistical Models Model Selection for Geostatistical Models Richard A. Davis Colorado State University http://www.stat.colostate.edu/~rdavis/lectures Joint work with: Jennifer A. Hoeting, Colorado State University Andrew

More information

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff IEOR 165 Lecture 7 Bias-Variance Tradeoff 1 Bias-Variance Tradeoff Consider the case of parametric regression with β R, and suppose we would like to analyze the error of the estimate ˆβ in comparison to

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

STAT 100C: Linear models

STAT 100C: Linear models STAT 100C: Linear models Arash A. Amini June 9, 2018 1 / 56 Table of Contents Multiple linear regression Linear model setup Estimation of β Geometric interpretation Estimation of σ 2 Hat matrix Gram matrix

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures

9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

Two-sample inference for normal mean vectors based on monotone missing data

Two-sample inference for normal mean vectors based on monotone missing data Journa of Mutivariate Anaysis 97 (006 6 76 wwweseviercom/ocate/jmva Two-sampe inference for norma mean vectors based on monotone missing data Jianqi Yu a, K Krishnamoorthy a,, Maruthy K Pannaa b a Department

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

Probability and Statistics Notes

Probability and Statistics Notes Probability and Statistics Notes Chapter Seven Jesse Crawford Department of Mathematics Tarleton State University Spring 2011 (Tarleton State University) Chapter Seven Notes Spring 2011 1 / 42 Outline

More information

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix

Do Schools Matter for High Math Achievement? Evidence from the American Mathematics Competitions Glenn Ellison and Ashley Swanson Online Appendix VOL. NO. DO SCHOOLS MATTER FOR HIGH MATH ACHIEVEMENT? 43 Do Schoos Matter for High Math Achievement? Evidence from the American Mathematics Competitions Genn Eison and Ashey Swanson Onine Appendix Appendix

More information

High-dimensional regression

High-dimensional regression High-dimensional regression Advanced Methods for Data Analysis 36-402/36-608) Spring 2014 1 Back to linear regression 1.1 Shortcomings Suppose that we are given outcome measurements y 1,... y n R, and

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ).

Bourgain s Theorem. Computational and Metric Geometry. Instructor: Yury Makarychev. d(s 1, s 2 ). Bourgain s Theorem Computationa and Metric Geometry Instructor: Yury Makarychev 1 Notation Given a metric space (X, d) and S X, the distance from x X to S equas d(x, S) = inf d(x, s). s S The distance

More information

Ma 3/103: Lecture 24 Linear Regression I: Estimation

Ma 3/103: Lecture 24 Linear Regression I: Estimation Ma 3/103: Lecture 24 Linear Regression I: Estimation March 3, 2017 KC Border Linear Regression I March 3, 2017 1 / 32 Regression analysis Regression analysis Estimate and test E(Y X) = f (X). f is the

More information

Lecture 14 Simple Linear Regression

Lecture 14 Simple Linear Regression Lecture 4 Simple Linear Regression Ordinary Least Squares (OLS) Consider the following simple linear regression model where, for each unit i, Y i is the dependent variable (response). X i is the independent

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

Math 124B January 31, 2012

Math 124B January 31, 2012 Math 124B January 31, 212 Viktor Grigoryan 7 Inhomogeneous boundary vaue probems Having studied the theory of Fourier series, with which we successfuy soved boundary vaue probems for the homogeneous heat

More information

Model Selection and Geometry

Model Selection and Geometry Model Selection and Geometry Pascal Massart Université Paris-Sud, Orsay Leipzig, February Purpose of the talk! Concentration of measure plays a fundamental role in the theory of model selection! Model

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

SCHOOL OF MATHEMATICS AND STATISTICS. Mathematics II (Materials) Section A. Find the general solution of the equation

SCHOOL OF MATHEMATICS AND STATISTICS. Mathematics II (Materials) Section A. Find the general solution of the equation Data provided: Formua Sheet MAS250 SCHOOL OF MATHEMATICS AND STATISTICS Mathematics II (Materias Autumn Semester 204 5 2 hours Marks wi be awarded for answers to a questions in Section A, and for your

More information

Product Cosines of Angles between Subspaces

Product Cosines of Angles between Subspaces Product Cosines of Anges between Subspaces Jianming Miao and Adi Ben-Israe Juy, 993 Dedicated to Professor C.R. Rao on his 75th birthday Let Abstract cos{l,m} : i cos θ i, denote the product of the cosines

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

Multivariate Regression

Multivariate Regression Multivariate Regression The so-called supervised learning problem is the following: we want to approximate the random variable Y with an appropriate function of the random variables X 1,..., X p with the

More information

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36

20. REML Estimation of Variance Components. Copyright c 2018 (Iowa State University) 20. Statistics / 36 20. REML Estimation of Variance Components Copyright c 2018 (Iowa State University) 20. Statistics 510 1 / 36 Consider the General Linear Model y = Xβ + ɛ, where ɛ N(0, Σ) and Σ is an n n positive definite

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

6 Wave Equation on an Interval: Separation of Variables

6 Wave Equation on an Interval: Separation of Variables 6 Wave Equation on an Interva: Separation of Variabes 6.1 Dirichet Boundary Conditions Ref: Strauss, Chapter 4 We now use the separation of variabes technique to study the wave equation on a finite interva.

More information

Linear regression COMS 4771

Linear regression COMS 4771 Linear regression COMS 4771 1. Old Faithful and prediction functions Prediction problem: Old Faithful geyser (Yellowstone) Task: Predict time of next eruption. 1 / 40 Statistical model for time between

More information

Statistical Inference, Econometric Analysis and Matrix Algebra

Statistical Inference, Econometric Analysis and Matrix Algebra Statistica Inference, Econometric Anaysis and Matrix Agebra Bernhard Schipp Water Krämer Editors Statistica Inference, Econometric Anaysis and Matrix Agebra Festschrift in Honour of Götz Trenker Physica-Verag

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

Regression and Statistical Inference

Regression and Statistical Inference Regression and Statistical Inference Walid Mnif wmnif@uwo.ca Department of Applied Mathematics The University of Western Ontario, London, Canada 1 Elements of Probability 2 Elements of Probability CDF&PDF

More information

Math 220B - Summer 2003 Homework 1 Solutions

Math 220B - Summer 2003 Homework 1 Solutions Math 0B - Summer 003 Homework Soutions Consider the eigenvaue probem { X = λx 0 < x < X satisfies symmetric BCs x = 0, Suppose f(x)f (x) x=b x=a 0 for a rea-vaued functions f(x) which satisfy the boundary

More information

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that

This model of the conditional expectation is linear in the parameters. A more practical and relaxed attitude towards linear regression is to say that Linear Regression For (X, Y ) a pair of random variables with values in R p R we assume that E(Y X) = β 0 + with β R p+1. p X j β j = (1, X T )β j=1 This model of the conditional expectation is linear

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

Sparse canonical correlation analysis from a predictive point of view

Sparse canonical correlation analysis from a predictive point of view Sparse canonica correation anaysis from a predictive point of view arxiv:1501.01231v1 [stat.me] 6 Jan 2015 Ines Wims Facuty of Economics and Business, KU Leuven and Christophe Croux Facuty of Economics

More information

4 Multiple Linear Regression

4 Multiple Linear Regression 4 Multiple Linear Regression 4. The Model Definition 4.. random variable Y fits a Multiple Linear Regression Model, iff there exist β, β,..., β k R so that for all (x, x 2,..., x k ) R k where ε N (, σ

More information

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law Gauss Law 1. Review on 1) Couomb s Law (charge and force) 2) Eectric Fied (fied and force) 2. Gauss s Law: connects charge and fied 3. Appications of Gauss s Law Couomb s Law and Eectric Fied Couomb s

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Schoo of Computer Science Probabiistic Graphica Modes Gaussian graphica modes and Ising modes: modeing networks Eric Xing Lecture 0, February 0, 07 Reading: See cass website Eric Xing @ CMU, 005-07 Network

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

3 Multiple Linear Regression

3 Multiple Linear Regression 3 Multiple Linear Regression 3.1 The Model Essentially, all models are wrong, but some are useful. Quote by George E.P. Box. Models are supposed to be exact descriptions of the population, but that is

More information

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

Economics 620, Lecture 2: Regression Mechanics (Simple Regression) 1 Economics 620, Lecture 2: Regression Mechanics (Simple Regression) Observed variables: y i ; x i i = 1; :::; n Hypothesized (model): Ey i = + x i or y i = + x i + (y i Ey i ) ; renaming we get: y i =

More information

Applied Econometrics (QEM)

Applied Econometrics (QEM) Applied Econometrics (QEM) based on Prinicples of Econometrics Jakub Mućk Department of Quantitative Economics Jakub Mućk Applied Econometrics (QEM) Meeting #3 1 / 42 Outline 1 2 3 t-test P-value Linear

More information

Math 124B January 17, 2012

Math 124B January 17, 2012 Math 124B January 17, 212 Viktor Grigoryan 3 Fu Fourier series We saw in previous ectures how the Dirichet and Neumann boundary conditions ead to respectivey sine and cosine Fourier series of the initia

More information

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn Automobie Prices in Market Equiibrium Berry, Pakes and Levinsohn Empirica Anaysis of demand and suppy in a differentiated products market: equiibrium in the U.S. automobie market. Oigopoistic Differentiated

More information

Fourier Series. 10 (D3.9) Find the Cesàro sum of the series. 11 (D3.9) Let a and b be real numbers. Under what conditions does a series of the form

Fourier Series. 10 (D3.9) Find the Cesàro sum of the series. 11 (D3.9) Let a and b be real numbers. Under what conditions does a series of the form Exercises Fourier Anaysis MMG70, Autumn 007 The exercises are taken from: Version: Monday October, 007 DXY Section XY of H F Davis, Fourier Series and orthogona functions EÖ Some exercises from earier

More information

Demand in Leisure Markets

Demand in Leisure Markets Demand in Leisure Markets An Empirica Anaysis of Time Aocation Shomi Pariat Ph.D Candidate Eitan Bergas Schoo of Economics Te Aviv University Motivation Leisure activities matter 35% of waking time 9.7%

More information

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

Unconditional security of differential phase shift quantum key distribution

Unconditional security of differential phase shift quantum key distribution Unconditiona security of differentia phase shift quantum key distribution Kai Wen, Yoshihisa Yamamoto Ginzton Lab and Dept of Eectrica Engineering Stanford University Basic idea of DPS-QKD Protoco. Aice

More information

Econ 582 Nonparametric Regression

Econ 582 Nonparametric Regression Econ 582 Nonparametric Regression Eric Zivot May 28, 2013 Nonparametric Regression Sofarwehaveonlyconsideredlinearregressionmodels = x 0 β + [ x ]=0 [ x = x] =x 0 β = [ x = x] [ x = x] x = β The assume

More information

LECTURE 2 LINEAR REGRESSION MODEL AND OLS

LECTURE 2 LINEAR REGRESSION MODEL AND OLS SEPTEMBER 29, 2014 LECTURE 2 LINEAR REGRESSION MODEL AND OLS Definitions A common question in econometrics is to study the effect of one group of variables X i, usually called the regressors, on another

More information

LECTURE NOTES 8 THE TRACELESS SYMMETRIC TENSOR EXPANSION AND STANDARD SPHERICAL HARMONICS

LECTURE NOTES 8 THE TRACELESS SYMMETRIC TENSOR EXPANSION AND STANDARD SPHERICAL HARMONICS MASSACHUSETTS INSTITUTE OF TECHNOLOGY Physics Department Physics 8.07: Eectromagnetism II October, 202 Prof. Aan Guth LECTURE NOTES 8 THE TRACELESS SYMMETRIC TENSOR EXPANSION AND STANDARD SPHERICAL HARMONICS

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Maximum Likelihood Estimation Merlise Clyde STA721 Linear Models Duke University August 31, 2017 Outline Topics Likelihood Function Projections Maximum Likelihood Estimates Readings: Christensen Chapter

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

Regression: Lecture 2

Regression: Lecture 2 Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014 Put your solution to each problem on a separate sheet of paper. Problem 1. (5166) Assume that two random samples {x i } and {y i } are independently

More information

Linear Regression and Discrimination

Linear Regression and Discrimination Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77

Linear Regression. September 27, Chapter 3. Chapter 3 September 27, / 77 Linear Regression Chapter 3 September 27, 2016 Chapter 3 September 27, 2016 1 / 77 1 3.1. Simple linear regression 2 3.2 Multiple linear regression 3 3.3. The least squares estimation 4 3.4. The statistical

More information

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1654 March23, 1999

More information

Statistics 910, #5 1. Regression Methods

Statistics 910, #5 1. Regression Methods Statistics 910, #5 1 Overview Regression Methods 1. Idea: effects of dependence 2. Examples of estimation (in R) 3. Review of regression 4. Comparisons and relative efficiencies Idea Decomposition Well-known

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

ISyE 691 Data mining and analytics

ISyE 691 Data mining and analytics ISyE 691 Data mining and analytics Regression Instructor: Prof. Kaibo Liu Department of Industrial and Systems Engineering UW-Madison Email: kliu8@wisc.edu Office: Room 3017 (Mechanical Engineering Building)

More information

LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIANCE STRUCTURE CHARACTERIZATION OF THE EXACT

LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIANCE STRUCTURE CHARACTERIZATION OF THE EXACT LIKELIHOOD RATIO TEST FOR THE HYPER- BLOCK MATRIX SPHERICITY COVARIACE STRUCTURE CHARACTERIZATIO OF THE EXACT DISTRIBUTIO AD DEVELOPMET OF EAR-EXACT DISTRIBUTIOS FOR THE TEST STATISTIC Authors: Bárbara

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

An Information Geometrical View of Stationary Subspace Analysis

An Information Geometrical View of Stationary Subspace Analysis An Information Geometrica View of Stationary Subspace Anaysis Motoaki Kawanabe, Wojciech Samek, Pau von Bünau, and Frank C. Meinecke Fraunhofer Institute FIRST, Kekuéstr. 7, 12489 Berin, Germany Berin

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION J. Korean Math. Soc. 46 2009, No. 2, pp. 281 294 ORHOGONAL MLI-WAVELES FROM MARIX FACORIZAION Hongying Xiao Abstract. Accuracy of the scaing function is very crucia in waveet theory, or correspondingy,

More information

11 Hypothesis Testing

11 Hypothesis Testing 28 11 Hypothesis Testing 111 Introduction Suppose we want to test the hypothesis: H : A q p β p 1 q 1 In terms of the rows of A this can be written as a 1 a q β, ie a i β for each row of A (here a i denotes

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

ECON The Simple Regression Model

ECON The Simple Regression Model ECON 351 - The Simple Regression Model Maggie Jones 1 / 41 The Simple Regression Model Our starting point will be the simple regression model where we look at the relationship between two variables In

More information

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case

Analysis of Emerson s Multiple Model Interpolation Estimation Algorithms: The MIMO Case Technica Report PC-04-00 Anaysis of Emerson s Mutipe Mode Interpoation Estimation Agorithms: The MIMO Case João P. Hespanha Dae E. Seborg University of Caifornia, Santa Barbara February 0, 004 Anaysis

More information

Available online at ScienceDirect. Procedia Computer Science 96 (2016 )

Available online at  ScienceDirect. Procedia Computer Science 96 (2016 ) Avaiabe onine at www.sciencedirect.com ScienceDirect Procedia Computer Science 96 (206 92 99 20th Internationa Conference on Knowedge Based and Inteigent Information and Engineering Systems Connected categorica

More information

arxiv: v2 [math.st] 9 Feb 2017

arxiv: v2 [math.st] 9 Feb 2017 Submitted to the Annals of Statistics PREDICTION ERROR AFTER MODEL SEARCH By Xiaoying Tian Harris, Department of Statistics, Stanford University arxiv:1610.06107v math.st 9 Feb 017 Estimation of the prediction

More information

Statistical Astronomy

Statistical Astronomy Lectures for the 7 th IAU ISYA Ifrane, nd 3 rd Juy 4 p ( x y, I) p( y x, I) p( x, I) p( y, I) Statistica Astronomy Martin Hendry, Dept of Physics and Astronomy University of Gasgow, UK http://www.astro.ga.ac.uk/users/martin/isya/

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept,

Linear Regression. In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, Linear Regression In this problem sheet, we consider the problem of linear regression with p predictors and one intercept, y = Xβ + ɛ, where y t = (y 1,..., y n ) is the column vector of target values,

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

The Symmetric and Antipersymmetric Solutions of the Matrix Equation A 1 X 1 B 1 + A 2 X 2 B A l X l B l = C and Its Optimal Approximation

The Symmetric and Antipersymmetric Solutions of the Matrix Equation A 1 X 1 B 1 + A 2 X 2 B A l X l B l = C and Its Optimal Approximation The Symmetric Antipersymmetric Soutions of the Matrix Equation A 1 X 1 B 1 + A 2 X 2 B 2 + + A X B C Its Optima Approximation Ying Zhang Member IAENG Abstract A matrix A (a ij) R n n is said to be symmetric

More information

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance

More information

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population

More information