Quantitative Methods in Economics Conditional Expectations

Similar documents
Introduction to Probability and Stocastic Processes - Part I

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Conditional distributions. Conditional expectation and conditional variance with respect to a variable.

STA 256: Statistics and Probability I

ECE 302: Probabilistic Methods in Engineering

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture 3. David Aldous. 31 August David Aldous Lecture 3

2 (Statistics) Random variables

SDS 321: Introduction to Probability and Statistics

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

Lecture 11. Probability Theory: an Overveiw

Introduction to Probability Theory

Random Signals and Systems. Chapter 3. Jitendra K Tugnait. Department of Electrical & Computer Engineering. Auburn University.

Covariance and Correlation

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

1 Joint and marginal distributions

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Joint Probability Distributions

Continuous r.v practice problems

Lecture 5. October 21, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Bivariate distributions

Lecture Notes 3 Multiple Random Variables. Joint, Marginal, and Conditional pmfs. Bayes Rule and Independence for pmfs

Chapter 5,6 Multiple RandomVariables

Measure-theoretic probability

Mathematics 426 Robert Gross Homework 9 Answers

GOV 2001/ 1002/ Stat E-200 Section 1 Probability Review

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

EE4601 Communication Systems

Stochastic Simulation Introduction Bo Friis Nielsen

4 Pairs of Random Variables

Review: mostly probability and some statistics

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

ECE353: Probability and Random Processes. Lecture 7 -Continuous Random Variable

Quantitative Methods in Economics Panel data and generalized least squares

Practice Examination # 3

HW4 : Bivariate Distributions (1) Solutions

Continuous Random Variables

Recitation 2: Probability

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

conditional cdf, conditional pdf, total probability theorem?

MATH 307: Problem Set #3 Solutions

2.2 Separable Equations

Probability Theory. Patrick Lam

Final Exam # 3. Sta 230: Probability. December 16, 2012

2 Functions of random variables

Stat 5101 Notes: Algorithms

Review of Probability Theory

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Joint Probability Distributions, Correlations

ENGG2430A-Homework 2

When Are Two Random Variables Independent?

Multivariate distributions

Chapter 4 Multiple Random Variables

p. 6-1 Continuous Random Variables p. 6-2

Lecture 3. Conditional distributions with applications

Data Modeling & Analysis Techniques. Probability & Statistics. Manfred Huber

Econ 2148, fall 2017 Instrumental variables II, continuous treatment

Probability Theory Review Reading Assignments

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Chapter 4 Multiple Random Variables

Basics of Stochastic Modeling: Part II

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

5. Conditional Distributions

MATH 151, FINAL EXAM Winter Quarter, 21 March, 2014

Formulas for probability theory and linear models SF2941

6.041/6.431 Fall 2010 Quiz 2 Solutions

STAT 430/510: Lecture 16

Stochastic Models of Manufacturing Systems

Lecture 19: Properties of Expectation

Set Theory. Pattern Recognition III. Michal Haindl. Set Operations. Outline

Linear models and their mathematical foundations: Simple linear regression

ECE 4400:693 - Information Theory

Lecture 1. ABC of Probability

volq = U(f, P 2 ). This finally gives

CS145: Probability & Computing

The logistic regression model is thus a glm-model with canonical link function so that the log-odds equals the linear predictor, that is

Introduction to Bayesian Statistics

δ -method and M-estimation

Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak

EE514A Information Theory I Fall 2013

Single Maths B: Introduction to Probability

Conditional distributions (discrete case)

Intro to Probability Instructor: Alexandre Bouchard

A review of probability theory

SOLUTIONS OF SELECTED PROBLEMS

Ordinary Differential Equations

General Random Variables

Joint Probability Distributions, Correlations

Gov Multiple Random Variables

Math 416 Lecture 2 DEFINITION. Here are the multivariate versions: X, Y, Z iff P(X = x, Y = y, Z =z) = p(x, y, z) of X, Y, Z iff for all sets A, B, C,

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Introduction to Probability

UCSD ECE153 Handout #27 Prof. Young-Han Kim Tuesday, May 6, Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei)

Introduction to Stochastic Processes

1 Probability theory. 2 Random variables and probability theory.

Raquel Prado. Name: Department of Applied Mathematics and Statistics AMS-131. Spring 2010

3 Operations on One Random Variable - Expectation

Transcription:

Quantitative Methods in Economics Conditional Expectations Maximilian Kasy Harvard University, fall 2016 1 / 19

Roadmap, Part I 1. Linear predictors and least squares regression 2. Conditional expectations 3. Some functional forms for linear regression 4. Regression with controls and residual regression 5. Panel data and generalized least squares 2 / 19

Takeaways for these slides Probability review Conditional probability share within subpopulation Conditional expectations minimize average squared prediction error (population concept) Like best linear predictor, but dropping linearity restriction 3 / 19

Probability review Assume for simplicity finite state space. Set of States of Nature S = {s 1,...,s M }. Probability Distribution (Measure): P(s j ) 0, M j=1 P(s j) = 1. An event A is a subset of S and has probability P(A) = s A P(s). 4 / 19

A random variable Y is a mapping from S to R. Its expectation is E(Y ) = M j=1 Y (s j )P(s j ). Distribution of Y : F Y (y) = s:y (s)=y P(s). Questions for you Express E(Y ) in terms of F Y 5 / 19

Solution: E(Y ) = M j=1 Y (s j )P(s j ) = L l=1 ( ) y l P(s j ) = j:y (s j )=y l L l=1 y l F(y l ). 6 / 19

Conditional Probability If P(B) > 0 and s j B, P(s j B) = P(s j) s B P(s) = P(s j)/p(b); P(s j B) = 0 ifs j / B. Same properties as unconditional probability: P(s j B) 0, M j=1 P(s j B) = 1. Conditional probability for events: P(A B) = P(s B) = s A B P(s) = P(A B)/P(B). s A P(B) 7 / 19

Properties of conditional probability Partition Formula: If the collection of disjoint subsets {B i } forms a partition of S, then Bayes Rule: Questions for you Prove these formulas. P(A) = P(A B j )P(B j ). j P(B i A) = P(A B i)p(b i ) P(A) = P(A B i)p(b i ) j P(A B j )P(B j ). 8 / 19

Conditional Expectation Conditional expectation given event B: E(Y B) = M j=1 Y (s j )P(s j B). Let X : S {x 1,...,x K } be another random variable. The regression function r : {x 1,...,x K } R has r(x) = E(Y X = x) = E(Y {s S : X(s) = x}). Conditional expectation given random variable X 9 / 19

Joint Distribution. The random vector (X,Y ) maps S into X Y = {(x,y) : x {x 1,...,x K },y {y 1,...,y L }}, and induces F XY (x,y) = P(X = x,y = y) = s S:X(s)=x,Y (s)=y P(s) for (x,y) X Y. Marginal Distribution: F X : X [0,1], F Y : Y [0,1], Conditional Distribution. If P(X = x) 0, F X (x) = F XY (x,y), y Y F Y (y) = F XY (x,y). x X F Y X (y x) = F YX (x,y)/f X (x). 10 / 19

Optimal Prediction Recall the definition of the best linear predictor: Ŷ = β 0 + β 1 X, min E(Y Ŷ ) 2 β 0,β 1 Now consider the same problem, without linearity: min g E[Y g(x)] 2. Questions for you Rewrite E[Y g(x)] 2 in terms of F XY (x,y). Now rewrite it using F Y X (y x) and F X (x). Solve for the optimal g. 11 / 19

Solution: Rewriting: Thus: E[Y g(x)] 2 = [y g(x)] 2 F XY (x,y) x,y ( ) [y g(x)] 2 F Y X (y x) arg min c = x y (y c) 2 F(y x) = E(Y X = x). y F X (x). So the optimal choice of the function g is the regression function r. 12 / 19

General state spaces and continuous distributions So far, we considered discrete state spaces. For continuous distributions, we can derive most of the same results. Let s start with the uniform distribution on [a, b]: P([c,d]) = 1 b a d c dx = d c b a (a c < d b). If A [a,b], P(A) = 1 dx. b a A 13 / 19

Expectation General notation: E(Y ) = Y (s)dp(s) S = y df Y (y), with F Y (B) = P{s : Y (s) B}. If the induced distribution for Y is uniform on [a,b], then E(Y ) = 1 b a b a y dy. 14 / 19

Generalizing our previous definitions Joint Distribution: Marginal Distribution: F XY (B A) = P{s : X(s) B,Y (s) A}. F X (B) = F XY (B R), F Y (A) = F XY (R A). Conditional Distribution: F Y X (A x) = P(Y A X = x). 15 / 19

Partition Formula: F XY (B A) = B F Y X (A x)df X (x). Bayes Rule: B P(X B Y A) = P(Y A X = x)df X (x) P(Y A X = x)dfx (x). 16 / 19

Densities Joint Density f XY (with respect to Lebesgue measure on R 2 ): F XY (B A) = f XY (x,y)dx dy B A ( ) ( ) = f XY (x,y)dx dy = f XY (x,y)dy dx. A B B A Marginal Density: f X (x) = Conditional Density: f XY (x,y)dy, f Y (y) = f Y X (y x) = f XY (x,y)/f X (x). F Y X (A x) = f Y X (y x)dy. A f XY (x,y)dx. 17 / 19

The joint density factors into the product of the conditional density and the marginal density: f XY (x,y) = f Y X (y x)f X (x). Partition Formula: f Y (y) = f XY (x,y)dx = f Y X (y x)f X (x)dx. Bayes Rule: f X Y (x y) = f Y X (y x)f X (x) fy X (y x)f X (x)dx. 18 / 19

Law of iterated expectations For Random Variables X and Y, E[E[Y X]] = E[Y ] Proof for the continuous case: E[E[Y X]] = = = = = yf Y X (y x)dyf X (x)dx yf Y X (y x)f X (x)dydx yf X,Y (x,y)dydx y f X,Y (x,y)dxdy yf Y (y)dy = E[Y ] 19 / 19