Stat 5101 Notes: Algorithms (thru 2nd midterm)

Similar documents
Stat 5101 Notes: Algorithms

Lecture 2: Review of Probability

Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota

Continuous Random Variables

Stat 5101 Lecture Slides Deck 5. Charles J. Geyer School of Statistics University of Minnesota

REVIEW OF MAIN CONCEPTS AND FORMULAS A B = Ā B. Pr(A B C) = Pr(A) Pr(A B C) =Pr(A) Pr(B A) Pr(C A B)

BASICS OF PROBABILITY

f X, Y (x, y)dx (x), where f(x,y) is the joint pdf of X and Y. (x) dx

ECE Lecture #9 Part 2 Overview

ENGG2430A-Homework 2

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Probability Theory and Statistics. Peter Jochumzen

conditional cdf, conditional pdf, total probability theorem?

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Stat 5101 Lecture Slides: Deck 8 Dirichlet Distribution. Charles J. Geyer School of Statistics University of Minnesota

Stat 5101 Lecture Slides Deck 2. Charles J. Geyer School of Statistics University of Minnesota

MULTIVARIATE PROBABILITY DISTRIBUTIONS

Stat 5101 Lecture Slides Deck 4. Charles J. Geyer School of Statistics University of Minnesota

Probability. Paul Schrimpf. January 23, UBC Economics 326. Probability. Paul Schrimpf. Definitions. Properties. Random variables.

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Probability. Paul Schrimpf. January 23, Definitions 2. 2 Properties 3

P (x). all other X j =x j. If X is a continuous random vector (see p.172), then the marginal distributions of X i are: f(x)dx 1 dx n

Random Variables and Expectations

Chapter 4 Multiple Random Variables

2 (Statistics) Random variables

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

4. Distributions of Functions of Random Variables

Joint Distribution of Two or More Random Variables

Multivariate Random Variable

A Probability Review

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

matrix-free Elements of Probability Theory 1 Random Variables and Distributions Contents Elements of Probability Theory 2

Math 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14

Lecture 2: Repetition of probability theory and statistics

Mathematics 426 Robert Gross Homework 9 Answers

Multiple Random Variables

Chp 4. Expectation and Variance

Elements of Probability Theory

Introduction to Computational Finance and Financial Econometrics Probability Review - Part 2

CHAPTER 4 MATHEMATICAL EXPECTATION. 4.1 Mean of a Random Variable

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Lecture 4: Proofs for Expectation, Variance, and Covariance Formula

Bivariate distributions

Introduction to Probability Theory

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com

Raquel Prado. Name: Department of Applied Mathematics and Statistics AMS-131. Spring 2010

Review: mostly probability and some statistics

CHAPTER 5. Jointly Probability Mass Function for Two Discrete Distributed Random Variables:

ECE302 Exam 2 Version A April 21, You must show ALL of your work for full credit. Please leave fractions as fractions, but simplify them, etc.

Functions of two random variables. Conditional pairs

EXPECTED VALUE of a RV. corresponds to the average value one would get for the RV when repeating the experiment, =0.

Week 10 Worksheet. Math 4653, Section 001 Elementary Probability Fall Ice Breaker Question: Do you prefer waffles or pancakes?

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

More on Distribution Function

Jointly Distributed Random Variables

Review of Probability Theory

Review of Basic Probability Theory

1 Joint and marginal distributions

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Algorithms for Uncertainty Quantification

E X A M. Probability Theory and Stochastic Processes Date: December 13, 2016 Duration: 4 hours. Number of pages incl.

STT 441 Final Exam Fall 2013

Lecture 11. Probability Theory: an Overveiw

MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)

Formulas for probability theory and linear models SF2941

Bivariate Distributions

Notes for Math 324, Part 19

Multivariate Distributions (Hogg Chapter Two)

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Chapter 4. Chapter 4 sections

Recitation 2: Probability

STAT Chapter 5 Continuous Distributions

STAT/MATH 395 PROBABILITY II

Final Exam # 3. Sta 230: Probability. December 16, 2012

Random Variables and Their Distributions

Problem Set #5. Econ 103. Solution: By the complement rule p(0) = 1 p q. q, 1 x 0 < 0 1 p, 0 x 0 < 1. Solution: E[X] = 1 q + 0 (1 p q) + p 1 = p q

5 Operations on Multiple Random Variables

Math 151. Rumbos Spring Solutions to Review Problems for Exam 1

BACKGROUND NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2016 PROBABILITY A. STRANDLIE NTNU AT GJØVIK AND UNIVERSITY OF OSLO

Course information: Instructor: Tim Hanson, Leconte 219C, phone Office hours: Tuesday/Thursday 11-12, Wednesday 10-12, and by appointment.

Deterministic. Deterministic data are those can be described by an explicit mathematical relationship

For a stochastic process {Y t : t = 0, ±1, ±2, ±3, }, the mean function is defined by (2.2.1) ± 2..., γ t,

The Multivariate Normal Distribution. In this case according to our theorem

MAS113 Introduction to Probability and Statistics. Proofs of theorems

ECSE B Solutions to Assignment 8 Fall 2008

MATH Notebook 4 Fall 2018/2019

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Math 151. Rumbos Spring Solutions to Review Problems for Exam 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2

Probability Theory Review Reading Assignments

MULTIVARIATE DISTRIBUTIONS

Probability. Lecture Notes. Adolfo J. Rumbos

ECON 5350 Class Notes Review of Probability and Distribution Theory

1 Random Variable: Topics

3. Probability and Statistics

Probability and Statistics Notes

We introduce methods that are useful in:

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS Solutions to Problems on Random Vectors and Random Sampling. 1+ x2 +y 2 ) (n+2)/2

SEBASTIAN RASCHKA. Introduction to Artificial Neural Networks and Deep Learning. with Applications in Python

Random Variables. P(x) = P[X(e)] = P(e). (1)

Transcription:

Stat 5101 Notes: Algorithms (thru 2nd midterm) Charles J. Geyer October 18, 2012 Contents 1 Calculating an Expectation or a Probability 2 1.1 From a PMF........................... 2 1.2 From a PDF........................... 3 1.3 From given Expectations using Uncorrelated.......... 3 1.4 From given Expectations using Independent.......... 3 1.5 From given Probabilities using Inclusion-Exclusion...... 4 1.6 From given Probabilities using Complement Rule....... 4 1.7 From given Probabilities using Independent.......... 4 1.8 From given Expectations using Linearity of Expectation... 5 1.8.1 Expectation of Sum and Average............ 5 1.8.2 Variance of Sum and Average.............. 5 1.8.3 Expectation and Variance of Linear Transformation. 6 1.8.4 Covariance of Linear Transformations......... 6 1.8.5 Expectation and Variance of Vector Linear Transformation........................... 6 1.8.6 Short Cut Formulas.................. 7 1.9 From Moment Generating Function............... 7 1.10 Convolution Formula....................... 7 2 Change of Variable Formulas 7 2.1 Discrete Distributions...................... 7 2.1.1 One-to-one Transformations............... 7 2.1.2 Many-to-one Transformations.............. 8 2.2 Continuous Distributions.................... 8 2.2.1 One-to-one Transformations............... 8 2.2.2 Alternate Method (Does not Require One-to-one)... 9 1

3 PMF or PDF and Independence 9 4 PDF or PMF to DF and Vice Versa 9 4.1 DF to PDF............................ 9 4.2 PDF to DF............................ 10 4.3 PDF to DF (Alternative Method)................ 10 4.4 DF to PMF............................ 10 4.5 PMF to DF............................ 11 5 Quantiles 11 5.1 Continuous Random Variables.................. 11 5.2 Discrete Random Variables................... 11 6 Joint, Marginal, Conditional 12 6.1 Joint to Marginal......................... 12 6.2 Joint to Conditional....................... 13 6.3 Conditional and Marginal to Joint............... 13 6.4 Conditional Probability and Expectation............ 14 1 Calculating an Expectation or a Probability Probability is a special case of expectation (deck 1, slide 62). 1.1 From a PMF If f is a PMF having domain S (the sample space) and g is any function, then E{g(X)} = x S g(x)f(x) (deck 1, slide 56 and deck 3, slide 12), and for any event A (a subset of S) Pr(A) = x S I A (x)f(x) = x A f(x) (deck 1, slide 62). 2

1.2 From a PDF If f is a PDF having domain S (the sample space) and g is any function, then E{g(X)} = g(x)f(x) dx (deck 3, slide 66), and for any event A (a subset of S) Pr(A) = I A (x)f(x) dx S = f(x) dx (deck 1, slide 84). And similarly for random vectors, E{g(X)} = (deck 3, slide 66), and Pr(A) = = A S A S S g(x)f(x) dx I A (x)f(x) dx f(x) dx (deck 1, slide 84), where the boldface indicates vectors and the integral signs indicate multiple integrals (the same dimension as the dimension of x). 1.3 From given Expectations using Uncorrelated If X and Y are uncorrelated random variables, then (deck 2, slide 73). E(XY ) = E(X)E(Y ) 1.4 From given Expectations using Independent If X and Y are independent random variables and g and h are any functions, then E{g(X)h(Y )} = E{g(X)}E{h(Y )} (deck 2, slide 76). 3

More generally, if X 1, X 2,..., X n are independent random variables and g 1, g 2,..., g n are any functions, then { n } n E g i (X i ) = E{g i (X i )} (deck 2, slide 76). 1.5 From given Probabilities using Inclusion-Exclusion Pr(A B) = Pr(A) + Pr(B) Pr(A B) Pr(A B C) = Pr(A) + Pr(B C) Pr(A (B C)) = Pr(A) + Pr(B C) Pr((A B) (A C)) = Pr(A) + Pr(B) + Pr(C) Pr(B C) Pr(A B) Pr(A C) + Pr(A B C) and so forth (deck 2, slides 139 and 140). In the special case that A 1,..., A n are mutually exclusive events ( n ) Pr A i = Pr(A i ) (deck 2, slide 137). 1.6 From given Probabilities using Complement Rule Pr(A c ) = 1 Pr(A) (deck 2, slide 143). 1.7 From given Probabilities using Independent In the special case that A 1,..., A n are independent events ( n ) n Pr A i = Pr(A i ) (deck 2, slide 147). 4

1.8 From given Expectations using Linearity of Expectation 1.8.1 Expectation of Sum and Average If X 1, X 2,..., X n are random variables, then { } E X i = E(X i ) (deck 2, slide 10). In particular, if X 1, X 2,..., X n all have the same expectation µ, then { } E X i = nµ (deck 2, slide 90), and E(X n ) = µ, where X n = 1 n (deck 2, slide 90). X i (1) 1.8.2 Variance of Sum and Average If X 1, X 2,..., X n are random variables, then { } var X i = cov(x i, X j ) = = j=1 var(x i ) + var(x i ) + 2 cov(x i, X j ) j=1j i j=i+1 cov(x i, X j ) (deck 2, slide 71). In particular, if X 1, X 2,..., X n are uncorrelated, then { } var X i = var(x i ) 5

(deck 2, slide 75). More particularly, if X 1, X 2,..., X n are uncorrelated and all have the same variance σ 2, then { } var X i = nσ 2 (deck 2, slide 90) and var(x n ) = σ2 n (deck 2, slide 90), where X n is given by (1). 1.8.3 Expectation and Variance of Linear Transformation If X is a random variable and a and b are constants, then (deck 2, slide 8 and slide 37). E(a + bx) = a + be(x) var(a + bx) = b 2 var(x) 1.8.4 Covariance of Linear Transformations If X and Y are random variables and a, b, c, and d are constants, then (homework problem 3-7). cov(a + bx, c + dy ) = bd cov(x, Y ) 1.8.5 Expectation and Variance of Vector Linear Transformation If X is a random vector, a is a constant vector, and B is a constant matrix such that a + BX makes sense (the dimension of a and the row dimension of B are the same, and the dimension of X and the column dimension of B are the same), then (deck 2, slide 64). E(a + BX) = a + BE(X) var(a + BX) = B var(x)b T 6

1.8.6 Short Cut Formulas If X is a random variable, then var(x) = E(X 2 ) E(X) 2 (deck 2, slide 21). If X and Y are random variables, then (homework problem 3-7). cov(x, Y ) = E(XY ) E(X)E(Y ) 1.9 From Moment Generating Function E(X k ) = ϕ (k) (0) (deck 3, slide 20), where ϕ is the moment generating function of the random variable X and the right-hand side denotes the k-th derivative of ϕ evaluated at zero. 1.10 Convolution Formula If X and Y are independent random variables in the same probability model and Z = X +Y and f X, f Y, and f Z denote the PMFs of these random variables, then f Z (z) = f X (x)f Y (z x) x (deck 3, slide 52), where the sum ranges over the possible values of X and f Y is extended to be zero off of the support of Y. 2 Change of Variable Formulas 2.1 Discrete Distributions 2.1.1 One-to-one Transformations If f X is the PMF of the random variable X, if Y = g(x), and g is an invertible function with inverse function h (that is, X = h(y )), then f Y (y) = f X [h(y)] and the domain of f Y is the range of the function g (the set of possible Y values) (deck 2, slide 86). 7

2.1.2 Many-to-one Transformations If f X is the PMF of the random variable X having sample space S and Y = g(x), then f Y (y) = f X (x) x S g(x)=y and the domain of f Y is the codomain of the function g (deck 2, slide 81). 2.2 Continuous Distributions 2.2.1 One-to-one Transformations Univariate If f X is the PDF of the random variable X, if Y = g(x), and g is an invertible function with inverse function h (that is, X = h(y )), then f Y (y) = f X [h(y)] h (y) and the domain of f Y is the range of the function g (the set of possible Y values) (deck 3, slide 123). Linear Transformation (Special Case of Above) If f X is the PDF of the random variable X defined on the whole real line and Y = a + bx with b > 0, then the PDF of Y is f Y (y) = 1 ( ) y a b f X b and the domain of f Y is the whole real line (deck 3, slide 140). Multivariate If f X is the PDF of the random vector X, if Y = g(x), and g is an invertible function with inverse function h (that is, X = h(y)), then f Y (y) = f X [h(y)] det J(y) and the domain of f Y is the range of the function g (the set of possible Y values) (deck 3, slide 122), where J(y) is the Jacobian matrix of the transformation h, the matrix whose i, j component is h i (y)/ y j, where h(y) = ( h 1 (y),..., h n (y) ) (deck 3, slide 121). 8

2.2.2 Alternate Method (Does not Require One-to-one) If f X is the PDF of the random variable X and Y = g(x), then the DF of Y is F Y (y) = Pr(Y y) = Pr{g(X) y} (deck 3, slide 154), and then the PDF of Y can be found by differentiation (Section 4.1 below). 3 PMF or PDF and Independence If X 1,..., X n are independent random variables having PMF or PDF f 1,..., f n, then n f(x 1,..., x n ) = f i (x i ) (deck 1, slide 98 and deck 3, slide 115). In this terminology of deck 5 (Section 6 below) this says the joint is the product of the marginals (when the components are independent). In particular, if X 1,..., X n are independent and identically distributed random variables having PMF or PDF h, then f(x 1,..., x n ) = n h(x i ). 4 PDF or PMF to DF and Vice Versa 4.1 DF to PDF If F is the DF of a continuous random variable X, and f is defined by f(x) = F (x), at points x where F is differentiable (and may be arbitrarily defined at other points), then f is a PDF of X (deck 3, slide 105) defined on the whole real line. 9

4.2 PDF to DF If f is the PDF of a continuous random variable defined on the whole real line and F is the corresponding DF, then (deck 3, slides 92 and 104). F (x) = Pr(X x) = x f(s) ds 4.3 PDF to DF (Alternative Method) If f is the PDF of a random variable defined by case splitting, say f 1 (x), x < a 1 f 2 (x), a 1 < x < a 2 f(x) = f 3 (x), a 2 < x < a 3 f 4 (x), a 3 < x then the DF is F (x) = f i (x) dx on each of the four intervals, the indefinite integral containing an arbitrary constant, which is different for each interval, and the constants must be adjusted to make F have the properties of a DF given on deck 3, slides 112 and 113, that is F is continuous, goes to zero as x and goes to one as x +. 4.4 DF to PMF If f is the PMF of a discrete random variable defined on the whole real line and F is the corresponding DF, then f(x) is the size of the jump F makes at x, that is, f(x) = F (x) sup F (y) y<x (deck 3, slide 109). In particular, f(x) = 0 whenever F is continuous at x. 10

4.5 PMF to DF If f : S R is the PMF of a discrete random variable and F is the corresponding DF, then (deck 3, slide 109). F (x) = Pr(X x) = f(s) s S s x 5 Quantiles 5.1 Continuous Random Variables If F is the DF of a continuous random variable, then the p-th quantile of this random variable or this distribution is any x satisfying F (x) = p and the solution is unique if F is differentiable at x with F (x) > 0 (deck 4, slide 2). The solution is non-unique only if F has a flat section with value p, that is if there exist points a and b with a < b such that F (x) = p, a x b in which case every x [a, b] is a p-th quantile. 5.2 Discrete Random Variables If F is the DF of a discrete random variable, then the p-th quantile of this random variable or this distribution is any x satisfying F (y) p F (x), for all y < x (deck 4, slide 2). There are two cases. The solution is unique if we do not have F (x) = p for any x. In that case the p-th quantile is the point x where F jumps past p, that is, F (y) < p < F (x), y < x. The solution is non-unique if we do have F (x) = p for some x, in which case (because the DF of a discrete random variable is a step function) we must have F (x) = p for all x in an interval, and every such x is a p-th quantile. 11

6 Joint, Marginal, Conditional The probability distribution of a random vector is often called the joint distribution of the components of the random vector. In this context the probability distribution of a random vector comprising some subset of these components is called a marginal distribution. The terms joint and marginal have meaning only in context. They have no context-independent meaning. We can say f(x, y, z) is the joint PDF of X, Y, and Z, and f(x, y), f(y, z), and f(x, z) are three marginal PDF. So in one context f(x, y, z) is a joint distribution and f(x, y) is a marginal. But in another context we can say f(x, y) is a joint distribution and f(x) and f(y) are its marginals. (Pedantically, each f here should have a different letter or distinguishing subscript because each is a different function.) 6.1 Joint to Marginal To derive a marginal PMF or PDF from a joint PMF or PDF, sum or integrate out the variable(s) you don t want (sum if discrete, integrate if continuous) (deck 5, slides 4 9). Given the joint of X and Y, to obtain the marginal of X sum out y f X (x) = y f X,Y (x, y) or integrate out y f X (x) = f X,Y (x, y) dy as the case may be. Given the joint of X, Y, and Z, to obtain the marginal of X sum out y and z f X (x) = f X,Y,Z (x, y, z) y z or integrate out y and z f X (x) = f X,Y,Z (x, y, z) dy dz as the case may be. Given the joint of X, Y, and Z, to obtain the (bivariate) marginal of X and Y sum out z f X,Y (x, y) = f X,Y,Z (x, y, z) z 12

or integrate out z f X,Y (x, y) = f X,Y,Z (x, y, z) dz as the case may be. And so forth (many special cases of the general principle could be given). 6.2 Joint to Conditional conditional = joint marginal and the marginal is the marginal for the variable(s) behind the bar in the conditional f(x, y) f(x y) = f(y) f(w, x, y, z) f(w, x y, z) = f(y, z) and so forth (pedantically, each f here should have a different letter or distinguishing subscript because each is a different function). 6.3 Conditional and Marginal to Joint joint = conditional marginal and the marginal is the marginal for the variable(s) behind the bar in the conditional f(x, y) = f(x y)f(y) f(w, x, y, z) = f(w, x y, z)f(y, z) and so forth (pedantically, each f here should have a different letter or distinguishing subscript because each is a different function). Note that the equations in this section are the equations in the preceding section rearranged. 13

6.4 Conditional Probability and Expectation Everything in Sections 1 to 5 above can be redone for conditional probability distributions. The variables behind the bar in the conditional distribution just go along for the ride like parameters in unconditional distributions (deck 5, slides 11 22). Calculating expectations (compare with Section 1 above) E{g(X) Y } = g(x)f(x y) dx E{g(W, X) Y, Z} = g(w, x)f(w, x y, z) dw dx (replace integrals with sums if W and X are discrete). Calculating probabilities (compare with Section 1 above) Pr{X A Y } = f(x y) dx A Pr{(W, X) A Y, Z} = I A (w, x)f(w, x y, z) dw dx (replace integrals with sums if W and X are discrete). PDF to DF (compare with Section 4.2 above) F (x y) = x f X Y (s y) (replace the integral with a sum if X is discrete). Quantiles (compare with Section 5 above, if F is the conditional DF of X given Y and X is continuous, then the p-th quantile of this distribution is the x that satisfies F (x y) = p. 14