Expected Value and Variance

Similar documents
Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions

Lecture 3: Probability Distributions

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

PhysicsAndMathsTutor.com

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Exercises of Chapter 2

Rules of Probability

Engineering Risk Benefit Analysis

= z 20 z n. (k 20) + 4 z k = 4

HMMT February 2016 February 20, 2016

Probability and Random Variable Primer

xp(x µ) = 0 p(x = 0 µ) + 1 p(x = 1 µ) = µ

PROBABILITY PRIMER. Exercise Solutions

SELECTED PROOFS. DeMorgan s formulas: The first one is clear from Venn diagram, or the following truth table:

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

APPENDIX A Some Linear Algebra

find (x): given element x, return the canonical element of the set containing x;

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

18.1 Introduction and Recap

Statistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran

Foundations of Arithmetic

First Year Examination Department of Statistics, University of Florida

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

Min Cut, Fast Cut, Polynomial Identities

Introduction to Random Variables

A random variable is a function which associates a real number to each element of the sample space

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

Introduction to Algorithms

THE SUMMATION NOTATION Ʃ

Economics 130. Lecture 4 Simple Linear Regression Continued

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Chapter 1. Probability

CS 798: Homework Assignment 2 (Probability)

Lecture 12: Discrete Laplacian

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

8.6 The Complex Number System

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

REAL ANALYSIS I HOMEWORK 1

Finding Dense Subgraphs in G(n, 1/2)

a b a In case b 0, a being divisible by b is the same as to say that

6.4. RANDOM VARIABLES 233

Bernoulli Numbers and Polynomials

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

Linear Regression Analysis: Terminology and Notation

1.4. Experiments, Outcome, Sample Space, Events, and Random Variables

), it produces a response (output function g (x)

More metrics on cartesian products

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

CS-433: Simulation and Modeling Modeling and Probability Review

First day August 1, Problems and Solutions

Exercise Solutions to Real Analysis

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

Notes on Frequency Estimation in Data Streams

7. Multivariate Probability

Transfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Lecture 4: November 17, Part 1 Single Buffer Management

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Matrix Approximation via Sampling, Subspace Embedding. 1 Solving Linear Systems Using SVD

Applied Stochastic Processes

Math 261 Exercise sheet 2

Convergence of random processes

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Complex Numbers Alpha, Round 1 Test #123

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Affine transformations and convexity

The Geometry of Logit and Probit

x = , so that calculated

and problem sheet 2

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

Solutions Homework 4 March 5, 2018

Multiple Choice. Choose the one that best completes the statement or answers the question.

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Limited Dependent Variables

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

E Tail Inequalities. E.1 Markov s Inequality. Non-Lecture E: Tail Inequalities

Difference Equations

The Expectation-Maximization Algorithm

Probability Theory (revisited)

Strong Markov property: Same assertion holds for stopping times τ.

A be a probability space. A random vector

Homework Notes Week 7

Lecture 21: Numerical methods for pricing American type derivatives

SUMS PROBLEM COMPETITION, 2001

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 281A: Homework #6

Exercises. 18 Algorithms

Complete subgraphs in multipartite graphs

Problem Set 9 - Solutions Due: April 27, 2005

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Week 5: Neural Networks

The Second Anti-Mathima on Game Theory

Hidden Markov Models

Transcription:

MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or mean µ ) of a dscrete random varable X s a weghted average gven by µ E[ X] P(X ). Note: If X has a fnte range { 1,,..., n }, then ts average value s E[ X ] 1 P( X 1 ) + P( X ) +...+ n P( X n ), whch s always a fnte sum. But f Range X s denumerable { 1,,..., n,... }, then E[ X ] s an nfnte sum and may or may not converge. Example 1. (a) Roll a far sx-sded de; record the value. What s the average value? (b) Roll a far sx-sded over and over untl you roll a 3. You wn $ n f you roll a 3 for the frst tme on the n th roll. What are your average earnngs? (c) Roll a sngle sx-sded de. If you roll a 1, then you get $5. If you roll a or 3, then you get $10. But f you roll a 4, 5, or, then you lose $0. Let X denote your change n fortune. Compute E[ X ]. Soluton. (a) Let X be the value of the de. Then Range X {1,, 3, 4, 5, }, and each value n the range occurs wth probablty 1/. Thus, E[ X ] P( X ) 1 1 + + 3 + 4 + 5 + 1 3. 5. (b) Let X be your earnngs. Then Range X {,, 3,... }, and for n 1 we have P(X n ) 5 n 1 1 (from n 1 losses followed by the wn on the n th roll). Thus, E[ X ] P( X ) n P( X n ) n 1 n 5 n 1 1 5 n 1. n 1 n 1 Although you would expect the game to end at some pont at whch you receve your payoff, the average payoff s nfnte.

(c) We have Range X { 0, 5, 10} wth the pdf of X gven by f ( 0) P( X 0) 3 f (5) P( X 5) 1 f (10) P(X 10). Thus, E[ X ] P( X ) 0 3 + 5 1 +10 35 $5.83. Condtonal Average Defnton. Let A be an event wth P(A) > 0. The condtonal average of a dscrete random varable X gven A s defned by E[ X A] P( X A) Range X 1 P( A) P ({X } A). P ( {X } A ) P(A) Theorem 1. (Law of Total Average) Suppose non-null events A 1, A,... form a partton of Ω. That s, the events are dsjont and ther unon s all of Ω. Then for any dscrete random varable X, we have E[ X ] E[ X A 1 ] P( A 1 ) + E[ X A ] P(A ) +... E[ X A ] P(A ). Proof. Usng propertes of seres and the probablty measure P, we have E[ X A ] P( A ) P( X A ) P( A ) P ( (X ) A ) P(A ) P(A ) P ((X ) A ) P ((X ) A ) rearrange the sum P (( X ) A ) P ( X ) A by dsjontness of partton P( X ) E[ X ].

Example. Roll one de. Let A 1 Roll a 1, A Roll a, 3 or 5, A 3 Roll a 4 or. Let X the value of the roll. What are E[ X A ]? Use these values to compute E[ X ]. Soluton. Frst, we have E[ X A 1 ] P( X A 1 ) 1 P( X 1 A 1 ) P( X 1 A 1 ) P( A 1 ) 1 P(A 1 ) P(A 1 ) 1. In other words, the average of the rolls that are 1 s 1. Next, E[ X A ] P( X A ) + 3P( X 3 A ) + 5P( X 5 A ) P(X A ) P( A ) P(X ) P( A ) + 3 + 3 P(X 3 A ) P( A ) P(X 3) P( A ) ( + 3 + 5) 1 3 10 3. + 5 P( X 5) P(A ) + 5 P( X 5 A ) P(A ) 1 / 1 / + 3 1 / 1 / + 5 1 / 1 / That s, gven that you've rolled a, 3, or 5, the average roll s 10/3. Lastly, E[ X A 3 ] 4P( X 4 A 3 ) + P( X A 3 ) 4 P( X 4 A 3 ) P(A 3 ) 4 P( X 4) P( A 3 ) + (4 + ) 1 5. + P( X A 3 ) P(A 3 ) P(X ) P(A 3 ) 4 1 / 1 / 3 + 1 / 1 / 3 Then, 3 E[ X ] E[ X A ] P(A ) 1P( A 1 ) + 10 3 P( A ) + 5P( A 3 ) 1 1 + 10 3 1 + 5 1 3 3.5. Propertes of Expected Value We now prove some mportant and often used propertes the expected value for dscrete random varables. The proof of property () uses the precedng concept of condtonal expectaton.

Theorem. () The expected value of a constant s that constant: E[c ] c. () Expected value s a lnear operator on dscrete random varables; that s, constants factor out of an expected value and the expected value of a sum s the sum of the expected values: E[c X ] c E[ X ] and E[ X + Y ] E[ X] + E[Y]. () If X 0, then E[ X ] 0. In partcular, E[ Z ] 0 for any random varable Z. Proof. () Let X c wth probablty 1. Then E[c ] c P( X c) c 1 c. () If c 0, then E[c X ] E[0] 0 0 E[ X]. Otherwse, f X has range wth dstnct values { 1,,... }, then c X has range wth dstnct values {c 1, c,... } and P(X c ) P(X ). Hence, E[c X ] c P( X c ) c P( X ) c E[ X ]. Next, let X have range wth dstnct values { 1,,... } and let Y have range wth dstnct values {l 1, l,... }. Then the range of X +Y s the collecton of possble sums { 1 + l 1, 1 + l,..., + l 1, + l,..., j + l 1, j + l,... } (wth some of these values possbly repeatng). Then E[ X +Y ] P( X + Y ) ( + l j ) P( X,Y l j ) P( X, Y l j ) + l j P( X,Y l j ) ( ) P( X,Y l j ) + l j P( X, Y l j ) P( X, Y l j ) + l j P( X ) P(Y l j X ) j P {X Y l j } + P( X ) l j P(Y l j X ) P( X ) + P( X )E[Y X ] E[ X] + E[Y].

() Assume that every value n the range of X s non-negatve. Because P(X ) also s non-negatve for every, we have E[ X ] P( X ) 0. Example 3. Fnd the average sum when rollng two dce. Soluton. (Hard way) Let X 1 be the value of one de and let X be the value of the other de. The sum X 1 + X has range {, 3, 4,..., 1}. Then, 1 E[ X 1 + X ] P( X 1 + X ) 1 3 + 3 3 +... + 7 3 5 3 7. +...+ 1 1 3 (Easer) By lnearty we have E[ X 1 + X ] E[X 1 ] + E[ X ] 3. 5 + 3.5 7. Varance and Standard Devaton Defnton. Let X be a random varable wth µ E[ X] beng fnte. The varance of X s then defned by σ Var( X ) E [(X E[ X]) ] E [( X µ ) ] ( µ ) P( X ). Because the random varable ( X E[ X ]) s non-negatve, ts expected value s nonnegatve. That s, Var( X ) 0. Thus, we can defne the standard devaton of X by σ Var(X). The formal defnton of varance s rarely used n practce. When computng the varance of a dscrete random varable, t s more common to use the followng result: ( ). Theorem 3. Let X be a dscrete random varable. Then Var( X) E[X ] E[ X]

Proof. Usng µ E[X] and the propertes of expected value we have Var( X) E [( X µ ) ] E[ X µ X + µ ] E[X ] µ E[ X ] + E[µ ] E[X ] µ + µ E[X ] µ E[X ] (E[ X]). In order to apply ths result, we frst compute E[ X ] P(X ), then compute E[ X ] by E[ X ] P(X ). Then Var( X) E[X ] ( E[ X] ) and σ Var( X). Example 4. Roll a sngle sx-sded de. If you roll a 1, then you get $5. If you roll a or 3, then you get $10. But f you roll a 4, 5, or, then you lose $0. Let X denote your change n fortune. Compute the standard devaton of X. Soluton. As n Ex. 1(c), we have P(X 0) 3, P(X 5) 1, P(X 10) and E[ X ] 0 3 + 5 1 + 10 35 5.833. Then E[ X ] ( 0) 3 + 5 1 + 10 145. Thus, Var( X ) E[X ] ( E[ X ]) 145 35 735 3, and σ 735 3 14.4. Here we have 0 occurrng half the tme, 5 occurrng one-sxth of the tme, and 10 occurrng one-thrd of the tme. The average value s then about 5.83. The standard devaton of 14.4 gves a way of measurng the average spread from the mean. The actual values n the range, 0, 5, 10, all dffer from the sngle average value of 5.83. And n ths case, the average spread from 5.83 s about 14.4. Example 5. Let m and n be non-negatve ntegers wth m n. Let X be a randomly chosen nteger from m to n. Fnd the mean and varance of X. Soluton. The range of X s the set of ntegers {m,..., n } that has n m +1 elements. Snce X s chosen randomly, each value n the range occurs wth the equal probablty of 1 / (n m +1). Thus, the expected value of X s

n E[X] m 1 n m + 1 1 n m + 1 1 n m + 1 1 n m + 1 m + n The varance of X s gven by Var( X) E[X ] ( E[ X] ) n P( X ) m 1 n m + 1. 1 n m +1 n(n + 1) 1 n m 1 n m +1 1 1 (m 1)m n m + n + m (n + m)(n m) + n + m (n + m)(n m +1) m + n n(n +1)(n + 1) (n m)(n m + ). 1 1 n m +1 n m (m 1) m(m 1) m + n m + n Independent Random Varables Intutvely, we thn of random varables X and Y as beng ndependent when ther values are not affected by each other. That s, the fact that X a has no bearng on whether Y b, and vce versa. We formally defne ndependence as follows: Defnton. Random varables X 1, X,..., X n are ndependent f and only f ( ) P X j a j P {X 1 a 1 } {X a }... {X a } for all possble subsequences X 1,..., X. j1 ( ),

Example. Flp a con 3 tmes. Let X count the number of Heads on the th flp. Then X s ether 1 or 0 and the X are ndependent, and the probablty of gettng 3 Heads n a row s P(X 1 1 X 1 X 3 1) P( X 1 1) P(X 1) P(X 3 1) 1 1 1 1 8. Note: For convenence, we often wrte P(X a X j b) as P(X a, X j b), and we denote that X and Y are ndependent by X Y. Theorem 4. Let X and Y be ndependent. Then E[ X Y] E[ X] E[Y]. Proof. Let X have range wth dstnct values { 1,,... } and let Y have range wth dstnct values {l 1, l,... }. Then the range of X Y s the collecton of possble products { 1 l 1, 1 l,..., l 1, l,..., j l 1, j l,... } Then, E[ X Y] P(X Y ) l j P(X, Y l j ) l j P( X ) P(Y l j ) P( X ) l j P(Y l j ) by ndep. P( X ) l j P(Y l j ) j ( P( X ) E[Y] ) E[Y] P( X ) E[Y] E[ X] E[ X ] E[Y ]. Note: By nducton, the prevous result easly extends to a fnte sequence. That s, f random varables X 1, X,..., X n are ndependent, then E[ X 1 X... X n ] E[ X 1 ] E[ X ]... E[X n ]. Example 7. (a) Roll two dce. What s the average product of the two values? (b) Roll one de. Square the result. What s the average square? How does t compare wth the square of the average roll?

Soluton. (a) (Hard Way) Let X 1 be the value of one de and let X be the value of the other de. Then both have range {1,,..., }. So E[ X 1 X ] P(X Y ) l P( X,Y l ) 1 j1 1 1 + 1 +... + 1 + 1 +...+ 3. (Easer) By ndependence we have E[ X 1 X ] E[X 1 ] E[ X ] 1 + +...+ 1 + +... + 3. 5 3. 5 1. 5. (b) Now let X be the value on the rolled de. Then X has range 1, 4, 9, 1, 5, 3. So the average square s E[ X 1 + 4 + 9 +1 + 5 + 3 ] 91 15.17. But E[ X ] (3. 5) 1. 5. Thus, E[ X ] E[ X ]. Notes: () Because X s not ndependent of tself, we cannot say that E[ X ] E[ X X] E[ X ] E[X ]. () Because E[ X ] E[ X ] Var(X ) E[( X µ) ] 0, we always have E[ X ] E[ X]. () If X s the value on a rolled de, then Var( X) E[ X ] E[ X] 91 and σ 35 / 1 1.7. 1.5 35 1 Other Facts About Varance Let X and Y be a random varables and let a and b be constants. Then () Var(a) 0 () Var(a X) a Var( X) () Var( X + b) Var(X ) (v) Var(a X + b) a Var(X ) (v) If X and Y are ndependent, then Var( X + Y) Var(X ) + Var(Y).

Proofs. () Var(a) E[a ] ( E[a]) a (a) 0. () Var(a X ) E[(aX ) ] (E[aX ]) E[a X ] (a E[X ]) a E[ X ] a ( E[ X ]) ( ) a E[ X ] ( E[X ]) a Var( X ). (v) Smply combne Propertes () and (). (v) E[( X + Y) ] E[ X + XY + Y ] E[ X ] + E[ XY] + E[Y ] E[ X ] + E[ X ]E[Y] + E[Y ] by ndependence Hence, Var( X + Y) E[(X + Y) ] E[ X + Y] [ ] [ ] [ ] ( ) Var( X + b) E (( X + b) E[ X + b] ) E ( X + b E[ X] b) E (( X E[X ]) Var( X). E[ X ] + E[X ]E[Y] + E[Y ] ( E[ X ] + E[Y] ) ( ) E[ X ] + E[X ]E[Y] + E[Y ] E[ X ] + E[ X ]E[Y] + E[Y] E[ X ] E[ X] + E[Y ] E[Y] Var(X ) + Var(Y). Note: If X and Y are ndependent, then Var( X Y) Var( X ) + Var(Y) by Propertes (v) and (). By nducton, Property (v) extends to a fnte sequence. That s, f random varables X 1, X,..., X n are ndependent, then Var( X 1 + X +...+ X n ) Var( X 1 ) + Var(X ) +... + Var(X n ).

Exercses 1. Draw one card from a standard dec. If you draw a number card ( through 10), then X wll be 10. If you draw a face card, then X wll be 0. If you draw an Ace, then X wll be 30. (a) Compute the mean and standard devaton of X. (b) Compute the condtonal average of X gven that you draw a number card or a face card.. You mae a bet for whch you wn wth probablty p and lose wth probablty q 1 p, where 0 < p < 1. If you wn, you gan $a and f you lose then you drop $b. Let X be your change n fortune so that X s ether +a or b. (a) Derve E[ X ]. (b) The casno wants E[ X ] 0. Use your formula for E[ X ] from (a) to solve for what the payoff a should be to mantan the nequalty E[ X ] 0. 3. Roll two far sx-sded dce and let (ω 1, ω ) denote the resultng par. Defne a random varable X by X ((ω 1, ω )) ω 1 ω. Compute the mean and standard devaton of X. 4. The Fbonacc sequence {F n } s defned by F 1 1, F 1, and F n F n 1 + F n for n 3. Let S be sum of the frst sx Fbonacc numbers. A based sx-sded de s such that the probablty of rollng the value s F / S for 1,,...,. Roll ths de and let X be the value. Compute the mean and standard devaton of X.