CS 361: Probability & Statistics

Similar documents
V. RANDOM VARIABLES, PROBABILITY DISTRIBUTIONS, EXPECTED VALUE

What is a random variable

Uncertainty. Michael Peters December 27, 2013

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Chapter 14. From Randomness to Probability. Copyright 2012, 2008, 2005 Pearson Education, Inc.

STA Module 4 Probability Concepts. Rev.F08 1

MATH Notebook 5 Fall 2018/2019

Introduction to Probability 2017/18 Supplementary Problems

This exam contains 13 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

Senior Math Circles November 19, 2008 Probability II

N/4 + N/2 + N = 2N 2.

Why should you care?? Intellectual curiosity. Gambling. Mathematically the same as the ESP decision problem we discussed in Week 4.

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

1 Normal Distribution.

Expected Value 7/7/2006

2.1 Independent, Identically Distributed Trials

Probability Experiments, Trials, Outcomes, Sample Spaces Example 1 Example 2

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

Probability Theory and Simulation Methods

Lecture #13 Tuesday, October 4, 2016 Textbook: Sections 7.3, 7.4, 8.1, 8.2, 8.3

Expectation is linear. So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then,

Basic Probability. Introduction

Steve Smith Tuition: Maths Notes

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

CS 5014: Research Methods in Computer Science. Bernoulli Distribution. Binomial Distribution. Poisson Distribution. Clifford A. Shaffer.

Name: 180A MIDTERM 2. (x + n)/2

Math , Fall 2012: HW 5 Solutions

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

Last few slides from last time

MATH 3C: MIDTERM 1 REVIEW. 1. Counting

CS 361: Probability & Statistics

the time it takes until a radioactive substance undergoes a decay

CISC 1100/1400 Structures of Comp. Sci./Discrete Structures Chapter 7 Probability. Outline. Terminology and background. Arthur G.

X = X X n, + X 2

Discrete Random Variables. David Gerard Many slides borrowed from Linda Collins

Discrete Distributions

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Deep Learning for Computer Vision

Lecture 2: Review of Basic Probability Theory

Chapter 4: An Introduction to Probability and Statistics

Chapter 8: An Introduction to Probability and Statistics

EECS 126 Probability and Random Processes University of California, Berkeley: Spring 2015 Abhay Parekh February 17, 2015.

What is Probability? Probability. Sample Spaces and Events. Simple Event

Confidence Intervals

MAS275 Probability Modelling Exercises

Chapter 35 out of 37 from Discrete Mathematics for Neophytes: Number Theory, Probability, Algorithms, and Other Stuff by J. M. Cargal.

Conditional Probability and Bayes

UC Berkeley Department of Electrical Engineering and Computer Science. EE 126: Probablity and Random Processes. Solutions 5 Spring 2006

1 The Basic Counting Principles

Conditional probabilities and graphical models

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.

Probability (Devore Chapter Two)

CME 106: Review Probability theory

RANDOM WALKS IN ONE DIMENSION

Expectations and Variance

Intermediate Math Circles November 15, 2017 Probability III

Great Theoretical Ideas in Computer Science

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Expectations and Variance

Probability and Statistics

STAT/SOC/CSSS 221 Statistical Concepts and Methods for the Social Sciences. Random Variables

Recitation 6. Randomization. 6.1 Announcements. RandomLab has been released, and is due Monday, October 2. It s worth 100 points.

Conditional Probability, Independence and Bayes Theorem Class 3, Jeremy Orloff and Jonathan Bloom


Discrete Random Variables

Week 3 Sept. 17 Sept. 21

Discrete Structures for Computer Science

Elementary Discrete Probability

Probability: Terminology and Examples Class 2, Jeremy Orloff and Jonathan Bloom

BINOMIAL DISTRIBUTION

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

the probability of getting either heads or tails must be 1 (excluding the remote possibility of getting it to land on its edge).

Conditional distributions (discrete case)

CS 361: Probability & Statistics

ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.

Random Walks and Quantum Walks

Toss 1. Fig.1. 2 Heads 2 Tails Heads/Tails (H, H) (T, T) (H, T) Fig.2

2011 Pearson Education, Inc

Some Basic Concepts of Probability and Information Theory: Pt. 1

ACMS Statistics for Life Sciences. Chapter 13: Sampling Distributions

CS 124 Math Review Section January 29, 2018

Markov Chains. Chapter 16. Markov Chains - 1

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

18.175: Lecture 8 Weak laws and moment-generating/characteristic functions

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

CS 237: Probability in Computing

P(A) = Definitions. Overview. P - denotes a probability. A, B, and C - denote specific events. P (A) - Chapter 3 Probability

MATH MW Elementary Probability Course Notes Part I: Models and Counting

MA 1125 Lecture 15 - The Standard Normal Distribution. Friday, October 6, Objectives: Introduce the standard normal distribution and table.

Outline. Probability. Math 143. Department of Mathematics and Statistics Calvin College. Spring 2010

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

CS 361: Probability & Statistics

Joint Probability Distributions and Random Samples (Devore Chapter Five)

CS 361: Probability & Statistics

MATH 19B FINAL EXAM PROBABILITY REVIEW PROBLEMS SPRING, 2010

Notes slides from before lecture. CSE 21, Winter 2017, Section A00. Lecture 16 Notes. Class URL:

Lecture 4: September Reminder: convergence of sequences

The remains of the course

Transcription:

February 19, 2018 CS 361: Probability & Statistics Random variables

Markov s inequality This theorem says that for any random variable X and any value a, we have A random variable is unlikely to have an absolute value much larger than the mean of its absolute value If, for instance, we took a = 10 E[ X ], we d get

Chebyshev s inequality For any random variable X and value a Or if we let be the standard deviation of X and substitute we have The probability that X is greater than k standard deviations from the mean is small. Look familiar?

Sampling So far we have been working on a number of problems where we suppose ahead of time that we know the distributions of the outcomes and hence the random variables in our experiments. We eventually want to get to a place where we don t make that assumption and we guess what the distribution is after observing some runs of an experiment In this context and others we will refer to our observations of experiments as trials or samples from the underlying distribution

Sampling: IID If we have a set of data items x_i meeting the following: a) they are independent b) they were generated by the same process c) the histogram of a very large set of the items looks increasingly like the probability distribution P(X) of some random variable X We call this set of data items independent, identically distributed samples of P(X) or IID for short

Expectation of iid samples Assume we have a set of N iid samples of a probability distribution P(X) But since x_i is a sample drawn from X X_N is a random variable, and if we take the expectation of both sides we get Simplifying we get Using linearity, we get

Variance of iid samples Assume that X has a finite variance given by. Let s find the variance of X_N Since the x_i are drawn independently from X we can use the fact that the variance of a sum of independent variables can be broken up into a sum of variances Substituting Recall the property of variance to get And simplifying, we get

Weak law of large numbers With

Proving the weak law Proving Using we get Let s put X_N into Chebyshev s inequality Using what we showed a few slides ago we substitute to get Taking the limit as N goes to infinity proves the weak law, the right hand side goes to 0, as desired

Who cares? The weak law tells us that if we observe enough data we can learn things about random variables It means we can do simulations instead of calculations in some cases It is useful for solving problems

A coin is flipped 15 times, the probability of heads for each flip is p. Let X be the random variable that counts the number of heads that come up in the 15 flips. Use a simulation to estimate E[X] Example

Simulating for expectation If we want to compute the expected value of a function of random variables, the law of large numbers tells us that we can use simulation If our x_i come from a random number generator and f(x_i) = f_i, then defining we have

Example Let X be a random variable uniformly distributed in the range 0 to 1. Let Y be a random variable uniformly distributed from 0 to 10. For Z=(Y-5X)^3 - X^2, what is Var(Z)? Using gives Var(Z) is approximately 27,000

Probabilities with simulations Recall the indicator random variable We have which means we can use our insight for calculating expectations in order to calculate probabilities By counting occurrences of the event in our simulations

Example Let X be a random variable uniformly distributed in the range 0 to 1. Let Y be a random variable uniformly distributed from 0 to 10. For Z=(Y-5X)^3 - X^2, what is P(Z>3)? Count how many times Z > 3 Giving P(Z > 3) is approximately 0.6

Example: Roulette A roulette wheel has 36 numbers, 18 are red and 18 are black. It also has a number of colorless zeros. If you bet 1 on red and a red number comes up, you receive a payout of 1, if a red number does not come up you lose 1. On a wheel with 1 zero, what is the expected payout? 18/37-19/37 = -1/37 On a wheel with 2 zeros, what is the expected payout? 18/38-20/38 = -1/19 On a wheel with 3 zeros, what is the expected payout? 18/39-21/39 = -1/13

Fair games In some sense, a game isn t really fair unless it has an expected payout of 0 Taking on a game with negative expected value means on average you will lose and the more times you play the more you will lose The weak law says a casino will be making about 5 cents per bet on red in roulette now matter how much people bet or how lucky they re feeling

Example: birthdays P1 and P2 are going to play a game where P1 bets that if 3 people are stopped on the street, they will have birthdays whose days of the week are in succession. If this happens P2 gives P1 $100, if not P1 gives P2 $1. What is the expected payout to P1? The payout will be 100p - (1-p)1 Recall that we calculated p before and it was 1/49 So P1 s payout is 52/49 or slightly more than a dollar If P1 can find people willing to take this bet, they should do it all day

Example: ending a game early Two players each put up $25 to play the following game: they will toss a fair coin. If it comes up heads, player H wins that toss and for tails, player T wins that toss. The first player to win 10 tosses gets all $50. One of the players has to leave when the game is scored 8-7 (H-T), how should the $50 be divided between them? Either person could win if play were allowed to continue. We should divide up the $50 based on the expected earnings of each player at the current point in the game. Player H s expected value is 50 P({H wins from score 8-7}) + 0 P({T wins from score of 8-7}) H could win 10-7, 10-8, or 10-9. These are disjoint events P( 10-7 ) =1/4 P( 10-8 ) = 2/8 P( 10-9 ) = 3/16 E[H] = 50*11/16 HH HTH, THH HTTH, THTH, TTHH = $34.375

Example: decision tree We have to decide whether to be vaccinated or not. It costs 10 to get the vaccine. If you re vaccinated you get the disease with probability 1e-7. If you re not you get it with probability 0.1 The disease is unpleasant and with probability 0.95 you will experience problems that cost you 1000 to have treated, but with probability 0.05 you get a major form of the disease and have to spend 1e6. Should you get vaccinated? In a decision diagram, squares represent choices, circles random events, and triangles outcomes

Example: decision tree Cost to vaccinate 10 + (1e-7)(50,950) = $10.01 What to compute first? Expected cost of disease (0.95)(1000) + (0.05)(1e6) = $50,950 Cost not to (0.1)(50950) = $5095

Example Cash isn t always the quantity of interest when we are making decisions under uncertainty and using expected values as a guide. Consider the following. Imagine you have to make a decision about how to treat a terminal disease. There are two treatment options: standard and radical. Radical treatment might kill you in and of itself with probability 0.1, or it might be so damaging that doctors stop treatment with probability 0.3 or you could complete radical treatment with probability 0.6 For any treatment there is a major, minor, or no response. The probability of major response with radical treatment is 0.1 while minor is 0.9. For standard treatment the probability of a minor response is 0.5 and no response is 0.5

Example Survival time for radical treatment Survival time for standard treatment 0.1(0) + 0.3(6) + 0.6(0.1(60) + 0.9(10)) = 10.8 months 0.5(10)+0.5(6) = 8 months

Continuous probability

Continuous random variables Some quantities we would like to model don t fit into any of the probability theory we have discussed so far: things like height, weight, temperature are all quantities that we can think of as appearing in one experimental setup or another It s a bit tricky to formally think about sample spaces and events with these kinds of experiments and we won t do so in this course

Continuous random variables Instead, we will think of each continuous random variable as being associated with a function p(x) called a probability density function, which we can use to compute probabilities Roughly, if p(x) is a probability density function then for an infinitesimally small interval dx So in order to be a density function, p(x) must be nonnegative

Probability density functions Now if p(x) is the density function for some continuous random variable and We can compute for a < b Observe, then, that

Density functions: summary Density functions are non-negative (Density functions can take on values greater than 1) Integrating density functions allows us to compute the probability that a continuous random variable takes on a value within a range Density functions integrate to 1 if we let x run from negative infinity to infinity

Density functions Non-negative functions which integrate to 1 are the density functions of some continuous random variable A continuous random variable X can be associated to a non-negative function which integrates to 1 that we can use to compute probabilities for X If we have a density for a variable, great, we can compute probabilities Many times, we are trying to come up with a sensible density to describe a continuous phenomenon

Example - constant of proportionality Suppose we have a physical system that produces random numbers in the range 0 to with No number outside this range can appear, but every number within this range is equally likely. If X is a random variable that tells us which number was generated, what is its density function? We are going to want p(x) to be 0 when x < 0 or x > We will also want p(x) to have the same value, c, for every other x since we want all outcomes in our range to be equally likely Recall our interpretation of the density function

Example - constant of proportionality So we wind up with In order to figure out what c should be, recall that we must have Observe that we can have p(x) > 1 if But

Expectation With continuous variables, we compute expectations with an integral

Expectation This integral may not be defined or may not be finite. The cases we encounter will have well-behaved expectations, however