Example A. Define X = number of heads in ten tosses of a coin. What are the values that X may assume?

Similar documents
success and failure independent from one trial to the next?

Stat 400 section 4.1 Continuous Random Variables

Properties of Continuous Probability Distributions The graph of a continuous probability distribution is a curve. Probability is represented by area

To find the median, find the 40 th quartile and the 70 th quartile (which are easily found at y=1 and y=2, respectively). Then we interpolate:

Random variables. DS GA 1002 Probability and Statistics for Data Science.

1 Probability Distributions

CHAPTER 3 RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS. 3.1 Concept of a Random Variable. 3.2 Discrete Probability Distributions

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Discrete Probability Distributions

Review of Probability. CS1538: Introduction to Simulations

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Random Variables Example:

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Statistics and Econometrics I

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Random variables, Expectation, Mean and Variance. Slides are adapted from STAT414 course at PennState

Conditional Probability

Statistical Preliminaries. Stony Brook University CSE545, Fall 2016

Lecture 3: Random variables, distributions, and transformations

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, < x <. ( ) X s. Real Line

Lecture 3. Discrete Random Variables

STAT 430/510 Probability Lecture 7: Random Variable and Expectation

Probability, Random Processes and Inference

Part 3: Parametric Models

Applied Statistics I

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Notes 6 Autumn Example (One die: part 1) One fair six-sided die is thrown. X is the number showing.

INF FALL NATURAL LANGUAGE PROCESSING. Jan Tore Lønning

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

MA 250 Probability and Statistics. Nazar Khan PUCIT Lecture 15

Joint Probability Distributions and Random Samples (Devore Chapter Five)

ISyE 6739 Test 1 Solutions Summer 2015

Lecture 2. October 21, Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Johns Hopkins University.

Discrete Random Variables

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Continuous Probability Distributions. Uniform Distribution

Recap of Basic Probability Theory

Intro to probability concepts

Probability measures A probability measure, P, is a real valued function from the collection of possible events so that the following

Discrete and continuous

Recap of Basic Probability Theory

RVs and their probability distributions

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

Binomial random variable

Probability. Lecture Notes. Adolfo J. Rumbos

Part 3: Parametric Models

Chapter 3. Chapter 3 sections

A random variable is a quantity whose value is determined by the outcome of an experiment.

Northwestern University Department of Electrical Engineering and Computer Science

Relationship between probability set function and random variable - 2 -

Senior Math Circles November 19, 2008 Probability II

IAM 530 ELEMENTS OF PROBABILITY AND STATISTICS LECTURE 3-RANDOM VARIABLES

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

X 1 ((, a]) = {ω Ω : X(ω) a} F, which leads us to the following definition:

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Introduction to Probability and Statistics Slides 3 Chapter 3

Probability and Statisitcs

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, x. X s. Real Line

CMPSCI 240: Reasoning Under Uncertainty

What is a random variable

Conditional Probability

Lecture 4: Random Variables and Distributions

Chapter 4 : Discrete Random Variables

Lecture 2. Distributions and Random Variables

Fourier and Stats / Astro Stats and Measurement : Stats Notes

The random variable 1

Probability Distributions for Discrete RV

Continuous Expectation and Variance, the Law of Large Numbers, and the Central Limit Theorem Spring 2014

STAT 509 Section 3.4: Continuous Distributions. Probability distributions are used a bit differently for continuous r.v. s than for discrete r.v. s.

STAT 516 Answers Homework 2 January 23, 2008 Solutions by Mark Daniel Ward PROBLEMS. = {(a 1, a 2,...) : a i < 6 for all i}

ECE 302: Probabilistic Methods in Electrical Engineering

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Definition: A random variable X is a real valued function that maps a sample space S into the space of real numbers R. X : S R

Example. If 4 tickets are drawn with replacement from ,

Special distributions

MARKOV PROCESSES. Valerio Di Valerio

Chapter 1 Probability Theory

Distributions of Functions of Random Variables. 5.1 Functions of One Random Variable

DS-GA 1002 Lecture notes 2 Fall Random variables

Chapter 2.5 Random Variables and Probability The Modern View (cont.)

COS 424: Interacting with Data. Lecturer: Dave Blei Lecture #2 Scribe: Ellen Kim February 7, 2008

Statistical Methods: Introduction, Applications, Histograms, Ch

Guidelines for Solving Probability Problems

Fundamental Tools - Probability Theory II

The Geometric Distribution

Statistical methods in recognition. Why is classification a problem?

Chapter 18 Sampling Distribution Models

Learning Objectives for Stat 225

Discrete Probability Refresher

STAT 516: Basic Probability and its Applications

Chapter 3. Discrete Random Variables and Their Probability Distributions

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Random Variables. Marina Santini. Department of Linguistics and Philology Uppsala University, Uppsala, Sweden

MAS108 Probability I

Lecture 9. Expectations of discrete random variables

Transcription:

Stat 400, section.1-.2 Random Variables & Probability Distributions notes by Tim Pilachowski For a given situation, or experiment, observations are made and data is recorded. A sample space S must contain all possible outcomes for an experiment. An event (designated with a capital letter A, B, C, etc.) is a subset of the sample space, and will incorporate one or more of the outcomes. Example A. Define X = number of heads in ten tosses of a coin. What are the values that X may assume? Note carefully: The probabilities for the values of X are not equal! Example B. You test electrical components until you find one that fails. Define Y = number of components tested before a failure comes up. What are the values that Y may assume? Note carefully: The probabilities for the values of Y are not equal! [We ll suppose that P(individual component fails) = 0.1.] A random variable is a rule/formula that assigns a number value to each outcome in a sample space S. Random reminds us that we cannot predict specific outcomes, only discuss the probabilities. Each value xi assigned to a discrete random variable X will have an associated probability, P(X = xi). In Example A we defined our random variable as X = number of heads in ten tosses of a coin. In Example B we defined our random variable as Y = number of components tested before a failure comes up. Example C. Suppose that a box contains blue blocks and 2 yellow blocks. You pick three blocks without replacement. Define X = number of blue blocks drawn. What are the values that X may assume? x P(X = x)

The above is a probability distribution table. We could also have drawn either a line graph or a probability histogram. 0.6 0.6 1 2 1 2 Each value xi assigned to a discrete random variable X will have an associated probability, P(X = xi). A function p(xi) = P(X = xi) is called (what a surprise!) a probability distribution function or probability mass function. (Some of you may recognize probability mass function as being a Physics-type concept.) While the text uses the abbreviations pdf and pmf, I ll try to stick with writing the phrases out. For Example C, the formal way of writing the probability distribution function would be x = 1 10 x = 2 p ( x) = P( X = x) =. 1 x = 10 0 otherwise A probability distribution function must meet the basic criteria for all probabilities. p ( x) = P( X = xi ) 1 for each value xi, and P ( X = x i ) = 1 0 Why can t the table below give the probability distribution for a random variable X? x 0 P(X = x) 1 all x i Why can t the table below give the probability distribution for a random variable X? x 2 4 P(X = x) 7 2 7 4 7 Most of the examples done so far illustrate finite random variables. The values X may assume are limited in scope (Examples A and C). Variables that are (at least in theory) unlimited are infinite random variables. (In Example B, Y = 0, 1, 2,, 4,, 6, )

Example D. Let X = the number of days each ICU patient stays in intensive care. What are the values that X can assume? The probabilities would be developed based on relative frequencies observations made from hospital and patient records. The histogram might look something like this. This histogram is extrapolated from research articles. In the literature, researchers have tried correlating length of stay with diagnosis. For your general knowledge, this probability distribution 1 2 4 6 7 is approximately exponential, with formula f(x) = e x. (We ll take a look at exponential distributions in chapter 4.) Note that, for each probability, 0 p(x) = P (X = x) 1. All of the examples done so far illustrate discrete random variables. The values X may assume are distinct and countable. Measurements such as length or distance would be continuous random variables. Example E. Suppose we measure the heights of 2 people and we define X = height in inches. What are the values that X may assume? Example E revisited. Suppose we measure the heights of 2 people and we define X = height in inches. X x x > 0 { } Although height is a continuous random variable, we can (and in practice, do) treat it as a discrete random variable by rounding off to a specified accuracy. Suppose we measure the heights of 2 people to the nearest inch and get the following results: height (in.) 64 6 66 67 68 69 70 frequency 7 6 4 1 0 2 probability For Example E, the formal way of writing the probability distribution function would be x = 64 8 x = 6 4 x = 66 p ( x) = 0.16 x = 67. 0.04 x = 68 0.08 x = 70 0 otherwise You can, perhaps, see why another method (table, line graph, histogram) is usually preferred in a case like this.

The probability histogram would look like this: 64 6 66 67 68 69 70 We can calculate probabilities for various values of X. a) The probability that X is not more than 66 is b) The probability that X is at least 68 is Now is as good a time as any to note that the areas of the rectangles in the probability histogram represent probabilities, and that the sum of the areas of the rectangles = 1. {area of each bar} = {percentage of people having that height} = {relative frequency of that height} = {probability of a person picked at random having that height}. We can use the idea of area under the curve to introduce a cumulative probability function (text abbreviation cdf). Example E continued. Suppose we measure the heights of 2 people and we define X = height rounded to the nearest inch. Define a cumulative distribution function F(x) = P(X x). 4 0.16 0.0 8 F(64) = P(X 64) = F(6) = P(X 6) = F(67) = P(X 67) = 64 6 66 67 68 69 70 F(70) = P(X 70) = The formal expression of the cumulative distribution function is as follows. 0, x < 64, 64 x < 6 8, 6 x < 66 F ( x) = 0.72, 66 x < 67 0.88, 67 x < 68 0.92, 68 x < 70 1, 70 x

Now for some more challenging questions: first method: P(6 X 68) = second method: P(6 X 68) = Note that when using the cumulative distribution function is used to find the probability for an interval, we needed to be careful in choosing the values we used. In general, P(a X b) = F(b) F(largest possible value for X that is strictly less than a). Example C revisited. Suppose that a box contains blue blocks and 2 yellow blocks. You pick three blocks without replacement. Define X = number of blue blocks drawn. We have the following probability distribution function and line graph. p 10 1 10 0 x = 1 x = 2 ( x) = P ( X = x) =. x = otherwise 0.6 1 2 Find the cumulative distribution function and draw its graph.

Example B revisited. You test electrical components until you find one that fails. Define 1 if the component works (success) X =. 0 if the component fails This is an example of a Bernoulli random variable whose only possible values are 0 and 1. If we know that P(individual component fails) = 0.1, then the probability density function is An equivalent way of writing the probability density function is In general, for a Bernoulli random variable where we specify a parameter α = probability of success, 1 α if x = 0 p ( x) = α if x = 1 0 otherwise Now define Y = number of components tested before a failure comes up. p(0) = P(Y = 0) = p(1) = P(Y = 1) = p(2) = P(Y = 2) = The probability distribution function for this Example would be p(y) = (Your text has a general version of the probability distribution function for a Bernoulli experiment as part of example.12.) The cumulative distribution function for this Example would be F(y) = P(Y y) = Perhaps you recognize that this is a geometric series. We could sum the infinite series to show that the sum of the probabilities equals 1. (Your text has a general version of the cumulative distribution function for a Bernoulli experiment as part of example.14.)