Discrete Probability Distributions

Similar documents
Random Variables and Events

Expectations and Entropy

Bernoulli Trials, Binomial and Cumulative Distributions

CMPSCI 240: Reasoning Under Uncertainty

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

Mathematical Foundations

Parameters. Data Science: Jordan Boyd-Graber University of Maryland FEBRUARY 12, Data Science: Jordan Boyd-Graber UMD Parameters 1 / 18

Example A. Define X = number of heads in ten tosses of a coin. What are the values that X may assume?

Logistic Regression. Introduction to Data Science Algorithms Jordan Boyd-Graber and Michael Paul SLIDES ADAPTED FROM HINRICH SCHÜTZE

Conditional Probability

Lecture 3. Discrete Random Variables

Binomial and Poisson Probability Distributions

Applied Statistics I

Statistics and Econometrics I

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Probability Distributions: Continuous

Guidelines for Solving Probability Problems

Probability Distributions for Discrete RV

Random Variables Example:

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

Bernoulli Trials and Binomial Distribution

Introduction to Machine Learning

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Classification. Jordan Boyd-Graber University of Maryland WEIGHTED MAJORITY. Slides adapted from Mohri. Jordan Boyd-Graber UMD Classification 1 / 13

To find the median, find the 40 th quartile and the 70 th quartile (which are easily found at y=1 and y=2, respectively). Then we interpolate:

Linear Classifiers. Michael Collins. January 18, 2012

RVs and their probability distributions

Business Statistics PROBABILITY DISTRIBUTIONS

Chapter 2. Random Variable. Define single random variables in terms of their PDF and CDF, and calculate moments such as the mean and variance.

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Math 1313 Experiments, Events and Sample Spaces

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks

BNAD 276 Lecture 5 Discrete Probability Distributions Exercises 1 11

Stat Lecture 20. Last class we introduced the covariance and correlation between two jointly distributed random variables.

Discrete Structures for Computer Science

Review of Probability. CS1538: Introduction to Simulations

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Compute f(x θ)f(θ) dθ

Probability Distributions

Machine Learning

Confidence Intervals for Normal Data Spring 2014

Bayes Theorem and Hypergeometric Distribution

9. DISCRETE PROBABILITY DISTRIBUTIONS

DISCRETE RANDOM VARIABLES: PMF s & CDF s [DEVORE 3.2]

CS 361: Probability & Statistics

Topic 3: The Expectation of a Random Variable

Probability & statistics for linguists Class 2: more probability. D. Lassiter (h/t: R. Levy)

An-Najah National University Faculty of Engineering Industrial Engineering Department. Course : Quantitative Methods (65211)

Notes 6 Autumn Example (One die: part 1) One fair six-sided die is thrown. X is the number showing.

Discrete Probability Refresher

Data Structures and Algorithm Analysis (CSC317) Randomized algorithms

Random variables (discrete)

18.05 Practice Final Exam

Machine Learning

Some Probability and Statistics

CS 237: Probability in Computing

(Ch 3.4.1, 3.4.2, 4.1, 4.2, 4.3)

Bernoulli Trials and Binomial Distribution

Continuous-Valued Probability Review

Section 2.4 Bernoulli Trials

Introduction to Machine Learning

Introduction to Machine Learning

Chapter 3 Discrete Random Variables

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

Clustering. Introduction to Data Science. Jordan Boyd-Graber and Michael Paul SLIDES ADAPTED FROM LAUREN HANNAH

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 18 Oct. 31, 2018

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

Probability and Statisitcs

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Recitation 2: Probability

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

MATH Solutions to Probability Exercises

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1

Chapter Learning Objectives. Random Experiments Dfiii Definition: Dfiii Definition:

Probability Theory and Simulation Methods

6.3 Bernoulli Trials Example Consider the following random experiments

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Some Probability and Statistics

(A) Incorrect! A parameter is a number that describes the population. (C) Incorrect! In a Random Sample, not just a sample.

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Tutorial:A Random Number of Coin Flips

A brief review of basics of probabilities

Probability measures A probability measure, P, is a real valued function from the collection of possible events so that the following

Name: Firas Rassoul-Agha

Fundamentals of Probability CE 311S

Discrete Distributions

Introduction to Probability and Statistics Slides 3 Chapter 3

Bayesian Updating with Discrete Priors Class 11, Jeremy Orloff and Jonathan Bloom

Bayesian Updating with Continuous Priors Class 13, Jeremy Orloff and Jonathan Bloom

GANs. Machine Learning: Jordan Boyd-Graber University of Maryland SLIDES ADAPTED FROM GRAHAM NEUBIG

Conjugate Priors, Uninformative Priors

STAT509: Discrete Random Variable

MATH 3670 First Midterm February 17, No books or notes. No cellphone or wireless devices. Write clearly and show your work for every answer.

Probability. Lecture Notes. Adolfo J. Rumbos

Outline. 1. Define likelihood 2. Interpretations of likelihoods 3. Likelihood plots 4. Maximum likelihood 5. Likelihood ratio benchmarks

Methods of Mathematics

Bernoulli and Binomial

Sample Spaces, Random Variables

Random Variables. Statistics 110. Summer Copyright c 2006 by Mark E. Irwin

Transcription:

Discrete Probability Distributions Data Science: Jordan Boyd-Graber University of Maryland JANUARY 18, 2018 Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 1 / 1

Refresher: Random variables Random variables take on values in a sample space. This week we will focus on discrete random variables: Coin flip: {H,T } Number of times a coin lands heads after N flips: {0,1,2,...,N} Number of words in a document: Positive integers {1,2,...} Reminder: we denote the random variable with a capital letter; denote a outcome with a lower case letter. E.g., X is a coin flip, x is the value (H or T ) of that coin flip. Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 2 / 1

Refresher: Discrete distributions A discrete distribution assigns a probability to every possible outcome in the sample space For example, if X is a coin flip, then P(X = H) = 0.5 P(X = T) = 0.5 Probabilities have to be greater than or equal to 0 and probabilities over the entire sample space must sum to one P(X = x) = 1 x Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 3 / 1

Mathematical Conventions 0! If n! = n (n 1)! then 0! = 1 if definition holds for n > 0. n 0 Example for 3: 3 2 =9 (1) 3 1 =3 (2) 3 1 = 1 3 (3) Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 4 / 1

Mathematical Conventions n 0 Example for 3: 0! If n! = n (n 1)! then 0! = 1 if definition holds for n > 0. 3 2 =9 (1) 3 1 =3 (2) 3 0 =1 (3) 3 1 = 1 3 (4) Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 4 / 1

Today: Types of discrete distributions There are many different types of discrete distributions, with different definitions. Today we ll look at the most common discrete distributions. And we ll introduce the concept of parameters. These discrete distributions (along with the continuous distributions next) are fundamental Regression, classification, and clustering Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 5 / 1

Bernoulli distribution A distribution over a sample space with two values: {0,1} Interpretation: 1 is success ; 0 is failure Example: coin flip (we let 1 be heads and 0 be tails ) A Bernoulli distribution can be defined with a table of the two probabilities: X denotes the outcome of a coin flip: P(X = 0) = 0.5 P(X = 1) = 0.5 X denotes whether or not a TV is defective: P(X = 0) = 0.995 P(X = 1) = 0.005 Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 6 / 1

Bernoulli distribution Do we need to write out both probabilities? P(X = 0) = 0.995 P(X = 1) = 0.005 What if I only told you P(X = 1)? Or P(X = 0)? Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 7 / 1

Bernoulli distribution Do we need to write out both probabilities? P(X = 0) = 0.995 P(X = 1) = 0.005 What if I only told you P(X = 1)? Or P(X = 0)? P(X = 0) = 1 P(X = 1) P(X = 1) = 1 P(X = 0) We only need one probability to define a Bernoulli distribution Usually the probability of success, P(X = 1). Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 7 / 1

Bernoulli distribution Another way of writing the Bernoulli distribution: Let θ denote the probability of success (0 θ 1). P(X = 0) = 1 θ P(X = 1) = θ An even more compact way to write this: P(X = x) = θ x (1 θ ) 1 x This is called a probability mass function. Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 8 / 1

Probability mass functions A probability mass function (PMF) is a function that assigns a probability to every outcome of a discrete random variable X. Notation: f(x) = P(X = x) Compact definition Example: PMF for Bernoulli random variable X {0,1} f(x) = θ x (1 θ ) 1 x In this example, θ is called a parameter. Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 9 / 1

Parameters Define the probability mass function Free parameters not constrained by the PMF. For example, the Bernoulli PMF could be written with two parameters: f(x) = θ x 1 θ 1 x 2 But θ 2 1 θ 1... only 1 free parameter. The complexity number of free parameters. Simpler models have fewer parameters. Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 10 / 1

Sampling from a Bernoulli distribution How to randomly generate a value distributed according to a Bernoulli distribution? Algorithm: 1. Randomly generate a number between 0 and 1 r = random(0, 1) 2. If r < θ, return success Else, return failure Data Science: Jordan Boyd-Graber UMD Discrete Probability Distributions 11 / 1