The Random Variable for Probabilities Chris Piech CS109, Stanford University

Size: px
Start display at page:

Download "The Random Variable for Probabilities Chris Piech CS109, Stanford University"

Transcription

1 The Random Variable for Probabilities Chris Piech CS109, Stanford University

2 Assignment Grades Frequency Frequency Frequency Frequency We have 2055 assignment distributions from grade scope

3 Flip a Coin With Unknown Probability Demo

4 Four Prototypical Trajectories Today we are going to learn something unintuitive, beautiful and useful

5 Four Prototypical Trajectories Review

6 Conditional Events Recall that for events E and F: P( EF) P ( E F) = where P( F) > P( F) E F 0

7 Discrete Conditional Distributions

8 Continuous Conditional Distributions Let X and Y be continuous random variables Conditional PDF of X given Y (where f Y (y) > 0): P (X = x Y = y) = P (X = x, Y = y) P (Y = y) f X Y (x, y)@x = f X,Y (x, y)@x@y f X Y (y)@y f X Y (x, y)@x = f X,Y (x, y)@x f X Y (y) f X Y (x, y) = f X,Y (x, y) f X Y (y)

9 Piech, CS106A, Stanford University Conditioning with a continuous random variable feels weird at first. But then it gets good. Its like biking with a helmet

10 Continuous Conditional Distributions Let X be continuous random variable Let E be an event: P (X = x, E) P (E X = x) = P (X = x) P (X = x E)P (E) = P (X = x) = f X(x E)P (E)@x f X (x)@x = f X(x E)P (E) f X (x)

11 Piech, CS106A, Stanford University Anomaly Detection Let X be a measure of time to answer a question Let E be the event that the user is a human: P (X = x, E) P (E X = x) = P (X = x) P (X = x E)P (E) = P (X = x) = f X(x E)P (E)@x f X (x)@x = f X(x E)P (E) f X (x)

12 Piech, CS106A, Stanford University Anomaly Detection Let X be a measure of time to answer a question Let E be the event that the user is a human What if you don t know unconditional?: Normal pdf P (E X = x) = f X(x E)P (E) f X (x) P (E X = x) P (E C X = x) =

13 Biometric Keystroke Piech, CS106A, Stanford University

14 Let X be a continuous random variable Let N be a discrete random variable Conditional PDF of X given N: Conditional PMF of N given X: If X and N are independent, then: ) ( ) ( ) ( ) ( n p x f x n p n x f N X X N N X = ) ( ) ( ) ( ) ( x f n p n x f x n p X N N X X N = ) ( ) ( x f n x f X N X = ) ( ) ( n p x n p N X N = Mixing Discrete and Continuous

15 Four Prototypical Trajectories End Review

16

17 Piech, CS106A, Stanford University We are going to think of probabilities as random variables!!!

18 Flip a Coin With Unknown Probability Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Frequentist Bayesian X = lim n+m!1 n n + m n n + m f X N (x n) = P (N = n X = x)f X (x) P (N = n) X is a single value X is a random variable

19 Flip a Coin With Unknown Probability Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Our belief before flipping coins is that: X ~ Uni(0, 1) Let N = number of heads Given X = x, coin flips independent: (N X) ~Bin(n +m, x) f X N (x n) = P (N = n X = x)f X(x) P (N = n) Bayesian posterior probability distribution Bayesian prior probability distribution

20 Flip a Coin With Unknown Probability Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Our belief before flipping coins is that: X ~ Uni(0, 1) Let N = number of heads Given X = x, coin flips independent: (N X) ~Bin(n +m, x) f X N (x n) = P (N = n X = x)f X(x) P (N = n) Binomial = = n+m n x n (1 x) m P (N = n) n+m n P (N = n) xn (1 x) m Z = 1 c xn (1 x) m where c = 1 Z 1 0 x n (1 x) m dx

21 Beta Random Variable f X is a Beta Random Variable: X ~ Beta(a, b) B( ( x) = 0 Probability Density Function (PDF): (where a, b > 0) 1 x, ) a x a b 1 (1 ) b 1 0 < x < 1 otherwise where B 1 a 1 b 1 ( a, b) = x (1 x) dx 0 Symmetric when a = b a E[ X ] = Var X ) a + b ( = 2 ( a + b) ab ( a + b + 1)

22 Meta Beta Used to represent a distributed belief of a probability

23 Piech, CS106A, Stanford University Beta is a distribution for probabilities

24 Back to flipping coins Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Our belief before flipping coins is that: X ~ Uni(0, 1) Let N = number of heads Given X = x, coin flips independent: (N X) ~Bin(n +m, x) f X N (x n) = P (N = n X = x)f X(x) P (N = n) = = n+m n x n (1 x) m P (N = n) n+m n P (N = n) xn (1 x) m Z = 1 c xn (1 x) m where c = Z 1 0 x n (1 x) m dx

25 Flip a coin (n + m) times, comes up with n heads Conditional density of X given N = n f X Note: 1 n m n m N ( x n) = x (1 x) where c = x (1 x) dx c Recall Beta distribution: B( f ( x) = 0 Dude, Where s My Beta? 0 < x < 1, so f ( x n) = 1 x, ) a x a b 1 (1 ) b 1 0 < x < 1 otherwise Hey, that looks more familiar now... X (N = n, n + m trials) ~ Beta(n + 1, m + 1) X N B otherwise 1 a 1 b 1 ( a, b) = x (1 x) dx 0

26 Understanding Beta X (N = n, m + n trials) ~ Beta(n + 1, m + 1) X ~ Uni(0, 1) Check this out, boss: o Beta(1, 1) =? f 1 B( a, b) a 1 b 1 0 ( x) = x (1 x) = x (1 x 1 = 1 1= 1 where 0 < x < 1 1dx 0 1 B( a, b) ) 0 o Beta(1, 1) = Uni(0, 1) So, X ~ Beta(1, 1)

27 If the Prior was a Beta If our belief about X (that random variable for probability) was beta f X (x) = 1 B(a, b) xa 1 (1 x) b 1 What is our belief about X after observing N heads? f X N (x, n) =???

28 If the Prior was a Beta f X N (x, n) = P (N = n X = x)f X(x) P (N = n) = n+m n x n (1 x) m f X (x) P (N = n) n+m n x n (1 x) m 1 B(a,b) = xa 1 (1 x) b 1 P (N = n) n + m = K 1 x n (1 x) m 1 n B(a, b) xa 1 (1 x) b 1 = K 2 x n (1 x) m 1 B(a, b) xa 1 (1 x) b 1 = K 2 x n (1 x) m x a 1 (1 x) b 1 = K 2 x n+a 1 (1 x) m+b 1 X N Beta(n + a, m + b)

29 Understanding Beta If Prior distribution of X (before seeing flips) is Beta Then Posterior distribution of X (after flips) is Beta Beta is a conjugate distribution for Beta Prior and posterior parametric forms are the same! Practically, conjugate means easy update: o Add number of heads and tails seen to Beta parameters

30 Further Understanding Beta Can set X ~ Beta(a, b) as prior to reflect how biased you think coin is apriori This is a subjective probability! Then observe n + m trials, where n of trials are heads Update to get posterior probability X (n heads in n + m trials) ~ Beta(a + n, b + m) Sometimes call a and b the equivalent sample size Prior probability for X based on seeing (a + b 2) imaginary trials, where (a 1) of them were heads. Beta(1, 1) ~ Uni(0, 1) à we haven t seen any imaginary trials, so apriori know nothing about coin

31 Check out Demo!

32 Four Prototypical Trajectories Damn

33 Four Prototypical Trajectories Next level?

34 Assignment Grades Frequency Frequency Frequency Frequency We have 2055 assignment distributions from gradescope

35 Binomial Distributions Geometric Exponential Poisson Neg Binomial Normal Beta Beta Uniform

36 Four Prototypical Trajectories Grades must be bounded

37 Four Prototypical Trajectories Normal: No

38 Four Prototypical Trajectories Poisson: No

39 Four Prototypical Trajectories Exponential: No

40 Four Prototypical Trajectories Beta: Yes

41 Assignment Grades Demo Assignment id = 1613 Frequency

42 Assignment Grades Demo Assignment id = 1613 Frequency X Beta(a =8.28,b=3.16)

43 Assignment Grades a = 4.03, b = 3.00 a = 8.28, b = Frequency Frequency Frequency Frequency a = 5.06, b = 3.79 a = 3.46, b = 0.35 We have 2055 assignment distributions from grade scope

44 Beta is a Better Fit Unpublished results. Based on Gradescope data

45 Beta is a Better Fit For All Class Sizes Unpublished results. Based on Gradescope data

46 Binomial Interpretation Each student has the same probability of getting each point. Generate grades by flipping a coin 100 times for each student. The resulting distribution is binomial. - Binomial

47 Normal Interpretation What the Binomial said, but approximated. - Normal

48 Beta Interpretation Each student s ability is represented as a probability. The distribution of probabilities is a Beta distribution. Each student has a different probability of getting points, and that probability is sampled from a Beta distribution. - Beta This is Chris Piech s opinion. It is open for debate

49 Assignment Grades a = 4.03, b = 3.00 a = 8.28, b = Frequency Frequency Frequency Frequency a = 5.06, b = 3.79 a = 3.46, b = 0.35 These are the distribution of student point probabilitities

50 Assignment Grades Demo What is the semantics of E[X]? Frequency X Beta(a =8.28,b=3.16)

51 Assignment Grades What is the probability that a student is bellow the mean? X Beta(a =8.28,b=3.16) E[X] = a a + b = P (X <0.7238) = F X (0.7238) Wait what? Chris are you holding out on me? stats.beta.cdf(x, alpha, beta) P (X <E[X]) = 0.46

52 Four Prototypical Trajectories As far as I know, this is an unpublished result

53 Implications Will be combined with Item Response Theory which models how assignment difficulty and student ability combine to give point probabilities. Suggests a way to calculate final grades as a probabilistic most likely estimate of ability. Machine learning on education data will be more accurate. Analysis of mixture distributions can be fixed.

54 Four Prototypical Trajectories Will you use this on us?

55 Four Prototypical Trajectories Not yet J

56 Four Prototypical Trajectories Beta: The probability density for probabilities

57 Piech, CS106A, Stanford University Any parameter for a parameterized random variable can be thought of as a random variable.

58 Course Mean Ε[CS109] This is actual midpoint of course (Just wanted you to know)

The Random Variable for Probabilities Chris Piech CS109, Stanford University

The Random Variable for Probabilities Chris Piech CS109, Stanford University The Random Variable for Probabilities Chris Piech CS109, Stanford University Assignment Grades 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Frequency Frequency 10 20 30 40 50 60 70 80

More information

Conditional distributions

Conditional distributions Conditional distributions Will Monroe July 6, 017 with materials by Mehran Sahami and Chris Piech Independence of discrete random variables Two random variables are independent if knowing the value of

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the

More information

Debugging Intuition. How to calculate the probability of at least k successes in n trials?

Debugging Intuition. How to calculate the probability of at least k successes in n trials? How to calculate the probability of at least k successes in n trials? X is number of successes in n trials each with probability p # ways to choose slots for success Correct: Debugging Intuition P (X k)

More information

Poisson Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University

Poisson Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University Poisson Chris Piech CS109, Stanford University Piech, CS106A, Stanford University Probability for Extreme Weather? Piech, CS106A, Stanford University Four Prototypical Trajectories Review Binomial Random

More information

Continuous Variables Chris Piech CS109, Stanford University

Continuous Variables Chris Piech CS109, Stanford University Continuous Variables Chris Piech CS109, Stanford University 1906 Earthquak Magnitude 7.8 Learning Goals 1. Comfort using new discrete random variables 2. Integrate a density function (PDF) to get a probability

More information

CS 361: Probability & Statistics

CS 361: Probability & Statistics October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite

More information

Central Theorems Chris Piech CS109, Stanford University

Central Theorems Chris Piech CS109, Stanford University Central Theorems Chris Piech CS109, Stanford University Silence!! And now a moment of silence......before we present......a beautiful result of probability theory! Four Prototypical Trajectories Central

More information

Random Variables Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University

Random Variables Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University Random Variables Chris Piech CS109, Stanford University Piech, CS106A, Stanford University Piech, CS106A, Stanford University Let s Play a Game Game set-up We have a fair coin (come up heads with p = 0.5)

More information

Bayesian Models in Machine Learning

Bayesian Models in Machine Learning Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

Conjugate Priors, Uninformative Priors

Conjugate Priors, Uninformative Priors Conjugate Priors, Uninformative Priors Nasim Zolaktaf UBC Machine Learning Reading Group January 2016 Outline Exponential Families Conjugacy Conjugate priors Mixture of conjugate prior Uninformative priors

More information

STAT 430/510: Lecture 15

STAT 430/510: Lecture 15 STAT 430/510: Lecture 15 James Piette June 23, 2010 Updates HW4 is up on my website. It is due next Mon. (June 28th). Starting today back at section 6.4... Conditional Distribution: Discrete Def: The conditional

More information

Compute f(x θ)f(θ) dθ

Compute f(x θ)f(θ) dθ Bayesian Updating: Continuous Priors 18.05 Spring 2014 b a Compute f(x θ)f(θ) dθ January 1, 2017 1 /26 Beta distribution Beta(a, b) has density (a + b 1)! f (θ) = θ a 1 (1 θ) b 1 (a 1)!(b 1)! http://mathlets.org/mathlets/beta-distribution/

More information

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist

More information

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics

DS-GA 1002 Lecture notes 11 Fall Bayesian statistics DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian

More information

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.

This exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text. TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to

More information

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1

The Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1 The Exciting Guide To Probability Distributions Part 2 Jamie Frost v. Contents Part 2 A revisit of the multinomial distribution The Dirichlet Distribution The Beta Distribution Conjugate Priors The Gamma

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Computational Perception. Bayesian Inference

Computational Perception. Bayesian Inference Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters

More information

ECE521 W17 Tutorial 6. Min Bai and Yuhuai (Tony) Wu

ECE521 W17 Tutorial 6. Min Bai and Yuhuai (Tony) Wu ECE521 W17 Tutorial 6 Min Bai and Yuhuai (Tony) Wu Agenda knn and PCA Bayesian Inference k-means Technique for clustering Unsupervised pattern and grouping discovery Class prediction Outlier detection

More information

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014

Probability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014 Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions

More information

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2

Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

3 Multiple Discrete Random Variables

3 Multiple Discrete Random Variables 3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f

More information

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014

Machine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014 Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,

More information

Things to remember when learning probability distributions:

Things to remember when learning probability distributions: SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions

More information

Probability Density Functions and the Normal Distribution. Quantitative Understanding in Biology, 1.2

Probability Density Functions and the Normal Distribution. Quantitative Understanding in Biology, 1.2 Probability Density Functions and the Normal Distribution Quantitative Understanding in Biology, 1.2 1. Discrete Probability Distributions 1.1. The Binomial Distribution Question: You ve decided to flip

More information

Bayesian Inference. STA 121: Regression Analysis Artin Armagan

Bayesian Inference. STA 121: Regression Analysis Artin Armagan Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y

More information

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation

Machine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform

More information

Bayesian RL Seminar. Chris Mansley September 9, 2008

Bayesian RL Seminar. Chris Mansley September 9, 2008 Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in

More information

Gradient Ascent Chris Piech CS109, Stanford University

Gradient Ascent Chris Piech CS109, Stanford University Gradient Ascent Chris Piech CS109, Stanford University Our Path Deep Learning Linear Regression Naïve Bayes Logistic Regression Parameter Estimation Our Path Deep Learning Linear Regression Naïve Bayes

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 9: Bayesian Estimation Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 17 October, 2017 1 / 28

More information

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Random Variable Discrete Random

More information

Announcements. Proposals graded

Announcements. Proposals graded Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are

More information

Probability. Table of contents

Probability. Table of contents Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous

More information

Point Estimation. Vibhav Gogate The University of Texas at Dallas

Point Estimation. Vibhav Gogate The University of Texas at Dallas Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 9: A Bayesian model of concept learning Chris Lucas School of Informatics University of Edinburgh October 16, 218 Reading Rules and Similarity in Concept Learning

More information

Page Max. Possible Points Total 100

Page Max. Possible Points Total 100 Math 3215 Exam 2 Summer 2014 Instructor: Sal Barone Name: GT username: 1. No books or notes are allowed. 2. You may use ONLY NON-GRAPHING and NON-PROGRAMABLE scientific calculators. All other electronic

More information

COMP90051 Statistical Machine Learning

COMP90051 Statistical Machine Learning COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide

More information

Discrete Binary Distributions

Discrete Binary Distributions Discrete Binary Distributions Carl Edward Rasmussen November th, 26 Carl Edward Rasmussen Discrete Binary Distributions November th, 26 / 5 Key concepts Bernoulli: probabilities over binary variables Binomial:

More information

Bernoulli and Binomial

Bernoulli and Binomial Bernoulli and Binomial Will Monroe July 1, 217 image: Antoine Taveneaux with materials by Mehran Sahami and Chris Piech Announcements: Problem Set 2 Due this Wednesday, 7/12, at 12:3pm (before class).

More information

STAT 430/510 Probability

STAT 430/510 Probability STAT 430/510 Probability Hui Nie Lecture 16 June 24th, 2009 Review Sum of Independent Normal Random Variables Sum of Independent Poisson Random Variables Sum of Independent Binomial Random Variables Conditional

More information

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS EXAM Exam # Math 3342 Summer II, 2 July 2, 2 ANSWERS i pts. Problem. Consider the following data: 7, 8, 9, 2,, 7, 2, 3. Find the first quartile, the median, and the third quartile. Make a box and whisker

More information

Random variables. DS GA 1002 Probability and Statistics for Data Science.

Random variables. DS GA 1002 Probability and Statistics for Data Science. Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities

More information

The Normal Distribution

The Normal Distribution The Normal Distribution image: Etsy Will Monroe July 19, 017 with materials by Mehran Sahami and Chris Piech Announcements: Midterm A week from yesterday: Tuesday, July 5, 7:00-9:00pm Building 30-105 One

More information

Introduction: MLE, MAP, Bayesian reasoning (28/8/13)

Introduction: MLE, MAP, Bayesian reasoning (28/8/13) STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this

More information

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2 IEOR 316: Introduction to Operations Research: Stochastic Models Professor Whitt SOLUTIONS to Homework Assignment 2 More Probability Review: In the Ross textbook, Introduction to Probability Models, read

More information

18.440: Lecture 28 Lectures Review

18.440: Lecture 28 Lectures Review 18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is

More information

Probability and Estimation. Alan Moses

Probability and Estimation. Alan Moses Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.

More information

Independent random variables

Independent random variables Will Monroe July 4, 017 with materials by Mehran Sahami and Chris Piech Independent random variables Announcements: Midterm Tomorrow! Tuesday, July 5, 7:00-9:00pm Building 30-105 (main quad, Geology Corner)

More information

Bayesian Inference. Chapter 2: Conjugate models

Bayesian Inference. Chapter 2: Conjugate models Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in

More information

Lecture 8 : The Geometric Distribution

Lecture 8 : The Geometric Distribution 0/ 24 The geometric distribution is a special case of negative binomial, it is the case r = 1. It is so important we give it special treatment. Motivating example Suppose a couple decides to have children

More information

5. Conditional Distributions

5. Conditional Distributions 1 of 12 7/16/2009 5:36 AM Virtual Laboratories > 3. Distributions > 1 2 3 4 5 6 7 8 5. Conditional Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an

More information

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Expectation. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,

More information

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF

Accouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Accouncements You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Please do not zip these files and submit (unless there are >5 files) 1 Bayesian Methods Machine Learning

More information

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5 s in the 6 rolls. Let X = number of 5 s. Then X could be 0, 1, 2, 3, 4, 5, 6. X = 0 corresponds to the

More information

Multivariate random variables

Multivariate random variables Multivariate random variables DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Joint distributions Tool to characterize several

More information

Chapter 2: Random Variables

Chapter 2: Random Variables ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:

More information

Lecture 3. Discrete Random Variables

Lecture 3. Discrete Random Variables Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition

More information

Introduction to Applied Bayesian Modeling. ICPSR Day 4

Introduction to Applied Bayesian Modeling. ICPSR Day 4 Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the

More information

Chapter 5. Means and Variances

Chapter 5. Means and Variances 1 Chapter 5 Means and Variances Our discussion of probability has taken us from a simple classical view of counting successes relative to total outcomes and has brought us to the idea of a probability

More information

Review of probability

Review of probability Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

Computational Cognitive Science

Computational Cognitive Science Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational

More information

Lecture 2: Conjugate priors

Lecture 2: Conjugate priors (Spring ʼ) Lecture : Conjugate priors Julia Hockenmaier juliahmr@illinois.edu Siebel Center http://www.cs.uiuc.edu/class/sp/cs98jhm The binomial distribution If p is the probability of heads, the probability

More information

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including

More information

Exam 2 Practice Questions, 18.05, Spring 2014

Exam 2 Practice Questions, 18.05, Spring 2014 Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Review for the previous lecture Theorems and Examples: How to obtain the pmf (pdf) of U = g ( X Y 1 ) and V = g ( X Y) Chapter 4 Multiple Random Variables Chapter 43 Bivariate Transformations Continuous

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Review of Probabilities and Basic Statistics

Review of Probabilities and Basic Statistics Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to

More information

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for

More information

STAT 516 Midterm Exam 3 Friday, April 18, 2008

STAT 516 Midterm Exam 3 Friday, April 18, 2008 STAT 56 Midterm Exam 3 Friday, April 8, 2008 Name Purdue student ID (0 digits). The testing booklet contains 8 questions. 2. Permitted Texas Instruments calculators: BA-35 BA II Plus BA II Plus Professional

More information

Bivariate distributions

Bivariate distributions Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient

More information

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Expectation. DS GA 1002 Probability and Statistics for Data Science.   Carlos Fernandez-Granda Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,

More information

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)

STAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3) STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 07 Néhémy Lim Moment functions Moments of a random variable Definition.. Let X be a rrv on probability space (Ω, A, P). For a given r N, E[X r ], if it

More information

Lecture 10: Normal RV. Lisa Yan July 18, 2018

Lecture 10: Normal RV. Lisa Yan July 18, 2018 Lecture 10: Normal RV Lisa Yan July 18, 2018 Announcements Midterm next Tuesday Practice midterm, solutions out on website SCPD students: fill out Google form by today Covers up to and including Friday

More information

Chapter 4 Multiple Random Variables

Chapter 4 Multiple Random Variables Review for the previous lecture Definition: n-dimensional random vector, joint pmf (pdf), marginal pmf (pdf) Theorem: How to calculate marginal pmf (pdf) given joint pmf (pdf) Example: How to calculate

More information

Parameter Learning With Binary Variables

Parameter Learning With Binary Variables With Binary Variables University of Nebraska Lincoln CSCE 970 Pattern Recognition Outline Outline 1 Learning a Single Parameter 2 More on the Beta Density Function 3 Computing a Probability Interval Outline

More information

SAMPLE CHAPTER. Avi Pfeffer. FOREWORD BY Stuart Russell MANNING

SAMPLE CHAPTER. Avi Pfeffer. FOREWORD BY Stuart Russell MANNING SAMPLE CHAPTER Avi Pfeffer FOREWORD BY Stuart Russell MANNING Practical Probabilistic Programming by Avi Pfeffer Chapter 9 Copyright 2016 Manning Publications brief contents PART 1 INTRODUCING PROBABILISTIC

More information

Problem Set #5. Due: 1pm on Friday, Nov 16th

Problem Set #5. Due: 1pm on Friday, Nov 16th 1 Chris Piech CS 109 Problem Set #5 Due: 1pm on Friday, Nov 16th Problem Set #5 Nov 7 th, 2018 With problems by Mehran Sahami and Chris Piech For each problem, briefly explain/justify how you obtained

More information

Northwestern University Department of Electrical Engineering and Computer Science

Northwestern University Department of Electrical Engineering and Computer Science Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability

More information

CS 630 Basic Probability and Information Theory. Tim Campbell

CS 630 Basic Probability and Information Theory. Tim Campbell CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)

More information

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables

CDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables CDA6530: Performance Models of Computers and Networks Chapter 2: Review of Practical Random Variables Two Classes of R.V. Discrete R.V. Bernoulli Binomial Geometric Poisson Continuous R.V. Uniform Exponential,

More information

Intro to Bayesian Methods

Intro to Bayesian Methods Intro to Bayesian Methods Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Lecture 1 1 Course Webpage Syllabus LaTeX reference manual R markdown reference manual Please come to office

More information

Common Discrete Distributions

Common Discrete Distributions Common Discrete Distributions Statistics 104 Autumn 2004 Taken from Statistics 110 Lecture Notes Copyright c 2004 by Mark E. Irwin Common Discrete Distributions There are a wide range of popular discrete

More information

Probability Background

Probability Background CS76 Spring 0 Advanced Machine Learning robability Background Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu robability Meure A sample space Ω is the set of all possible outcomes. Elements ω Ω are called sample

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Generative Models Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1

More information

Lecture 1: Probability Fundamentals

Lecture 1: Probability Fundamentals Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability

More information

Introduction to Machine Learning

Introduction to Machine Learning What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes

More information

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University

Learning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University Learning from Sensor Data: Set II Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University 1 6. Data Representation The approach for learning from data Probabilistic

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Modeling Environment

Modeling Environment Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA

More information

Applied Bayesian Statistics STAT 388/488

Applied Bayesian Statistics STAT 388/488 STAT 388/488 Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 29, 207 Course Info STAT 388/488 http://math.luc.edu/~ebalderama/bayes 2 A motivating example (See

More information

Sample Spaces, Random Variables

Sample Spaces, Random Variables Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted

More information

Polynomials; Add/Subtract

Polynomials; Add/Subtract Chapter 7 Polynomials Polynomials; Add/Subtract Polynomials sounds tough enough. But, if you look at it close enough you ll notice that students have worked with polynomial expressions such as 6x 2 + 5x

More information

Continuous random variables

Continuous random variables Continuous random variables CE 311S What was the difference between discrete and continuous random variables? The possible outcomes of a discrete random variable (finite or infinite) can be listed out;

More information

(1) Introduction to Bayesian statistics

(1) Introduction to Bayesian statistics Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they

More information

Basics on Probability. Jingrui He 09/11/2007

Basics on Probability. Jingrui He 09/11/2007 Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability

More information

Machine Learning 4771

Machine Learning 4771 Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood

More information