The Random Variable for Probabilities Chris Piech CS109, Stanford University
|
|
- Diana Hood
- 5 years ago
- Views:
Transcription
1 The Random Variable for Probabilities Chris Piech CS109, Stanford University
2 Assignment Grades Frequency Frequency Frequency Frequency We have 2055 assignment distributions from grade scope
3 Flip a Coin With Unknown Probability Demo
4 Four Prototypical Trajectories Today we are going to learn something unintuitive, beautiful and useful
5 Four Prototypical Trajectories Review
6 Conditional Events Recall that for events E and F: P( EF) P ( E F) = where P( F) > P( F) E F 0
7 Discrete Conditional Distributions
8 Continuous Conditional Distributions Let X and Y be continuous random variables Conditional PDF of X given Y (where f Y (y) > 0): P (X = x Y = y) = P (X = x, Y = y) P (Y = y) f X Y (x, y)@x = f X,Y (x, y)@x@y f X Y (y)@y f X Y (x, y)@x = f X,Y (x, y)@x f X Y (y) f X Y (x, y) = f X,Y (x, y) f X Y (y)
9 Piech, CS106A, Stanford University Conditioning with a continuous random variable feels weird at first. But then it gets good. Its like biking with a helmet
10 Continuous Conditional Distributions Let X be continuous random variable Let E be an event: P (X = x, E) P (E X = x) = P (X = x) P (X = x E)P (E) = P (X = x) = f X(x E)P (E)@x f X (x)@x = f X(x E)P (E) f X (x)
11 Piech, CS106A, Stanford University Anomaly Detection Let X be a measure of time to answer a question Let E be the event that the user is a human: P (X = x, E) P (E X = x) = P (X = x) P (X = x E)P (E) = P (X = x) = f X(x E)P (E)@x f X (x)@x = f X(x E)P (E) f X (x)
12 Piech, CS106A, Stanford University Anomaly Detection Let X be a measure of time to answer a question Let E be the event that the user is a human What if you don t know unconditional?: Normal pdf P (E X = x) = f X(x E)P (E) f X (x) P (E X = x) P (E C X = x) =
13 Biometric Keystroke Piech, CS106A, Stanford University
14 Let X be a continuous random variable Let N be a discrete random variable Conditional PDF of X given N: Conditional PMF of N given X: If X and N are independent, then: ) ( ) ( ) ( ) ( n p x f x n p n x f N X X N N X = ) ( ) ( ) ( ) ( x f n p n x f x n p X N N X X N = ) ( ) ( x f n x f X N X = ) ( ) ( n p x n p N X N = Mixing Discrete and Continuous
15 Four Prototypical Trajectories End Review
16
17 Piech, CS106A, Stanford University We are going to think of probabilities as random variables!!!
18 Flip a Coin With Unknown Probability Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Frequentist Bayesian X = lim n+m!1 n n + m n n + m f X N (x n) = P (N = n X = x)f X (x) P (N = n) X is a single value X is a random variable
19 Flip a Coin With Unknown Probability Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Our belief before flipping coins is that: X ~ Uni(0, 1) Let N = number of heads Given X = x, coin flips independent: (N X) ~Bin(n +m, x) f X N (x n) = P (N = n X = x)f X(x) P (N = n) Bayesian posterior probability distribution Bayesian prior probability distribution
20 Flip a Coin With Unknown Probability Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Our belief before flipping coins is that: X ~ Uni(0, 1) Let N = number of heads Given X = x, coin flips independent: (N X) ~Bin(n +m, x) f X N (x n) = P (N = n X = x)f X(x) P (N = n) Binomial = = n+m n x n (1 x) m P (N = n) n+m n P (N = n) xn (1 x) m Z = 1 c xn (1 x) m where c = 1 Z 1 0 x n (1 x) m dx
21 Beta Random Variable f X is a Beta Random Variable: X ~ Beta(a, b) B( ( x) = 0 Probability Density Function (PDF): (where a, b > 0) 1 x, ) a x a b 1 (1 ) b 1 0 < x < 1 otherwise where B 1 a 1 b 1 ( a, b) = x (1 x) dx 0 Symmetric when a = b a E[ X ] = Var X ) a + b ( = 2 ( a + b) ab ( a + b + 1)
22 Meta Beta Used to represent a distributed belief of a probability
23 Piech, CS106A, Stanford University Beta is a distribution for probabilities
24 Back to flipping coins Flip a coin (n + m) times, comes up with n heads We don t know probability X that coin comes up heads Our belief before flipping coins is that: X ~ Uni(0, 1) Let N = number of heads Given X = x, coin flips independent: (N X) ~Bin(n +m, x) f X N (x n) = P (N = n X = x)f X(x) P (N = n) = = n+m n x n (1 x) m P (N = n) n+m n P (N = n) xn (1 x) m Z = 1 c xn (1 x) m where c = Z 1 0 x n (1 x) m dx
25 Flip a coin (n + m) times, comes up with n heads Conditional density of X given N = n f X Note: 1 n m n m N ( x n) = x (1 x) where c = x (1 x) dx c Recall Beta distribution: B( f ( x) = 0 Dude, Where s My Beta? 0 < x < 1, so f ( x n) = 1 x, ) a x a b 1 (1 ) b 1 0 < x < 1 otherwise Hey, that looks more familiar now... X (N = n, n + m trials) ~ Beta(n + 1, m + 1) X N B otherwise 1 a 1 b 1 ( a, b) = x (1 x) dx 0
26 Understanding Beta X (N = n, m + n trials) ~ Beta(n + 1, m + 1) X ~ Uni(0, 1) Check this out, boss: o Beta(1, 1) =? f 1 B( a, b) a 1 b 1 0 ( x) = x (1 x) = x (1 x 1 = 1 1= 1 where 0 < x < 1 1dx 0 1 B( a, b) ) 0 o Beta(1, 1) = Uni(0, 1) So, X ~ Beta(1, 1)
27 If the Prior was a Beta If our belief about X (that random variable for probability) was beta f X (x) = 1 B(a, b) xa 1 (1 x) b 1 What is our belief about X after observing N heads? f X N (x, n) =???
28 If the Prior was a Beta f X N (x, n) = P (N = n X = x)f X(x) P (N = n) = n+m n x n (1 x) m f X (x) P (N = n) n+m n x n (1 x) m 1 B(a,b) = xa 1 (1 x) b 1 P (N = n) n + m = K 1 x n (1 x) m 1 n B(a, b) xa 1 (1 x) b 1 = K 2 x n (1 x) m 1 B(a, b) xa 1 (1 x) b 1 = K 2 x n (1 x) m x a 1 (1 x) b 1 = K 2 x n+a 1 (1 x) m+b 1 X N Beta(n + a, m + b)
29 Understanding Beta If Prior distribution of X (before seeing flips) is Beta Then Posterior distribution of X (after flips) is Beta Beta is a conjugate distribution for Beta Prior and posterior parametric forms are the same! Practically, conjugate means easy update: o Add number of heads and tails seen to Beta parameters
30 Further Understanding Beta Can set X ~ Beta(a, b) as prior to reflect how biased you think coin is apriori This is a subjective probability! Then observe n + m trials, where n of trials are heads Update to get posterior probability X (n heads in n + m trials) ~ Beta(a + n, b + m) Sometimes call a and b the equivalent sample size Prior probability for X based on seeing (a + b 2) imaginary trials, where (a 1) of them were heads. Beta(1, 1) ~ Uni(0, 1) à we haven t seen any imaginary trials, so apriori know nothing about coin
31 Check out Demo!
32 Four Prototypical Trajectories Damn
33 Four Prototypical Trajectories Next level?
34 Assignment Grades Frequency Frequency Frequency Frequency We have 2055 assignment distributions from gradescope
35 Binomial Distributions Geometric Exponential Poisson Neg Binomial Normal Beta Beta Uniform
36 Four Prototypical Trajectories Grades must be bounded
37 Four Prototypical Trajectories Normal: No
38 Four Prototypical Trajectories Poisson: No
39 Four Prototypical Trajectories Exponential: No
40 Four Prototypical Trajectories Beta: Yes
41 Assignment Grades Demo Assignment id = 1613 Frequency
42 Assignment Grades Demo Assignment id = 1613 Frequency X Beta(a =8.28,b=3.16)
43 Assignment Grades a = 4.03, b = 3.00 a = 8.28, b = Frequency Frequency Frequency Frequency a = 5.06, b = 3.79 a = 3.46, b = 0.35 We have 2055 assignment distributions from grade scope
44 Beta is a Better Fit Unpublished results. Based on Gradescope data
45 Beta is a Better Fit For All Class Sizes Unpublished results. Based on Gradescope data
46 Binomial Interpretation Each student has the same probability of getting each point. Generate grades by flipping a coin 100 times for each student. The resulting distribution is binomial. - Binomial
47 Normal Interpretation What the Binomial said, but approximated. - Normal
48 Beta Interpretation Each student s ability is represented as a probability. The distribution of probabilities is a Beta distribution. Each student has a different probability of getting points, and that probability is sampled from a Beta distribution. - Beta This is Chris Piech s opinion. It is open for debate
49 Assignment Grades a = 4.03, b = 3.00 a = 8.28, b = Frequency Frequency Frequency Frequency a = 5.06, b = 3.79 a = 3.46, b = 0.35 These are the distribution of student point probabilitities
50 Assignment Grades Demo What is the semantics of E[X]? Frequency X Beta(a =8.28,b=3.16)
51 Assignment Grades What is the probability that a student is bellow the mean? X Beta(a =8.28,b=3.16) E[X] = a a + b = P (X <0.7238) = F X (0.7238) Wait what? Chris are you holding out on me? stats.beta.cdf(x, alpha, beta) P (X <E[X]) = 0.46
52 Four Prototypical Trajectories As far as I know, this is an unpublished result
53 Implications Will be combined with Item Response Theory which models how assignment difficulty and student ability combine to give point probabilities. Suggests a way to calculate final grades as a probabilistic most likely estimate of ability. Machine learning on education data will be more accurate. Analysis of mixture distributions can be fixed.
54 Four Prototypical Trajectories Will you use this on us?
55 Four Prototypical Trajectories Not yet J
56 Four Prototypical Trajectories Beta: The probability density for probabilities
57 Piech, CS106A, Stanford University Any parameter for a parameterized random variable can be thought of as a random variable.
58 Course Mean Ε[CS109] This is actual midpoint of course (Just wanted you to know)
The Random Variable for Probabilities Chris Piech CS109, Stanford University
The Random Variable for Probabilities Chris Piech CS109, Stanford University Assignment Grades 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 70 80 90 100 Frequency Frequency 10 20 30 40 50 60 70 80
More informationConditional distributions
Conditional distributions Will Monroe July 6, 017 with materials by Mehran Sahami and Chris Piech Independence of discrete random variables Two random variables are independent if knowing the value of
More informationCS 361: Probability & Statistics
March 14, 2018 CS 361: Probability & Statistics Inference The prior From Bayes rule, we know that we can express our function of interest as Likelihood Prior Posterior The right hand side contains the
More informationDebugging Intuition. How to calculate the probability of at least k successes in n trials?
How to calculate the probability of at least k successes in n trials? X is number of successes in n trials each with probability p # ways to choose slots for success Correct: Debugging Intuition P (X k)
More informationPoisson Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University
Poisson Chris Piech CS109, Stanford University Piech, CS106A, Stanford University Probability for Extreme Weather? Piech, CS106A, Stanford University Four Prototypical Trajectories Review Binomial Random
More informationContinuous Variables Chris Piech CS109, Stanford University
Continuous Variables Chris Piech CS109, Stanford University 1906 Earthquak Magnitude 7.8 Learning Goals 1. Comfort using new discrete random variables 2. Integrate a density function (PDF) to get a probability
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationCentral Theorems Chris Piech CS109, Stanford University
Central Theorems Chris Piech CS109, Stanford University Silence!! And now a moment of silence......before we present......a beautiful result of probability theory! Four Prototypical Trajectories Central
More informationRandom Variables Chris Piech CS109, Stanford University. Piech, CS106A, Stanford University
Random Variables Chris Piech CS109, Stanford University Piech, CS106A, Stanford University Piech, CS106A, Stanford University Let s Play a Game Game set-up We have a fair coin (come up heads with p = 0.5)
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationConjugate Priors, Uninformative Priors
Conjugate Priors, Uninformative Priors Nasim Zolaktaf UBC Machine Learning Reading Group January 2016 Outline Exponential Families Conjugacy Conjugate priors Mixture of conjugate prior Uninformative priors
More informationSTAT 430/510: Lecture 15
STAT 430/510: Lecture 15 James Piette June 23, 2010 Updates HW4 is up on my website. It is due next Mon. (June 28th). Starting today back at section 6.4... Conditional Distribution: Discrete Def: The conditional
More informationCompute f(x θ)f(θ) dθ
Bayesian Updating: Continuous Priors 18.05 Spring 2014 b a Compute f(x θ)f(θ) dθ January 1, 2017 1 /26 Beta distribution Beta(a, b) has density (a + b 1)! f (θ) = θ a 1 (1 θ) b 1 (a 1)!(b 1)! http://mathlets.org/mathlets/beta-distribution/
More informationEstimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio
Estimation of reliability parameters from Experimental data (Parte 2) This lecture Life test (t 1,t 2,...,t n ) Estimate θ of f T t θ For example: λ of f T (t)= λe - λt Classical approach (frequentist
More informationDS-GA 1002 Lecture notes 11 Fall Bayesian statistics
DS-GA 100 Lecture notes 11 Fall 016 Bayesian statistics In the frequentist paradigm we model the data as realizations from a distribution that depends on deterministic parameters. In contrast, in Bayesian
More informationThis exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.
TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to
More informationThe Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1
The Exciting Guide To Probability Distributions Part 2 Jamie Frost v. Contents Part 2 A revisit of the multinomial distribution The Dirichlet Distribution The Beta Distribution Conjugate Priors The Gamma
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationComputational Perception. Bayesian Inference
Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters
More informationECE521 W17 Tutorial 6. Min Bai and Yuhuai (Tony) Wu
ECE521 W17 Tutorial 6 Min Bai and Yuhuai (Tony) Wu Agenda knn and PCA Bayesian Inference k-means Technique for clustering Unsupervised pattern and grouping discovery Class prediction Outlier detection
More informationProbability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014
Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationDiscrete Distributions
Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have
More information3 Multiple Discrete Random Variables
3 Multiple Discrete Random Variables 3.1 Joint densities Suppose we have a probability space (Ω, F,P) and now we have two discrete random variables X and Y on it. They have probability mass functions f
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationThings to remember when learning probability distributions:
SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions
More informationProbability Density Functions and the Normal Distribution. Quantitative Understanding in Biology, 1.2
Probability Density Functions and the Normal Distribution Quantitative Understanding in Biology, 1.2 1. Discrete Probability Distributions 1.1. The Binomial Distribution Question: You ve decided to flip
More informationBayesian Inference. STA 121: Regression Analysis Artin Armagan
Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y
More informationMachine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation
Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationGradient Ascent Chris Piech CS109, Stanford University
Gradient Ascent Chris Piech CS109, Stanford University Our Path Deep Learning Linear Regression Naïve Bayes Logistic Regression Parameter Estimation Our Path Deep Learning Linear Regression Naïve Bayes
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 9: Bayesian Estimation Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 17 October, 2017 1 / 28
More informationELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables
Department of Electrical Engineering University of Arkansas ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables Dr. Jingxian Wu wuj@uark.edu OUTLINE 2 Random Variable Discrete Random
More informationAnnouncements. Proposals graded
Announcements Proposals graded Kevin Jamieson 2018 1 Hypothesis testing Machine Learning CSE546 Kevin Jamieson University of Washington October 30, 2018 2018 Kevin Jamieson 2 Anomaly detection You are
More informationProbability. Table of contents
Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous
More informationPoint Estimation. Vibhav Gogate The University of Texas at Dallas
Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 9: A Bayesian model of concept learning Chris Lucas School of Informatics University of Edinburgh October 16, 218 Reading Rules and Similarity in Concept Learning
More informationPage Max. Possible Points Total 100
Math 3215 Exam 2 Summer 2014 Instructor: Sal Barone Name: GT username: 1. No books or notes are allowed. 2. You may use ONLY NON-GRAPHING and NON-PROGRAMABLE scientific calculators. All other electronic
More informationCOMP90051 Statistical Machine Learning
COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Trevor Cohn 2. Statistical Schools Adapted from slides by Ben Rubinstein Statistical Schools of Thought Remainder of lecture is to provide
More informationDiscrete Binary Distributions
Discrete Binary Distributions Carl Edward Rasmussen November th, 26 Carl Edward Rasmussen Discrete Binary Distributions November th, 26 / 5 Key concepts Bernoulli: probabilities over binary variables Binomial:
More informationBernoulli and Binomial
Bernoulli and Binomial Will Monroe July 1, 217 image: Antoine Taveneaux with materials by Mehran Sahami and Chris Piech Announcements: Problem Set 2 Due this Wednesday, 7/12, at 12:3pm (before class).
More informationSTAT 430/510 Probability
STAT 430/510 Probability Hui Nie Lecture 16 June 24th, 2009 Review Sum of Independent Normal Random Variables Sum of Independent Poisson Random Variables Sum of Independent Binomial Random Variables Conditional
More informationEXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS
EXAM Exam # Math 3342 Summer II, 2 July 2, 2 ANSWERS i pts. Problem. Consider the following data: 7, 8, 9, 2,, 7, 2, 3. Find the first quartile, the median, and the third quartile. Make a box and whisker
More informationRandom variables. DS GA 1002 Probability and Statistics for Data Science.
Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities
More informationThe Normal Distribution
The Normal Distribution image: Etsy Will Monroe July 19, 017 with materials by Mehran Sahami and Chris Piech Announcements: Midterm A week from yesterday: Tuesday, July 5, 7:00-9:00pm Building 30-105 One
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationIEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2
IEOR 316: Introduction to Operations Research: Stochastic Models Professor Whitt SOLUTIONS to Homework Assignment 2 More Probability Review: In the Ross textbook, Introduction to Probability Models, read
More information18.440: Lecture 28 Lectures Review
18.440: Lecture 28 Lectures 18-27 Review Scott Sheffield MIT Outline Outline It s the coins, stupid Much of what we have done in this course can be motivated by the i.i.d. sequence X i where each X i is
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationIndependent random variables
Will Monroe July 4, 017 with materials by Mehran Sahami and Chris Piech Independent random variables Announcements: Midterm Tomorrow! Tuesday, July 5, 7:00-9:00pm Building 30-105 (main quad, Geology Corner)
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More informationLecture 8 : The Geometric Distribution
0/ 24 The geometric distribution is a special case of negative binomial, it is the case r = 1. It is so important we give it special treatment. Motivating example Suppose a couple decides to have children
More information5. Conditional Distributions
1 of 12 7/16/2009 5:36 AM Virtual Laboratories > 3. Distributions > 1 2 3 4 5 6 7 8 5. Conditional Distributions Basic Theory As usual, we start with a random experiment with probability measure P on an
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationAccouncements. You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF
Accouncements You should turn in a PDF and a python file(s) Figure for problem 9 should be in the PDF Please do not zip these files and submit (unless there are >5 files) 1 Bayesian Methods Machine Learning
More informationRandom Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping
Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5 s in the 6 rolls. Let X = number of 5 s. Then X could be 0, 1, 2, 3, 4, 5, 6. X = 0 corresponds to the
More informationMultivariate random variables
Multivariate random variables DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Joint distributions Tool to characterize several
More informationChapter 2: Random Variables
ECE54: Stochastic Signals and Systems Fall 28 Lecture 2 - September 3, 28 Dr. Salim El Rouayheb Scribe: Peiwen Tian, Lu Liu, Ghadir Ayache Chapter 2: Random Variables Example. Tossing a fair coin twice:
More informationLecture 3. Discrete Random Variables
Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition
More informationIntroduction to Applied Bayesian Modeling. ICPSR Day 4
Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the
More informationChapter 5. Means and Variances
1 Chapter 5 Means and Variances Our discussion of probability has taken us from a simple classical view of counting successes relative to total outcomes and has brought us to the idea of a probability
More informationReview of probability
Review of probability Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts definition of probability random variables
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational
More informationLecture 2: Conjugate priors
(Spring ʼ) Lecture : Conjugate priors Julia Hockenmaier juliahmr@illinois.edu Siebel Center http://www.cs.uiuc.edu/class/sp/cs98jhm The binomial distribution If p is the probability of heads, the probability
More informationDS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling
DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including
More informationExam 2 Practice Questions, 18.05, Spring 2014
Exam 2 Practice Questions, 18.05, Spring 2014 Note: This is a set of practice problems for exam 2. The actual exam will be much shorter. Within each section we ve arranged the problems roughly in order
More informationChapter 4 Multiple Random Variables
Review for the previous lecture Theorems and Examples: How to obtain the pmf (pdf) of U = g ( X Y 1 ) and V = g ( X Y) Chapter 4 Multiple Random Variables Chapter 43 Bivariate Transformations Continuous
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More informationReview of Probabilities and Basic Statistics
Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationSTAT 516 Midterm Exam 3 Friday, April 18, 2008
STAT 56 Midterm Exam 3 Friday, April 8, 2008 Name Purdue student ID (0 digits). The testing booklet contains 8 questions. 2. Permitted Texas Instruments calculators: BA-35 BA II Plus BA II Plus Professional
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationSTAT/MATH 395 A - PROBABILITY II UW Winter Quarter Moment functions. x r p X (x) (1) E[X r ] = x r f X (x) dx (2) (x E[X]) r p X (x) (3)
STAT/MATH 395 A - PROBABILITY II UW Winter Quarter 07 Néhémy Lim Moment functions Moments of a random variable Definition.. Let X be a rrv on probability space (Ω, A, P). For a given r N, E[X r ], if it
More informationLecture 10: Normal RV. Lisa Yan July 18, 2018
Lecture 10: Normal RV Lisa Yan July 18, 2018 Announcements Midterm next Tuesday Practice midterm, solutions out on website SCPD students: fill out Google form by today Covers up to and including Friday
More informationChapter 4 Multiple Random Variables
Review for the previous lecture Definition: n-dimensional random vector, joint pmf (pdf), marginal pmf (pdf) Theorem: How to calculate marginal pmf (pdf) given joint pmf (pdf) Example: How to calculate
More informationParameter Learning With Binary Variables
With Binary Variables University of Nebraska Lincoln CSCE 970 Pattern Recognition Outline Outline 1 Learning a Single Parameter 2 More on the Beta Density Function 3 Computing a Probability Interval Outline
More informationSAMPLE CHAPTER. Avi Pfeffer. FOREWORD BY Stuart Russell MANNING
SAMPLE CHAPTER Avi Pfeffer FOREWORD BY Stuart Russell MANNING Practical Probabilistic Programming by Avi Pfeffer Chapter 9 Copyright 2016 Manning Publications brief contents PART 1 INTRODUCING PROBABILISTIC
More informationProblem Set #5. Due: 1pm on Friday, Nov 16th
1 Chris Piech CS 109 Problem Set #5 Due: 1pm on Friday, Nov 16th Problem Set #5 Nov 7 th, 2018 With problems by Mehran Sahami and Chris Piech For each problem, briefly explain/justify how you obtained
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationCS 630 Basic Probability and Information Theory. Tim Campbell
CS 630 Basic Probability and Information Theory Tim Campbell 21 January 2003 Probability Theory Probability Theory is the study of how best to predict outcomes of events. An experiment (or trial or event)
More informationCDA6530: Performance Models of Computers and Networks. Chapter 2: Review of Practical Random Variables
CDA6530: Performance Models of Computers and Networks Chapter 2: Review of Practical Random Variables Two Classes of R.V. Discrete R.V. Bernoulli Binomial Geometric Poisson Continuous R.V. Uniform Exponential,
More informationIntro to Bayesian Methods
Intro to Bayesian Methods Rebecca C. Steorts Bayesian Methods and Modern Statistics: STA 360/601 Lecture 1 1 Course Webpage Syllabus LaTeX reference manual R markdown reference manual Please come to office
More informationCommon Discrete Distributions
Common Discrete Distributions Statistics 104 Autumn 2004 Taken from Statistics 110 Lecture Notes Copyright c 2004 by Mark E. Irwin Common Discrete Distributions There are a wide range of popular discrete
More informationProbability Background
CS76 Spring 0 Advanced Machine Learning robability Background Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu robability Meure A sample space Ω is the set of all possible outcomes. Elements ω Ω are called sample
More informationIntroduction to Machine Learning
Introduction to Machine Learning Generative Models Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationLecture 1: Probability Fundamentals
Lecture 1: Probability Fundamentals IB Paper 7: Probability and Statistics Carl Edward Rasmussen Department of Engineering, University of Cambridge January 22nd, 2008 Rasmussen (CUED) Lecture 1: Probability
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationLearning from Sensor Data: Set II. Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University
Learning from Sensor Data: Set II Behnaam Aazhang J.S. Abercombie Professor Electrical and Computer Engineering Rice University 1 6. Data Representation The approach for learning from data Probabilistic
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationModeling Environment
Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA
More informationApplied Bayesian Statistics STAT 388/488
STAT 388/488 Dr. Earvin Balderama Department of Mathematics & Statistics Loyola University Chicago August 29, 207 Course Info STAT 388/488 http://math.luc.edu/~ebalderama/bayes 2 A motivating example (See
More informationSample Spaces, Random Variables
Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted
More informationPolynomials; Add/Subtract
Chapter 7 Polynomials Polynomials; Add/Subtract Polynomials sounds tough enough. But, if you look at it close enough you ll notice that students have worked with polynomial expressions such as 6x 2 + 5x
More informationContinuous random variables
Continuous random variables CE 311S What was the difference between discrete and continuous random variables? The possible outcomes of a discrete random variable (finite or infinite) can be listed out;
More information(1) Introduction to Bayesian statistics
Spring, 2018 A motivating example Student 1 will write down a number and then flip a coin If the flip is heads, they will honestly tell student 2 if the number is even or odd If the flip is tails, they
More informationBasics on Probability. Jingrui He 09/11/2007
Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More information