Introduction to Probability and Statistics (Continued)
|
|
- Jonah Benson
- 5 years ago
- Views:
Transcription
1 Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science University of otre Dame otre Dame, Indiana, USA URL: August 7, 8
2 Contents The binomial and Bernoulli distributions The multinomial and multinoulli distributions The Poisson distribution Student s T Laplace distribution Gamma distribution Beta distribution Pareto distribution Introduction to Covariance and Correlation
3 References Following closely Chris Bishops PRML book, Chapter Kevin Murphy s, Machine Learning: A probablistic perspective, Chapter Jaynes, E. T. (3). Probability Theory: The Logic of Science. Cambridge University Press. Bertsekas, D. and J. Tsitsiklis (8). Introduction to Probability. Athena Scientific. nd Edition Wasserman, L. (4). All of statistics. A Concise Course in Statistical Inference. Springer. 3
4 Binary Variables Consider a coin flipping experiment with heads = and tails =. With [,] px ( ) px ( ) This defines the Bernoulli distribution as follows: Bern( ) ( ) x x x Using the indicator function, we can also write this as: Bern ( x ) ( ) ( x ) ( x) 4
5 Bernoulli Distribution Recall that in general [ f ] p( x) f ( x), [ f ] p( x) f ( x) dx x var[ f ] [ f ( x) ] [ f ( x)] For the Bernoulli distribution can easily show from the definitions: [ x] var[ x] ( ) Bern( ) ( ) x x x, we [ x] p( x )ln p( x ) ln ( )ln( ) x, Here H[x] is the entropy of the distribution 5
6 Likelihood Function for Bernoulli Distribution Consider the data set D x, x,..., x in which we have m heads (x = ), and m tails (x = ) The likelihood function takes the form: p xn xn m m ( ) p( xn ) ( ) D ( ) n n 6
7 Binomial Distribution Consider the discrete random variable X,,,..., We define the Binomial distribution as follows: Bin( X m, ) ( ) m m m In our coin flipping experiment, it gives the probability in flips to get m heads with μ being the probability getting heads in one flip. It can be shown (see S. Ross, Introduction to Probability Models) that the limit of the binomial distribution as, is the Poisson(l) distribution., l binomial distribution Bin(,.5) Matlab Code 7
8 Binomial Distribution.35 The Binomial distribution for =, and.5,.9 is shown below using MatLab function binomdistplot from Kevin Murphys PMTK.5 =.5.4 = Bin(, )
9 Mean,Variance of the Binomial Distribution Consider for independent events the mean of the sum is the sum of the means, and the variance of the sum is the sum of the variances. Because m = x x, and for each observation the mean and variance are known from the Bernoulli distribution: [ m] mbin ( m, ) [ x... x ] m m One can also compute (twice) the identity var[ m] m [ m] Bin( m, ) var[ x... x ] ( ) m [ m], [ m ] by differentiating ( ) wrt μ. Try it! m m m 9
10 Binomial Distribution: ormalization To show that the Binomial is correctly normalized, we use the following identities: Can be shown with direct substitution: Binomial theorem: m ( x) x (**) mm This theorem is proved by induction using (*) and noticing: To finally show normalization using (**): (*) n n n m m m m m ( x) x ( x) x x x x mm m m m m m m m m * m m m m x x x x x x m m m m m m m m m m ( ) ( ) ( ) m m m m m
11 Generalization of the Bernoulli Distribution We are now looking at discrete variables that can take on one of K possible mutually exclusive states. The variable is represented by a K-dimensional vector x in which one of the elements x k equals, and all remaining elements equal : x =,,,,,, T These vectors satisfy: K k x k Let the probability of x k = be denoted as μ k. Then K K K xk ( xk) ( x ) k k, k, k k k k p where μ = μ,, μ K T.
12 Multinoulli/Categorical Distribution The distribution is already normalized: K K k x x k k xk p( x ) The mean of the distribution is computed as: x xp( x ),..., K x similar to the result for the Bernoulli distribution. The Multinoulli also known as the Categorical distribution often denoted as (Mu here is the multinomial distribution): x x x, Cat Multinoulli Mu The parameter stands to emphasize that we roll a K-sided dice once ( = ) see next for the multinomial distribution. k T
13 Likelihood: Multinoulli Distribution Let us consider a data set becomes: D x,..., x. The likelihood K K xnk K xnk n mk p( D ), where : m x k k k k nk n k k k n is the # of observations of x k =. m k is the sufficient statistic of the distribution. 3
14 MLE Estimate: Multinoulli Distribution To compute the maximum likelihood (MLE) estimate of μ, we maximize an augmented log-likelihood K K K ln p( D ) l k mk ln k l k k k k Setting the derivative wrt μ k equal to zero: Substitution into the constraint K mk l K m m K k K k k l k l k k K K m k k m k m As expected, this is the fraction in the observations of x k = k 4
15 Multinomial Distribution We can also consider the joint distribution of m,, m K in observations conditioned on the parameters μ = (μ,, μ K ). From the expression for the likelihood given earlier p K mk ( D ) k k the multinomial distribution Mu parameters and μ takes the form: ( m,..., m, ) K with! K m m mk p( m, m,..., m,,,..., )... where m K K K k k m! m!... mk! 5
16 Example: Biosequence Analysis Consider a set of DA sequences where there are rows (sequences) and 5 columns (locations along the genome). Several locations are conserved by evolution (e.g., because they are part c g a t a c g g g g t c g a a c a a t c c g a g a t c g c a c a a t c c g t g t t g g g a c a a t c g g c a t g c g g g c g a g c c g c g t a c g a a c a t a c g g a g c a c g a a t a a t c c g g g c a t g t a c g a g c c g a g t a c a g a c c a t c c g c g t a a g c a g g a t a c g a g a t g a c a Location along the genome of a gene coding region), since the corresponding columns tend to be pure e.g., column 7 is all g s. To visuallize the data (sequence logo), we plot the letters A, C, G and T with a font size proportional to their empirical probability, and with the most probable letter on the top. Sequences =: 6
17 Example: Biosequence Analysis The empirical probability distribution at location t, is obtained by normalizing the vector of counts (see MLE estimate) θ t = i= This distribution is known as a motif. I(X it = ), I(X it = ), I(X it = 3 ), I(X it = 4) i= i= i= Can also compute the most probable letter in each location; this is the consensus sequence. Use MatLab function seqlogodemo from Kevin Murphys PMTK Bits Sequence Position 7
18 Summary of Discrete Distributions A summary of the multinomial and related discrete distributions is summarized below on a Table from Kevin Murphy s textbook n = (one roll of the dice), n = ( rolls of the dice) K = (binary variables), K = (-of-k encoding) 8
19 The Poisson Distribution X,,,3,... We say that has a Poisson distribution with parameter λ >, if its pmf is X ~ Poi( l) : Poi( x l) e x l l x! This is a model for counts of rare events. Poi(l=.).4.4 Poi(l=.) Use MatLab function poissonplotdemo from Kevin Murphys PMTK
20 The Empirical Distribution Given data, D = {x,, x }, we define the empirical distribution as: p ( A) ( A), Dirac Measure : ( A) emp xi x i if xi A if xi A We can also associate weights with each sample: Generalize pem p ( x) ( x) p ( x) w ( x), w, w xi emp i xi i i i i i This corresponds to a histogram with spikes at each sample point with height equal to the corresponding weight. This distribution assigns zero weight to any point not in the dataset. ote that the sample mean of f(x) is the expectation of f(x) under the empirical distribution: [ f ( x)] f ( x) ( x) dx f ( x ) xi i i n i
21 Student s T Distribution l l x px (, l, ) ( ) // ( ) / The parameter λ is called the precision of the T-distribution, even though it is not in general equal to the inverse of the variance (see below on behavior as υ ). The parameter υ is called the degrees of freedom. For the particular case of υ =, the T-distribution reduces to the Cauchy distribution. In the limit υ, the T-distribution T(x μ, λ, υ) becomes a Gaussian (x μ, λ ) with mean μ and precision λ.
22 For υ, T(x μ, λ, υ) Becomes a Gaussian ( ) / l l x px (, l, ) ( ) We first write the distribution as follows: For large υ, we can approximate the log as follows: // // x x l l T ( x, l, ) exp ln x x l l T ( x, l, ) exp O( ) exp O( ) In the limit υ, the T-distribution T(x μ, λ, υ) is indeed a Gaussian (x μ, λ ) with mean μ and precision λ. The normalization of the T is valid in this limit as well (so the Gaussian obtained is normalized).
23 Student s T Distribution Mean :, Mode : Var :, l l l x px (, l, ) ( ) // ( ) / student distribution v= v=. v=., l For, we obtain ( l, ) MatLab Code 3
24 Student s T Vs the Gaussian We plot: x,, ( x,,), x,/ T Lap The mean and variance of the Student s is undefined for υ = prob. density functions Gauss Student Laplace Logs of the PDFs. The Student s is OT log concave. Run MatLab function studentlaplacepdfplot from Kevin Murphys PMTK When υ =, the distribution is known as Cauchy or Lorentz. Due to its heavy tails, the mean does not converge. Recommended to use υ = log prob. density functions Gauss Student Laplace
25 Student s T Distribution Gamma ab, p( x, a, b) x, d / a b exp x ( a) The Student s T distribution can be seen from the equation above (see following two slides for proof) as an infinite mixture of Gaussians each of them with different precision (governed by a Gamma distribution) The result is a distribution that in general has longer tails than a Gaussian. This gives the T-distribution robustness, i.e. the T- distribution is much less sensitive than the Gaussian to the presence of outliers. a e b d 5
26 Appendix: Student s T as a Mixture of Gaussians If we have a univariate Gaussian T(x μ, τ ) together with a prior Gamma(τ a, b) and we integrate out the precision, we obtain the marginal distribution of x Introduce the transformation Gamma p( x, a, b) x, a, b d / a b a b exp x e d ( a) z b x a / b / a p( x, a, b) exp z d ( a) / a b ( a) A A to simplify as: / a z exp / a z z dz 6
27 Appendix: Student s T as a Mixture of Gaussians a / b / a p( x, a, b) z exp / a z z dz ( a) A Recalling the definition of the Gamma function: / a/ a/ a b b x expz z dz ( a) / a/ It is common to redefine the parameters in this distribution as: / ( ) l l x px (, l, ) ( ) a ( a) exp z z dz a b p( x, a, b) b x ( a ) ( a) // a, l a b 7
28 Robustness of Student s T Distribution The robustness of the T-distribution is illustrated here by comparing the maximum likelihood solutions for a Gaussian and a T-distribution (3 data points from the Gaussian are used). The effect of a small number of outliers (Fig. on the right) is less significant for the T-distribution than for the Gaussian T-distribution & Gaussian nearly overlap Gaussian MatLab Code T-distribution 8
29 Robustness of Student s T Distribution The earlier simulation is repeated here with the PMTK toolbox gaussian student T laplace.5 gaussian student T laplace Run MatLab function robustdemo from Kevin Murphys PMTK 9
30 The Laplace Distribution Another distribution with heavy tails is the Laplace distribution, also known as the double sided exponential distribution. It has the following pdf: x b Lap x, b e b μ is a location parameter and b > is a scale parameter Mean, Mode, Var b Its robust to outliers (see earlier demonstration). It puts mores probability density at than the Gaussian. This property is a useful way to encourage sparsity in a model. 3
31 Beta Distribution The Beta(α, β) distribution with follows: x [,],, is defined as The expected value, mode and variance of a Beta random variable x with (hyper-) parameters α and β : var x ( ) ( ) ( ) ( x) x ( x), beta(, ) ( ) ( ) beta(, ) x, x x ormalizing factor mode é ë xù û = a - a + b a=., b=. a=., b=. a=., b=3. a=8., b=4. beta distributions For more information visit this link
32 Beta Distribution If α = β =, we obtain a uniform distribution. If α and β are both less than, we get a bimodal distribution with spikes at and. If α and β are both greater than, the distribution is unimodal. Run betaplotdemo from PMTK a=., b=. a=., b=. a=., b=3. a=8., b=4. α=β= beta distributions α,β> α,β<
33 3 Beta Distribution Beta(.,.) 3 Beta(,).5.5 See Matlab implementation pdf.5 pdf x 3.5 Beta(,3) x 3.5 Beta(8,4) pdf.5 pdf x x 33
34 Gamma Function The gamma function extends the factorial to real numbers: ( ) With integration by parts: ( ) ( x) x ( x) ( ) ( ) x u x u e du For integer n: ( x ) x( x) ( n) ( n)! For more information visit this link. 34
35 Beta Distribution: ormalization Showing that the Beta(α, β) distribution is normalized correctly is a bit tricky. We need to prove that: ( ) ( ) ( ) d Follow the steps: (a) change the variable y below to y = t x; (b) change the order of integration in the shaded triangular region; and (c) change x to via x = tμ: t m ( ) ( ) x y t x e dx y e dy x e t x dt dx y t x x t t t x e t x dx dt t e t tdt d ) ( d 35 x t = x
36 Gamma Distribution- Rate Parametrization It is frequently a model for waiting times. For important properties see here. It is more often parameterized in terms of a shape parameter a and an inverse scale parameter b = /θ, called a rate parameter: a b a bx a u p( x a, b) x e, x,, ( a) u e du ( a) The mean, mode and variance with this parametrization are: a, for a x mod ex b var x b otherwise b 36
37 Plots of Gamma Distribution Gamma b ( a) a a ( X a, b) x exp( xb), b As we increase the rate b, the distribution squeezes leftwards and upwards. For a <, the mode is at zero Gamma distributions a=.,b=. a=.5,b=. a=.,b= Run gammaplotdemo from PMTK 37
38 Gamma Distribution An empirical PDF of rainfall data fitted with a Gamma distribution MoM MLE Run MatLab function gammarainfalldemo from PMTK 38
39 Exponential Distribution This is defined as Expon( X l) Gamma ( X, l) l exp( xl), x, Here λ is the rate parameter. This distribution describes the times between events in a Poisson process, i.e. a process in which events occur continuously and independently at a constant average rate λ. 39
40 Chi-Squared Distribution This is defined as x ( X ) Gamma ( X, ) x exp( ), x, This is the distribution of the sum of squared Gaussian random variables. More precisely, Let Zi ~ (,) and S Z, then : S ~ i i 4
41 Inverse Gamma Distribution This is defined as follows: where: If X X a b X X a b a is the shape and b the scale parameters. ote that b is a scale parameter since: It can be shown that: ~ Gamma (, ) ~ InvGamma (, ) InvGamma b ( a) a ( a) ( X a, b) x exp( b / x), x, X InvGamma ( a,) InvGamma ( X a, b) b b b b b Mean ( exists for a ), Mode, var ( exists for a ) a a ( a ) ( a ) 4
42 The Pareto Distribution Used to model the distribution of quantities that exhibit long tails (heavy tails) Pareto X k, m = km k x (k+) I x m This density asserts that x must be greater than some constant m, but not too much greater, k controls what is too much. Modeling the frequency of words vs. their rank (e.g. the, of, etc.) or the wealth of people.* As k, the distribution approaches δ(x m). On a log-log scale, the pdf forms a straight line of the form log p(x) = a log x + c for some constants a and c (power law, Zipf s law). * Basis of the distribution: a high proportion of a population has low income and only few have very high incomes. 4
43 The Pareto Distribution Applications: Modeling the frequency of words vs their rank, distribution of wealth (k =Pareto Index), etc. Pareto k ( k) ( X k, m) km x ( x m), km Mean ( if k ), k Mode m, mk ( k) ( k) var ( if k ) Pareto distribution m=., k=. m=., k=.5 m=., k= ParetoPlot from PMTK 43
44 Covariance Consider two random variables XY, :. The joint probability distribution is defined as: P X A, Y B P X ( A) Y ( B) p( x, y) dxdy Two random variables are independent if p( x, y) p( x) p( y) The covariance of X and Y is defined as: AB cov( X, Y ) ( X ( X ))( Y ( Y )) It is straight forward to verify that: cov( X, Y ) XY X Y 44
45 Correlation, Center ormalized Random Variables Consider two random variables XY, :. The correlation coefficient of X and Y is defined as: corrc( X, Y ) cov( XY, ) where the standard deviations of X and Y are X cov( X), cov( Y) The center normalized random variables are defined as: It is straight forward to verify that: Y X Y X ~ = Y ~ = X E X σ X Y E Y σ Y E X ~ = E Y ~ = var X ~ = var Y ~ = 45
Introduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. Nicholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of Notre Dame Notre Dame, Indiana, USA Email:
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationIntroduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationLecture 3. Probability - Part 2. Luigi Freda. ALCOR Lab DIAG University of Rome La Sapienza. October 19, 2016
Lecture 3 Probability - Part 2 Luigi Freda ALCOR Lab DIAG University of Rome La Sapienza October 19, 2016 Luigi Freda ( La Sapienza University) Lecture 3 October 19, 2016 1 / 46 Outline 1 Common Continuous
More information15-388/688 - Practical Data Science: Basic probability. J. Zico Kolter Carnegie Mellon University Spring 2018
15-388/688 - Practical Data Science: Basic probability J. Zico Kolter Carnegie Mellon University Spring 2018 1 Announcements Logistics of next few lectures Final project released, proposals/groups due
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationProbability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014
Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions
More informationLecture 2: Conjugate priors
(Spring ʼ) Lecture : Conjugate priors Julia Hockenmaier juliahmr@illinois.edu Siebel Center http://www.cs.uiuc.edu/class/sp/cs98jhm The binomial distribution If p is the probability of heads, the probability
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationPROBABILITY DISTRIBUTIONS. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception
PROBABILITY DISTRIBUTIONS Credits 2 These slides were sourced and/or modified from: Christopher Bishop, Microsoft UK Parametric Distributions 3 Basic building blocks: Need to determine given Representation:
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationMachine Learning. Probability Basics. Marc Toussaint University of Stuttgart Summer 2014
Machine Learning Probability Basics Basic definitions: Random variables, joint, conditional, marginal distribution, Bayes theorem & examples; Probability distributions: Binomial, Beta, Multinomial, Dirichlet,
More informationPROBABILITY AND INFORMATION THEORY. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
PROBABILITY AND INFORMATION THEORY Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Probability space Rules of probability
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More information02 Background Minimum background on probability. Random process
0 Background 0.03 Minimum background on probability Random processes Probability Conditional probability Bayes theorem Random variables Sampling and estimation Variance, covariance and correlation Probability
More informationStatistical Methods in Particle Physics
Statistical Methods in Particle Physics Lecture 3 October 29, 2012 Silvia Masciocchi, GSI Darmstadt s.masciocchi@gsi.de Winter Semester 2012 / 13 Outline Reminder: Probability density function Cumulative
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationBasics on Probability. Jingrui He 09/11/2007
Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability
More informationS n = x + X 1 + X X n.
0 Lecture 0 0. Gambler Ruin Problem Let X be a payoff if a coin toss game such that P(X = ) = P(X = ) = /2. Suppose you start with x dollars and play the game n times. Let X,X 2,...,X n be payoffs in each
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 PROBABILITY. Prof. Steven Waslander
ME 597: AUTONOMOUS MOBILE ROBOTICS SECTION 2 Prof. Steven Waslander p(a): Probability that A is true 0 pa ( ) 1 p( True) 1, p( False) 0 p( A B) p( A) p( B) p( A B) A A B B 2 Discrete Random Variable X
More informationMultiple Random Variables
Multiple Random Variables Joint Probability Density Let X and Y be two random variables. Their joint distribution function is F ( XY x, y) P X x Y y. F XY ( ) 1, < x
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationQuantitative Biology II Lecture 4: Variational Methods
10 th March 2015 Quantitative Biology II Lecture 4: Variational Methods Gurinder Singh Mickey Atwal Center for Quantitative Biology Cold Spring Harbor Laboratory Image credit: Mike West Summary Approximate
More informationA Few Special Distributions and Their Properties
A Few Special Distributions and Their Properties Econ 690 Purdue University Justin L. Tobias (Purdue) Distributional Catalog 1 / 20 Special Distributions and Their Associated Properties 1 Uniform Distribution
More informationIEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 2
IEOR 316: Introduction to Operations Research: Stochastic Models Professor Whitt SOLUTIONS to Homework Assignment 2 More Probability Review: In the Ross textbook, Introduction to Probability Models, read
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationIntroduction to Machine Learning
Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 20: Expectation Maximization Algorithm EM for Mixture Models Many figures courtesy Kevin Murphy s
More informationExercises with solutions (Set D)
Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where
More informationDistributions of Functions of Random Variables. 5.1 Functions of One Random Variable
Distributions of Functions of Random Variables 5.1 Functions of One Random Variable 5.2 Transformations of Two Random Variables 5.3 Several Random Variables 5.4 The Moment-Generating Function Technique
More informationNaïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.
School of Computer Science 10-701 Introduction to Machine Learning aïve Bayes Readings: Mitchell Ch. 6.1 6.10 Murphy Ch. 3 Matt Gormley Lecture 3 September 14, 2016 1 Homewor 1: due 9/26/16 Project Proposal:
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationBivariate distributions
Bivariate distributions 3 th October 017 lecture based on Hogg Tanis Zimmerman: Probability and Statistical Inference (9th ed.) Bivariate Distributions of the Discrete Type The Correlation Coefficient
More informationAarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell
Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationWeek 1 Quantitative Analysis of Financial Markets Distributions A
Week 1 Quantitative Analysis of Financial Markets Distributions A Christopher Ting http://www.mysmu.edu/faculty/christophert/ Christopher Ting : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 October
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationRandom variables. DS GA 1002 Probability and Statistics for Data Science.
Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities
More informationMath 3215 Intro. Probability & Statistics Summer 14. Homework 5: Due 7/3/14
Math 325 Intro. Probability & Statistics Summer Homework 5: Due 7/3/. Let X and Y be continuous random variables with joint/marginal p.d.f. s f(x, y) 2, x y, f (x) 2( x), x, f 2 (y) 2y, y. Find the conditional
More informationIntroduction: exponential family, conjugacy, and sufficiency (9/2/13)
STA56: Probabilistic machine learning Introduction: exponential family, conjugacy, and sufficiency 9/2/3 Lecturer: Barbara Engelhardt Scribes: Melissa Dalis, Abhinandan Nath, Abhishek Dubey, Xin Zhou Review
More informationMFM Practitioner Module: Quantitative Risk Management. John Dodson. September 23, 2015
MFM Practitioner Module: Quantitative Risk Management September 23, 2015 Mixtures Mixtures Mixtures Definitions For our purposes, A random variable is a quantity whose value is not known to us right now
More informationIntroducing the Normal Distribution
Department of Mathematics Ma 3/13 KC Border Introduction to Probability and Statistics Winter 219 Lecture 1: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2, 2.2,
More informationBayesian Data Analysis
An Introduction to Bayesian Data Analysis Lecture 1 D. S. Sivia St. John s College Oxford, England April 29, 2012 Some recommended books........................................................ 2 Outline.......................................................................
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationChapter 4 Multiple Random Variables
Review for the previous lecture Theorems and Examples: How to obtain the pmf (pdf) of U = g ( X Y 1 ) and V = g ( X Y) Chapter 4 Multiple Random Variables Chapter 43 Bivariate Transformations Continuous
More informationBrief Review of Probability
Maura Department of Economics and Finance Università Tor Vergata Outline 1 Distribution Functions Quantiles and Modes of a Distribution 2 Example 3 Example 4 Distributions Outline Distribution Functions
More informationChapter 1 Statistical Reasoning Why statistics? Section 1.1 Basics of Probability Theory
Chapter 1 Statistical Reasoning Why statistics? Uncertainty of nature (weather, earth movement, etc. ) Uncertainty in observation/sampling/measurement Variability of human operation/error imperfection
More informationLecture 11: Probability Distributions and Parameter Estimation
Intelligent Data Analysis and Probabilistic Inference Lecture 11: Probability Distributions and Parameter Estimation Recommended reading: Bishop: Chapters 1.2, 2.1 2.3.4, Appendix B Duncan Gillies and
More informationProbability Background
CS76 Spring 0 Advanced Machine Learning robability Background Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu robability Meure A sample space Ω is the set of all possible outcomes. Elements ω Ω are called sample
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationPoint Estimation. Vibhav Gogate The University of Texas at Dallas
Point Estimation Vibhav Gogate The University of Texas at Dallas Some slides courtesy of Carlos Guestrin, Chris Bishop, Dan Weld and Luke Zettlemoyer. Basics: Expectation and Variance Binary Variables
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationProbability Distribution Summary Dr. Mohammed Hawa October 2007
Probability Distribution Summary Dr. Mohammed Hawa October 2007 There are many probability distributions that have been studied over the years. Each of which was developed based on observations of a particular
More informationStatistics 1B. Statistics 1B 1 (1 1)
0. Statistics 1B Statistics 1B 1 (1 1) 0. Lecture 1. Introduction and probability review Lecture 1. Introduction and probability review 2 (1 1) 1. Introduction and probability review 1.1. What is Statistics?
More informationDS-GA 1002 Lecture notes 2 Fall Random variables
DS-GA 12 Lecture notes 2 Fall 216 1 Introduction Random variables Random variables are a fundamental tool in probabilistic modeling. They allow us to model numerical quantities that are uncertain: the
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationINTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP
INTRODUCTION TO BAYESIAN INFERENCE PART 2 CHRIS BISHOP Personal Healthcare Revolution Electronic health records (CFH) Personal genomics (DeCode, Navigenics, 23andMe) X-prize: first $10k human genome technology
More informationProbability. Table of contents
Probability Table of contents 1. Important definitions 2. Distributions 3. Discrete distributions 4. Continuous distributions 5. The Normal distribution 6. Multivariate random variables 7. Other continuous
More information3. Probability and Statistics
FE661 - Statistical Methods for Financial Engineering 3. Probability and Statistics Jitkomut Songsiri definitions, probability measures conditional expectations correlation and covariance some important
More informationStat 5101 Notes: Brand Name Distributions
Stat 5101 Notes: Brand Name Distributions Charles J. Geyer September 5, 2012 Contents 1 Discrete Uniform Distribution 2 2 General Discrete Uniform Distribution 2 3 Uniform Distribution 3 4 General Uniform
More informationSummary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016
8. For any two events E and F, P (E) = P (E F ) + P (E F c ). Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016 Sample space. A sample space consists of a underlying
More informationIE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.
Closed book and notes. 60 minutes. A summary table of some univariate continuous distributions is provided. Four Pages. In this version of the Key, I try to be more complete than necessary to receive full
More informationIntroducing the Normal Distribution
Department of Mathematics Ma 3/103 KC Border Introduction to Probability and Statistics Winter 2017 Lecture 10: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2,
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationReview of Probabilities and Basic Statistics
Alex Smola Barnabas Poczos TA: Ina Fiterau 4 th year PhD student MLD Review of Probabilities and Basic Statistics 10-701 Recitations 1/25/2013 Recitation 1: Statistics Intro 1 Overview Introduction to
More informationJoint Probability Distributions, Correlations
Joint Probability Distributions, Correlations What we learned so far Events: Working with events as sets: union, intersection, etc. Some events are simple: Head vs Tails, Cancer vs Healthy Some are more
More informationStatistics for Economists Lectures 6 & 7. Asrat Temesgen Stockholm University
Statistics for Economists Lectures 6 & 7 Asrat Temesgen Stockholm University 1 Chapter 4- Bivariate Distributions 41 Distributions of two random variables Definition 41-1: Let X and Y be two random variables
More informationArtificial Intelligence
Artificial Intelligence Probabilities Marc Toussaint University of Stuttgart Winter 2018/19 Motivation: AI systems need to reason about what they know, or not know. Uncertainty may have so many sources:
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationContinuous Random Variables
Continuous Random Variables Recall: For discrete random variables, only a finite or countably infinite number of possible values with positive probability. Often, there is interest in random variables
More informationProbability and Statistics
Probability and Statistics 1 Contents some stochastic processes Stationary Stochastic Processes 2 4. Some Stochastic Processes 4.1 Bernoulli process 4.2 Binomial process 4.3 Sine wave process 4.4 Random-telegraph
More informationData Analysis and Monte Carlo Methods
Lecturer: Allen Caldwell, Max Planck Institute for Physics & TUM Recitation Instructor: Oleksander (Alex) Volynets, MPP & TUM General Information: - Lectures will be held in English, Mondays 16-18:00 -
More informationChapter 1. Sets and probability. 1.3 Probability space
Random processes - Chapter 1. Sets and probability 1 Random processes Chapter 1. Sets and probability 1.3 Probability space 1.3 Probability space Random processes - Chapter 1. Sets and probability 2 Probability
More informationCopula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011
Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models
More informationJoint Probability Distributions, Correlations
Joint Probability Distributions, Correlations What we learned so far Events: Working with events as sets: union, intersection, etc. Some events are simple: Head vs Tails, Cancer vs Healthy Some are more
More informationMATH Solutions to Probability Exercises
MATH 5 9 MATH 5 9 Problem. Suppose we flip a fair coin once and observe either T for tails or H for heads. Let X denote the random variable that equals when we observe tails and equals when we observe
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationProbability Distributions.
Probability Distributions http://www.pelagicos.net/classes_biometry_fa18.htm Probability Measuring Discrete Outcomes Plotting probabilities for discrete outcomes: 0.6 0.5 0.4 0.3 0.2 0.1 NOTE: Area within
More informationThis does not cover everything on the final. Look at the posted practice problems for other topics.
Class 7: Review Problems for Final Exam 8.5 Spring 7 This does not cover everything on the final. Look at the posted practice problems for other topics. To save time in class: set up, but do not carry
More informationA Brief Review of Probability, Bayesian Statistics, and Information Theory
A Brief Review of Probability, Bayesian Statistics, and Information Theory Brendan Frey Electrical and Computer Engineering University of Toronto frey@psi.toronto.edu http://www.psi.toronto.edu A system
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationProbability and Probability Distributions. Dr. Mohammed Alahmed
Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about
More informationSTATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero
STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank
More informationDiscrete Random Variables
Discrete Random Variables An Undergraduate Introduction to Financial Mathematics J. Robert Buchanan 2014 Introduction The markets can be thought of as a complex interaction of a large number of random
More informationCS Lecture 19. Exponential Families & Expectation Propagation
CS 6347 Lecture 19 Exponential Families & Expectation Propagation Discrete State Spaces We have been focusing on the case of MRFs over discrete state spaces Probability distributions over discrete spaces
More informationRandom Processes. DS GA 1002 Probability and Statistics for Data Science.
Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Modeling quantities that evolve in time (or space)
More informationSTA2603/205/1/2014 /2014. ry II. Tutorial letter 205/1/
STA263/25//24 Tutorial letter 25// /24 Distribution Theor ry II STA263 Semester Department of Statistics CONTENTS: Examination preparation tutorial letterr Solutions to Assignment 6 2 Dear Student, This
More informationProbability Distributions Columns (a) through (d)
Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)
More information