Introduc)on to Bayesian Methods
|
|
- Evan Black
- 5 years ago
- Views:
Transcription
1 Introduc)on to Bayesian Methods
2 Bayes Rule py x)px) = px! y) = px y)py) py x) = px y)py) px) px) =! px! y) = px y)py) y py x) = py x) =! y "! y px y)py) px y)py) px y)py) px y)py)dy
3 Bayes Rule py x) = px y)py) px) pparameters data) = p! D) = p! D) = pd!)p!) pd)! pd!)p!) pd!)p!)d! pdata parameter)pparameters) pdata)
4 likelihood prior p! D) = posterior pd!)p!) pd) evidence
5 probability of the data given that parameter value or vector of parameter values) prior probability of that parameter value or vector of parameter values) p! D) = probability of a particular parameter value or vector of parameter values) given the data pd!)p!) pd) probability of the data averaged across all possible parameter values or vector)
6 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) just making the model M 1) explicit in Bayes rule
7 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) just to make clear that the set of parameters might be different across different models these are parameters associated with Model 1
8 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) pa B) = pb A)pA) pb) this is just Bayes rule
9 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) px y) = py x)px) py) this is just Bayes rule again
10 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) pm 1) D) = pd M 1) )pm 1) ) pd) this is just Bayes rule for Model M 1 given Data just applying formula) I haven t included θ 1) here because I don t care about θ 1) yet) this is just about M 1) and D
11 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) pm 1) D) = pd M 1) )pm 1) ) pd) pd M 1) ) =! pd! 1), M 1) )p! 1) M 1) )d! 1)
12 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) pm 1) D) = pd M 1) )pm 1) ) pd) prior probability of Model M 1)
13 p! D) = pd!)p!) pd) p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) pm 1) D) = pd M )pm ) 1) 1) pd) this is the probability of the Data given any possible Model under consideration it will be the same for every possible Model tested pd) =! pd M j )pm j ) j why did I make this a sum and not an integral?
14 p! 1) D, M 1) ) = pd! 1), M 1) )p! 1) M 1) ) pd M 1) ) pm 1) D) = pd M 1) )pm 1) ) pd) pm 1) D) pm 2) D) = pd M 1) )pm 1) ) pd) pd M 2) )pm 2) ) pd) pm 1) D) pm 2) D) = pd M 1)) pd M 2) ) pm 1) ) pm 2) )
15 Bayesian model selection pm 1) D) pm 2) D) = pd M 1)) pd M 2) ) pm 1) ) pm 2) ) Bayes factor model priors pd M ) =! pd!, M )p! M )d! j) j) j) j) j) j) likelihood prior
16 Bayesian model selection pm 1) D) pm 2) D) = pd M 1)) pd M 2) ) pm 1) ) pm 2) ) Bayes factor model priors In Bayesian model selection, models can be nested or nonnested Of advantage of Bayesian model selection is that it is not as sensitive to data sample size as classical significance procedures e.g., like the G 2 test)
17 Bayesian model selection pm 1) D) pm 2) D) = pd M 1)) pd M 2) ) pm 1) ) pm 2) ) Bayes factor model priors You often just see Bayes factor in terms of PD M) or even PD), where the model M is implicit Since this is intended to be used for model selections, it s odd that few authors note that this is really pm D) after applications of Bayes rule, which is what model selection is all about, but with no prior pm) on the Models
18
19 likelihood prior p! D) = posterior pd!)p!) pd) evidence Where does the prior probability pθ) come from? noninformative priors - akin to a uniform distribution - or - based on theory - or - based on past data
20
21 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the maximum likelihood estimate of p?! Lx p) = Probx p) = # " N x $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x
22 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)?! Lx p) = Probx p) = # " N x $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x P! D) = PD!)P!) PD)
23 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)?! Lx p) = Probx p) = # " N x $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x Pp x) = Px p)pp) Px) in this example, please don t confuse the big P with the little p one is the probability in Bayes rule P) the other is the parameter p in the binomial distribution
24 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)?! Lx p) = Probx p) = # " N x this is the likelihood $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x Pp x) = Px p)pp) Px)
25 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)?! Lx p) = Probx p) = # " N x $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x Pp x) = Px p)pp) Px) what functional form of the prior should be used?
26 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)?! Lx p) = Probx p) = # " N x $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x Pp x) = Px p)pp) Px) what functional form of the prior should be used? what are some possibilities?
27
28 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)?! Lx p) = Probx p) = # " N x $ & p x 1' p) N'x = N! x!n ' x)! px 1' p) N'x Pp x) = Px p)pp) Px) Pp) ~ Betaa, b) = pa!1 1! p) b!1 Bea, b) = pa!1 1! p) b!1 "a)"b) "a + b) "x) = x!1)!
29 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? Pp x) =! # " N x $ & p x 1' p) N'x p a'1 1' p) b'1 Bea, b) Px)
30 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? Pp x) =! # " N x $ & p x 1' p) N'x p a'1 1' p) b'1 Bea, b) Px) collect the terms that involve p
31 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? Pp x) = p x 1! p) N!x p a!1 1! p) b!1 " $ # N x ' & Px)Bea, b) Px) =! Px p)p p)dp constant wrt p
32 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? to be a true probability distribution, this needs to sum to 1 Pp x)! p x 1" p) N"x p a"1 1" p) b"1 you often see things like this, where is a proportionality without the denominator
33 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? to be a true probability distribution, this needs to sum to 1 Pp x)! p x 1" p) N"x p a"1 1" p) b"1 you often see things like this, where is a proportionality without the denominator with techniques like MCMC, it s often sufficient
34 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? this can be shown to be true, but for most problems, an analytic solution isn t possible Pp x) = px 1! p) N!x p a!1 1! p) b!1 Bex + a, N! x + b)
35 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? rearranging some terms Pp x) = px+a!1 1! p) N!x+b!1 Bex + a, N! x + b)
36 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? rearranging some terms Pp x) = px+a!1 1! p) N!x+b!1 Bex + a, N! x + b) what does this look like?
37 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? posterior is also a Beta distribution Pp x) = px+a!1 1! p) N!x+b!1 Bex + a, N! x + b) recall: Pp) ~ Betaa, b) = pa!1 1! p) b!1 Bea, b)
38 imagine we flip a coin 10 times and get 4 heads N=10, x=4) what is the Bayesian estimate of Pp x)? why might it be nice to have a posterior that has a distribution the same as the prior? Pp x) = px+a!1 1! p) N!x+b!1 Bex + a, N! x + b) recall: Pp) ~ Betaa, b) = pa!1 1! p) b!1 Bea, b)
39 likelihood prior p! D) = posterior pd!)p!) pd) evidence so-called conjugate prior have a functional form and the only functional form) that give you posteriors with the same functional form and hence let you turn around and plug the posterior back in as the new prior
40 likelihood prior p! D) = posterior pd!)p!) pd) evidence use of conjugate prior is good Bayesian style it was also necessary before computer simulation techniques like MCMC allowed arbitrary functional forms with no need for analytic solutions
41 likelihood prior p! D) = posterior pd!)p!) pd) evidence Beta distribution = conjugate prior for Binomial/Bernoulli Dirichlet distribution = conjugate prior for Multinomial Normal distribution = conjugate prior for mean of Normal Inverse Gamma = conjugate prior for variance of Normal
42 see week13.m
43 Bayesian parameter estimation 1) Maximum a posteriori MAP) maximum of the posterior distribution pθ D) see week12.m 2) Expected value of parameters E[!] =!! p! D)d!
44 Bayesian parameter estimation 3) Highest Density Interval HDI) aka Highest Density Region HDR)
45
46 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? fair model assume p=.5 unfair model assumes 0 p 1
47 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? fair model assume p=.5 unfair model assumes 0 p 1 Px p) =! # " N x $ & p x 1' p) N'x PD M j) ) =! PD! j), M j) )P! j) M j) )d! j)
48 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? fair model assume p=.5 unfair model assumes 0 p 1 Px p) =! # " N x $ & p x 1' p) N'x Px M j) ) =! Px p j), M j) )Pp j) M j) )dp j) assume uniform prior on Pp)
49 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) For the fair model: Px p) =! # " N x $ &.5 x 1'.5) N'x =! # " N x $ &.5 N Px M fair ) =! # " N x $! &.5 N '1= # " Why is there no integral? And what is the 1? N x $ &.5 N Px M j) ) =! Px p j), M j) )Pp j) M j) )dp j)
50 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) For the unfair model: Px p) =! # " N x $ & p x 1' p) N'x Px M unfair ) =! N $ ) 1 # & p x 1' p) N'x 1dp = 1 0 " x N +1 Need to trust me on this.
51 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? M 1 fair model assume p=.5 M 2 unfair model assumes 0 p 1 pm D) 1) pm D) = pd M 1) ) pd M ) 2) 2) pm 1) ) pm 2) )
52 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? M 1 fair model assume p=.5 M 2 unfair model assumes 0 p 1 pm x) fair pm x) = px M ) fair px M ) unfair unfair pm fair ) pm unfair )
53 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? M 1 fair model assume p=.5 M 2 unfair model assumes 0 p 1 pm fair x) pm unfair x) =! # " N $ &.5 N x N +1
54 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? M 1 fair model assume p=.5 M 2 unfair model assumes 0 p 1 pm fair x) pm unfair x) =! # " 20 $ & = Imagine N=20 tosses and x=12 heads 2.52 times more likely that the coin is fair than unfair
55 Simple example of Bayesian Model Evaluation from Shiffrin, Lee, Kim, & Wagenmakers 2008) Is a coin fair? M 1 fair model assume p=.5 M 2 unfair model assumes 0 p 1 pm fair x) pm unfair x) =! # " 20 $ & = automatically penalizes the unfair model for number of parameters and complexity since it has to fit the data better than the fair model
56
57 Another example: Bayesian estimation of Normally distributed data
58 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ,! 2 X) = PX µ,! 2 )Pµ,! 2 ) PX) Let s start by looking at just a single parameter at a time that s straightforward joint distribution is trickier)
59 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of µ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) = PX µ,! 2 )Pµ! 2 ) PX! 2 ) This is a conditional posterior assumes we know σ 2 i.e., it s a constant, not a parameter) What might be a reasonable prior on μ?
60 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) = PX µ,! 2 )Pµ! 2 ) PX! 2 ) Pµ X,! 2 ) = " #! x j! µ) 2 & 1 1 PX! 2 ) 2"! 2 ) exp j 1 N 2! 2 2"! ) exp #!µ! µ 0 ) $ 2! 0 $ ' 2 & '
61 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) = PX µ,! 2 )Pµ! 2 ) PX! 2 ) Pµ X,! 2 ) = " #! x j! µ) 2 & 1 1 PX! 2 ) 2"! 2 ) exp j 1 N 2! 2 2"! ) exp #!µ! µ 0 ) $ 2! 0 $ ' 2 & '
62 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) = PX µ,! 2 )Pµ! 2 ) PX! 2 ) Pµ X,! 2 ) = " #! x j! µ) 2 & 1 1 PX! 2 ) 2"! 2 ) exp j 1 N 2! 2 2"! ) exp #!µ! µ 0 ) $ 2! 0 $ ' 2 & '
63 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) = PX µ,! 2 )Pµ! 2 ) PX! 2 ) Pµ X,! 2 ) = " #! x j! µ) 2 & 1 1 PX! 2 ) 2"! 2 ) exp j 1 N 2! 2 2"! ) exp #!µ! µ 0 ) $ 2! 0 $ ' 2 & '
64 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) = PX µ,! 2 )Pµ! 2 ) PX! 2 ) Pµ X,! 2 ) = " #! x j! µ) 2 & 1 1 PX! 2 ) 2"! 2 ) exp j 1 N 2! 2 2"! ) exp #!µ! µ 0 ) $ 2! 0 $ ' 2 & ' µ 0 and σ 2 0 are called hyperparameters
65 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' Pµ X,! 2 ) ~ Nµ ',! 2 ') " µ ' = 1 x! 2 j + 1 2! µ $! 0 '! 2 ' # j 0 & "! 2 ' = $ N 1! + 1 # 2! 0 2 ' & 1
66 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' P! 2 X,µ) = PX µ,! 2 )P! 2 µ) PX µ) This is a conditional posterior What might be a reasonable prior on σ 2?
67 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' P! 2 X,µ) = PX µ,! 2 )P! 2 µ) PX µ) P! 2 X,µ) = 1 PX µ) " #! x j! µ) 2 & 1 2"! 2 ) exp j b a #!b & exp N 2! 2 )a)! 2 a+1 ) $! 2 ' $ ' Inverse-Gamma Distribution
68 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' P! 2 X,µ) = PX µ,! 2 )P! 2 µ) PX µ) P! 2 X,µ) = 1 PX µ) " #! x j! µ) 2 & 1 2"! 2 ) exp j b a #!b & exp N 2! 2 )a)! 2 a+1 ) $! 2 ' $ ' Inverse-Gamma Distribution
69 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' P! 2 X,µ) = PX µ,! 2 )P! 2 µ) PX µ) P! 2 X,µ) = 1 PX µ) " #! x j! µ) 2 & 1 2"! 2 ) exp j b a #!b & exp N 2! 2 )a)! 2 a+1 ) $! 2 ' $ ' Inverse-Gamma Distribution
70 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' P! 2 X,µ) = PX µ,! 2 )P! 2 µ) PX µ) P! 2 X,µ) = 1 PX µ) " #! x j! µ) 2 & 1 2"! 2 ) exp j b a #!b & exp N 2! 2 )a)! 2 a+1 ) $! 2 ' $ ' Inverse-Gamma Distribution
71 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? PX µ,! 2 ) = #!"x j! µ) 2 & 1 2"! 2 ) exp j N 2! 2 $ ' P! 2 X,µ) ~ InverseGammaa', b') a' = N / 2 + a # & b' = "x j! µ) 2 $ j ' / 2 + b
72 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? Pµ X,! 2 ) ~ Nµ ',! 2 ') " µ ' = 1 x! 2 j + 1 2! µ $! 0 '! 2 ' # j 0 & "! 2 ' = $ N 1! + 1 # 2! 0 2 ' & 1 P! 2 X,µ) ~ InverseGammaa', b') a' = N / 2 + a # & b' = "x j! µ) 2 $ j ' / 2 + b In most circumstances, you want the unconditionalized posterior distributions Pμ X) and Pσ 2 X) unconditionalized posterior distributions are called marginal distributions
73 imagine normally distributed data X=x 1, x 2, x 3,,x N ) what is the Bayesian estimate of μ and σ 2? Pµ X,! 2 ) ~ Nµ ',! 2 ') " µ ' = 1 x! 2 j + 1 2! µ $! 0 '! 2 ' # j 0 & "! 2 ' = $ N 1! + 1 # 2! 0 2 ' & 1 P! 2 X,µ) ~ InverseGammaa', b') a' = N / 2 + a # & b' = "x j! µ) 2 $ j ' / 2 + b pµ X) = p! 2 X) =! pµ X,! 2 )p! 2 X)d! 2! p! 2 X,µ)p µ X)dµ integrals like this are usually hard to solve! Monte Carlo Markov Chain MCMC) Methods
74 a challenge of doing Bayesian Statistical Analysis is that the solutions require solving complex integrals p! D) = p! D) =! pd )!)p!) pd )!)p!)! p!, " D)d" p! D) =! p! ", D)p")d" E[ f!)] =! f!) p! D)d!
75 a challenge of doing Bayesian Statistical Analysis is that the solutions require solving complex integrals only in limited cases can these be solve analytically e.g., univariate models that are binomial, normal, etc.) and outside of models with only a few parameters, these integrals cannot be solved using standard numeric integration techniques Monte Carlo methods including MCMC)
76 consider this integral:!! p! D)d! what is this?
77 consider this integral:!! p! D)d! posterior distribution of parameter θ given data D
78 consider this integral:!! p! D)d! what is this?
79 consider this integral:!! p! D)d! E[! D] =!! p! D)d!
80 consider this integral:!! p! D)d! E[! D] =!! p! D)d! recall that this is a function
81 consider this integral:!! p! D)d! E[! D] =!! p! D)d! how can we evaluate this?
82 consider this integral:!! p! D)d! E[! D] =!! p! D)d! how can we evaluate this? - analytically hard and often impossible - numerical integration techniques inefficient - Monte Carlo methods often preferred or only method
83 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) imagine we have an engine that spits out θ s with probability pθ D)
84 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1)
85 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2)
86 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3)
87 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4)
88 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5)
89 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5) What is the E[θ D]?
90 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5) E[! D] =!! p! D)d! " #! j) 1 N j Monte Carlo simulation of an integral
91 Simple Monte Carlo Integration E[g!) D] =! g!) p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5) E[g!) D] =! g!) p! D)d! " # g! j) ) 1 N j Monte Carlo simulation of an integral
92 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5) Of course, this assumes you can create an engine that spits out independent samples from a distribution We ve talked about some such engines, line rand) or randn) or other matlab random number routines
93 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5) But independent sampling from a posterior density pθ D) is usually not feasible or simply impossible) WHY? Keep in mind that in the most general case pθ D) can be arbitrarily complex and have many many parameters
94 Simple Monte Carlo Integration E[! D] =!! p! D)d! p! D) θ 1) θ 2) θ 3) θ 4) θ 5) But independent sampling from a posterior density pθ D) is usually not feasible or simply impossible) but we can do dependent or autocorrelated sampling Monte Carlo Markov Chains
95 p! D) θ 1) θ 2) θ 3) θ 4) θ 5) Independent Sampling Pθ t) θ 1) θ t-1) ) = Pθ t) ) because the samples are independent, smaller sample sizes are needed to approximate distributions or integrals Sampling from a Markov Chain Process Pθ t) θ 1) θ t-1) ) = Pθ t) θ t-1) ) because the samples are dependent, far larger sample sizes are needed to approximate distributions or integrals
96 p! D) θ 1) θ 2) θ 3) θ 4) θ 5) in reality, many random number generators are actually Markov Processes Independent Sampling Pθ t) θ 1) θ t-1) ) = Pθ t) ) because the samples are independent, smaller sample sizes are needed to approximate distributions or integrals Sampling from a Markov Chain Process Pθ t) θ 1) θ t-1) ) = Pθ t) θ t-1) ) because the samples are dependent, far larger sample sizes are needed to approximate distributions or integrals
97 p! D) θ 1) θ 2) θ 3) θ 4) θ 5) Independent Sampling Pθ t) θ 1) θ t-1) ) = Pθ t) ) First, let s look at independent sampling see week13.m
98 Independent Sampling
99 p! D) θ 1) θ 2) θ 3) θ 4) θ 5) Sampling from a Markov Chain Process Pθ t) θ 1) θ t-1) ) = Pθ t) θ t-1) ) What is a Monte Carlo Markov Chain first, let s see it in action see week13.m
100 Monte Carlo Markov Chain Sampling
101 Independent Remember: this is what we are trying to derive MCMC this is what we re deriving it from
102 Independent Remember: this is what we are trying to derive MCMC where does this come from?
Probability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationBayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework
HT5: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Maximum Likelihood Principle A generative model for
More informationIntroduction to Bayesian Methods. Introduction to Bayesian Methods p.1/??
to Bayesian Methods Introduction to Bayesian Methods p.1/?? We develop the Bayesian paradigm for parametric inference. To this end, suppose we conduct (or wish to design) a study, in which the parameter
More informationBayesian RL Seminar. Chris Mansley September 9, 2008
Bayesian RL Seminar Chris Mansley September 9, 2008 Bayes Basic Probability One of the basic principles of probability theory, the chain rule, will allow us to derive most of the background material in
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationIntroduction to Probabilistic Machine Learning
Introduction to Probabilistic Machine Learning Piyush Rai Dept. of CSE, IIT Kanpur (Mini-course 1) Nov 03, 2015 Piyush Rai (IIT Kanpur) Introduction to Probabilistic Machine Learning 1 Machine Learning
More informationDS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling
DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling Due: Tuesday, May 10, 2016, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions below, including
More informationInferring information about models from samples
Contents Inferring information about models from samples. Drawing Samples from a Probability Distribution............. Simple Samples from Matlab.................. 3.. Rejection Sampling........................
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationBayesian Inference and MCMC
Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the
More informationIntroduction to Applied Bayesian Modeling. ICPSR Day 4
Introduction to Applied Bayesian Modeling ICPSR Day 4 Simple Priors Remember Bayes Law: Where P(A) is the prior probability of A Simple prior Recall the test for disease example where we specified the
More informationDavid Giles Bayesian Econometrics
David Giles Bayesian Econometrics 1. General Background 2. Constructing Prior Distributions 3. Properties of Bayes Estimators and Tests 4. Bayesian Analysis of the Multiple Regression Model 5. Bayesian
More informationIntroduction to Bayesian inference
Introduction to Bayesian inference Thomas Alexander Brouwer University of Cambridge tab43@cam.ac.uk 17 November 2015 Probabilistic models Describe how data was generated using probability distributions
More informationCSC321 Lecture 18: Learning Probabilistic Models
CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling
More informationBayesian Inference. STA 121: Regression Analysis Artin Armagan
Bayesian Inference STA 121: Regression Analysis Artin Armagan Bayes Rule...s! Reverend Thomas Bayes Posterior Prior p(θ y) = p(y θ)p(θ)/p(y) Likelihood - Sampling Distribution Normalizing Constant: p(y
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationThe Normal Linear Regression Model with Natural Conjugate Prior. March 7, 2016
The Normal Linear Regression Model with Natural Conjugate Prior March 7, 2016 The Normal Linear Regression Model with Natural Conjugate Prior The plan Estimate simple regression model using Bayesian methods
More informationLecture 2: Priors and Conjugacy
Lecture 2: Priors and Conjugacy Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 6, 2014 Some nice courses Fred A. Hamprecht (Heidelberg U.) https://www.youtube.com/watch?v=j66rrnzzkow Michael I.
More informationBayesian Approach 2. CSC412 Probabilistic Learning & Reasoning
CSC412 Probabilistic Learning & Reasoning Lecture 12: Bayesian Parameter Estimation February 27, 2006 Sam Roweis Bayesian Approach 2 The Bayesian programme (after Rev. Thomas Bayes) treats all unnown quantities
More informationMachine Learning CMPT 726 Simon Fraser University. Binomial Parameter Estimation
Machine Learning CMPT 726 Simon Fraser University Binomial Parameter Estimation Outline Maximum Likelihood Estimation Smoothed Frequencies, Laplace Correction. Bayesian Approach. Conjugate Prior. Uniform
More informationLecture 4: Probabilistic Learning
DD2431 Autumn, 2015 1 Maximum Likelihood Methods Maximum A Posteriori Methods Bayesian methods 2 Classification vs Clustering Heuristic Example: K-means Expectation Maximization 3 Maximum Likelihood Methods
More informationNaïve Bayes. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824
Naïve Bayes Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative HW 1 out today. Please start early! Office hours Chen: Wed 4pm-5pm Shih-Yang: Fri 3pm-4pm Location: Whittemore 266
More informationThe Exciting Guide To Probability Distributions Part 2. Jamie Frost v1.1
The Exciting Guide To Probability Distributions Part 2 Jamie Frost v. Contents Part 2 A revisit of the multinomial distribution The Dirichlet Distribution The Beta Distribution Conjugate Priors The Gamma
More informationCS 361: Probability & Statistics
October 17, 2017 CS 361: Probability & Statistics Inference Maximum likelihood: drawbacks A couple of things might trip up max likelihood estimation: 1) Finding the maximum of some functions can be quite
More informationLecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions
DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K
More informationSome slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2
Logistics CSE 446: Point Estimation Winter 2012 PS2 out shortly Dan Weld Some slides from Carlos Guestrin, Luke Zettlemoyer & K Gajos 2 Last Time Random variables, distributions Marginal, joint & conditional
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 143 Part IV
More informationBayesian Regression Linear and Logistic Regression
When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we
More informationMore Spectral Clustering and an Introduction to Conjugacy
CS8B/Stat4B: Advanced Topics in Learning & Decision Making More Spectral Clustering and an Introduction to Conjugacy Lecturer: Michael I. Jordan Scribe: Marco Barreno Monday, April 5, 004. Back to spectral
More informationST 740: Model Selection
ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model
More informationSTAT J535: Chapter 5: Classes of Bayesian Priors
STAT J535: Chapter 5: Classes of Bayesian Priors David B. Hitchcock E-Mail: hitchcock@stat.sc.edu Spring 2012 The Bayesian Prior A prior distribution must be specified in a Bayesian analysis. The choice
More informationLecture 7 and 8: Markov Chain Monte Carlo
Lecture 7 and 8: Markov Chain Monte Carlo 4F13: Machine Learning Zoubin Ghahramani and Carl Edward Rasmussen Department of Engineering University of Cambridge http://mlg.eng.cam.ac.uk/teaching/4f13/ Ghahramani
More informationCOS513 LECTURE 8 STATISTICAL CONCEPTS
COS513 LECTURE 8 STATISTICAL CONCEPTS NIKOLAI SLAVOV AND ANKUR PARIKH 1. MAKING MEANINGFUL STATEMENTS FROM JOINT PROBABILITY DISTRIBUTIONS. A graphical model (GM) represents a family of probability distributions
More informationAarti Singh. Lecture 2, January 13, Reading: Bishop: Chap 1,2. Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell
Machine Learning 0-70/5 70/5-78, 78, Spring 00 Probability 0 Aarti Singh Lecture, January 3, 00 f(x) µ x Reading: Bishop: Chap, Slides courtesy: Eric Xing, Andrew Moore, Tom Mitchell Announcements Homework
More informationAdvanced Probabilistic Modeling in R Day 1
Advanced Probabilistic Modeling in R Day 1 Roger Levy University of California, San Diego July 20, 2015 1/24 Today s content Quick review of probability: axioms, joint & conditional probabilities, Bayes
More informationProbabilistic Machine Learning
Probabilistic Machine Learning Bayesian Nets, MCMC, and more Marek Petrik 4/18/2017 Based on: P. Murphy, K. (2012). Machine Learning: A Probabilistic Perspective. Chapter 10. Conditional Independence Independent
More informationLecture 23 Maximum Likelihood Estimation and Bayesian Inference
Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Thais Paiva STA 111 - Summer 2013 Term II August 7, 2013 1 / 31 Thais Paiva STA 111 - Summer 2013 Term II Lecture 23, 08/07/2013 Lecture
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 8: Frank Keller School of Informatics University of Edinburgh keller@inf.ed.ac.uk Based on slides by Sharon Goldwater October 14, 2016 Frank Keller Computational
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationData Analysis and Uncertainty Part 2: Estimation
Data Analysis and Uncertainty Part 2: Estimation Instructor: Sargur N. University at Buffalo The State University of New York srihari@cedar.buffalo.edu 1 Topics in Estimation 1. Estimation 2. Desirable
More informationIntroduction: MLE, MAP, Bayesian reasoning (28/8/13)
STA561: Probabilistic machine learning Introduction: MLE, MAP, Bayesian reasoning (28/8/13) Lecturer: Barbara Engelhardt Scribes: K. Ulrich, J. Subramanian, N. Raval, J. O Hollaren 1 Classifiers In this
More informationCS 340 Fall 2007: Homework 3
CS 34 Fall 27: Homework 3 1 Marginal likelihood for the Beta-Bernoulli model We showed that the marginal likelihood is the ratio of the normalizing constants: p(d) = B(α 1 + N 1, α + N ) B(α 1, α ) = Γ(α
More informationStrong Lens Modeling (II): Statistical Methods
Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution
More informationProbabilistic modeling. The slides are closely adapted from Subhransu Maji s slides
Probabilistic modeling The slides are closely adapted from Subhransu Maji s slides Overview So far the models and algorithms you have learned about are relatively disconnected Probabilistic modeling framework
More informationNaïve Bayes classification
Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss
More informationBayesian Statistics. Debdeep Pati Florida State University. February 11, 2016
Bayesian Statistics Debdeep Pati Florida State University February 11, 2016 Historical Background Historical Background Historical Background Brief History of Bayesian Statistics 1764-1838: called probability
More informationBayesian Inference. Chapter 2: Conjugate models
Bayesian Inference Chapter 2: Conjugate models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in
More informationBayesian Inference: Concept and Practice
Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of
More informationLearning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling
Learning Sequence Motif Models Using Expectation Maximization (EM) and Gibbs Sampling BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 009 Mark Craven craven@biostat.wisc.edu Sequence Motifs what is a sequence
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationProbabilistic Graphical Models
Parameter Estimation December 14, 2015 Overview 1 Motivation 2 3 4 What did we have so far? 1 Representations: how do we model the problem? (directed/undirected). 2 Inference: given a model and partially
More informationCLASS NOTES Models, Algorithms and Data: Introduction to computing 2018
CLASS NOTES Models, Algorithms and Data: Introduction to computing 208 Petros Koumoutsakos, Jens Honore Walther (Last update: June, 208) IMPORTANT DISCLAIMERS. REFERENCES: Much of the material (ideas,
More informationNested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland
Department of Statistics The University of Auckland https://www.stat.auckland.ac.nz/ brewer/ is a Monte Carlo method (not necessarily MCMC) that was introduced by John Skilling in 2004. It is very popular
More informationModeling Environment
Topic Model Modeling Environment What does it mean to understand/ your environment? Ability to predict Two approaches to ing environment of words and text Latent Semantic Analysis (LSA) Topic Model LSA
More informationClassical and Bayesian inference
Classical and Bayesian inference AMS 132 Claudia Wehrhahn (UCSC) Classical and Bayesian inference January 8 1 / 11 The Prior Distribution Definition Suppose that one has a statistical model with parameter
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining MLE and MAP Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due tonight. Assignment 5: Will be released
More informationECE521 W17 Tutorial 6. Min Bai and Yuhuai (Tony) Wu
ECE521 W17 Tutorial 6 Min Bai and Yuhuai (Tony) Wu Agenda knn and PCA Bayesian Inference k-means Technique for clustering Unsupervised pattern and grouping discovery Class prediction Outlier detection
More informationA Bayesian Approach to Phylogenetics
A Bayesian Approach to Phylogenetics Niklas Wahlberg Based largely on slides by Paul Lewis (www.eeb.uconn.edu) An Introduction to Bayesian Phylogenetics Bayesian inference in general Markov chain Monte
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationChapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)
HW 1 due today Parameter Estimation Biometrics CSE 190 Lecture 7 Today s lecture was on the blackboard. These slides are an alternative presentation of the material. CSE190, Winter10 CSE190, Winter10 Chapter
More informationMCMC notes by Mark Holder
MCMC notes by Mark Holder Bayesian inference Ultimately, we want to make probability statements about true values of parameters, given our data. For example P(α 0 < α 1 X). According to Bayes theorem:
More informationSTA414/2104 Statistical Methods for Machine Learning II
STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements
More informationHPD Intervals / Regions
HPD Intervals / Regions The HPD region will be an interval when the posterior is unimodal. If the posterior is multimodal, the HPD region might be a discontiguous set. Picture: The set {θ : θ (1.5, 3.9)
More informationReadings: K&F: 16.3, 16.4, Graphical Models Carlos Guestrin Carnegie Mellon University October 6 th, 2008
Readings: K&F: 16.3, 16.4, 17.3 Bayesian Param. Learning Bayesian Structure Learning Graphical Models 10708 Carlos Guestrin Carnegie Mellon University October 6 th, 2008 10-708 Carlos Guestrin 2006-2008
More informationNaïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability
Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish
More informationBayesian Estimation An Informal Introduction
Mary Parker, Bayesian Estimation An Informal Introduction page 1 of 8 Bayesian Estimation An Informal Introduction Example: I take a coin out of my pocket and I want to estimate the probability of heads
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University August 30, 2017 Today: Decision trees Overfitting The Big Picture Coming soon Probabilistic learning MLE,
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationMachine Learning using Bayesian Approaches
Machine Learning using Bayesian Approaches Sargur N. Srihari University at Buffalo, State University of New York 1 Outline 1. Progress in ML and PR 2. Fully Bayesian Approach 1. Probability theory Bayes
More informationCompute f(x θ)f(θ) dθ
Bayesian Updating: Continuous Priors 18.05 Spring 2014 b a Compute f(x θ)f(θ) dθ January 1, 2017 1 /26 Beta distribution Beta(a, b) has density (a + b 1)! f (θ) = θ a 1 (1 θ) b 1 (a 1)!(b 1)! http://mathlets.org/mathlets/beta-distribution/
More informationSAMPLE CHAPTER. Avi Pfeffer. FOREWORD BY Stuart Russell MANNING
SAMPLE CHAPTER Avi Pfeffer FOREWORD BY Stuart Russell MANNING Practical Probabilistic Programming by Avi Pfeffer Chapter 9 Copyright 2016 Manning Publications brief contents PART 1 INTRODUCING PROBABILISTIC
More informationComputational Cognitive Science
Computational Cognitive Science Lecture 9: Bayesian Estimation Chris Lucas (Slides adapted from Frank Keller s) School of Informatics University of Edinburgh clucas2@inf.ed.ac.uk 17 October, 2017 1 / 28
More informationBayesian Models in Machine Learning
Bayesian Models in Machine Learning Lukáš Burget Escuela de Ciencias Informáticas 2017 Buenos Aires, July 24-29 2017 Frequentist vs. Bayesian Frequentist point of view: Probability is the frequency of
More informationComputational Perception. Bayesian Inference
Computational Perception 15-485/785 January 24, 2008 Bayesian Inference The process of probabilistic inference 1. define model of problem 2. derive posterior distributions and estimators 3. estimate parameters
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationLecture 6: Markov Chain Monte Carlo
Lecture 6: Markov Chain Monte Carlo D. Jason Koskinen koskinen@nbi.ku.dk Photo by Howard Jackman University of Copenhagen Advanced Methods in Applied Statistics Feb - Apr 2016 Niels Bohr Institute 2 Outline
More informationLecture 13 : Variational Inference: Mean Field Approximation
10-708: Probabilistic Graphical Models 10-708, Spring 2017 Lecture 13 : Variational Inference: Mean Field Approximation Lecturer: Willie Neiswanger Scribes: Xupeng Tong, Minxing Liu 1 Problem Setup 1.1
More informationIntroduction to Probability and Statistics (Continued)
Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:
More informationPattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions
Pattern Recognition and Machine Learning Chapter 2: Probability Distributions Cécile Amblard Alex Kläser Jakob Verbeek October 11, 27 Probability Distributions: General Density Estimation: given a finite
More informationLecture 13 Fundamentals of Bayesian Inference
Lecture 13 Fundamentals of Bayesian Inference Dennis Sun Stats 253 August 11, 2014 Outline of Lecture 1 Bayesian Models 2 Modeling Correlations Using Bayes 3 The Universal Algorithm 4 BUGS 5 Wrapping Up
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationBayesian Methods: Naïve Bayes
Bayesian Methods: aïve Bayes icholas Ruozzi University of Texas at Dallas based on the slides of Vibhav Gogate Last Time Parameter learning Learning the parameter of a simple coin flipping model Prior
More informationNon-Parametric Bayes
Non-Parametric Bayes Mark Schmidt UBC Machine Learning Reading Group January 2016 Current Hot Topics in Machine Learning Bayesian learning includes: Gaussian processes. Approximate inference. Bayesian
More informationLearning Bayesian network : Given structure and completely observed data
Learning Bayesian network : Given structure and completely observed data Probabilistic Graphical Models Sharif University of Technology Spring 2017 Soleymani Learning problem Target: true distribution
More informationBayesian Analysis (Optional)
Bayesian Analysis (Optional) 1 2 Big Picture There are two ways to conduct statistical inference 1. Classical method (frequentist), which postulates (a) Probability refers to limiting relative frequencies
More informationLEARNING WITH BAYESIAN NETWORKS
LEARNING WITH BAYESIAN NETWORKS Author: David Heckerman Presented by: Dilan Kiley Adapted from slides by: Yan Zhang - 2006, Jeremy Gould 2013, Chip Galusha -2014 Jeremy Gould 2013Chip Galus May 6th, 2016
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationLecture 6: Model Checking and Selection
Lecture 6: Model Checking and Selection Melih Kandemir melih.kandemir@iwr.uni-heidelberg.de May 27, 2014 Model selection We often have multiple modeling choices that are equally sensible: M 1,, M T. Which
More informationIntroduction to Machine Learning
Introduction to Machine Learning Generative Models Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 2. MLE, MAP, Bayes classification Barnabás Póczos & Aarti Singh 2014 Spring Administration http://www.cs.cmu.edu/~aarti/class/10701_spring14/index.html Blackboard
More informationParticle Filtering a brief introductory tutorial. Frank Wood Gatsby, August 2007
Particle Filtering a brief introductory tutorial Frank Wood Gatsby, August 2007 Problem: Target Tracking A ballistic projectile has been launched in our direction and may or may not land near enough to
More informationBayesian Inference: Posterior Intervals
Bayesian Inference: Posterior Intervals Simple values like the posterior mean E[θ X] and posterior variance var[θ X] can be useful in learning about θ. Quantiles of π(θ X) (especially the posterior median)
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationAALBORG UNIVERSITY. Learning conditional Gaussian networks. Susanne G. Bøttcher. R June Department of Mathematical Sciences
AALBORG UNIVERSITY Learning conditional Gaussian networks by Susanne G. Bøttcher R-2005-22 June 2005 Department of Mathematical Sciences Aalborg University Fredrik Bajers Vej 7 G DK - 9220 Aalborg Øst
More informationLecture 18: Learning probabilistic models
Lecture 8: Learning probabilistic models Roger Grosse Overview In the first half of the course, we introduced backpropagation, a technique we used to train neural nets to minimize a variety of cost functions.
More informationBayesian analysis in nuclear physics
Bayesian analysis in nuclear physics Ken Hanson T-16, Nuclear Physics; Theoretical Division Los Alamos National Laboratory Tutorials presented at LANSCE Los Alamos Neutron Scattering Center July 25 August
More informationLecture 3. Univariate Bayesian inference: conjugate analysis
Summary Lecture 3. Univariate Bayesian inference: conjugate analysis 1. Posterior predictive distributions 2. Conjugate analysis for proportions 3. Posterior predictions for proportions 4. Conjugate analysis
More information