A Very Brief Summary of Statistical Inference, and Examples

Size: px
Start display at page:

Download "A Very Brief Summary of Statistical Inference, and Examples"

Transcription

1 A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1

2 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model) f(x 1, x 2,..., x n ; θ) Frequentist: θ Θ unknown constant 2

3 1. Likelihood and Sufficiency (Fisherian) Likelihood approach: L(θ) = L(θ, x) = f(x 1, x 2,..., x n ; θ) Often: X 1, X 2,..., X n independent, identically distributed (i.i.d.); then L(θ, x) = n i=1 f(x i, θ) Summarize information about θ: find a minimal sufficient statistic t(x); from the Factorization Theorem: T = t(x) is sufficient for θ if and only if there exists functions g(t, θ) and h(x) such that for all x and θ f(x, θ) = g(t(x), θ)h(x). Moveover T = t(x) is minimal sufficient when it holds that f(x, θ) f(y, θ) is constant in θ t(x) = t(y) 3

4 Example Let X 1,..., X n be a random sample from the truncated exponential distribution, where f Xi (x i ) = e θ x i, x i > θ or, using the indicator function notation, f Xi (x i ) = e θ x i I (θ, ) (x i ). Show that Y 1 = min(x i ) is sufficient for θ. Let T = T (X 1,..., X n ) = Y 1. We need the pdf f T (t) of the smallest order statistic. Calculate the cdf for X i, F (x) = x θ e θ z dz = e θ [e θ e x ] = 1 e θ x. 4

5 Now P (T > t) = n (1 F (t)) = (1 F (t)) n. i=1 Differentiating gives that f T (t) equals n[1 F (t)] n 1 f(t) = ne (θ t)(n 1) e θ t = ne n(θ t), t > θ. So the conditional density of X 1,... X n given T = t is e θ x 1e θ x 2... e θ x n ne n(θ t) = e xi ne, x nt i t, i = 1,... n, which does not depend on θ for each fixed t = min(x i ). Note that since x i t, i = 1,..., n, neither the expression nor the range space depends on θ, so the first order statistic, X (1), is a sufficient statistic for θ. 5

6 Alternatively, use Factorization Theorem: n f X (x) = e θ x i I (θ, ) (x i ) i=1 = I (θ, ) (x (1) ) n i=1 e θ x i = e nθ I (θ, ) (x (1) )e n i=1 x i with g(t, θ) = e nθ I (θ, ) (t) and h(x) = e n i=1 x i. 6

7 2. Point Estimation Estimate θ by a function t(x 1,..., x n ) of the data; often by maximumlikelihood: ˆθ = arg max θ L(θ), or by method of moments Neither are unbiased in general, but the m.l.e. is asymptotically unbiased and asymptotically efficient; under some regularity assumptions, ˆθ N (θ, In 1 (θ)), where [ ( l(θ, ) ] 2 X) I n (θ) = E θ is the Fisher information (matrix) 7

8 Under more regularity, ( 2 ) l(θ, X) I n (θ) = E θ 2 If x is a random sample, then I n (θ) = ni 1 (θ) ; often abbreviate I 1 (θ) = I(θ) The m.l.e. is a function of a sufficient statistic; recall that, for scalar θ, there is a nice theory using the Cramer-Rao lower bound and the Rao-Blackwell theorem on how to obtain minimum variance unbiased estimators based on a sufficient statistic and an unbiased estimator Nice invariance property: The m.l.e. of a function φ(θ) is φ(ˆθ) 8

9 Example Suppose X 1, X 2,..., X n random sample from Gamma distribution with density f(x; c, β) = xc 1 Γ(c)β ce x β, x > 0, where c > 0, β > 0; θ = (c, β); l(θ) = n log Γ(c) nc log β +(c 1) log x i 1 β xi Put D 1 (c) = c log Γ(c), then l β = nc β + 1 β 2 xi l c = nd 1(c) n log β + log( x i ) 9

10 Setting equal to zero yields ˆβ = x ĉ, where ĉ solves D 1 (ĉ) log(ĉ) = log([ x i ] 1/n /x) (Could calculate that sufficient statistics is indeed ( x i, [ x i ), and is minimal sufficient) Check the second derivatives for maximum Calculate Fisher information: Put D 2 (c) = 2 c 2 log Γ(c), 2 l β 2 = nc β 2 2 β 3 xi 2 l β c = n β 2 l c 2 = nd 2(c) 10

11 Use that EX i = cβ to obtain I n (θ) = n c 1 β 2 β 1 β D 2 (c) Recall also that there is an iterative method to compute m.l.e.s, related to the Newton-Raphson method. 11

12 3. Hypothesis Testing For simple null hypothesis and simple alternative, the Neyman- Pearson Lemma says that the most powerful tests are likelihood-ratio tests. These can sometimes be generalized to one-sided alternatives in such a way as to yield uniformly most powerful tests. 12

13 Example Suppose as above that X 1,..., X n is a random sample from the truncated exponential distribution, where f Xi (x i ; θ) = e θ x i, x i > θ. Find a UMP test of size α for testing H 0 : θ = θ 0 against H 1 : θ > θ 0 : Let θ 1 > θ 0, then the LR is f(x, θ 1 ) f(x, θ 0 ) = en(θ 1 θ 0 ) I (θ1, )(x (1) ), which increases with x (1), so we reject H 0 if the smallest observation, T = X (1), is large. Under H 0, we have calculated that the pdf of f T is ne n(θ 0 y 1 ), y 1 > θ 0. 13

14 So for t > θ, under H 0, P (T > t) = t ne n(θ 0 s) ds = e n(θ 0 t). For a test at level α, choose t such that t = θ 0 ln α n. The LR test rejects H 0 if X (1) > θ 0 ln α n. The test is the same for all θ > θ 0, so is UMP. 14

15 For general null hypothesis and general alternative: H 0 : θ Θ 0 H 1 : θ Θ 1 = Θ \ Θ 0 The (generalized) LR test uses the likelihood ratio statistic T = max θ Θ L(θ; X) max L(θ; X) θ Θ 0 rejects H 0 for large values of T ; use chisquare asymptotics 2 log T χ 2 p, where p = dimθ dimθ 0 (nested models only); or use score tests, which are based on the score function l/ θ, with asymptotics in terms of normal distribution N (0, I n (θ)) An important example is Pearson s Chisquare test, which we derived as a score test, and we saw that it is asymptotically equivalent to the generalized likelihood ratio test 15

16 Example Let X 1,..., X n be a random sample from a geometric distribution with parameter p; P (X i = k) = p(1 p) k, k = 0, 1, 2,.... Then L(p) = p n (1 p) (x i ) and l(p) = n ln p + nx ln(1 p), so that and l/ p(p) = n 2 l/ p 2 (p) = n ( 1 p ( 1 p 2 x ) 1 p ) x. (1 p) 2 16

17 We know that EX 1 = (1 p)/p. Calculate the information I(p) = n p 2 (1 p). Suppose n = 20, x = 3, H 0 : p 0 = 0.15 and H 1 : p Then l/ p(0.15) = 62.7 and I(0.15) = Test statistic Z = 62.7/ = Compare to 1.96 for a test at level α = 0.05: do not reject H 0. 17

18 Example continued. For a generalized LRT the test statistic is based on 2 log[l(ˆθ)/l(θ 0 )] χ 2 1, and the test statistic is thus 2(l(ˆθ) l(θ 0 ) = 2n(ln(ˆθ) ln(θ 0 ) + x(ln(1 ˆθ) ln(1 θ 0 )). We calculate that the m.l.e. is ˆθ = x. Suppose again n = 20, x = 3, H 0 : θ 0 = 0.15 and H 1 : θ The m.l.e. is ˆθ = 0.25, and l(0.25) = 44.98; l(0.15) = Calculate χ 2 = 2( ) = 5.4 and compare to chisquare distribution with 1 degree of freedom: 3.84 at 5 percent level, so reject H 0. 18

19 4. Confidence Regions If we can find a pivot, a function t(x, θ) of a sufficient statistics whose distribution does not depend on θ, then we can find confidence regions in a straightforward manner. Otherwise we may have to resort to approximate confidence regions, for example using the approximate normality of the m.l.e. Recall that confidence intervals are equivalent to hypothesis tests with simple null hypothesis and one- or two-sided alternatives 19

20 Not to forget about: Profile likelihood Often θ = (ψ, λ) where ψ contains the parameters of interest; then base inference on profile likelihood for ψ, L P (ψ) = L(ψ, ˆλ ψ ). Again we can use (generalized) LRT or score test; if ψ scalar, H 0 : ψ = ψ 0, H + 1 : ψ > ψ 0, H 1 : ψ < ψ 0, use test statistic T = l(ψ 0, ˆλ 0 ; X), ψ where ˆλ 0 is the MLE for λ when H 0 true 20

21 Large positive values of T indicate H + 1 Large negative values indicate H 1 T l ψ I 1 λ,λ I ψ,λl λ N(0, 1/I ψ,ψ ), where I ψ,ψ = (I ψ,ψ I 2 ψ,λ I 1 λ,λ ) 1 is the top left element of I 1 Estimate by substituting the null hypothesis values; calculate the practical standardized form of T as Z = T Var(T ) l ψ(ψ, ˆλ ψ )[I ψ,ψ (ψ, ˆλ ψ )] 1/2 is approximately standard normal 21

22 Bias and variance approximations: the delta method If we cannot calculate mean and variance directly: Suppose T = g(s) where ES = β and Var S = V. Taylor T = g(s) g(β) + (S β)g (β). Taking the mean and variance of the r.h.s.: ET g(β), Var T [g (β)] 2 V. Also works for vectors S, β, with T still a scalar ( g (β) ) i = g/ β i and g (β) matrix of second derivatives, Var T [g (β)] T V g (β) and ET g(β) trace[g (β)v ]. 22

23 Exponential family For distributions in the exponential family, many calculations have been standardized, see Notes. 23

24 A Very Brief Summary of Bayesian Inference, and Examples Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model) f(x 1, x 2,..., x n θ) Bayesian: θ Θ random; prior π(θ) 24

25 1. Priors, Posteriors, Likelihood, and Sufficiency Posterior distribution of θ given x is π(θ x) = f(x θ)π(θ) f(x θ)π(θ)dθ Shorter: posterior prior likelihood The (prior) predictive distribution of the data x on the basis π is p(x) = f(x θ)π(θ)dθ Suppose data x 1 is available, want to predict additional data: p(x 2 x 1 ) = f(x 2 θ)π(θ x 1 )dθ Note that x 2 and x 1 are assumed conditionally independent given θ They are not, in general, unconditionally independent. 25

26 Example Suppose y 1, y 2,..., y n are independent normally distributed random variables, each with variance 1 and with means βx 1,..., βx n, where β is an unknown real-valued parameter and x 1, x 2,..., x n are known constants. Suppose prior π(β) = N (µ, α 2 ). Then the likelihood is L(β) = (2π) n 2 n i=1 e (y i βx i )2 2 and π(β) = 1 2πα 2 e (β µ)2 2α 2 exp { β2 2α 2 + β µ α 2 } 26

27 and the posterior of β given y 1, y 2,..., y n can be calculated as π(β y) exp exp { 1 (yi βx i ) 2 1 } 2 2α2(β µ)2 {β x i y i 12 β2 x 2i β2 2α + βµ } 2 α 2. Abbreviate s xx = x 2 i, s xy = x i y i, then π(β y) exp {βs xy 12 β2 s xx β2 2α + βµ } 2 α 2 = exp { (s β2 xx + 1α ) + β (s 2 2 xy + µ ) } α 2 which we recognize as N ( sxy + µ α 2 s xx + 1, (s xx + 1α ) ) 1. 2 α 2 27

28 Summarize information about θ: find a minimal sufficient statistic t(x); T = t(x) is sufficient for θ if and only if for all x and θ π(θ x) = π(θ t(x)) As in frequentist approach; posterior (inference) based on sufficient statistic 28

29 Choice of prior There are many ways of choosing a prior distribution; using a coherent belief system, e.g.. Often it is convenient to restrict the class of priors to a particular family of distributions. When the posterior is in the same family of models as the prior, i.e. when one has a conjugate prior, then updating the distribution under new data is particularly convenient. For the regular k-parameter exponential family, k f(x θ) = f(x)g(θ)exp{ c i φ i (θ)h i (x)}, x X, i=1 conjugate priors can be derived in a straightforward manner, using the sufficient statistics. 29

30 Non-informative priors favour no particular values of the parameter over others. If Θ is finite, choose uniform prior; if Θ is infinite, there are several ways (some may be improper) For a location density f(x θ) = f(x θ), then the non-informative location-invariant prior is π(θ) = π(0) constant; usually choose π(θ) = 1 for all θ (improper prior) For a scale density f(x σ) = 1 σ f ( x σ) for σ > 0, the scale-invariant non-informative prior is π(σ) 1 σ ; usually choose π(σ) = 1 σ (improper) Jeffreys prior π(θ) I(θ) 1 2 if the information I(θ) exists; they is invariant under reparametrization; may or may not be improper 30

31 Example continued. Suppose y 1, y 2,..., y n are independent, y i N (βx i, 1), where β is an unknown parameter and x 1, x 2,..., x n are known. Then ( ) 2 I(β) = E log L(β, y) β2 ( ) = E (yi βx i )x i β = s xx and so the Jeffreys prior is s xx ; constant and improper. With this Jeffreys prior, the calculation of the posterior distribution is equivalent to putting α 2 = in the previous calculation, yielding π(β y) is N ( ) sxy, (s xx ) 1. s xx 31

32 If Θ is discrete, constraints E π g k (θ) = µ k, k = 1,..., m choose the distribution with the maximum entropy under these constraints: π(θ i ) = exp ( m k=1 λ kg k (θ i )) i exp ( m k=1 λ kg k (θ i )) where the λ i are determined by the constraints. There is a similar formula for the continuous case, maximizing the entropy relative to a particular reference distribution π 0 under constraints. For π 0 one would choose the natural invariant noninformative prior. For inference, check influence of the choice of prior, for example by ɛ contamination class of priors. 32

33 2. Point Estimation Under suitable regularity conditions, random sampling, when n is large, then the posterior is approximately N (ˆθ, (ni 1 (ˆθ)) 1 ), where ˆθ is the m.l.e.; provided that the prior is non-zero in a region surrounding ˆθ. In particular, if θ 0 is the true parameter, then the posterior will become more and more concentrated around θ 0. Often reporting the posterior distribution is preferred to point estimation. 33

34 Bayes estimators are constructed to minimize the posterior expected loss. Recall from Decision Theory: Θ is the set of all possible states of nature (values of parameter), D is the set of all possible decisions (actions); a loss function is any function L : Θ D [0, ) For point estimation: D = Θ, L(θ, d) loss in reporting d when θ is true. 34

35 Bayesian: For a prior π and data x X, the posterior expected loss of a decision is ρ(π, d x) = Θ L(θ, d)π(θ x)dθ For a prior π the integrated risk of a decision rule δ is r(π, δ) = Θ X L(θ, δ(x))f(x θ)dxπ(θ)dθ An estimator minimizing r(π, δ) can be obtained by selecting, for every x X, the value δ(x) that minimizes ρ(π, δ x). A Bayes estimator associated with prior π, loss L, is any estimator δ π which minimizes r(π, δ). Then r(π) = r(π, δ π ) is the Bayes risk. 35

36 Valid for proper priors, and for improper priors if r(π) <. If r(π) =, define a generalized Bayes estimator as the minimizer, for every x, of ρ(π, d x) For strictly convex loss functions, Bayes estimators are unique. For squared error loss L(θ, d) = (θ d) 2, the Bayes estimator δ π associated with prior π is the posterior mean. For absolute error loss L(θ, d) = θ d, the posterior median is a Bayes estimator. 36

37 Example. Let x be a single observation from N (µ, 1), and let µ have prior distribution π(µ) e µ, on the whole real line. What is the generalized Bayes estimator for µ under square error loss? First we find the posterior distribution of µ given x. π(µ x) exp (µ 12 ) (µ x))2 exp ( 12 ) (µ (x + 1))2 and so π(µ x) = N (x + 1, 1). The generalized Bayes estimator under square error loss is the posterior mean, and so we have µ B = 1 + x. 37

38 Recall that we have seen more in Decision Theory: least favourable distributions. 38

39 3. Hypothesis Testing H 0 : θ Θ 0, D = {accept H 0, reject H 0 } = {1, 0}, where 1 stands for acceptance; loss function 0 if θ Θ 0, φ = 1 L(θ, φ) = a 0 if θ Θ 0, φ = 0 0 if θ Θ 0, φ = 0 a 1 if θ Θ 0, φ = 1 Under this loss function, the Bayes decision rule associated with a prior distribution π is φ π (x) = 1 if P π (θ Θ 0 x) > a 1 a 0 +a 1 0 otherwise 39

40 The Bayes factor for testing H 0 : θ Θ 0 against H 1 : θ Θ 1 is B π (x) = P π (θ Θ 0 x)/p π (θ Θ 1 x) P π (θ Θ 0 )/P π (θ Θ 1 ) = p(x θ Θ 0) p(x θ Θ 1 ) is the ratio of how likely the data is under H 0 and how likely the data is under H 1 ; measures the extent to which the data x will change the odds of Θ 0 relative to Θ 1 Note that we have to make sure that our prior distribution puts mass on H 0 (and on H 1 ). If H 0 is simple, this is usually achieved by choosing a prior that has some point mass on H 0 and otherwise lives on H 1. 40

41 Example. Suppose that x 1,..., x n is a random sample from a Poisson distribution with unknown mean θ; possible models for the prior distribution are π 1 (θ) = e θ, θ > 0, and π 2 (θ) = e θ θ, θ > 0. The prior probabilities of model 1 and model 2 are assessed at probability 1/2 each. Then the Bayes factor is B(x) = 0 e nθ θ x i e θ dθ 0 e nθ θ x ie θ θdθ = Γ( x i + 1)/((n + 1) x i +1 ) Γ( x i + 2)/((n + 1) x i +2 ) = n + 1 xi

42 Note that in this setting B(x) = = P (model 1 x) P (model 2 x) P (model 1 x) 1 P (model 1 x), so that P (model 1 x) = (1 + B(x)) 1. Hence P (model 1 x) = = ( 1 + xi + 1 n + 1 ) 1, ( 1 + x + 1 n n ) 1 which is decreasing for x increasing. 42

43 For robustness: Least favourable Bayesian answers; suppose H 0 : θ = θ 0, H 1 : θ θ 0, prior probability on H 0 is ρ 0 = 1/2. For G family of priors on H 1 ; put B(x, G) = inf g G f(x θ 0 ) Θ f(x θ)g(θ)dθ. A Bayesian prior g G on H 0 will then have posterior probability at least P (x, G) = ( B(x,G)) 1 on H0 (for ρ 0 = 1/2). If ˆθ is the m.l.e. of θ, and G A the set of all prior distributions, then B(x, G A ) = f(x θ 0) f(x ˆθ(x)) and P (x, G A ) = ( 1 + f(x ˆθ(x)) f(x θ 0 ) ) 1. 43

44 4. Credible intervals A (1 α) (posterior) credible interval (region) is an interval (region) of θ values within which 1 α of the posterior probability lies Often would like to find HPD (highest posterior density) region: a (1 α) credible region that has minimal volume when the posterior density is unimodal, this is often straightforward 44

45 Example continued. Suppose y 1, y 2,..., y n are independent normally distributed random variables, each with variance 1 and with means βx 1,..., βx n, where β is an unknown real-valued parameter and x 1, x 2,..., x n are known constants; Jeffreys prior, posterior N ( sxy s xx, (s xx ) 1), then a 95%-credible interval for β is s xy 1 ± s xx s xx Interpretation in Bayesian: conditional on the observed x; randomness relates to the distribution of θ In contrast, frequentist confidence interval applies before x is observed; randomness relates to the distribution of x 45

46 5. Not to forget about: Nuisance parameters θ = (ψ, λ), where λ nuisance parameter π(θ x) = π((ψ, λ) x) marginal posterior of ψ: π(ψ x) = π(ψ, λ x)dλ Just integrate out the nuisance parameter 46

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

Part III. A Decision-Theoretic Approach and Bayesian testing

Part III. A Decision-Theoretic Approach and Bayesian testing Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to

More information

Statistical Theory MT 2007 Problems 4: Solution sketches

Statistical Theory MT 2007 Problems 4: Solution sketches Statistical Theory MT 007 Problems 4: Solution sketches 1. Consider a 1-parameter exponential family model with density f(x θ) = f(x)g(θ)exp{cφ(θ)h(x)}, x X. Suppose that the prior distribution has the

More information

Statistical Theory MT 2006 Problems 4: Solution sketches

Statistical Theory MT 2006 Problems 4: Solution sketches Statistical Theory MT 006 Problems 4: Solution sketches 1. Suppose that X has a Poisson distribution with unknown mean θ. Determine the conjugate prior, and associate posterior distribution, for θ. Determine

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources

STA 732: Inference. Notes 10. Parameter Estimation from a Decision Theoretic Angle. Other resources STA 732: Inference Notes 10. Parameter Estimation from a Decision Theoretic Angle Other resources 1 Statistical rules, loss and risk We saw that a major focus of classical statistics is comparing various

More information

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1 Chapter 4 HOMEWORK ASSIGNMENTS These homeworks may be modified as the semester progresses. It is your responsibility to keep up to date with the correctly assigned homeworks. There may be some errors in

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

STAT215: Solutions for Homework 2

STAT215: Solutions for Homework 2 STAT25: Solutions for Homework 2 Due: Wednesday, Feb 4. (0 pt) Suppose we take one observation, X, from the discrete distribution, x 2 0 2 Pr(X x θ) ( θ)/4 θ/2 /2 (3 θ)/2 θ/4, 0 θ Find an unbiased estimator

More information

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX

Direction: This test is worth 250 points and each problem worth points. DO ANY SIX Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes

More information

Mathematical statistics

Mathematical statistics October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter

More information

Interval Estimation. Chapter 9

Interval Estimation. Chapter 9 Chapter 9 Interval Estimation 9.1 Introduction Definition 9.1.1 An interval estimate of a real-values parameter θ is any pair of functions, L(x 1,..., x n ) and U(x 1,..., x n ), of a sample that satisfy

More information

Parameter Estimation

Parameter Estimation Parameter Estimation Chapters 13-15 Stat 477 - Loss Models Chapters 13-15 (Stat 477) Parameter Estimation Brian Hartman - BYU 1 / 23 Methods for parameter estimation Methods for parameter estimation Methods

More information

1. Fisher Information

1. Fisher Information 1. Fisher Information Let f(x θ) be a density function with the property that log f(x θ) is differentiable in θ throughout the open p-dimensional parameter set Θ R p ; then the score statistic (or score

More information

DA Freedman Notes on the MLE Fall 2003

DA Freedman Notes on the MLE Fall 2003 DA Freedman Notes on the MLE Fall 2003 The object here is to provide a sketch of the theory of the MLE. Rigorous presentations can be found in the references cited below. Calculus. Let f be a smooth, scalar

More information

Suggested solutions to written exam Jan 17, 2012

Suggested solutions to written exam Jan 17, 2012 LINKÖPINGS UNIVERSITET Institutionen för datavetenskap Statistik, ANd 73A36 THEORY OF STATISTICS, 6 CDTS Master s program in Statistics and Data Mining Fall semester Written exam Suggested solutions to

More information

Lecture 1: Introduction

Lecture 1: Introduction Principles of Statistics Part II - Michaelmas 208 Lecturer: Quentin Berthet Lecture : Introduction This course is concerned with presenting some of the mathematical principles of statistical theory. One

More information

Introduction to Bayesian Methods

Introduction to Bayesian Methods Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative

More information

General Bayesian Inference I

General Bayesian Inference I General Bayesian Inference I Outline: Basic concepts, One-parameter models, Noninformative priors. Reading: Chapters 10 and 11 in Kay-I. (Occasional) Simplified Notation. When there is no potential for

More information

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3

Hypothesis Testing. 1 Definitions of test statistics. CB: chapter 8; section 10.3 Hypothesis Testing CB: chapter 8; section 0.3 Hypothesis: statement about an unknown population parameter Examples: The average age of males in Sweden is 7. (statement about population mean) The lowest

More information

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata

Testing Hypothesis. Maura Mezzetti. Department of Economics and Finance Università Tor Vergata Maura Department of Economics and Finance Università Tor Vergata Hypothesis Testing Outline It is a mistake to confound strangeness with mystery Sherlock Holmes A Study in Scarlet Outline 1 The Power Function

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Ch. 5 Hypothesis Testing

Ch. 5 Hypothesis Testing Ch. 5 Hypothesis Testing The current framework of hypothesis testing is largely due to the work of Neyman and Pearson in the late 1920s, early 30s, complementing Fisher s work on estimation. As in estimation,

More information

Bayesian and frequentist inference

Bayesian and frequentist inference Bayesian and frequentist inference Nancy Reid March 26, 2007 Don Fraser, Ana-Maria Staicu Overview Methods of inference Asymptotic theory Approximate posteriors matching priors Examples Logistic regression

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation

More information

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm Statistics GIDP Ph.D. Qualifying Exam Theory Jan, 06, 9:00am-:00pm Instructions: Provide answers on the supplied pads of paper; write on only one side of each sheet. Complete exactly 5 of the 6 problems.

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

9 Bayesian inference. 9.1 Subjective probability

9 Bayesian inference. 9.1 Subjective probability 9 Bayesian inference 1702-1761 9.1 Subjective probability This is probability regarded as degree of belief. A subjective probability of an event A is assessed as p if you are prepared to stake pm to win

More information

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution.

Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Hypothesis Testing Definition 3.1 A statistical hypothesis is a statement about the unknown values of the parameters of the population distribution. Suppose the family of population distributions is indexed

More information

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Statistics Ph.D. Qualifying Exam: Part I October 18, 2003 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your answer

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory

Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory Statistical Inference Parametric Inference Maximum Likelihood Inference Exponential Families Expectation Maximization (EM) Bayesian Inference Statistical Decison Theory IP, José Bioucas Dias, IST, 2007

More information

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided

Let us first identify some classes of hypotheses. simple versus simple. H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided Let us first identify some classes of hypotheses. simple versus simple H 0 : θ = θ 0 versus H 1 : θ = θ 1. (1) one-sided H 0 : θ θ 0 versus H 1 : θ > θ 0. (2) two-sided; null on extremes H 0 : θ θ 1 or

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

Default priors and model parametrization

Default priors and model parametrization 1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)

More information

7. Estimation and hypothesis testing. Objective. Recommended reading

7. Estimation and hypothesis testing. Objective. Recommended reading 7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing

More information

INTRODUCTION TO BAYESIAN METHODS II

INTRODUCTION TO BAYESIAN METHODS II INTRODUCTION TO BAYESIAN METHODS II Abstract. We will revisit point estimation and hypothesis testing from the Bayesian perspective.. Bayes estimators Let X = (X,..., X n ) be a random sample from the

More information

Mathematical statistics

Mathematical statistics October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation

More information

Likelihood inference in the presence of nuisance parameters

Likelihood inference in the presence of nuisance parameters Likelihood inference in the presence of nuisance parameters Nancy Reid, University of Toronto www.utstat.utoronto.ca/reid/research 1. Notation, Fisher information, orthogonal parameters 2. Likelihood inference

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor.

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor. Final Examination a STA 532: Statistical Inference Wednesday, 2015 Apr 29, 7:00 10:00pm Thisisaclosed bookexam books&phonesonthefloor Youmayuseacalculatorandtwo pagesofyourownnotes Do not share calculators

More information

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Module 22: Bayesian Methods Lecture 9 A: Default prior selection Module 22: Bayesian Methods Lecture 9 A: Default prior selection Peter Hoff Departments of Statistics and Biostatistics University of Washington Outline Jeffreys prior Unit information priors Empirical

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Principles of Statistics

Principles of Statistics Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for

More information

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6.

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6. Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 00 MODULE : Statistical Inference Time Allowed: Three Hours Candidates should answer FIVE questions. All questions carry equal marks. The

More information

Nancy Reid SS 6002A Office Hours by appointment

Nancy Reid SS 6002A Office Hours by appointment Nancy Reid SS 6002A reid@utstat.utoronto.ca Office Hours by appointment Light touch assessment One or two problems assigned weekly graded during Reading Week http://www.utstat.toronto.edu/reid/4508s14.html

More information

Spring 2012 Math 541B Exam 1

Spring 2012 Math 541B Exam 1 Spring 2012 Math 541B Exam 1 1. A sample of size n is drawn without replacement from an urn containing N balls, m of which are red and N m are black; the balls are otherwise indistinguishable. Let X denote

More information

Introduction to Estimation Methods for Time Series models Lecture 2

Introduction to Estimation Methods for Time Series models Lecture 2 Introduction to Estimation Methods for Time Series models Lecture 2 Fulvio Corsi SNS Pisa Fulvio Corsi Introduction to Estimation () Methods for Time Series models Lecture 2 SNS Pisa 1 / 21 Estimators:

More information

Brief Review on Estimation Theory

Brief Review on Estimation Theory Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference

More information

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n

Recall that in order to prove Theorem 8.8, we argued that under certain regularity conditions, the following facts are true under H 0 : 1 n Chapter 9 Hypothesis Testing 9.1 Wald, Rao, and Likelihood Ratio Tests Suppose we wish to test H 0 : θ = θ 0 against H 1 : θ θ 0. The likelihood-based results of Chapter 8 give rise to several possible

More information

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n = Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically

More information

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes Neyman-Pearson paradigm. Suppose that a researcher is interested in whether the new drug works. The process of determining whether the outcome of the experiment points to yes or no is called hypothesis

More information

Exercises Chapter 4 Statistical Hypothesis Testing

Exercises Chapter 4 Statistical Hypothesis Testing Exercises Chapter 4 Statistical Hypothesis Testing Advanced Econometrics - HEC Lausanne Christophe Hurlin University of Orléans December 5, 013 Christophe Hurlin (University of Orléans) Advanced Econometrics

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45 Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Hypothesis Testing - Frequentist

Hypothesis Testing - Frequentist Frequentist Hypothesis Testing - Frequentist Compare two hypotheses to see which one better explains the data. Or, alternatively, what is the best way to separate events into two classes, those originating

More information

1 Hypothesis Testing and Model Selection

1 Hypothesis Testing and Model Selection A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection

More information

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Theory of Maximum Likelihood Estimation. Konstantin Kashin Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical

More information

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon Final Examination Saturday, 2001 May 5, 9:00am 12:00 noon This is an open-book examination, but you may not share materials. A normal distribution table, a PMF/PDF handout, and a blank worksheet are attached

More information

LECTURE NOTES 57. Lecture 9

LECTURE NOTES 57. Lecture 9 LECTURE NOTES 57 Lecture 9 17. Hypothesis testing A special type of decision problem is hypothesis testing. We partition the parameter space into H [ A with H \ A = ;. Wewrite H 2 H A 2 A. A decision problem

More information

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your

More information

Chapter 9: Hypothesis Testing Sections

Chapter 9: Hypothesis Testing Sections Chapter 9: Hypothesis Testing Sections 9.1 Problems of Testing Hypotheses 9.2 Testing Simple Hypotheses 9.3 Uniformly Most Powerful Tests Skip: 9.4 Two-Sided Alternatives 9.6 Comparing the Means of Two

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Jonathan Marchini Department of Statistics University of Oxford MT 2013 Jonathan Marchini (University of Oxford) BS2a MT 2013 1 / 27 Course arrangements Lectures M.2

More information

Linear Models A linear model is defined by the expression

Linear Models A linear model is defined by the expression Linear Models A linear model is defined by the expression x = F β + ɛ. where x = (x 1, x 2,..., x n ) is vector of size n usually known as the response vector. β = (β 1, β 2,..., β p ) is the transpose

More information

New Bayesian methods for model comparison

New Bayesian methods for model comparison Back to the future New Bayesian methods for model comparison Murray Aitkin murray.aitkin@unimelb.edu.au Department of Mathematics and Statistics The University of Melbourne Australia Bayesian Model Comparison

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2 Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate

More information

Chapter 3 : Likelihood function and inference

Chapter 3 : Likelihood function and inference Chapter 3 : Likelihood function and inference 4 Likelihood function and inference The likelihood Information and curvature Sufficiency and ancilarity Maximum likelihood estimation Non-regular models EM

More information

Statistical Inference

Statistical Inference Statistical Inference Robert L. Wolpert Institute of Statistics and Decision Sciences Duke University, Durham, NC, USA. Asymptotic Inference in Exponential Families Let X j be a sequence of independent,

More information

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.

Unbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

STATISTICS SYLLABUS UNIT I

STATISTICS SYLLABUS UNIT I STATISTICS SYLLABUS UNIT I (Probability Theory) Definition Classical and axiomatic approaches.laws of total and compound probability, conditional probability, Bayes Theorem. Random variable and its distribution

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Integrated Objective Bayesian Estimation and Hypothesis Testing

Integrated Objective Bayesian Estimation and Hypothesis Testing Integrated Objective Bayesian Estimation and Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es 9th Valencia International Meeting on Bayesian Statistics Benidorm

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

Bayesian Statistics from Subjective Quantum Probabilities to Objective Data Analysis

Bayesian Statistics from Subjective Quantum Probabilities to Objective Data Analysis Bayesian Statistics from Subjective Quantum Probabilities to Objective Data Analysis Luc Demortier The Rockefeller University Informal High Energy Physics Seminar Caltech, April 29, 2008 Bayes versus Frequentism...

More information

Lecture 8: Information Theory and Statistics

Lecture 8: Information Theory and Statistics Lecture 8: Information Theory and Statistics Part II: Hypothesis Testing and I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 23, 2015 1 / 50 I-Hsiang

More information

Loglikelihood and Confidence Intervals

Loglikelihood and Confidence Intervals Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,

More information

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ )

(θ θ ), θ θ = 2 L(θ ) θ θ θ θ θ (θ )= H θθ (θ ) 1 d θ (θ ) Setting RHS to be zero, 0= (θ )+ 2 L(θ ) (θ θ ), θ θ = 2 L(θ ) 1 (θ )= H θθ (θ ) 1 d θ (θ ) O =0 θ 1 θ 3 θ 2 θ Figure 1: The Newton-Raphson Algorithm where H is the Hessian matrix, d θ is the derivative

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Testing Statistical Hypotheses

Testing Statistical Hypotheses E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I Small-Sample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO LECTURE NOTES FYS 4550/FYS9550 - EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I PROBABILITY AND STATISTICS A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO Before embarking on the concept

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your

More information

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we

More information

Maximum Likelihood Estimation

Maximum Likelihood Estimation Chapter 8 Maximum Likelihood Estimation 8. Consistency If X is a random variable (or vector) with density or mass function f θ (x) that depends on a parameter θ, then the function f θ (X) viewed as a function

More information

ST 740: Model Selection

ST 740: Model Selection ST 740: Model Selection Alyson Wilson Department of Statistics North Carolina State University November 25, 2013 A. Wilson (NCSU Statistics) Model Selection November 25, 2013 1 / 29 Formal Bayesian Model

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information