1 Probability Model. 1.1 Types of models to be discussed in the course
|
|
- Hilda Wilkinson
- 6 years ago
- Views:
Transcription
1 Sufficiency January 18, 016 Debdeep Pati 1 Probability Model Model: A family of distributions P θ : θ Θ}. P θ (B) is the probability of the event B when the parameter takes the value θ. P θ is described by giving a joint pdf or pmf f(x θ). Experiment: Observe X(data) P θ, θ unknown. Goal: Make inference about θ. Joint distribution of independent rv s: If X (X 1,..., X n ) and X 1,..., X n are independent with X i g i (x i θ), then the joint pdf is f(x θ) n g i(x i θ) where x (x 1,..., x n ). For iid random variables g 1 g n g. 1.1 Types of models to be discussed in the course Let X (X 1,..., X n ). 1. Random Sample: X 1,..., X n are iid. Regression Model: X 1,..., X n are independent (but not necessarily identically distributed; the distribution of X i may depend on covariates z i ) Random Sample Models Example: Let X 1, X,..., X n iid Poisson(λ), λ unknown. Here we have: X (X 1, X,..., X n ), θ λ, Θ λ : λ > 0}, P θ is described by the joint pmf f(x λ) f(x 1,..., x n λ) g(x i λ) where g is the Poisson(λ) pmf g(x λ) λx e λ x! for x 0, 1,,.... Hence f(x λ) λ x i e λ for x 0, 1,,...} n. Example: Let X 1, X,..., X n iid N(µ, σ ), with µ and σ unknown. Here we have: X x i! 1
2 (X 1, X,..., X n ), θ (µ, σ ), Θ (µ, σ ) : < µ <, σ > 0}, P θ is described by the joint pmf f(x µ, σ ) g(x i µ, σ ) where g is the N(µ, σ ) pdf g(x µ, σ ) 1 πσ e (x µ) /(σ ). Hence f(x µ, σ ) Sufficient Statistic 1 πσ e (x i µ) /(σ ) Let X P θ, θ unknown. What part (or function) of the data X is essential for inference about θ? Example: Suppose X 1,..., X n iid Bernoulli(p) (independent tosses of a coin). Intuitively, T X i # of heads contains all the information about p in the data. We need to formalize this. Let X P θ, θ unknown. Definition 1. The statistic T T (X) is a sufficient statistic for θ if the conditional distribution of X given T does not depend on the unknown parameter θ. Abbreviation: T is SS if L(X T ) is same for all θ, where L stands for law or distribution..1 Motivation for the definition Suppose X P θ, θ Θ, θ unknown. Let T T (X) be any statistic. We can imagine that the data X is generated hierarchically as follows: 1. First generate T L(T ).. Then generate X L(X T ). If T is a sufficient statistic for θ, then L(X T ) does not depend on θ and Step can be carried out without knowing θ. Since, given T, the data X can be generated without knowing θ, the data X supplies no further information about θ beyond what is already contained in T.
3 Notation: X P θ, θ Θ, θ unknown. If T T (X) is a sufficient statistic for θ, then T contains all the information about θ in X in the sense that if X is discarded, but we keep T T (X), we can fake the data (without knowing θ) by generating X from L(X T ). X has the same distribution as X (X P θ ) and the same value of the sufficient statistic (T (X ) T (X)) and can be used for any purpose we would use the real data for. Example: If U(X) is an estimator of θ, then U(X ) is another estimator of θ which performs just as well since U(X) d U(X ) for all θ. Cautionary Note: If the model is correct (X P θ ) and T (X) is sufficient for θ, then can ignore data X and just use T (X) for inference about θ. BUT if we are not sure that the model is correct, X may contain valuable information about model correctness not contained in T (X). Example: X 1, X,..., X n iid Bernoulli(p). T n X i is a sufficient statistic for p. Possible Model violations: The trial might be correlated as not independent. The success probability p might not be constant from trial to trial. These model violations cannot be investigated using the sufficient statistic. This can be only done by further investigation with the data.. Examples of Sufficient Statistic 1. X (X 1, X ) iid Poisson(λ). T X 1 + X is a sufficient statistic for λ because P λ (X 1 x 1, X x T t) P λ(x 1 x, X x, P λ (T t) redundant if tx 1 +x }} T t ) Pλ (X 1 x,x x ) P λ (T t), if t x 1 + x 0 if t x 1 + x This follows from the fact that for discrete distributions P θ, Assuming t x 1 + x, P θ (X x T (X) t) P λ (X 1 x 1, X x T t) Pθ (Xx) P θ (T (X)t) 0 otherwise if T (x) t λ x 1 e λ x 1! λx e λ x! (λ) t e λ t! (Since T Poisson(λ)) ( t x 1 ) t which does not involve λ. Thus, T is a sufficient statistic for λ. Note that 3
4 ( t P (X 1 x 1 T t) x 1 )( 1 ) x1 ( ) 1 t x1, x 1 0, 1,..., t. Thus L(X 1 T t) is Binomial(t, 1/). Given T t, we may generate fake data X1, X without knowing λ which has the same distribution as the real data: (a) Generate X1 Binomial(t, 1/). (Toss a fair coin t times and count the number of heads). (b) Set X t X 1. The real and fake data have the same value of the sufficient statistic: X 1 + X t X 1 + X.. Extension to previous Example: If X (X 1, X,..., X n ) are iid Poisson(λ), then T X 1 + X + + X n is a sufficient statistic for λ. Moreover P (X 1 x 1,..., X n x n T t) ( ) t! 1 t x 1!x! x n! n ( )( ) t 1 x1 ( ) 1 xn x 1,..., x n n n so that L(X T t) is Multinomial with t trials and n categories with equal probability 1/n (see Section 4.6). 3. X (X 1, X ) iid Expo(β). Then T X 1 + X is a sufficient statistic for β. To derive this, we need to calculate L(X 1, X T t). It suffices to get L(X 1 T t) since X t X 1. How to do this? (a) Find joint density f X1,T (x 1, t). (b) Then get conditional density Continuing with the steps, (a) Use the transformation f X1 T (x 1 t) f X 1,T (x 1, t). f T (t) U X 1, T X 1 + X X 1 U, X T U 4
5 with Jacobian J 1. Then f U,T (u, t) f X1,X (u, t u) J 1 β e u/β 1 β e (t u)/β 1 (b) T X 1 + X Gamma(α, β) so that 1 β e t/β, for 0 u t <. f T (t) te t/β β, t 0. Aternatively, integrate over x 1 in the joint density f X1,T (x 1, t) to get f T (t). Now f X1 T (x 1 t) 1 β e t/β I(0 x 1 t) te t/β β 1 t I(0 x 1 t) which does not involve β. Thus T X 1 + X is a sufficient statistic for β. Moreover, L(X 1 T t) is Unif(0, t). This can also be seen intuitively by noting that is constant on the line segment f X1,X (x 1, x ) 1 β e (x 1+x )/β (x 1, x ) : x 1 0, x 0, x 1 + x t} Thus given T t, we may generate fake data X1, X the same distribution as the real data: without knowing β which has (a) Generate X1 Unif(0, t). (b) Set X t X 1. The real and fake data have the same value of the sufficient statistic: X 1 + X t X 1 + X. 4. Extension to previous Example: If X (X 1, X,..., X n ) are iid Expo(β), then T X 1 + X + + X n is a sufficient statistic for β and L(X T t) is a uniform distribution on the simplex (x 1,..., x n ) : x x n t, x i 0 i}. 5
6 5. X (X 1, X ) iid Unif(0, θ). Then T X 1 + X is not sufficient statistic for θ. Proof. We must show that L(X 1, X T ) depends on θ. The support of (X 1, X ) is [0, θ]. Given T t, we know (X 1, X ) lies on the line L (x 1, x ) : x 1 + x t}. Thus, the support of L(X 1, X T ) is L [0, θ] which is drawn below for two different values of θ. The support of L(X 1, X T t) varies with θ. This shows that L(X 1, X T ) depends on θ. 6. If X 1,..., X n iid Bernoulli(p), then T n X i is a sufficient statistic for p. First: What is the joint pmf of X 1,..., X n? Note that P (X 1 1, X 0, X 3 1, X 4 1, X 5 0) p q p p q p 3 q where q 1 p. In general, P (X x) P (X 1 x 1,..., X n x n ) p x i q 1 x i p n x i q n (1 x i) p t q n t p T (x) q n T (x), where T (x) t n x i. Next, we derive L(X T ). We will use the notation T (X) n X i T and T (x) n x i. Recall that for discrete distributions P θ, P θ (X x T (X) t) Assume T (x) n x i t, θ p. Then Pθ (Xx) P θ (T (X)t) 0 otherwise if T (x) t P θ (X x T (X) t) P θ(x x) P θ (T (X) t) pt q ( n t n ) t p t q n t ( 1 n t) 6
7 since T Binomial(n, p). This does not involve p which proves that T is a sufficient statistic for p. Note: The conditional probability is the same for any sequence x (x 1,..., x n ) with t 1s and n t 0s. There are ( n t) such sequences. Summary: Given T X X n t, all possible sequences of t 1s and n t 0s are equally likely. Algorithm for generating from L(X 1,..., X n T t): (a) Put t 1s and n t 0s in an urn. (b) Draw them out one by one (without replacement) until the urn is empty. This makes all possible sequences equally likely. (Think about it!) The resulting sequence (X 1,..., X n) (the fake data) has the same value of the sufficient statistic as (X 1,..., X n ): Xi t X i.3 Sufficient conditions for sufficiency Sometimes finding sufficient statistic might be time-consuming and cumbersome if one proceeds directly from the definition. We need an easy to verifiable sufficient condition to find a sufficient statistic. Suppose X P θ, θ Θ. Theorem 6.. T(X) is a sufficient statistic for θ iff for all x f X (x θ) f T (T (x) θ) is constant as a function of θ. Notation: f X (x θ) is pdf (or pmf) of X. f T (t θ) is pdf (or pmf) of T T (X). Factorization Criterion (FC): There exist functions h(x) and g(t θ) such that for all x and θ. f(x θ) g(t (x) θ)h(x) Theorem 1. T (X) is a sufficient statistic for θ iff the factorization criterion is satisfied. Proof. (When X is discrete) Notation: T T (X), t T (x). 7
8 First, Assume T is a sufficient statistic for θ. Then the pmf f(x θ) can be written as f(x θ) P θ (T t) }} P θ (X x T t) }} This is a function of t and θ. Call it g(t θ) This depends on x, but not θ (by defn. of suff. stat. Call it h(x) g(t θ)h(x). Hence F C is true. Next Assume FC is true. Then P θ (X x T t) P θ(x x) (since X x} T t}) P θ (T t) f(x θ) g(t θ)h(x) f(z θ) g(t θ)h(z) which does not involve θ. z:t (z)t h(x) z:t (z)t h(z) z:t (z)t.4 Applications of FC 1. Let X (X 1,..., X n ) iid Poisson(λ). The joint pmf is f(x λ) f(x 1,..., x n λ) λ x i e λ λ i x i e nλ x i! i x i! ( λ )( ) i x i 1 e nλ i x i! g(t(x) λ)h(x) where T (x) i x i, g(t λ) λ t e nλ, h(x) 1 i x i! Thus, by FC, T (X) i X i is a sufficient statistic for λ.. Simple Linear Regression: Let X i β 0 + β 1 z i + ɛ i, ɛ i i.i.d N(0, σ 0) i 1,..., n where z i, i 1,..., n are known constants. Alternative statement of the model: X 1, X,..., X n independent X i N(β 0 + β 1 z i, σ0). 8
9 Data is X (X 1, X,..., X n ). (z 1, z,..., z n ) are known constants. Unknown parameter is θ (β 0, β 1 ) R. What are the sufficient statistics for this model? Use FC. f(x θ) 1 πσ 0 e (x i β 0 β 1 z i ) /σ 0 }} N(β 0 +β 1 z i,σ 0 ) density ( ) 1 n exp 1 πσ 0 σ0. } (x i β 0 β 1 z i ). }} S Here S x i x i (β 0 + β 1 z i ) + (β 0 + β 1 z i ) x i β 0 x i β 1 x i z i + (β 0 + β 1 z i ). Plus this back into the exponential and rearrange to get exp 1 σ0 f(x θ) x i } ( ) 1 n exp 1 ( πσ 0 σ0 β 0 ( g x i, x i z i, β 0, β 1 )h(x) g(t (x), θ)h(x) where T (x) ( n x i, n x iz i ) and g(t, θ) ( ) 1 n exp 1 ( πσ 0 σ β 0 t 1 β 1 t + with t (t 1, t ) and h(x) exp 1 n } σ0 x i. x i β 1 x i z i + )} (β 0 + β 1 z i ) 3. Continuation of Simple Linear Regression Example: What if the variance σ is unknown? Now θ (β 0, β 1, σ ) and Θ R (0, ). (Change σ 0 to σ in the earlier )} (β 0 + β 1 z i ) 9
10 formulas to indicate this). Now exp 1 n } σ x i is not a function of x, but depends also on θ. So we now factor the joint density as ( ) 1 n f(x θ) exp 1 ( )} πσ σ x i β 0 x i β 1 x i z i + (β 0 + β 1 z i ) where g( T (x) x i, x i, g(t (x), θ)h(x) ( x i, x i z i, β 0, β 1, σ )h(x) x i, g(t, θ) (πσ ) n/ exp ) x i z i (t 1, t, t 3 ) 1 ( σ t 1 β 0 t β 1 t 3 + )} (β 0 + β 1 z i ) and h(x) 1. According to FC, T (X) ( i X i, i X i, i z ix i ) is a sufficient statistic for θ (β 0, β 1, σ ). 4. Discussion on the preceding examples: We have described two models. The model with σ known (i.e., σ σ 0 ) can be regarded as a subset of the model where σ is unknown. Θ 1 (β 0, β 1, σ ) : σ σ 0} R σ 0}. Θ (β 0, β 1, σ ) : σ > 0} R (0, ). Θ 1 Θ. The sufficient statistics we found for these two models were different: 1. T 1 ( i X i, i z i X i ) is SS for Θ 1. T ( i X i, i X i, i z i X i ) is SS for Θ. Note: T is also a SS for Θ 1, but it is not minimal. 5. Sufficient statistic for random samples from various families of normal distributions: Let X (X 1,..., X n ) where X 1,..., X n are iid N(µ, σ ). Consider different families of normal distributions. Θ 1 (µ, σ ) : σ > 0} Θ (µ, σ ) : σ σ 0} Θ 3 (µ, σ ) : µ µ 0, σ > 0} 10 (all normal distributions) (known variance) (known mean)
11 For each space, the obvious sufficient statistic is different. In all case, the joint pdf of X is given by f(x µ, σ ) (πσ ) 1/ exp (x i µ) } σ (πσ ) n/ exp 1 } σ (x i µ) (1) Θ 3 : Here µ µ 0, (a known value), so the unknown parameter is θ σ. The joint pdf may be factored as f(x σ ) (πσ ) n/ exp 1 } σ (x i µ 0 ) i g ( (x i µ 0 ), σ ) h(x) i g ( T 3 (x), σ ) h(x), where T 3 (x) n (x i µ 0 ) so that T 3 T 3 (X) i (X i µ 0 ) is a SS for Θ 3. Note: T 3 is not even a statistic if µ is unknown (i.e., not fixed). For the rest (Θ 1 and Θ ), we modify (1) by substituting (x i µ) (x i x) + n( x µ), where x n 1 n x i. (This is an identity valid for all x 1, x,..., x n and µ). Substituting in (1) and breaking up the exponential yields n (x i x) f(x µ, σ ) (πσ ) n/ exp σ i } exp } n( x µ) σ Θ : Here σ σ0, (a known value), so the unknown parameter is θ µ. Factoring the joint pdf () as f(x µ) [ (πσ 0) n/ exp 1 σ 0 h(x)g( x, µ) h(x)g(t (x), µ) }][ (x i x) exp where T (x) x. This shows that T T (X) X is a SS for θ. i }] n( x µ) σ0 Θ 1 : Here both µ and σ are unknown so θ (µ, σ ). It is clear that () may be written as f(x µ, σ ) g( x, (X i x), µ, σ ) 1 i g(t 1 (x), θ)h(x) () 11
12 where T 1 (x) ( x, i (x i x) ) so that T 1 T 1 (X) ( X, i (X i X) ) is a SS for Θ 1. Note: T 1 is also a SS for Θ and Θ 3, neither T or T 3 is a SS for Θ 1..5 General Facts about SS 1. If T T (X) is a SS for θ Θ A, and Θ B Θ A, then T is SS for θ Θ B. Proof. If L(X T ) is constant for θ Θ A, then it is constant for θ Θ B.. If T is a SS (for θ Θ) and T φ(u) where U U(X), then U is also a SS (for θ Θ). Proof. (Using FC) Since T is SS, f(x θ) g(t (x) θ)h(x) g(φ(u(x)) θ)h(x) g (U(x) θ)h(x) where g (u θ) g(φ(u) θ). Hence U(X) is SS. 3. If T T (X) is a sufficient statistics (for θ Θ), then U (S, T ) is also a sufficient statistic for any S S(X). Proof. Immediate consequence of ) by taking φ(s, t) t. With this choice of φ, we have T φ(u) U is SS. 4. If T T (X) and U U(X) are related by T φ(u) where φ is one-one function, then T is SS iff U is SS..6 Application to random samples from various families of normal distributions: Recall: 1. T 1 ( X, (X i X) ) is SS for Θ 1 (µ, σ ) : σ > 0}.. T X is SS for Θ (µ, σ ) : σ σ 0 }. 3. T 3 (X i µ 0 ) is SS for Θ 3 (µ, σ ) : µ µ 0, σ > 0}. 1
13 Some facts: 1. T 1 is SS for Θ 1 T 1 is SS for Θ and for Θ 3 (Follows from Fact 1 since Θ 1 Θ and Θ 1 Θ 3.. T is SS for Θ T 1 is SS for Θ (Follows from Fact 3). 3. T 3 is SS for Θ 3 and T 3 (X i µ 0 ) (X i X) + n( X µ 0 ) φ(t 1 ) T 1 is SS for Θ 3 (Follows from Fact ). 4. T 1 is SS for Θ 1 ( X, 1 n 1 (Xi X) ) is SS for Θ 1 and ( X i, X i ) is SS for Θ 1 (Since both of these are one-one functions of T 1 (Follows from Fact 4). 3 Minimal sufficient statistic Definition. A minimal sufficient statistic is a function of any other sufficient statistic. T T (X) is minimal sufficient if for every sufficient statistic S S(X) there exists a function ψ such that T ψ(s), that is, T (X) ψ(s(x)). Theorem. (Lehmann-Scheffe Theorem) X P θ, θ Θ. T (X) is a minimal sufficient statistic iff for all x, y, T (x) T (y) iff f(x θ) f(y θ) is constant as a function of θ. Remark 1. It is difficult to show a statistic is MSS directly from the definition. For proving MSS, we usually use the Lehmann-Scheffe Theorem. However, it is often very easy to prove a statistic is not MSS using the definition. If S and T are two different sufficient statistics, and T cannot be written as a function of S, then T is not minimal. Example: Consider the three families of normal distributions used earlier. T 1 and T are both SS for Θ, but T 1 clearly cannot be written as a function of T. Thus T 1 is not a MSS for Θ. Similarly, T 1 and T 3 are both SS for Θ 3, but T 1 clearly cannot be written as a function of T 3. Thus T 1 is not a MSS for Θ 3. Comments on the Lehmann-Scheffe Theorem 1. In situations where the support of f(x θ) depends on θ, a better statement (which avoids awkward 0 0 s) is: For all x, y, T (x) T (y) iff f(x θ) c(x, y)f(y θ) for all θ.. The iff can be broken down as two results (a) If T (X) is sufficient, then for all x, y, T (x) T (y) implies f(x θ) f(y θ) (b) A sufficient statistic T (X) is minimal if for all x, y, f(x θ) f(y θ) T (x) T (y). constant in θ. constant in θ implies 13
14 3.1 Examples for Lehmann-Scheffe Theorem 1. X (X 1,..., X n ) iid N(µ, σ ). T (X) ( X, S ) where S 1 n 1 n (X i X) is MSS for (µ, σ ). X (X 1,..., X n ) iid Uniform(α, β), Θ (α, β) : < α < β < }. T (X) (X (1), X (n) ) is MSS for (α, β) (X (1) min X i, X (n) max X i ). We must verify: for all x, y, T (x) T (y) iff there exists c 0 such that f(x θ) cf(y θ) for all θ. (c does not involve θ, but can depend on x, y). In this case, f(x θ) 1 β α I(α x i β) 1 (β α) n I(x (1) α)i(x (n) β) Similarly, f(y θ) 1 (β α) n I(y (1) α)i(y (n) β). Clearly, (x (1), x (n) ) (y (1), y (n) ) implies f(x θ) f(y θ) (can take c 1) for all θ Θ. This gives one direction. What about the other? Define A(x) θ : f(x θ) > 0}. Here θ (α, β) with α < β. Assume that there exists c 0 such that f(x θ) cf(y θ) for all θ. Then we must have A(x) A(y). But A(x) (α, β) : α x (1), β x (n) }. for any x. Thus A(x) A(y) implies (x (1), x (n) ) (y (1), y (n) ) proving that (x (1), x (n) ) is MSS. Note: This style of argument can only work for examples similar to the uniform distribution where the support depends upon the parameter value. 3. X (X 1,..., X n ) iid Uniform(θ, θ + 1). Then T (X) (X (1), X (n) ) is MSS for θ. Comments: (a) The dimension of the MSS does not have to be the same as the dimension of the parameter. 14
15 (b) shrinking the parameter space does not always change the MSS. When X (X 1,..., X n ) iid Uniform(α, β), Θ 1 (α, β) : α < β} and Θ (α, β) : β α + 1} have the same MSS. 4. Random Sample Model: Suppose X (X 1, X,..., X n ) iid ψ(x θ) (pdf or pmf) where ψ(x θ) is an arbitrary family of pdf s (pmf s). Then T (X ) (X (1), X (),..., X (n) ), the order statistics (data arranged in increasing order) is a sufficient statistic for θ, but may not be minimal. Proof. (Use FC) f(x θ) ψ(x i θ) ψ(x (i) θ) 1 g(t (x ) θ)h(x ). Note: (assume x (1) < x () < < x (n) ). Then P (X x T (X ) t) 1 n! if x is any rearrangement of x (1), x (),..., x (n) and 0 otherwise. All possible ordering are equally likely. To generate from L(X T ), place the values x (1), x (),..., x (n) in a hat and draw them out one by one. Comment: For random sample models, the order statistics are often the SS. 5. X (X 1,..., X n ) iid ψ(x θ) with ψ(x θ) 1 π the Cauchy-location family. Look at (x θ), f(x θ) f(ỹ θ) n 1 1 π 1+(x i θ) n 1 1 π 1+(y i θ) 15
16 If x (i) y (i) for all i, then the ratio is a constant function of θ. Now suppose f(x θ)/f(ỹ θ) is a constant function of θ. Then (1 + (x i θ) ) c(x, y) (1 + (y i θ) ) for some function c(x, y) independent of θ. This is equivalent to (θ x i θ + x i + 1) c(x, y) (θ y i θ + yi + 1). Clearly, both n (θ x i θ + x i + 1) and n (θ y i θ + yi + 1) are polynomials of degree n in θ with the same set of zeros O L and O R. We can spell out O L x i ± i, i 1,..., n}, O R y i ± i, i 1,..., n}, where i 1, the imaginary root of 1/ Then O L and O R are permutations of each other. Hence x (i) y (i) for all i 1,..., n. 6. Suppose X P θ, θ Θ and P θ has a joint pdf or pmf f(x θ). Fact: X is a SS for θ. Proof. (Using FC) Define T T (X) X. (T is the identity function.) Then f(x θ) f(x θ) 1 g(t (x) θ) h(x) where g f and h(x) 1. Thus T is SS. Proof. (From definition of SS) L(X T (X) t) L(X X t) δ t where δ t is the probability measure which places all its mass at the point (dataset) t. 7. Further suppose X (X 1,..., X n ) where X 1,..., X n are iid from the pdf (pmf) f(x θ). Fact: T (X) X (X 1,..., X n ) is not a MSS. Proof. (from definition of MSS) Let S S(X) (X (1), X (),..., X (n) ) (the order statistics). Since we have a random sample model, S is a SS. But clearly T is not a function of S. (You cannot recover the original ordering of the data given only the order statistics.) Thus T is not a MSS. 16
1 Probability Model. 1.1 Types of models to be discussed in the course
Sufficiency January 11, 2016 Debdeep Pati 1 Probability Model Model: A family of distributions {P θ : θ Θ}. P θ (B) is the probability of the event B when the parameter takes the value θ. P θ is described
More information1 Complete Statistics
Complete Statistics February 4, 2016 Debdeep Pati 1 Complete Statistics Suppose X P θ, θ Θ. Let (X (1),..., X (n) ) denote the order statistics. Definition 1. A statistic T = T (X) is complete if E θ g(t
More informationLast Lecture - Key Questions. Biostatistics Statistical Inference Lecture 03. Minimal Sufficient Statistics
Last Lecture - Key Questions Biostatistics 602 - Statistical Inference Lecture 03 Hyun Min Kang January 17th, 2013 1 How do we show that a statistic is sufficient for θ? 2 What is a necessary and sufficient
More informationMiscellaneous Errors in the Chapter 6 Solutions
Miscellaneous Errors in the Chapter 6 Solutions 3.30(b In this problem, early printings of the second edition use the beta(a, b distribution, but later versions use the Poisson(λ distribution. If your
More informationSUFFICIENT STATISTICS
SUFFICIENT STATISTICS. Introduction Let X (X,..., X n ) be a random sample from f θ, where θ Θ is unknown. We are interested using X to estimate θ. In the simple case where X i Bern(p), we found that the
More informationSTAT 730 Chapter 4: Estimation
STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum
More informationStatistics 3858 : Maximum Likelihood Estimators
Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,
More informationMathematical Statistics
Mathematical Statistics Chapter Three. Point Estimation 3.4 Uniformly Minimum Variance Unbiased Estimator(UMVUE) Criteria for Best Estimators MSE Criterion Let F = {p(x; θ) : θ Θ} be a parametric distribution
More informationExample: An experiment can either result in success or failure with probability θ and (1 θ) respectively. The experiment is performed independently
Chapter 3 Sufficient statistics and variance reduction Let X 1,X 2,...,X n be a random sample from a certain distribution with p.m/d.f fx θ. A function T X 1,X 2,...,X n = T X of these observations is
More informationSummary. Complete Statistics ... Last Lecture. .1 What is a complete statistic? .2 Why it is called as complete statistic?
Last Lecture Biostatistics 602 - Statistical Inference Lecture 06 Hyun Min Kang January 29th, 2013 1 What is a complete statistic? 2 Why it is called as complete statistic? 3 Can the same statistic be
More information1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).
Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,
More informationSpring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =
Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically
More informationSummary. Ancillary Statistics What is an ancillary statistic for θ? .2 Can an ancillary statistic be a sufficient statistic?
Biostatistics 62 - Statistical Inference Lecture 5 Hyun Min Kang 1 What is an ancillary statistic for θ? 2 Can an ancillary statistic be a sufficient statistic? 3 What are the location parameter and the
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationSOLUTION FOR HOMEWORK 12, STAT 4351
SOLUTION FOR HOMEWORK 2, STAT 435 Welcome to your 2th homework. It looks like this is the last one! As usual, try to find mistakes and get extra points! Now let us look at your problems.. Problem 7.22.
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More information1 Random Variable: Topics
Note: Handouts DO NOT replace the book. In most cases, they only provide a guideline on topics and an intuitive feel. 1 Random Variable: Topics Chap 2, 2.1-2.4 and Chap 3, 3.1-3.3 What is a random variable?
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your
More informationLecture 3. Discrete Random Variables
Math 408 - Mathematical Statistics Lecture 3. Discrete Random Variables January 23, 2013 Konstantin Zuev (USC) Math 408, Lecture 3 January 23, 2013 1 / 14 Agenda Random Variable: Motivation and Definition
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationCentral Limit Theorem ( 5.3)
Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately
More informationChapter 8.8.1: A factorization theorem
LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.
More informationFundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner
Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization
More informationStatistics STAT:5100 (22S:193), Fall Sample Final Exam B
Statistics STAT:5 (22S:93), Fall 25 Sample Final Exam B Please write your answers in the exam books provided.. Let X, Y, and Y 2 be independent random variables with X N(µ X, σ 2 X ) and Y i N(µ Y, σ 2
More informationWeek 9 The Central Limit Theorem and Estimation Concepts
Week 9 and Estimation Concepts Week 9 and Estimation Concepts Week 9 Objectives 1 The Law of Large Numbers and the concept of consistency of averages are introduced. The condition of existence of the population
More informationSTATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero
STATISTICAL METHODS FOR SIGNAL PROCESSING c Alfred Hero 1999 32 Statistic used Meaning in plain english Reduction ratio T (X) [X 1,..., X n ] T, entire data sample RR 1 T (X) [X (1),..., X (n) ] T, rank
More informationModeling: How do we capture the uncertainty in our data and the world that produced it?
Location-scale families January 4, 209 Debdeep Pati Overview data statistics inferences. As statisticians, we are tasked with turning the large amount of data generated by eperiments and observations into
More information1 Introduction. P (n = 1 red ball drawn) =
Introduction Exercises and outline solutions. Y has a pack of 4 cards (Ace and Queen of clubs, Ace and Queen of Hearts) from which he deals a random of selection 2 to player X. What is the probability
More informationLecture 17: Minimal sufficiency
Lecture 17: Minimal sufficiency Maximal reduction without loss of information There are many sufficient statistics for a given family P. In fact, X (the whole data set) is sufficient. If T is a sufficient
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More information18 Bivariate normal distribution I
8 Bivariate normal distribution I 8 Example Imagine firing arrows at a target Hopefully they will fall close to the target centre As we fire more arrows we find a high density near the centre and fewer
More informationLink lecture - Lagrange Multipliers
Link lecture - Lagrange Multipliers Lagrange multipliers provide a method for finding a stationary point of a function, say f(x, y) when the variables are subject to constraints, say of the form g(x, y)
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationChapter 4. Chapter 4 sections
Chapter 4 sections 4.1 Expectation 4.2 Properties of Expectations 4.3 Variance 4.4 Moments 4.5 The Mean and the Median 4.6 Covariance and Correlation 4.7 Conditional Expectation SKIP: 4.8 Utility Expectation
More informationSTAT2201. Analysis of Engineering & Scientific Data. Unit 3
STAT2201 Analysis of Engineering & Scientific Data Unit 3 Slava Vaisman The University of Queensland School of Mathematics and Physics What we learned in Unit 2 (1) We defined a sample space of a random
More informationClass 26: review for final exam 18.05, Spring 2014
Probability Class 26: review for final eam 8.05, Spring 204 Counting Sets Inclusion-eclusion principle Rule of product (multiplication rule) Permutation and combinations Basics Outcome, sample space, event
More informationExample 1. Assume that X follows the normal distribution N(2, 2 2 ). Estimate the probabilities: (a) P (X 3); (b) P (X 1); (c) P (1 X 3).
Example 1. Assume that X follows the normal distribution N(2, 2 2 ). Estimate the probabilities: (a) P (X 3); (b) P (X 1); (c) P (1 X 3). First of all, we note that µ = 2 and σ = 2. (a) Since X 3 is equivalent
More informationECE 650 Lecture 4. Intro to Estimation Theory Random Vectors. ECE 650 D. Van Alphen 1
EE 650 Lecture 4 Intro to Estimation Theory Random Vectors EE 650 D. Van Alphen 1 Lecture Overview: Random Variables & Estimation Theory Functions of RV s (5.9) Introduction to Estimation Theory MMSE Estimation
More informationExpectation Maximization (EM) Algorithm. Each has it s own probability of seeing H on any one flip. Let. p 1 = P ( H on Coin 1 )
Expectation Maximization (EM Algorithm Motivating Example: Have two coins: Coin 1 and Coin 2 Each has it s own probability of seeing H on any one flip. Let p 1 = P ( H on Coin 1 p 2 = P ( H on Coin 2 Select
More informationECE 275B Homework # 1 Solutions Winter 2018
ECE 275B Homework # 1 Solutions Winter 2018 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2 < < x n Thus,
More informationBasics on Probability. Jingrui He 09/11/2007
Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability
More informationPh.D. Qualifying Exam Monday Tuesday, January 4 5, 2016
Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Find the maximum likelihood estimate of θ where θ is a parameter
More informationECE 275B Homework # 1 Solutions Version Winter 2015
ECE 275B Homework # 1 Solutions Version Winter 2015 1. (a) Because x i are assumed to be independent realizations of a continuous random variable, it is almost surely (a.s.) 1 the case that x 1 < x 2
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationNotes on Random Vectors and Multivariate Normal
MATH 590 Spring 06 Notes on Random Vectors and Multivariate Normal Properties of Random Vectors If X,, X n are random variables, then X = X,, X n ) is a random vector, with the cumulative distribution
More informationMathematical statistics
October 1 st, 2018 Lecture 11: Sufficient statistic Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation
More informationUnbiased Estimation. Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others.
Unbiased Estimation Binomial problem shows general phenomenon. An estimator can be good for some values of θ and bad for others. To compare ˆθ and θ, two estimators of θ: Say ˆθ is better than θ if it
More informationProbability Background
Probability Background Namrata Vaswani, Iowa State University August 24, 2015 Probability recap 1: EE 322 notes Quick test of concepts: Given random variables X 1, X 2,... X n. Compute the PDF of the second
More informationExercises with solutions (Set D)
Exercises with solutions Set D. A fair die is rolled at the same time as a fair coin is tossed. Let A be the number on the upper surface of the die and let B describe the outcome of the coin toss, where
More informationNon-parametric Inference and Resampling
Non-parametric Inference and Resampling Exercises by David Wozabal (Last update. Juni 010) 1 Basic Facts about Rank and Order Statistics 1.1 10 students were asked about the amount of time they spend surfing
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationPCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities
PCMI 207 - Introduction to Random Matrix Theory Handout #2 06.27.207 REVIEW OF PROBABILITY THEORY Chapter - Events and Their Probabilities.. Events as Sets Definition (σ-field). A collection F of subsets
More informationCS281A/Stat241A Lecture 17
CS281A/Stat241A Lecture 17 p. 1/4 CS281A/Stat241A Lecture 17 Factor Analysis and State Space Models Peter Bartlett CS281A/Stat241A Lecture 17 p. 2/4 Key ideas of this lecture Factor Analysis. Recall: Gaussian
More informationThis exam is closed book and closed notes. (You will have access to a copy of the Table of Common Distributions given in the back of the text.
TEST #3 STA 5326 December 4, 214 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. (You will have access to
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationFirst Year Examination Department of Statistics, University of Florida
First Year Examination Department of Statistics, University of Florida August 20, 2009, 8:00 am - 2:00 noon Instructions:. You have four hours to answer questions in this examination. 2. You must show
More informationCompleteness. On the other hand, the distribution of an ancillary statistic doesn t depend on θ at all.
Completeness A minimal sufficient statistic achieves the maximum amount of data reduction while retaining all the information the sample has concerning θ. On the other hand, the distribution of an ancillary
More informationUniversity of Regina. Lecture Notes. Michael Kozdron
University of Regina Statistics 252 Mathematical Statistics Lecture Notes Winter 2005 Michael Kozdron kozdron@math.uregina.ca www.math.uregina.ca/ kozdron Contents 1 The Basic Idea of Statistics: Estimating
More informationChapter 1. Statistical Spaces
Chapter 1 Statistical Spaces Mathematical statistics is a science that studies the statistical regularity of random phenomena, essentially by some observation values of random variable (r.v.) X. Sometimes
More informationECE534, Spring 2018: Solutions for Problem Set #3
ECE534, Spring 08: Solutions for Problem Set #3 Jointly Gaussian Random Variables and MMSE Estimation Suppose that X, Y are jointly Gaussian random variables with µ X = µ Y = 0 and σ X = σ Y = Let their
More information1.1 Review of Probability Theory
1.1 Review of Probability Theory Angela Peace Biomathemtics II MATH 5355 Spring 2017 Lecture notes follow: Allen, Linda JS. An introduction to stochastic processes with applications to biology. CRC Press,
More informationMAS223 Statistical Inference and Modelling Exercises
MAS223 Statistical Inference and Modelling Exercises The exercises are grouped into sections, corresponding to chapters of the lecture notes Within each section exercises are divided into warm-up questions,
More informationST5215: Advanced Statistical Theory
Department of Statistics & Applied Probability Monday, September 26, 2011 Lecture 10: Exponential families and Sufficient statistics Exponential Families Exponential families are important parametric families
More informationSOLUTION FOR HOMEWORK 6, STAT 6331
SOLUTION FOR HOMEWORK 6, STAT 633. Exerc.7.. It is given that X,...,X n is a sample from N(θ, σ ), and the Bayesian approach is used with Θ N(µ, τ ). The parameters σ, µ and τ are given. (a) Find the joinf
More informationSTAT 418: Probability and Stochastic Processes
STAT 418: Probability and Stochastic Processes Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical
More informationPart IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015
Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)
More informationOrder Statistics and Distributions
Order Statistics and Distributions 1 Some Preliminary Comments and Ideas In this section we consider a random sample X 1, X 2,..., X n common continuous distribution function F and probability density
More information1 Review of Probability and Distributions
Random variables. A numerically valued function X of an outcome ω from a sample space Ω X : Ω R : ω X(ω) is called a random variable (r.v.), and usually determined by an experiment. We conventionally denote
More informationGenerating Random Variates 2 (Chapter 8, Law)
B. Maddah ENMG 6 Simulation /5/08 Generating Random Variates (Chapter 8, Law) Generating random variates from U(a, b) Recall that a random X which is uniformly distributed on interval [a, b], X ~ U(a,
More informationSTAT 414: Introduction to Probability Theory
STAT 414: Introduction to Probability Theory Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical Exercises
More informationStat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory. Charles J. Geyer School of Statistics University of Minnesota
Stat 5101 Lecture Slides: Deck 7 Asymptotics, also called Large Sample Theory Charles J. Geyer School of Statistics University of Minnesota 1 Asymptotic Approximation The last big subject in probability
More informationMaximum Likelihood Estimation
Maximum Likelihood Estimation Assume X P θ, θ Θ, with joint pdf (or pmf) f(x θ). Suppose we observe X = x. The Likelihood function is L(θ x) = f(x θ) as a function of θ (with the data x held fixed). The
More informationMcGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper
McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section
More informationWe introduce methods that are useful in:
Instructor: Shengyu Zhang Content Derived Distributions Covariance and Correlation Conditional Expectation and Variance Revisited Transforms Sum of a Random Number of Independent Random Variables more
More informationReview 1: STAT Mark Carpenter, Ph.D. Professor of Statistics Department of Mathematics and Statistics. August 25, 2015
Review : STAT 36 Mark Carpenter, Ph.D. Professor of Statistics Department of Mathematics and Statistics August 25, 25 Support of a Random Variable The support of a random variable, which is usually denoted
More informationDirection: This test is worth 250 points and each problem worth points. DO ANY SIX
Term Test 3 December 5, 2003 Name Math 52 Student Number Direction: This test is worth 250 points and each problem worth 4 points DO ANY SIX PROBLEMS You are required to complete this test within 50 minutes
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationAn Introduction to Bayesian Linear Regression
An Introduction to Bayesian Linear Regression APPM 5720: Bayesian Computation Fall 2018 A SIMPLE LINEAR MODEL Suppose that we observe explanatory variables x 1, x 2,..., x n and dependent variables y 1,
More informationMIT Spring 2016
Dr. Kempthorne Spring 2016 1 Outline Building 1 Building 2 Definition Building Let X be a random variable/vector with sample space X R q and probability model P θ. The class of probability models P = {P
More informationAppendix A : Introduction to Probability and stochastic processes
A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of
More informationTest Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics
Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests
More information1 Random variables and distributions
Random variables and distributions In this chapter we consider real valued functions, called random variables, defined on the sample space. X : S R X The set of possible values of X is denoted by the set
More informationContents 1. Contents
Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................
More informationPOISSON PROCESSES 1. THE LAW OF SMALL NUMBERS
POISSON PROCESSES 1. THE LAW OF SMALL NUMBERS 1.1. The Rutherford-Chadwick-Ellis Experiment. About 90 years ago Ernest Rutherford and his collaborators at the Cavendish Laboratory in Cambridge conducted
More information{ p if x = 1 1 p if x = 0
Discrete random variables Probability mass function Given a discrete random variable X taking values in X = {v 1,..., v m }, its probability mass function P : X [0, 1] is defined as: P (v i ) = Pr[X =
More informationMATH 4211/6211 Optimization Constrained Optimization
MATH 4211/6211 Optimization Constrained Optimization Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Constrained optimization
More informationFebruary 26, 2017 COMPLETENESS AND THE LEHMANN-SCHEFFE THEOREM
February 26, 2017 COMPLETENESS AND THE LEHMANN-SCHEFFE THEOREM Abstract. The Rao-Blacwell theorem told us how to improve an estimator. We will discuss conditions on when the Rao-Blacwellization of an estimator
More informationBEST TESTS. Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized.
BEST TESTS Abstract. We will discuss the Neymann-Pearson theorem and certain best test where the power function is optimized. 1. Most powerful test Let {f θ } θ Θ be a family of pdfs. We will consider
More informationThings to remember when learning probability distributions:
SPECIAL DISTRIBUTIONS Some distributions are special because they are useful They include: Poisson, exponential, Normal (Gaussian), Gamma, geometric, negative binomial, Binomial and hypergeometric distributions
More informationTransformations from R m to R n.
Transformations from R m to R n 1 Differentiablity First of all because of an unfortunate combination of traditions (the fact that we read from left to right and the way we define matrix multiplication
More informationProblems ( ) 1 exp. 2. n! e λ and
Problems The expressions for the probability mass function of the Poisson(λ) distribution, and the density function of the Normal distribution with mean µ and variance σ 2, may be useful: ( ) 1 exp. 2πσ
More information1 Presessional Probability
1 Presessional Probability Probability theory is essential for the development of mathematical models in finance, because of the randomness nature of price fluctuations in the markets. This presessional
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationProbabilistic Graphical Models
Parameter Estimation December 14, 2015 Overview 1 Motivation 2 3 4 What did we have so far? 1 Representations: how do we model the problem? (directed/undirected). 2 Inference: given a model and partially
More informationToday s Outline. Biostatistics Statistical Inference Lecture 01 Introduction to BIOSTAT602 Principles of Data Reduction
Today s Outline Biostatistics 602 - Statistical Inference Lecture 01 Introduction to Principles of Hyun Min Kang Course Overview of January 10th, 2013 Hyun Min Kang Biostatistics 602 - Lecture 01 January
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationTheory of Statistical Tests
Ch 9. Theory of Statistical Tests 9.1 Certain Best Tests How to construct good testing. For simple hypothesis H 0 : θ = θ, H 1 : θ = θ, Page 1 of 100 where Θ = {θ, θ } 1. Define the best test for H 0 H
More information