2017 Financial Mathematics Orientation - Statistics

Size: px
Start display at page:

Download "2017 Financial Mathematics Orientation - Statistics"

Transcription

1 2017 Financial Mathematics Orientation - Statistics Written by Long Wang Edited by Joshua Agterberg August 21, 2018 Contents 1 Preliminaries Samples and Population Distributions Central limit theorem (CLT) Main Ideas One-sample Large-sample population mean Estimation Testing Duality between confidence interval and testing Choose sample size Small-sample population mean Estimation Testing Population variance

2 2.3.1 Estimation Testing Large-sample population proportion Estimation Testing Two-sample Large-sample two population means Estimation Testing Small-sample two population means (identical variance) Practical rule Estimation Testing Small-sample two population means (non-identical variance) Practical rule Estimation Testing Small-sample two population means (matched-pair) Estimation Testing Two population variances Estimation Testing Large-sample two population proportions Estimation Testing (identical proportion hypothesis) Testing (non-identical proportion hypothesis)

3 4 General distribution Estimation Mean-variance trade-off Method of moments (MOM) Maximum likelihood estimator (MLE) Asymptotic Normality for MLE Bootstrap Bayesian Testing Likelihood ratio test Generalized ratio test ANOVA One-way ANOVA Model Testing Estimation Randomized block design Model Testing Estimation Two-way ANOVA Model Testing Estimation Linear regression Simple linear regression

4 6.1.1 Model Testing Estimation T-test Coefficient of determination Variable Selection Measures of Fit Stepwise Regression Ridge Regression Lasso Regression Other important concepts to know Cross Validation Dimensionality Reduction Classification Clustering Non-parametric methods Model selection

5 1 Preliminaries 1.1 Samples and Population Sample is a randomly (i.i.d.) selected subset of the population/distribution. From a sample, we want to learn as best we can about properties/parameters of the population/distribution 1.2 Distributions Bernoulli: X Ber(p), P (X = 1) = p Binomial: X Bin(n, p), P (X = k) = ( ) n k p k (1 p) n k, Bin(n, p) = n i=1 Ber(p) Normal: X N(µ, σ 2 ), f(x) = 1 2πσ e (x µ)2 2σ 2 Bin(n, p) N(np, npq), N(µ,σ2 ) µ σ = N(0, 1) Chi-square: X χ 2 ν, χ 2 ν = ν i=1 N(0, 1)2 t-distribution: X t ν, t ν = N(0,1) χ 2 ν/ν, t = N(0, 1), t 0 = Cauchy (undefined mean and variance) F-distribution X F n,m, F n,m = χ2 n/n χ m/m, t2 ν = F 1,ν Others: Exponential, Poisson, Gamma, Beta, Negative Binomial, Central limit theorem (CLT) X 1,..., X n are i.i.d. random variables with mean µ and variance σ 2 <, then X µ σ/ n N(0, 1) 5

6 1.4 Main Ideas Assuming that there is a true parameter θ (e.g. the mean, the variance, etc.), can we use our data to discover the true distribution? (Frequentist method). We can: estimate θ, perform a hypothesis test, or find a confidence interval about the true mean. Always pay attention to assumptions! In many cases, assumptions do not hold, but they make our lives easier. We often assume a distribution, and then perform statistical inference based on the distributions. We typically say that X 1,..., X n i.i.d. F θ, where F is completely determined by θ (e.g. N(µ, σ 2 ), θ = (µ, σ)). This is parametric. If F is not completely determined by θ, then we typically say we are in a non-parametric setting. There is such thing as a semi-parametric setting, but we will ignore this for now. 2 One-sample Here, we assume that we have one sample of i.i.d. data. We often assume data are normal or approximately normal by the CLT. 2.1 Large-sample population mean Estimation Data: X i N(µ, σ 2 ) Estimator: ˆµ = X Distribution: X µ σ/ n N(0, 1) (why?) C.I.: X ± z(α/2) σ n or X + z(α) σ n (one-sided) or X z(α) σ n (one-sided) 6

7 2.1.2 Testing Hypothesis: H 0 : µ = µ 0 vs. H A : µ µ 0 or H A : µ < µ 0 or H A : µ > µ 0 Test statistics: Z = X µ 0 σ/ n N(0, 1) Rejection region: Z > z(α/2) or Z < z(α) or Z > z(α) H A : µ = µ A : Power = P (Reject H A ) Type-I (α) and Type-II error p-value: 2 p(z > z ) or p(z < z) or p(z > z) Duality between confidence interval and testing µ 0 C.I. Accept H 0 : µ = µ Choose sample size B: upper bound on the error of estimate. The width of confidence bound equals 2B B z(α/2) σ n n z(α/2)2 σ 2 B Small-sample population mean Estimation Data: X i N(µ, σ 2 ) Estimator: ˆµ = X Distribution: C.I.: X ± tn 1 (α/2) s n X µ s/ n t n 1 (proof: X µ X µ s/ = σ/ n ) n s 2 /σ Testing Hypothesis: H 0 : µ = µ 0 vs. H A : µ µ 0 Test statistics: T = X µ 0 s/ n t n 1 Rejection region: T > t n 1 (α/2) 7

8 2.3 Population variance Estimation Data: X i N(µ, σ 2 ) Estimator: ˆσ 2 = s 2, E [s 2 ] = σ 2 Distribution: (n 1)s2 χ 2 σ 2 n 1 (proof: (n 1)s2 + ( X µ) 2 σ 2 σ 2 /n ( ) C.I.: (n 1)s 2 χ 2 (α/2), (n 1)s 2 n 1 χ 2 n 1 (1 α/2) = n (X i µ) 2 i=1 ) σ Testing Hypothesis: H 0 : σ 2 = σ 2 0 vs. H A : σ 2 σ 2 0 Test statistics: X 2 = (n 1)s2 σ 2 0 χ 2 n 1 Rejection region: X 2 > χ 2 n 1(α/2) and X 2 < χ 2 n 1(1 α/2) 2.4 Large-sample population proportion Estimation Data: X i Ber(p) Estimator: ˆp = X Distribution: C.I.: ˆp ± z(α/2) ˆp p N(0, 1) p(1 p)/n ˆp(1 ˆp) n Testing Hypothesis: H 0 : p = p 0 vs. H A : p p 0 Test statistics: Z = ˆp p 0 p0 (1 p 0 )/n Rejection region: Z > z(α/2) N(0, 1) 8

9 3 Two-sample Note: we can come back to this if there is more interest in other sections. The setup is very similar as in Section II, only now we have two samples and want to perform inference on whether the means or other population parameters are similar. For example, suppose we have a sample of undergraduate GPAs from Hopkins graduate students and a sample of undergraduate GPAs from people who did not go to graduate school. Are the mean GPAs different? Are the distributions different? 3.1 Large-sample two population means Estimation Data: X i N(µ X, σ 2 X ), Y j N(µ Y, σ 2 Y ) Estimator: ˆµ X ˆµ Y = X Ȳ Distribution: X Ȳ (µ X µ Y ) σ 2 Xn + σ2 Ym N(0, 1) σ C.I.: X X Ȳ ± z(α/2) 2 n + σ2 Y m Testing Hypothesis: H 0 : µ X µ Y = D 0 vs. H A : µ X µ Y D 0 Test statistics: Z = X Ȳ D 0 σ 2 Xn + σ2 Ym N(0, 1) Rejection region: Z > z(α/2) 3.2 Small-sample two population means (identical variance) Practical rule max(s 2 X, s2 Y )/ min(s2 X, s2 Y ) < 3 9

10 3.2.2 Estimation Data: X i N(µ X, σ 2 ), Y j N(µ Y, σ 2 ) Estimator: ˆµ X ˆµ Y = X Ȳ Distribution: X Ȳ (µ X µ Y ) s 2 p n + s2 p m C.I.: X Ȳ ± t n+m 2 (α/2) t n+m 2, s 2 p = (n 1)s2 X +(m 1)s2 Y n+m 2 s 2 p n + s2 p m Testing Hypothesis: H 0 : µ X µ Y = D 0 vs. H A : µ X µ Y D 0 Test statistics: T = X Ȳ D 0 t n+m 2 s 2 p n + s2 p m Rejection region: T > t n+m 2 (α/2) 3.3 Small-sample two population means (non-identical variance) Practical rule max(s 2 X, s2 Y )/ min(s2 X, s2 Y ) > Estimation Data: X i N(µ X, σ 2 ), Y j N(µ Y, σ 2 ) Estimator: ˆµ X ˆµ Y = X Ȳ X Ȳ (µ X µ Y ) Distribution: t ν, ν = (s2 X /n+s2 Y /m)2 s 2 X n + s2 (s 2 Y X /n)2 m n 1 C.I.: X Ȳ ± t ν (α/2) s 2 X n + s2 Y m + (s2 Y /m)2 m Testing Hypothesis: H 0 : µ X µ Y = D 0 vs. H A : µ X µ Y D 0 Test statistics: T = X Ȳ D 0 s 2 X n + s2 Y m t ν Rejection region: T > t ν (α/2) 10

11 3.4 Small-sample two population means (matched-pair) Estimation Data: X i N(µ X, σ 2 ), Y j N(µ Y, σ 2 ), D i = X i Y i Estimator: ˆµ D = D Distribution: D µ D s D / n t n 1 C.I.: D ± tn 1 (α/2) sd n Testing Hypothesis: H 0 : µ D = µ 0 vs. H A : µ D µ 0 Test statistics: T = D µ 0 s D / n t n 1 Rejection region: T > t n 1 (α/2) 3.5 Two population variances Estimation Data: X i N(µ X, σ 2 X ), X j N(µ Y, σ 2 Y ) Estimator: ˆσ 2 X ˆσ 2 Y = s2 X s 2 Y Distribution: s2 X /σ2 X F s 2 n 1,m 1 Y /σ2 Y ( C.I.: s 2 X s 2 Y F n 1,m 1(α/2), s 2 X s 2 Y F n 1,m 1(1 α/2) ) Testing Hypothesis: H 0 : σ2 X σ 2 Y = r 0 vs. H A : σ2 X σ 2 Y r 0 Test statistics: F = s2 X s 2 Y r 0 F n 1,m 1 Rejection region: F > F n 1,m 1 (α/2) and F < F n 1,m 1 (1 α/2) 11

12 3.6 Large-sample two population proportions Estimation Data: X i Ber(p X ), Y j Ber(p Y ) Estimator: ˆp X ˆp Y = X Ȳ ˆp X ˆp Y (p X p Y ) px (1 p X ) + p Y (1 p Y ) n m Distribution: C.I.: ˆp X ˆp Y ± z(α/2) N(0, 1) ˆp X (1 ˆp X ) + ˆp Y (1 ˆp Y ) n m Testing (identical proportion hypothesis) Hypothesis: H 0 : p X = p Y vs. H A : p X p Y Test statistics: Z = ˆp X ˆp Y p(1 p) + p(1 p) n m Rejection region: Z > z(α/2) N(0, 1), ˆp = n X+mȲ n+m Testing (non-identical proportion hypothesis) Hypothesis: H 0 : p X p Y = D 0 vs. H A : p X p Y D 0 Test statistics: Z = ˆp X ˆp Y D 0 px (1 p X ) + p Y (1 p Y ) n m Rejection region: Z > z(α/2) N(0, 1) 4 General distribution 4.1 Estimation Mean-variance trade-off Unbiased: E(ˆθ) = θ ] Mean square error: E [(ˆθ θ) 2 ] ( ] 2 ) = E [(ˆθ E(ˆθ) + E(ˆθ) θ) 2 = E [ˆθ θ) + Var (ˆθ Method of moments (MOM) µ k = E [ X k] and ˆµ k = 1 n n i=1 Xk i 12

13 Example: X P oi(λ) : µ 1 = λ ˆµ 1 = X and ˆλ = X X N(µ, σ 2 ) : µ 1 = µ, µ 2 = µ 2 + σ 2 ˆµ = X, ˆσ 2 = 1 n n i=1 (X i X) 2 (biased) X Γ(α, β) : µ 1 = α/β, µ 2 = α(α+1) ˆβ = ˆµ 1 β 2 ˆµ 2, α = ˆβ ˆµ ˆµ X U(0, θ) : µ 1 = θ ˆθ = 2 X (could make no sense) Pros: easy, consistent, asymptotically unbiased Cons: could make no sense, not efficient Maximum likelihood estimator (MLE) ˆθ = arg max lik(θ) = arg max n i=1 f(x i θ) ˆθ = arg max l(θ) = arg max n i=1 log f(x i θ) Example: X P oi(λ): P (X = x) = λx e λ x! l(λ) = n i=1 (X i log λ λ log X!) = logλ n i=1 X i nλ n i=1 log X! l (λ) = 1 λ n i=1 X i n = 0 ˆλ = X X N(µ, σ 2 ): lik(µ, σ) = n i=1 l(µ, σ 2 ) = n log σ n 2 l = 1 n µ σ 2 i=1 (X i µ) = 0 ˆµ = X 1 2πσ e (X i µ)2 2σ log 2π 1 2σ 2 n i=1 (X i µ) 2 l = n + 1 n σ σ σ 3 i=1 (X i µ) 2 ˆσ 2 = 1 n X Γ(α, β): lik(α, β) = n i=1 β α Γ(α) xα 1 e βx n i=1 (X i X) 2 l(α, β) = nα log β + (α 1) n i=1 log X i λ n i=1 X i n log Γ(α) l = n log β + n α i=1 log X i n Γ (α) Γ(α) l = nα n β β i=1 X i = 0, ˆβ = ˆᾱ X = 0 (no closed-form solution) 13

14 X U(0, θ) : lik(θ) = n i=1 ˆθ = X (n) 1 θ n 1 {Xi <θ} Pros: consistent, asymptotically unbiased, asymptotically normally, asymptotically efficient (unbiased + achieve Cramer-Rao lower bound (CRLB)) Cons: not robust, could have no closed-form solution (could be sensitive to optimization algorithm) Asymptotic Normality for MLE Asymptotic normality: ni(θ 0 )(ˆθ θ 0 ) N(0, 1) C.I.: ˆθ 1 ± z(α/2) ni(ˆθ), I(θ) = E [ log f(x θ)] 2 θ (Fisher information) [ ] Under appropriate smoothness conditions: I(θ) = E 2 log f(x θ) θ 2 Example: X P oi(λ) : f(x λ) = λx e λ x!, E [X] = λ, var (X) = λ, ˆλ MLE = X I(λ) = E [ log f(x λ)] 2 ( λ = E X 1) 2 λ = 1 E(X 2 ) 2 λ+λ2 E(X) + 1 = λ 2 λ λ 2 [ ] I(λ) = E 2 log f(x λ) = E ( ) X λ 2 λ = 1 2 λ = 1 λ C.I.: X ± z(α/2) X n X N(µ, σ 2 ) : log f(x µ, σ 2 ) = 1 2 log(2πσ2 ) (x µ)2 2σ 2 l(θ) = x µ µ σ 2 I(µ) = 1 σ 2, 2 l(θ) µ 2 = 1 σ 2 C.I.: X ± z(α/2) σ n. ) Cramer-Rao lower bound: var (ˆθ > 1 ni(θ) for unbiased estimator Bootstrap Parametric v.s. non-parametric Bootstrap samples: θ 14

15 Percentile method: C.I.: (θ L, θ U ) Normal approximation: se(ˆθ) s θ : C.I.: ˆθ ± z(α/2)s θ Distribution approximation: C.I.: θ 0 is (2ˆθ θ U, 2ˆθ θ L ) P ( L ˆθ θ 0 U ) = 1 α C.I.: (ˆθ U, ˆθ L ) = ˆθ θ 0 θ ˆθ L θ L ˆθ, U θ U ˆθ C.I.: (2ˆθ θ U, 2ˆθ θ L ) If θ is symmetric about ˆθ: ˆθ θ L = θ U ˆθ, then 2ˆθ θ U = θ L and 2ˆθ θ L = θ U See also: Cross-validation (in the literature these are often considered simultaneously) Bayesian Prior + likelihood posterior (good in small sample, small effect in large sample) Example: Prior: U(0, 1) : π(p) 1 or Beta(α, β) : π(p) p α 1 (1 p) β 1 Likelihood: Bin(n, p) : lik(p) (1 p) n N p N Posterior: π (p) p α+n 1 (1 p) β+n N 1 Beta(α + N, β + n N) µ = α α+β, σ2 = µ = α+n α+β+n = σ 2 = αβ (α+β) 2 (α+β+1) α+β α + α+β+n α+β (α+n)(β+n N) (α+β+n) 2 (α+β+n+1) 0 n N N α+β+n n n weighted average Prior: Γ(a, b) : π(θ) = ba Γ(a) θα 1 e bθ, E(θ) = a b, V ar(θ) = a b 2 Likelihood: P oi(θ) : lik(θ) = n θ X ie θ i=1 X i! θ S e nθ Posterior: π (θ) θ S+a 1 e (n+b)θ Γ(a + S, b + n) µ = a+s b+n = b a + n S S ; b+n b b+n n n σ 2 = a+s 0 (b+n) 2 Prior: Γ(a, b) : π(λ) = ba Γ(a) λα 1 e bλ, E(λ) = a b, V ar(λ) = a b 2 15

16 Likelihood: exp(λ) : lik(λ) = λe λx i λ n e sλ Posterior: π (λ) λ n+a 1 e (s+b)θ Γ(a + n, b + s) µ = a+n b+s = b a + s n. b+s b b+s s σ 2 = a+n (b+s) 2 Prior: N (a, b 2 ) : π(θ) e (θ a)2 2b 2 Likelihood: N (θ, σ 2 ) : L(θ) e (xi θ) 2 2σ 2 Posterior: π (θ) e ( θ2 2aθ+a 2 e (θ aσ 2 +n Xb 2 σ 2 +nb 2 )2 2 b2 σ 2 σ 2 +nb 2 θ = aσ2 +n Xb 2 σ 2 +nb 2 = σ2 σ 2 +nb 2 a + 2b 2 + nθ2 2n Xθ+ X 2 nb2 σ 2 +nb 2 e SS+n(θ x)2 2σ 2 2σ 2 ) e σ2 θ 2 2aσ 2 θ+σ 2 a 2 +nb 2 θ 2 2n Xb 2 θ+ X 2 b 2 2b 2 σ 2 X X σ 2 = b2 σ 2 1 = σ 2 +nb 2 1 b σ 2 /n Testing Likelihood ratio test Hypothesis: H 0 : µ = µ 0 ; H A : µ = µ A Test statistic: Λ = f(x H 0) f(x H A ) Rejection region: small value of Λ(X) Most powerful for simple null vs. simply alternative Example: H 0 : µ = µ 0 ; H A : µ = µ A Λ = exp[ 1 2σ 2 n i=1 (X i µ 0 ) 2 ] exp[ 1 2σ 2 n i=1 (X i µ A ) 2 ] Reject for small n i=1 (X i µ A ) 2 n i=1 (X i µ 0 ) 2 = 2n X(µ 0 µ A ) + nµ 2 A nµ2 0. If µ 0 > µ A, reject for small value of X. If µ0 < µ A, reject for large value of X 16

17 4.2.2 Generalized ratio test Hypothesis: composite null vs. composite alternative Test statistic: Λ = max θ H 0 f(x θ) max θ H0 H A f(x θ) 2 log Λ χ2 dim Ω dim ω 0 Rejection region: small value of Λ(X) or large value of 2 log Λ Example: as n H 0 : µ = µ 0 ; H A : µ µ 0. σ 2 is known Λ(X) = exp[ 1 2σ 2 n i=1 (X i µ 0 ) 2 ] exp[ 1 2σ 2 n i=1 (X i X) 2 ] = exp ( 1 2σ 2 [ n i=1 (X i µ 0 ) 2 n i=1 (X i X) 2]) 2 log Λ(X) = 1 σ 2 ( n i=1 (X i µ 0 ) 2 n i=1 (X i X) 2) = n σ 2 ( X µ 0 ) 2 = χ 2 1; [ n i=1 (X i µ 0 ) 2 = n i=1 (X i X) 2 + n( X µ 0 ) 2] ( ) 2 Reject when 2 log Λ > χ 2 1(α) X µ0 σ/ n > χ 2 1(α) Poisson Dispersion Test: X µ 0 σ/ n > z(α/2) ( X µ0 σ/ n ) 2 X i P oi(λ i ), H 0 : λ i = λ Λ = n i=1 ˆλ x ie ˆλ/x i! n i=1 λ i x i e λ i/xi! = n i=1 ( x x i ) xi e x i x 2 log Λ = 2 n i=1 [x i log( x/x i ) + (x i x)] = 2 n i=1 x i log( x/x i ) χ 2 n 1 Reject when 2 log Λ > χ 2 n 1(α) 5 ANOVA 5.1 One-way ANOVA Model Y ij N(µ + α i, σ 2 ) for i = 1,..., I and j = 1,..., n i Constraint: I i=1 α i = 0 17

18 5.1.2 Testing SST = SSB + SSE SST = i,j (Y ij Ȳ ) 2, SSB = i n i(ȳi Ȳ ) 2, SSE = i,j (Y ij Ȳi ) 2 d.f.: (n 1) = (I 1) + (n I) Hypothesis: H 0 : α 1 = = α I = 0 Test-statistic: F = SSB/(I 1) SSE/(I(J 1)) = MSB MSE F I 1,I(J 1). Rejection region: F > F I 1,I(J 1) (α) Estimation ˆµ = Ȳ, ˆα i = Ȳi Ȳ, ˆσ 2 = s 2 p = MSE = SSE/(I(J 1)) C.I. for µ + α i : Ȳ i ± t n I (α/2) C.I. for α i α j : MSE n i Two-sample: (Ȳi Ȳj ) ± t n I (α/2) MSE n i + MSE n j Bonferroni: (Ȳi Ȳj ) ± t n I ((α/2)/ ( I 2 )) MSE n i + MSE n j Tukey (n 1 = = n I = J): (Ȳi Ȳj ) ± q I,n I (α) MSE J 5.2 Randomized block design Model Y ij N(µ + α i + β j, σ 2 ) for i = 1,..., I and j = 1,..., J Constraint: i α i = 0, j β j = Testing SST = SSA + SSB + SSE SST = i,j (Y ij Ȳ ) 2, SSA = i J(Ȳi Ȳ ) 2, SSB = j I(Ȳ j Ȳ ) 2 SSE = i,j (Y ij Ȳi Ȳ j + Ȳ ) 2 18

19 d.f.: (IJ 1) = (I 1) + (J 1) + (I 1)(J 1) Hypothesis: H 0 : α 1 = = α I = 0 Test-statistic: F = SSA/(I 1) SSE/((I 1)(J 1)) = MSA MSE F I 1,(I 1)(J 1) Rejection region: F > F I 1,(I 1)(J 1) (α) Hypothesis: H 0 : β 1 = = β J = 0 Test-statistic: F = SSB/(J 1) SSE/((I 1)(J 1)) = MSB MSE F J 1,(I 1)(J 1) Rejection region: F > F J 1,(I 1)(J 1) (α) Estimation ˆµ = Ȳ, ˆα i = Ȳi Ȳ, ˆσ 2 = s 2 p = MSE = SSE/((I 1)(J 1)) MSE C.I. for µ + α i : Ȳ i ± t (I 1)(J 1) (α/2) J C.I. for µ + β j : Ȳ i ± t (I 1)(J 1) (α/2) C.I. for α i α j : MSE I Two-sample: (Ȳi Ȳj ) MSE ± t (I 1)(J 1) (α/2) + MSE J J Tukey: (Ȳi Ȳj ) MSE ± q I,(I 1)(J 1) (α) J C.I. for β i β j : Two-sample: (Ȳ i Ȳ j) MSE ± t (I 1)(J 1) (α/2) + MSE I I Tukey: (Ȳ i Ȳ i) MSE ± q J,(I 1)(J 1) (α) I 5.3 Two-way ANOVA Model Y ijk N(µ + α i + β j + δ ij, σ 2 ) for i = 1,..., I, j = 1,..., J, k = 1,..., K Constraint: i α i = 0, j β j = 0, i δ ij = 0, j δ ij = 0 19

20 5.3.2 Testing SST = SSA + SSB + SSAB + SSE SST = i,j,k (Y ijk Ȳ ) 2, SSA = i JK(Ȳi Ȳ ) 2, SSB = j IK(Ȳ j Ȳ ) 2 SSAB = i,j K(Y ij Ȳi Ȳ j + Ȳ ) 2, SSE = i,j,k (Y ijk Ȳij ) 2 d.f.: (IJK 1) = (I 1) + (J 1) + (I 1)(J 1) + IJ(K 1) Hypothesis: H 0 : α 1 = = α I = 0 Test-statistic: F = SSA/(I 1) SSE/(IJ(K 1)) = MSA MSE F I 1,IJ(K 1) Rejection region: F > F I 1,IJ(K 1) (α) Hypothesis: H 0 : β 1 = = β J = 0 Test-statistic: F = SSB/(J 1) SSE/(IJ(K 1)) = MSB MSE F J 1,IJ(K 1) Rejection region: F > F J 1,IJ(K 1) (α) Hypothesis: H 0 : δ ij = 0 for all i, j Test-statistic: F = SSAB/((I 1)(J 1) SSE/(IJ(K 1)) = MSAB MSE Rejection region: F > F (I 1)(J 1),IJ(K 1) (α) F (I 1)(J 1),IJ(K 1) Estimation ˆσ 2 = s 2 p = MSE = SSE/(IJ(K 1)) Tukey (interaction is significant): (Ȳij Ȳi j ) ± q IJ,J(K 1) (α) MSE K 6 Linear regression 6.1 Simple linear regression Model y N(β 0 + β 1 x, σ 2 ) 20

21 6.1.2 Testing SST = SSR + SSE SST = (y i ȳ) 2, SSR = (ŷ i ȳ) 2, SSE = (y i ŷ i ) 2 d.f.: (n 1) = (2 1) + (n 2) Hypothesis: H 0 : β 1 = 0 Test-statistics: F = SSR/(2 1) SSE/(n 2) = MSR MSE F 1,n 2 Rejection region: F > F 1,n 2 (α) Estimation Least-square estimation: ˆβ 1 = Sxy S xx, ˆβ 0 = ȳ b x, S xx = (x i x)(y i ȳ) and S xx = (x i x) 2 Distribution: C.I.: ˆβ 1 β 1 σ 2 /S xx N(0, 1), ˆβ 1 β 1 MSE/Sxx t n 2 ˆβ 0 β 0 ˆβ N(0, 1), 0 β 0 t n 2 σ 2 x 2 /S xx MSE x 2 /S xx ŷ (β 0 +β 1 x ) ( MSE 1 n + (x x) 2 Sxx ) t n 2 β 1 : ˆβ MSE 1 ± t n 2 (α) S xx β 0 : ˆβ MSEx 0 ± t n 2 (α) 2 S xx β 0 + β 1 x : ŷ ± t n 2 (α/2) y : ŷ ± t n 2 (α/2) MSE MSE ( ) 1 + (x x) 2 n S xx ( ) (x x) 2 n S xx T-test H 0 : β 1 = 0, T = H 0 : β 0 = 0, T = ˆβ 1 MSE/Sxx t n 2 ˆβ 0 MSE x 2 /S xx t n 2 Equivalent to the analysis of variance 21

22 6.1.5 Coefficient of determination r 2 = SSR T SS = S2 xy S xxs yy The percent reduction in the total variation obtained by using the regression line Correlation Analysis: H 0 : ρ = 0, T = r t n 2 Equivalent to the t-test n 2 1 r Variable Selection Sometimes it is inefficient to use all the variables. So we seek to choose our variables. There are several ways to do this. First, we need to discuss measures of fit. R 2 is a good measure, but only works for linear regression Measures of Fit We can minimize: AIC BIC MSE/Classification Error AIC: 2k 2 log(ˆl), where k is the number of variables, and ˆL is the maximum likelihood. BIC: log(n)k 2 log(ˆl), where n is the number of observations. MSE: same as before Stepwise Regression Forward stepwise regression: 1. Initiate V := 2. for each variable v / V : 22

23 (a) Run a model with all the variables in V and the variable v i (b) keep track of the AIC/BIC 3. Find the variable v that maximizes AIC, and set V := V v. 4. go back to step Ridge Regression Solve: Lasso Regression arg min β (β i x i y i ) 2 + λ i i β 2 i. Solve: arg min β (β i x i y i ) 2 + λ i i β i. 7 Other important concepts to know 7.1 Cross Validation 7.2 Dimensionality Reduction PCA Variable Selection Stepwise Regression 7.3 Classification Logistic Regression K-NN 23

24 Random Forests 7.4 Clustering K-means E-M algorithm Other clustering methods (subspace clustering, K-medoids, etc.) 7.5 Non-parametric methods Random Forest Support Vector Machines Neural Networks (technically semi-parametric) 7.6 Model selection Elbow method See also: Cross-validation 24

Exercises and Answers to Chapter 1

Exercises and Answers to Chapter 1 Exercises and Answers to Chapter The continuous type of random variable X has the following density function: a x, if < x < a, f (x), otherwise. Answer the following questions. () Find a. () Obtain mean

More information

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Statistics - Lecture One. Outline. Charlotte Wickham  1. Basic ideas about estimation Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence

More information

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation Yujin Chung November 29th, 2016 Fall 2016 Yujin Chung Lec13: MLE Fall 2016 1/24 Previous Parametric tests Mean comparisons (normality assumption)

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we

More information

Chapter 12 - Lecture 2 Inferences about regression coefficient

Chapter 12 - Lecture 2 Inferences about regression coefficient Chapter 12 - Lecture 2 Inferences about regression coefficient April 19th, 2010 Facts about slope Test Statistic Confidence interval Hypothesis testing Test using ANOVA Table Facts about slope In previous

More information

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2 identity Y ijk Ȳ = (Y ijk Ȳij ) + (Ȳi Ȳ ) + (Ȳ j Ȳ ) + (Ȳij Ȳi Ȳ j + Ȳ ) Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, (1) E(MSE) = E(SSE/[IJ(K 1)]) = (2) E(MSA) = E(SSA/(I

More information

Lecture 15. Hypothesis testing in the linear model

Lecture 15. Hypothesis testing in the linear model 14. Lecture 15. Hypothesis testing in the linear model Lecture 15. Hypothesis testing in the linear model 1 (1 1) Preliminary lemma 15. Hypothesis testing in the linear model 15.1. Preliminary lemma Lemma

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

STAT 512 sp 2018 Summary Sheet

STAT 512 sp 2018 Summary Sheet STAT 5 sp 08 Summary Sheet Karl B. Gregory Spring 08. Transformations of a random variable Let X be a rv with support X and let g be a function mapping X to Y with inverse mapping g (A = {x X : g(x A}

More information

Math 494: Mathematical Statistics

Math 494: Mathematical Statistics Math 494: Mathematical Statistics Instructor: Jimin Ding jmding@wustl.edu Department of Mathematics Washington University in St. Louis Class materials are available on course website (www.math.wustl.edu/

More information

simple if it completely specifies the density of x

simple if it completely specifies the density of x 3. Hypothesis Testing Pure significance tests Data x = (x 1,..., x n ) from f(x, θ) Hypothesis H 0 : restricts f(x, θ) Are the data consistent with H 0? H 0 is called the null hypothesis simple if it completely

More information

HT Introduction. P(X i = x i ) = e λ λ x i

HT Introduction. P(X i = x i ) = e λ λ x i MODS STATISTICS Introduction. HT 2012 Simon Myers, Department of Statistics (and The Wellcome Trust Centre for Human Genetics) myers@stats.ox.ac.uk We will be concerned with the mathematical framework

More information

Stat 579: Generalized Linear Models and Extensions

Stat 579: Generalized Linear Models and Extensions Stat 579: Generalized Linear Models and Extensions Mixed models Yan Lu March, 2018, week 8 1 / 32 Restricted Maximum Likelihood (REML) REML: uses a likelihood function calculated from the transformed set

More information

MATH4427 Notebook 2 Fall Semester 2017/2018

MATH4427 Notebook 2 Fall Semester 2017/2018 MATH4427 Notebook 2 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 2 MATH4427 Notebook 2 3 2.1 Definitions and Examples...................................

More information

Statistics 3858 : Maximum Likelihood Estimators

Statistics 3858 : Maximum Likelihood Estimators Statistics 3858 : Maximum Likelihood Estimators 1 Method of Maximum Likelihood In this method we construct the so called likelihood function, that is L(θ) = L(θ; X 1, X 2,..., X n ) = f n (X 1, X 2,...,

More information

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation.

Simple Regression Model Setup Estimation Inference Prediction. Model Diagnostic. Multiple Regression. Model Setup and Estimation. Statistical Computation Math 475 Jimin Ding Department of Mathematics Washington University in St. Louis www.math.wustl.edu/ jmding/math475/index.html October 10, 2013 Ridge Part IV October 10, 2013 1

More information

Theory of Statistics.

Theory of Statistics. Theory of Statistics. Homework V February 5, 00. MT 8.7.c When σ is known, ˆµ = X is an unbiased estimator for µ. If you can show that its variance attains the Cramer-Rao lower bound, then no other unbiased

More information

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments

STAT 135 Lab 3 Asymptotic MLE and the Method of Moments STAT 135 Lab 3 Asymptotic MLE and the Method of Moments Rebecca Barter February 9, 2015 Maximum likelihood estimation (a reminder) Maximum likelihood estimation Suppose that we have a sample, X 1, X 2,...,

More information

where x and ȳ are the sample means of x 1,, x n

where x and ȳ are the sample means of x 1,, x n y y Animal Studies of Side Effects Simple Linear Regression Basic Ideas In simple linear regression there is an approximately linear relation between two variables say y = pressure in the pancreas x =

More information

BTRY 4090: Spring 2009 Theory of Statistics

BTRY 4090: Spring 2009 Theory of Statistics BTRY 4090: Spring 2009 Theory of Statistics Guozhang Wang September 25, 2010 1 Review of Probability We begin with a real example of using probability to solve computationally intensive (or infeasible)

More information

Linear Models and Estimation by Least Squares

Linear Models and Estimation by Least Squares Linear Models and Estimation by Least Squares Jin-Lung Lin 1 Introduction Causal relation investigation lies in the heart of economics. Effect (Dependent variable) cause (Independent variable) Example:

More information

Statistics and Econometrics I

Statistics and Econometrics I Statistics and Econometrics I Point Estimation Shiu-Sheng Chen Department of Economics National Taiwan University September 13, 2016 Shiu-Sheng Chen (NTU Econ) Statistics and Econometrics I September 13,

More information

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method

STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method STAT 135 Lab 2 Confidence Intervals, MLE and the Delta Method Rebecca Barter February 2, 2015 Confidence Intervals Confidence intervals What is a confidence interval? A confidence interval is calculated

More information

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic. Serik Sagitov, Chalmers and GU, February, 08 Solutions chapter Matlab commands: x = data matrix boxplot(x) anova(x) anova(x) Problem.3 Consider one-way ANOVA test statistic For I = and = n, put F = MS

More information

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper

McGill University. Faculty of Science. Department of Mathematics and Statistics. Part A Examination. Statistics: Theory Paper McGill University Faculty of Science Department of Mathematics and Statistics Part A Examination Statistics: Theory Paper Date: 10th May 2015 Instructions Time: 1pm-5pm Answer only two questions from Section

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

Section 4.6 Simple Linear Regression

Section 4.6 Simple Linear Regression Section 4.6 Simple Linear Regression Objectives ˆ Basic philosophy of SLR and the regression assumptions ˆ Point & interval estimation of the model parameters, and how to make predictions ˆ Point and interval

More information

Principles of Statistics

Principles of Statistics Part II Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2018 81 Paper 4, Section II 28K Let g : R R be an unknown function, twice continuously differentiable with g (x) M for

More information

Problem Selected Scores

Problem Selected Scores Statistics Ph.D. Qualifying Exam: Part II November 20, 2010 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. Problem 1 2 3 4 5 6 7 8 9 10 11 12 Selected

More information

1 One-way analysis of variance

1 One-way analysis of variance LIST OF FORMULAS (Version from 21. November 2014) STK2120 1 One-way analysis of variance Assume X ij = µ+α i +ɛ ij ; j = 1, 2,..., J i ; i = 1, 2,..., I ; where ɛ ij -s are independent and N(0, σ 2 ) distributed.

More information

Advanced Statistics I : Gaussian Linear Model (and beyond)

Advanced Statistics I : Gaussian Linear Model (and beyond) Advanced Statistics I : Gaussian Linear Model (and beyond) Aurélien Garivier CNRS / Telecom ParisTech Centrale Outline One and Two-Sample Statistics Linear Gaussian Model Model Reduction and model Selection

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth

More information

Loglikelihood and Confidence Intervals

Loglikelihood and Confidence Intervals Stat 504, Lecture 2 1 Loglikelihood and Confidence Intervals The loglikelihood function is defined to be the natural logarithm of the likelihood function, l(θ ; x) = log L(θ ; x). For a variety of reasons,

More information

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain

f(x θ)dx with respect to θ. Assuming certain smoothness conditions concern differentiating under the integral the integral sign, we first obtain 0.1. INTRODUCTION 1 0.1 Introduction R. A. Fisher, a pioneer in the development of mathematical statistics, introduced a measure of the amount of information contained in an observaton from f(x θ). Fisher

More information

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota

Stat 5102 Lecture Slides Deck 3. Charles J. Geyer School of Statistics University of Minnesota Stat 5102 Lecture Slides Deck 3 Charles J. Geyer School of Statistics University of Minnesota 1 Likelihood Inference We have learned one very general method of estimation: method of moments. the Now we

More information

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given. (a) If X and Y are independent, Corr(X, Y ) = 0. (b) (c) (d) (e) A consistent estimator must be asymptotically

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

Mathematical statistics

Mathematical statistics October 18 th, 2018 Lecture 16: Midterm review Countdown to mid-term exam: 7 days Week 1 Chapter 1: Probability review Week 2 Week 4 Week 7 Chapter 6: Statistics Chapter 7: Point Estimation Chapter 8:

More information

A Very Brief Summary of Bayesian Inference, and Examples

A Very Brief Summary of Bayesian Inference, and Examples A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X

More information

This paper is not to be removed from the Examination Halls

This paper is not to be removed from the Examination Halls ~~ST104B ZA d0 This paper is not to be removed from the Examination Halls UNIVERSITY OF LONDON ST104B ZB BSc degrees and Diplomas for Graduates in Economics, Management, Finance and the Social Sciences,

More information

Simple and Multiple Linear Regression

Simple and Multiple Linear Regression Sta. 113 Chapter 12 and 13 of Devore March 12, 2010 Table of contents 1 Simple Linear Regression 2 Model Simple Linear Regression A simple linear regression model is given by Y = β 0 + β 1 x + ɛ where

More information

Final Examination Statistics 200C. T. Ferguson June 11, 2009

Final Examination Statistics 200C. T. Ferguson June 11, 2009 Final Examination Statistics 00C T. Ferguson June, 009. (a) Define: X n converges in probability to X. (b) Define: X m converges in quadratic mean to X. (c) Show that if X n converges in quadratic mean

More information

Mathematical statistics

Mathematical statistics October 4 th, 2018 Lecture 12: Information Where are we? Week 1 Week 2 Week 4 Week 7 Week 10 Week 14 Probability reviews Chapter 6: Statistics and Sampling Distributions Chapter 7: Point Estimation Chapter

More information

Chapter 8.8.1: A factorization theorem

Chapter 8.8.1: A factorization theorem LECTURE 14 Chapter 8.8.1: A factorization theorem The characterization of a sufficient statistic in terms of the conditional distribution of the data given the statistic can be difficult to work with.

More information

Two-Way Analysis of Variance - no interaction

Two-Way Analysis of Variance - no interaction 1 Two-Way Analysis of Variance - no interaction Example: Tests were conducted to assess the effects of two factors, engine type, and propellant type, on propellant burn rate in fired missiles. Three engine

More information

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001

Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Statistics Ph.D. Qualifying Exam: Part II November 3, 2001 Student Name: 1. Answer 8 out of 12 problems. Mark the problems you selected in the following table. 1 2 3 4 5 6 7 8 9 10 11 12 2. Write your

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Information in Data. Sufficiency, Ancillarity, Minimality, and Completeness

Information in Data. Sufficiency, Ancillarity, Minimality, and Completeness Information in Data Sufficiency, Ancillarity, Minimality, and Completeness Important properties of statistics that determine the usefulness of those statistics in statistical inference. These general properties

More information

Probability Theory and Statistics. Peter Jochumzen

Probability Theory and Statistics. Peter Jochumzen Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................

More information

Part IB. Statistics. Year

Part IB. Statistics. Year Part IB Year 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2018 42 Paper 1, Section I 7H X 1, X 2,..., X n form a random sample from a distribution whose probability

More information

Analysis of Variance and Design of Experiments-I

Analysis of Variance and Design of Experiments-I Analysis of Variance and Design of Experiments-I MODULE VIII LECTURE - 35 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS MODEL Dr. Shalabh Department of Mathematics and Statistics Indian

More information

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor.

Final Examination a. STA 532: Statistical Inference. Wednesday, 2015 Apr 29, 7:00 10:00pm. Thisisaclosed bookexam books&phonesonthefloor. Final Examination a STA 532: Statistical Inference Wednesday, 2015 Apr 29, 7:00 10:00pm Thisisaclosed bookexam books&phonesonthefloor Youmayuseacalculatorandtwo pagesofyourownnotes Do not share calculators

More information

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013

UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013 UNIVERSITY OF TORONTO SCARBOROUGH Department of Computer and Mathematical Sciences FINAL EXAMINATION, APRIL 2013 STAB57H3 Introduction to Statistics Duration: 3 hours Last Name: First Name: Student number:

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

LIST OF FORMULAS FOR STK1100 AND STK1110

LIST OF FORMULAS FOR STK1100 AND STK1110 LIST OF FORMULAS FOR STK1100 AND STK1110 (Version of 11. November 2015) 1. Probability Let A, B, A 1, A 2,..., B 1, B 2,... be events, that is, subsets of a sample space Ω. a) Axioms: A probability function

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Statistics Ph.D. Qualifying Exam

Statistics Ph.D. Qualifying Exam Department of Statistics Carnegie Mellon University May 7 2008 Statistics Ph.D. Qualifying Exam You are not expected to solve all five problems. Complete solutions to few problems will be preferred to

More information

Statistical Inference

Statistical Inference Statistical Inference Serik Sagitov, Chalmers University of Technology and Gothenburg University Abstract This text is a compendium for the undergraduate course on course MVE155 Statistical Inference worth

More information

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n = Spring 2012 Math 541A Exam 1 1. (a) Let Z i be independent N(0, 1), i = 1, 2,, n. Are Z = 1 n n Z i and S 2 Z = 1 n 1 n (Z i Z) 2 independent? Prove your claim. (b) Let X 1, X 2,, X n be independent identically

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2009 Prof. Gesine Reinert Our standard situation is that we have data x = x 1, x 2,..., x n, which we view as realisations of random

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida First Year Examination Department of Statistics, University of Florida August 19, 010, 8:00 am - 1:00 noon Instructions: 1. You have four hours to answer questions in this examination.. You must show your

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA iron retention (log) 0 1 2 3 high Fe2+ high Fe3+ low Fe2+ low Fe3+ medium Fe2+ medium Fe3+ 2 Two-way ANOVA In the one-way design there is only one factor. What if there are several factors? Often, we are

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2

Some Assorted Formulae. Some confidence intervals: σ n. x ± z α/2. x ± t n 1;α/2 n. ˆp(1 ˆp) ˆp ± z α/2 n. χ 2 n 1;1 α/2. n 1;α/2 STA 248 H1S MIDTERM TEST February 26, 2008 SURNAME: SOLUTIONS GIVEN NAME: STUDENT NUMBER: INSTRUCTIONS: Time: 1 hour and 50 minutes Aids allowed: calculator Tables of the standard normal, t and chi-square

More information

Probability and Statistics qualifying exam, May 2015

Probability and Statistics qualifying exam, May 2015 Probability and Statistics qualifying exam, May 2015 Name: Instructions: 1. The exam is divided into 3 sections: Linear Models, Mathematical Statistics and Probability. You must pass each section to pass

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

9 Asymptotic Approximations and Practical Asymptotic Tools

9 Asymptotic Approximations and Practical Asymptotic Tools 9 Asymptotic Approximations and Practical Asymptotic Tools A point estimator is merely an educated guess about the true value of an unknown parameter. The utility of a point estimate without some idea

More information

Elements of statistics (MATH0487-1)

Elements of statistics (MATH0487-1) Elements of statistics (MATH0487-1) Prof. Dr. Dr. K. Van Steen University of Liège, Belgium November 12, 2012 Introduction to Statistics Basic Probability Revisited Sampling Exploratory Data Analysis -

More information

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm

Statistics GIDP Ph.D. Qualifying Exam Theory Jan 11, 2016, 9:00am-1:00pm Statistics GIDP Ph.D. Qualifying Exam Theory Jan, 06, 9:00am-:00pm Instructions: Provide answers on the supplied pads of paper; write on only one side of each sheet. Complete exactly 5 of the 6 problems.

More information

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions.

For iid Y i the stronger conclusion holds; for our heuristics ignore differences between these notions. Large Sample Theory Study approximate behaviour of ˆθ by studying the function U. Notice U is sum of independent random variables. Theorem: If Y 1, Y 2,... are iid with mean µ then Yi n µ Called law of

More information

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:. MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss

More information

Factorial and Unbalanced Analysis of Variance

Factorial and Unbalanced Analysis of Variance Factorial and Unbalanced Analysis of Variance Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 04-Jan-2017 Nathaniel E. Helwig (U of Minnesota)

More information

Maximum Likelihood Large Sample Theory

Maximum Likelihood Large Sample Theory Maximum Likelihood Large Sample Theory MIT 18.443 Dr. Kempthorne Spring 2015 1 Outline 1 Large Sample Theory of Maximum Likelihood Estimates 2 Asymptotic Results: Overview Asymptotic Framework Data Model

More information

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is. Linear regression We have that the estimated mean in linear regression is The standard error of ˆµ Y X=x is where x = 1 n s.e.(ˆµ Y X=x ) = σ ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. 1 n + (x x)2 i (x i x) 2 i x i. The

More information

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30 MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD Copyright c 2012 (Iowa State University) Statistics 511 1 / 30 INFORMATION CRITERIA Akaike s Information criterion is given by AIC = 2l(ˆθ) + 2k, where l(ˆθ)

More information

Notes on the Multivariate Normal and Related Topics

Notes on the Multivariate Normal and Related Topics Version: July 10, 2013 Notes on the Multivariate Normal and Related Topics Let me refresh your memory about the distinctions between population and sample; parameters and statistics; population distributions

More information

Hypothesis testing: theory and methods

Hypothesis testing: theory and methods Statistical Methods Warsaw School of Economics November 3, 2017 Statistical hypothesis is the name of any conjecture about unknown parameters of a population distribution. The hypothesis should be verifiable

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015 Part IB Statistics Theorems with proof Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly)

More information

Estimation, Inference, and Hypothesis Testing

Estimation, Inference, and Hypothesis Testing Chapter 2 Estimation, Inference, and Hypothesis Testing Note: The primary reference for these notes is Ch. 7 and 8 of Casella & Berger 2. This text may be challenging if new to this topic and Ch. 7 of

More information

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018

Mathematics Ph.D. Qualifying Examination Stat Probability, January 2018 Mathematics Ph.D. Qualifying Examination Stat 52800 Probability, January 2018 NOTE: Answers all questions completely. Justify every step. Time allowed: 3 hours. 1. Let X 1,..., X n be a random sample from

More information

Foundations of Statistical Inference

Foundations of Statistical Inference Foundations of Statistical Inference Julien Berestycki Department of Statistics University of Oxford MT 2016 Julien Berestycki (University of Oxford) SB2a MT 2016 1 / 20 Lecture 6 : Bayesian Inference

More information

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ).

1 General problem. 2 Terminalogy. Estimation. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Estimation February 3, 206 Debdeep Pati General problem Model: {P θ : θ Θ}. Observe X P θ, θ Θ unknown. Estimate θ. (Pick a plausible distribution from family. ) Or estimate τ = τ(θ). Examples: θ = (µ,

More information

Master s Written Examination - Solution

Master s Written Examination - Solution Master s Written Examination - Solution Spring 204 Problem Stat 40 Suppose X and X 2 have the joint pdf f X,X 2 (x, x 2 ) = 2e (x +x 2 ), 0 < x < x 2

More information

Statistics 135 Fall 2008 Final Exam

Statistics 135 Fall 2008 Final Exam Name: SID: Statistics 135 Fall 2008 Final Exam Show your work. The number of points each question is worth is shown at the beginning of the question. There are 10 problems. 1. [2] The normal equations

More information

Statistical Hypothesis Testing

Statistical Hypothesis Testing Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

SOLUTION FOR HOMEWORK 6, STAT 6331

SOLUTION FOR HOMEWORK 6, STAT 6331 SOLUTION FOR HOMEWORK 6, STAT 633. Exerc.7.. It is given that X,...,X n is a sample from N(θ, σ ), and the Bayesian approach is used with Θ N(µ, τ ). The parameters σ, µ and τ are given. (a) Find the joinf

More information

Lecture 2: Statistical Decision Theory (Part I)

Lecture 2: Statistical Decision Theory (Part I) Lecture 2: Statistical Decision Theory (Part I) Hao Helen Zhang Hao Helen Zhang Lecture 2: Statistical Decision Theory (Part I) 1 / 35 Outline of This Note Part I: Statistics Decision Theory (from Statistical

More information

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference

Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Lecture 23 Maximum Likelihood Estimation and Bayesian Inference Thais Paiva STA 111 - Summer 2013 Term II August 7, 2013 1 / 31 Thais Paiva STA 111 - Summer 2013 Term II Lecture 23, 08/07/2013 Lecture

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information