UNIVERSITY OF LONDON BSc and MSci EXAMINATIONS (MATHEMATICS) May-June 2006 This paper is also taken for the relevant examination for the Associateship. M2S2 Statistical Modelling Date: Wednesday, 17th May 2006 Time: 10 am 12 noon Credit will be given for all questions attempted but extra credit will be given for complete or nearly complete answers. Calculators may not be used. A Formula Sheet is provided on pages 7-8. c 2006 University of London M2S2 Page 1 of 8
1. (i) Write down a description of the method of moments estimation procedure for an arbitrary set of parameters collected in a vector of length k, based on the random sample Z 1,..., Z n. Each Z i, i = 1,..., n is independently drawn from the distribution f Z (z; ). (ii) The Pearson type III distribution has, for fixed values of α R, β > 0 and p > 0, the form ( ) p 1 1 x α βγ(p) β e ( x α β ) if x > α f X (x) =. 0 if x > α Assume that a random sample of size n, denoted X 1,..., X n, is collected from this distribution. (a) (b) (c) (d) Determine the mean of any of the X i drawn from the Pearson type III distribution. It can be shown that the variance of a realisation X i from the Pearson type III is given by Var [X i ] = pβ 2. (1) Determine E [X 2 i ] by using part (a) and equation (1). Determine the method of moments estimators of α and β for a fixed known value of p, denoting them by α M and β M, respectively. Form an estimator of the mean of X based on your solution to part (a), and your estimates for α and β. Show that this estimator is unbiased for the mean. M2S2 Statistical Modelling (2006) Page 2 of 8
2. The half-normal distribution for a random variable Y has probability density function f Y (y) = 2 a y 2 a π e π, y > 0, a > 0 A random sample of size m, Y 1,..., Y m, is collected from the half-normal distribution. (i) (ii) Write down the likelihood of the data. Write down the log-likelihood of the data. (iii) Determine the maximum likelihood estimator (MLE) of a, denoting this estimator â MLE. (iv) It can be shown that Y 2 i Gamma( 1, a ). Using moment generating functions, or 2 π i. You may use the form of the otherwise, determine the distribution of V = m i=1 Y 2 MGF of a Gamma random variable from the formula sheet, and any known result about the properties of moment generating functions, as long as you state the result used carefully. (v) For any p it is assumed that 0 < p < m/2 and m > 4, and in this case calculate E [V p ]. Thus deduce the mean and variance of â MLE. (vi) Comment on the form of the bias and variance of â MLE, for moderate and large values of m. M2S2 Statistical Modelling (2006) Page 3 of 8
3. The absolute returns on m different assets of approximately equivalent magnitude are modelled as exponentially distributed random variables, from the same distribution with population mean 1/λ. The assets are assumed to be independent, and the ith asset is denoted by A i, for i = 1,..., m. (i) (a) (b) (c) (d) (e) Without proof, write down the distribution of C = m i=1 A i. Determine the distribution of D = γc, where γ is a positive constant. Write down the definition of a pivotal quantity for parameter based on sample X 1,... X m. By choosing a suitable value of γ (or otherwise) write down a pivotal quantity for λ, and say why this is a pivotal quantity. A random sample, B 1,..., B n, is collected of returns on a different day, again modelled as exponentially distributed, but with population mean 1/λ. Write down a pivotal quantity for λ. (ii) The two samples, A 1,..., A m and B 1,..., B n, are assumed to be independent. We wish to decide between two different hypotheses: H 0 : λ = λ H 1 : λ = 2λ. Describe how to test this at level α, starting from a pivotal quantity for λ/λ, based on the ratio of the two previous pivotal quantities. You may assume that standard distributional results are valid, as long as you state them carefully. M2S2 Statistical Modelling (2006) Page 4 of 8
4. The survival times of rats in a sample of n rats are measured. The survival time of the ith rat is denoted by T i. The ith rat has been exposed to a substance that is a potential toxin at dosage x i, and T i has an expectation that is modelled by: E (T i ) = β 0 + β 1 x 1 i, var (T i ) = σ 2 i, i = 1,..., n. (2) (i) (ii) Write down the model in vector and matrix form, defining the vector of dependent observations by T, the design matrix by X and the vector of parameters by β = [β 0 β 1 ]. Write down the formal definition of Second Order Assumptions (SOA). You may for this part, and for the rest of the question make SOA on {T i } n i=1. Form an estimate of β; give an explicit expression for the estimate in terms of x 1, x 2, S X 1,X 1, defining such terms appropriately. Justify your choice of estimator, and state any theorems you use. (iii) Calculate the variance matrix of β, and denote this A. (iv) The hypothesis of the substance is not a toxin, i.e. that an increased level of x does not tend to reduce the mean of the survival time, is to be tested based on the random sample of survival times. (a) (b) Write down the null and alternative hypothesis for this test. Under normal theory assumptions we may note that β N 2 (β, A). Assume that σi 2 is known from another study of the rats. Describe how to test the hypothesis: the substance is not a toxin. M2S2 Statistical Modelling (2006) Page 5 of 8
5. The temperature at a given weather station in Sussex over time is measured. The readings correspond to Y i, where i = 1,..., 3n is the time point of the measurements, all made between the start of April and early June. The mean temperature is modelled as increasing linearly over time by E (Y i ) = α 1 + α 2 i = μ i, i = 1,..., 3n. Three different weather-station attendants make the measurements, where attendant 1 took the measurements i = 1,..., n, attendant 2 took the measurements i = n + 1,..., 2n, and attendant 3 took the measurements i = 2n + 1,..., 3n. It is known that the variance of attendant 2 is four times that of attendant 1 and 3 as he has a bit of a problem with his alcohol consumption; attendants 1 & 3 do not drink and their variances are known to be equal, given by some fixed unknown constant σ 2. We assume that the Y i are not correlated. (i) Write down the model in vector and matrix form, defining the vector of observations by Y = [Y 1,..., Y 3n ] T, the design matrix by X and the vector of parameters by α = [α 1 α 2 ] T. (ii) Write down the covariance matrix Σ of Y. (iii) Define a suitable matrix with entries d i to be specified, so that D = diag (d 1, d 2,..., d 3n ), Z = DY and cov (Z) = ci n. (iv) (v) Calculate the minimum variance, unbiased linear estimator of α, expressing this estimator in terms of X, D and Y. The mean value of the temperature is to estimated at time point i = t. The estimator is formed by μ t = α 0 + α 1 t. Calculate the mean and variance of μ t. You should denote s 1 (n) = s 2 (n) = n t + t=1 n t 2 + 3n i=2n+1 3n t=1 i=2n+1 t + 1 4 t 2 + 1 4 2n t, i=n+1 2n i=n+1 t 2, s 3 (n) = 9n 4 s 2(n) s 2 1(n). M2S2 Statistical Modelling (2006) Page 6 of 8
DISCRETE DISTRIBUTIONS RANGE PARAMETERS MASS FUNCTION CDF EfX [X] Var fx [X] MGF X fx FX MX Bernoulli() {0, 1} (0, 1) x (1 ) 1 x (1 ) 1 + e t ( ) n Binomial(n, ) {0, 1,..., n} n Z +, (0, 1) x (1 ) n x ( n n(1 ) 1 + e t ) n x P oisson(λ) {0, 1, 2,...} λ R + e λ λ x x! λ λ exp { λ ( e t 1 )} Geometric() {1, 2,...} (0, 1) (1 ) x 1 1 (1 ) x 1 (1 ) 2 e t 1 e t (1 ) NegBinomial(n, ) {n, n + 1,...} n Z +, (0, 1) or {0, 1, 2,...} n Z +, (0, 1) ( x 1 n 1 ) ( n + x 1 x n (1 ) x n n ) n (1 ) x n(1 ) n(1 ) 2 n(1 ) 2 ( ( e t 1 e t (1 ) 1 e t (1 ) ) n ) n For CONTINUOUS distributions (see over), define the GAMMA FUNCTION Γ(α) = 0 x α 1 e x dx and the LOCATION/SCALE transformation Y = μ + σx gives ( ) ( ) y μ 1 y μ fy (y) = fx FY (y) = FX σ σ σ MY (t) = e μt MX(σt) EfY [Y ] = μ+σe fx [X] Var fy [Y ] = σ2 VarfX [X] M2S2 Statistical Modelling (2006) Page 7 of 8
CONTINUOUS DISTRIBUTIONS PARAMS. PDF CDF EfX [X] Var fx [X] MGF Uniform(α, β) (α, β) α < β R (standard model α = 0, β = 1) X fx FX MX 1 β α x α β α (α + β) 2 (β α) 2 12 e βt e αt t (β α) Exponential(λ) R + λ R + λe λx 1 e λx 1 λ (standard model λ = 1) 1 λ 2 ( λ λ t ) Gamma(α, β) R + α, β R + β α (standard model β = 1) Γ(α) xα 1 e βx α β α β 2 ( β β t ) α W eibull(α, β) R + α, β R + αβx α 1 e βxα 1 e (standard model β = 1) βxα Γ (1 + 1/α) Γ (1 + 2/α) Γ (1 + 1/α) 2 β 1/α β 2/α { Normal(μ, σ 2 ) R μ R, σ R + 1 exp 2πσ 2 (standard model μ = 0, σ = 1) (x μ)2 2σ 2 } μ σ 2 e {μt+σ2 t 2 /2} Student(ν) R ν R + (πν) Γ 1 2 Γ ( ν 2 ) { 1 + x2 ν ( ) ν + 1 2 } (ν+1)/2 0 (if ν > 1) ν ν 2 (if ν > 2) P areto(, α) R +, α R + α α ( + x) α+1 1 ( + x ) α α 2 α 1 (α 1)(α 2) (if α > 1) (if α > 2) Beta(α, β) (0, 1) α, β R + Γ(α + β) Γ(α)Γ(β) xα 1 (1 x) β 1 α α + β αβ (α + β) 2 (α + β + 1) M2S2 Statistical Modelling (2006) Page 8 of 8