THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

Similar documents
1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

Chapter 11: Simple Linear Regression and Correlation

STAT 3008 Applied Regression Analysis

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Lecture 3: Probability Distributions

Polynomial Regression Models

Correlation and Regression. Correlation 9.1. Correlation. Chapter 9

Chapter 14 Simple Linear Regression

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA APPLIED STATISTICS PAPER I

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

AS-Level Maths: Statistics 1 for Edexcel

Limited Dependent Variables

Statistics for Economics & Business

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Economics 130. Lecture 4 Simple Linear Regression Continued

Lecture 4 Hypothesis Testing

First Year Examination Department of Statistics, University of Florida

Answers Problem Set 2 Chem 314A Williamsen Spring 2000

The SAS program I used to obtain the analyses for my answers is given below.

Linear Regression Analysis: Terminology and Notation

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n!

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

Chapter 3 Describing Data Using Numerical Measures

Comparison of Regression Lines

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

β0 + β1xi and want to estimate the unknown

e i is a random error

Lecture 3 Stat102, Spring 2007

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

CHAPTER 8. Exercise Solutions

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

x i1 =1 for all i (the constant ).

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Statistics II Final Exam 26/6/18

The Ordinary Least Squares (OLS) Estimator

Chapter 13: Multiple Regression

Statistics Chapter 4

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

January Examinations 2015

STAT 3340 Assignment 1 solutions. 1. Find the equation of the line which passes through the points (1,1) and (4,5).

/ n ) are compared. The logic is: if the two

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Probability and Random Variable Primer

Chapter 7 Generalized and Weighted Least Squares Estimation. In this method, the deviation between the observed and expected values of

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

INF 5860 Machine learning for image classification. Lecture 3 : Image classification and regression part II Anne Solberg January 31, 2018

Basic Business Statistics, 10/e

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

a. (All your answers should be in the letter!

ANSWERS CHAPTER 9. TIO 9.2: If the values are the same, the difference is 0, therefore the null hypothesis cannot be rejected.

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Statistics for Business and Economics

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

RELIABILITY ASSESSMENT

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

The written Master s Examination

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

Introduction to Regression

Definition. Measures of Dispersion. Measures of Dispersion. Definition. The Range. Measures of Dispersion 3/24/2014

STAT 405 BIOSTATISTICS (Fall 2016) Handout 15 Introduction to Logistic Regression

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Lecture 2: Prelude to the big shrink

4.3 Poisson Regression

Professor Chris Murray. Midterm Exam

Basically, if you have a dummy dependent variable you will be estimating a probability.

Problem Set 9 - Solutions Due: April 27, 2005

Math1110 (Spring 2009) Prelim 3 - Solutions

Rockefeller College University at Albany

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

BIO Lab 2: TWO-LEVEL NORMAL MODELS with school children popularity data

Lecture 6: Introduction to Linear Regression

Negative Binomial Regression

28. SIMPLE LINEAR REGRESSION III

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Linear Approximation with Regularization and Moving Least Squares

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Learning Objectives for Chapter 11

Biostatistics 360 F&t Tests and Intervals in Regression 1

Regression Analysis. Regression Analysis

PubH 7405: REGRESSION ANALYSIS. SLR: INFERENCES, Part II

ISQS 6348 Final Open notes, no books. Points out of 100 in parentheses. Y 1 ε 2

x = , so that calculated

Properties of Least Squares

Hydrological statistics. Hydrological statistics and extremes

Y = β 0 + β 1 X 1 + β 2 X β k X k + ε

Chapter 8 Indicator Variables

Lecture 16 Statistical Analysis in Biomaterials Research (Part II)

Bayesian Planning of Hit-Miss Inspection Tests

Statistics and Probability Theory in Civil, Surveying and Environmental Engineering

ECONOMETRICS - FINAL EXAM, 3rd YEAR (GECO & GADE)

Transcription:

THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for the nformaton of any other persons usng the eamnatons. The solutons should NOT be seen as "model answers". Rather, they have been wrtten out n consderable detal and are ntended as learnng ads. Users of the solutons should always be aware that n many cases there are vald alternatve methods. Also, n the many cases where dscusson s called for, there may be other vald ponts that could be made. Whle every care has been taken wth the preparaton of these solutons, the Socety wll not be responsble for any errors or omssons. The Socety wll not enter nto any correspondence n respect of these solutons. Note. In accordance wth the conventon used n the Socety's eamnaton papers, the notaton log denotes logarthm to base e. Logarthms to any other base are eplctly dentfed, e.g. log. RSS 6

Hgher Certfcate, Paper I, 6. Queston () The frst place can be occuped by 9 dfferent dgts, to 9. Each of the other three places can be occuped by dgts, to 9. Hence there are 9 = 9 possble PINs. () All of the combnatons n () are allowed ecept,,, 9999, so there are 9 9 = 899 possbltes. () Only the 9 dgts to 9 can be used. The frst place can be flled n 9 ways, the second n 8, the thrd n 7 and the last n 6. So there are 9 8 7 6 = 34 possbltes. (v) Wth all dgts possble n any poston, there would be 4 PINs. There are 7 ncreasng sequences (3, 34,, 6789) and 7 decreasng sequences (9876, 8765,, 3), whch are not allowed. The number of possble PINs s therefore 4 4 = 9986. (v) All of the 4 combnatons are allowed ecept: (a) the where all 4 dgts are the same:,,, 9999; (b) those where one dgt occurs three tmes and another just once. There are 9 = 9 ways of choosng the two dgts. But note that, for eample, 333, 333, 333 and 333 are four dfferent PINs; whchever two dgts occur, the odd one out can be n any of the 4 places n the PIN. Therefore there are 4 9 = 36 PINs of ths sort. The number of possble PINs s therefore 4 36 = 963.

Hgher Certfcate, Paper I, 6. Queston () A: (a) P( entres) = = =.5. 4 (b) P( entry) = = =.5. B: (a) P( entres) = 3 3 7 = =.49. 4 64 (b) P( entry) = 3 7 3 = 4 4 64 =.49. C: (a) P( entres) = 5 4 4 = =.377. 5 35 (b) P( entry) = 4 4 56 5 = =.496. 5 5 65 () P( entry n total) = P( from A, from B and C) + P( from B, from A and C) + P( from C, from A and B) 7 4 7 4 56 7 459 = + + =. 64 35 64 4 35 65 4 64 35 [If worked n decmals, ths s.469.] P( from A n total) = P( from A and n total) / P( n total) = P( from A, from B and C) / P( n total) = 7 4 64 35 459 35 8 =. 7 () Denote the numbers of entres from A, B, C as (,, ) etc. Then we need P(,, ) + P(,, ) + P(,, ) + P(,, ) + P(,, ) + P(,, ). Snce entres from each group are ndependent, we have, as an eample, P(,, ) = P( from A).P( from B).P( from C).

Hgher Certfcate, Paper I, 6. Queston 3 3 4 ( ) ( ) + () We have k d=, so k d=. Ths gves 3 4 5 = k + 3 5 = k +, so k = 3. 3 5 f() = at = and at =. f() s symmetrcal about = ½. The sketch s as follows. f()....3.4.5.6.7.8.9. () E( X ) = by symmetry [or by drect ntegraton: f( ) d]. ( ) ( ) d E( X ) = 3 d= 3 + 4 4 5 6 5 6 6 3 3 3 = + 5 3 7 = + = =. 5 3 7 5 7 { } Var( X) = E( X ) E( X) = =. 7 8 /3 /3 3 4 3 4 5 3( ) 3 3 3 5 P X = + d= + 3 3 7 7 = 3. +. 4 4 5 = + = = 3 3 5 3 8 5 8 3 8 (=.99). () The requred probablty s 5 5 7 64 = 8 8 =.379. (v) The varance of X for a sample of sze 5 s Var( X ) / 8 = = =.74. 5 5 4

Hgher Certfcate, Paper I, 6. Queston 4 Let X represent cyclng tme wthout delays: X ~ N(5, ). 7 5 7 =Φ =Φ =.977. () P( X ) ( ) [Φ denotes the cdf of the standard Normal dstrbuton as usual.] () Addng n the delay tmes, also Normally dstrbuted [N(.7,.9)], and lettng T denote the total tme: 7 5.7 7 =Φ =Φ.45 =.8934 ;.9 (a) T ~ N(5.7,.9), so PT ( ) ( ) 7 6.4 7 =Φ =Φ.55 =.796 ;.8 (b) T ~ N(6.4,.8), so PT ( ) ( ) 7 7. 7 =Φ =Φ.887 =.4646..7 (c) T ~ N(7.,.7), so PT ( ) ( ) () The number of delays s dstrbuted as B(3, ½). Hence the stuatons n (), ()(a), ()(b) and ()(c) arse wth probabltes /8, 3/8, 3/8 and /8 respectvely, so the (uncondtonal) mean of the total journey tme s 3 3 8.4 ET ( ) = 5 + 5.7 + 6.4 + 7. = = 6.5 mnutes. 8 8 8 8 8 (v) Mean tme T.55 N 6.5,. 7 6.5 PT ( 7) =Φ =Φ (.45) =.999..55

Hgher Certfcate, Paper I, 6. Queston 5 () E(X) = λ λ λ λ λ λ e = λe = λe e λ!! =. = = ( ) E(X ) = E X ( X ) + X = E X ( X ) + E[ X]. λ λ λ λ λ λ e E X ( X ) = ( ) = λ e = λ e e = λ!!. = = ( ) Hence EX ( ) λ λ = +, and { } Var( X) = E( X ) E( X) = λ. () n e λ λ =! L =, and hence log L= nλ+ log λ+ constant. n = dlog L Σ = n + whch on settng equal to zero gves that the mamum dλ λ lkelhood estmate s ˆ Σ d log L λ = =. [Consderaton of confrms that n dλ d log L Σ ths s a mamum: = <.] dλ λ λ = = Var( X ) λ n = n. Thus the mamum lkelhood estmator of Var( ˆλ ) s ˆ λ. n () Var ( ˆ) Var ( X ) By the central lmt theorem, ˆ λ ( = X ) s appromately Normally dstrbuted wth mean λ and varance λ /n. We estmate the varance by ˆλ /n, so that we ˆ have ˆ λ λ ~N λ, n, appromately. Hence an appromate 95% confdence nterval s gven by ˆ λ λ.95 P.96 < <.96 ˆ λ / n, leadng to the nterval ˆ ˆ λ λ.96, ˆ λ+.96 n ˆ λ. n Soluton contnued on net page

(v) For the gven sample, we have n = and Σ = 48, leadng to ˆ λ = = 4. The appromate confdence nterval s therefore 4 4 4.96 to 4 +.96,.e..87 to 5.3. The sample also gves Σ = 38; so the sample varance s s = 48 46 38 = = 4.8. Ths s close to the sample mean (4), supportng a Posson hypothess for the underlyng model.

Hgher Certfcate, Paper I, 6. Queston 6 λ λ ( ), f = e < < lambda/ f().. By symmetry, E(X) =. Hence λ λ X E X e d λ λ λ = = = { e d+ e d } Var( ) ( ). Substtutng u = n the frst ntegral gves second. Hence we get, ntegratng by parts, λu ue du, whch s the same as the E( X ) λ e λ = d λ λ e = + λ = [ ] + e λ e = + λ λ λ e.d λ d λ e d λ e [ ] λ = + λ λ =. λ Soluton contnued on net page

If Q, q are the upper and lower quartles, we have same dstance below by symmetry. Q λ λe d=, and q wll be the 4 Q λ λq = e ( e ) 4 = +, gvng = e λq. Therefore λq = log. Hence the sem-nterquartle range s (log )/λ. n n λ L e λ λ = e λ =, and hence log L= constant + nlog λ λ. = dlog L n dλ = λ whch on settng equal to zero gves that the mamum lkelhood estmate s ˆ n λ =. [Consderaton of mamum: d log L n dλ = <.] λ d log L confrms that ths s a dλ

Hgher Certfcate, Paper I, 6. Queston 7 () () The sum of all table entres s 3c. These probabltes must add up to, so c = /3. The margnal dstrbutons are gven by the row and column totals. Hence: P(X = ) = 5c = /; P(X = ) = c = /3; P(X = 3) = 5c = /6. Smlarly: P(Y = ) = c = /5; P(Y = ) = 6c = /5; P(Y = 3) = 6c = /5; P(Y = 4) = 6c = /5. () 5 E( X) = + + 3 = + + =. 3 6 3 3 4 3 E( X ) = + 4 + 9 = + + =. 3 6 3 3 5 5 Var( X ) = =. 3 3 9 We also need E(Y) later: 3 4 EY ( ) = + + + =. 5 5 5 5 5 Dstrbuton of XY: Values of y 3 4 6 Probablty 6c 7c 4c 6c 5c c [c = /3, see above] 6 4 4 6 5 E( XY) = + + 3 + 4 + 6 + = = 3 3 3 3 3 3 3 3 Also we have 5 E( X) E( Y ) = =. 3 5 3 Cov( XY, ) = EXY ( ) EXEY ( ) ( ) =. (v) X and Y are not ndependent [even though Cov(X, Y) = and even though some cells have P(X =, Y = y) = P(X = ).P(Y = y)]. For eample, we have P(X =, Y = 4) = /5, but P(X = ).P(Y = 4) = /. Soluton contnued on net page

(v) U = f X = or 3 U = f X = V = f Y = or 3 V = f Y = or 4 Table of jont dstrbuton of U and V, wth margns. Values of U Values of V c = /5 8c = 4/5 c = /3 c = /3 c = /3 c = /3 c = /5 8c = 3/5 Consder the cell wth (U, V) = (, ). The cell probablty s /5 but the product of the margnal probabltes s /5. So U and V are not ndependent.

Hgher Certfcate, Paper I, 6. Queston 8 () Y = a + b + e, =,,, n. The {e } are uncorrelated random varables wth mean and constant varance σ. The method of least squares s equvalent to the method of mamum lkelhood for estmatng the regresson coeffcents (a and b) f the {e } are Normally dstrbuted. [If the analyss s to proceed to nference for the regresson coeffcents, Normalty of the {e } s requred.] ()(a) For Y = β + e, we mnmse S = e = ( y ) β. ds We have = ( y β ) whch on settng equal to zero gves dβ y = β, so the least squares estmate s ˆ y [Consderaton of β =. ds dβ confrms that ths s a mnmum: ds dβ = >.] (b) See scatter plot at foot of page. It shows an ncreasng trend, roughly lnear; but there seems to be some ncrease n varablty as ncreases. There are not enough data ponts to be sure. The usual summary statstcs (not all requred for the zero ntercept model) are n =, Σ = 8, Σy = 4, Σ = 55, Σy = 44, Σ y = 55. βˆ = 55/55 =.5. So the ftted lne s y =.5. Hence the estmated epected number of volatons for = s.5 = 4.. Logcally, zero traffc flow should mply zero speed volatons, so that y should be when s,.e. the zero ntercept model seems reasonable. The scatter plot does not contradct ths. Volatons 5 3 4 Traffc Flow per mnute 5