SCHOOL OF MATHEMATICS AND STATISTICS

Similar documents
SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

SCHOOL OF MATHEMATICS AND STATISTICS

Techniques and Applications of Multivariate Analysis

Multivariate Analysis

Semester , Example Exam 1

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

STT 843 Key to Homework 1 Spring 2018

Econometrics I KS. Module 1: Bivariate Linear Regression. Alexander Ahammer. This version: March 12, 2018

SCHOOL OF MATHEMATICS AND STATISTICS

Examples of vines and vine copulas Harry Joe July 24/31, 2015; Graphical models reading group

This paper is not to be removed from the Examination Halls

6.867 Machine Learning

CBA4 is live in practice mode this week exam mode from Saturday!

Correlation analysis. Contents

SCHOOL OF MATHEMATICS AND STATISTICS Autumn Semester

Inferences about a Mean Vector

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Inferences for Correlation

Hypothesis Testing for Var-Cov Components

Statistical Inference

Multivariate Statistical Analysis

THE UNIVERSITY OF CHICAGO Graduate School of Business Business 41912, Spring Quarter 2008, Mr. Ruey S. Tsay. Solutions to Final Exam

Introduction to Normal Distribution

2. A music library has 200 songs. How many 5 song playlists can be constructed in which the order of the songs matters?

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

WISE International Masters

STAT 501 Assignment 2 NAME Spring Chapter 5, and Sections in Johnson & Wichern.

MATH5745 Multivariate Methods Lecture 07

STA442/2101: Assignment 5

Problem Set 1 Solution Sketches Time Series Analysis Spring 2010

Principal component analysis

Stat 135 Fall 2013 FINAL EXAM December 18, 2013

ON THE CALCULATION OF A ROBUST S-ESTIMATOR OF A COVARIANCE MATRIX

PRINCIPAL COMPONENTS ANALYSIS (PCA)

Practical Statistics

Applied Multivariate Statistical Analysis Richard Johnson Dean Wichern Sixth Edition

You can compute the maximum likelihood estimate for the correlation

Psychology 282 Lecture #4 Outline Inferences in SLR

In most cases, a plot of d (j) = {d (j) 2 } against {χ p 2 (1-q j )} is preferable since there is less piling up

a. When a data set is not normally distributed, what should you try in order to appropriately make statistical tests on that data?

If we want to analyze experimental or simulated data we might encounter the following tasks:

2.1 Linear regression with matrices

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

Comment about AR spectral estimation Usually an estimate is produced by computing the AR theoretical spectrum at (ˆφ, ˆσ 2 ). With our Monte Carlo

FINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE -MODULE2 Midterm Exam Solutions - March 2015

CS534 Machine Learning - Spring Final Exam

Hierarchical models for multiway data

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

White Noise Processes (Section 6.2)

INTRODUCTION TO ANALYSIS OF VARIANCE

Introduction to Factor Analysis

LECTURE 4 PRINCIPAL COMPONENTS ANALYSIS / EXPLORATORY FACTOR ANALYSIS

3. (a) (8 points) There is more than one way to correctly express the null hypothesis in matrix form. One way to state the null hypothesis is

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH1725 (May-June 2009) INTRODUCTION TO STATISTICS. Time allowed: 2 hours

MAT 3379 (Winter 2016) FINAL EXAM (PRACTICE)

TA: Sheng Zhgang (Th 1:20) / 342 (W 1:20) / 343 (W 2:25) / 344 (W 12:05) Haoyang Fan (W 1:20) / 346 (Th 12:05) FINAL EXAM

Final Exam. Question 1 (20 points) 2 (25 points) 3 (30 points) 4 (25 points) 5 (10 points) 6 (40 points) Total (150 points) Bonus question (10)

MAS361. MAS361 1 Turn Over SCHOOL OF MATHEMATICS AND STATISTICS. Medical Statistics

Final Exam - Solutions

Statistics 135 Fall 2008 Final Exam

Lecture 3. Inference about multivariate normal distribution

UNIVERSITY OF TORONTO Faculty of Arts and Science

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

18.S096 Problem Set 7 Fall 2013 Factor Models Due Date: 11/14/2013. [ ] variance: E[X] =, and Cov[X] = Σ = =

Multinomial Logistic Regression Models

Short Answer Questions: Answer on your separate blank paper. Points are given in parentheses.

Chapter 12 - Lecture 2 Inferences about regression coefficient

Simple Linear Regression

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Problems. Suppose both models are fitted to the same data. Show that SS Res, A SS Res, B

Wald s theorem and the Asimov data set

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance

MAS223 Statistical Inference and Modelling Exercises

[y i α βx i ] 2 (2) Q = i=1

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Review. December 4 th, Review

STP 226 EXAMPLE EXAM #3 INSTRUCTOR:

Principal Component Analysis-I Geog 210C Introduction to Spatial Data Analysis. Chris Funk. Lecture 17

Simple Linear Regression

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL 2010 EXAMINATIONS STA 303 H1S / STA 1002 HS. Duration - 3 hours. Aids Allowed: Calculator

Multiple Linear Regression

WISE International Masters

Week 5 Quantitative Analysis of Financial Markets Characterizing Cycles

Empirical Market Microstructure Analysis (EMMA)

Normal (Gaussian) distribution The normal distribution is often relevant because of the Central Limit Theorem (CLT):

EXAMINATIONS OF THE HONG KONG STATISTICAL SOCIETY

CHAPTER 10. Regression and Correlation

STAT 501 EXAM I NAME Spring 1999

OXFORD CAMBRIDGE AND RSA EXAMINATIONS A2 GCE 4733/01. MATHEMATICS Probability & Statistics 2 QUESTION PAPER

Classification 2: Linear discriminant analysis (continued); logistic regression

Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Bayesian Inference for Normal Mean

STA 2201/442 Assignment 2

Transcription:

Data provided: Graph Paper MAS6011 SCHOOL OF MATHEMATICS AND STATISTICS Dependent Data Spring Semester 2016 2017 3 hours Marks will be awarded for your best five answers. RESTRICTED OPEN BOOK EXAMINATION Candidates may bring to the examination lecture notes and associated lecture material (but no textbooks) plus a calculator that conforms to University regulations. There are 100 marks available on the paper. Please leave this exam paper on your desk Do not remove it from the hall Registration number from U-Card (9 digits) to be completed by student MAS6011 1 Turn Over

Blank MAS6011 2 Continued

1 (i) In a report from 1973, Weber reports on protein consumption in 25 European countries for nine food groups: Red meat, White meat, Eggs, Milk, Fish, Cereals, Starchy foods, Pulses and nuts, Fruits and vegetables. The R analysis of this data set follows on the next two pages. (a) What is the correlation between red meat consumption and white meat consumption? (b) Why is the option cor=true included included in the princomp command? What tends to happen if it is omitted? (c) The session has been edited so that only 6 components are listed. But how many would R have computed? (1 mark) (d) With the aid of an informal graphical technique, how many principal components would you retain in your study? Justify your answer. Suggest characteristics of countries with high scores on the first prin- (e) cipal component. (f) Albania has a very low consumption of fish. How might one expect to see that reflected in the PCA plots? Which country would you expect to have the highest consumption of fish? (g) Give an interpretation of the third and fourth principal components, and comment on the protein consumption of Finland with particular reference to these components. (3 marks) (ii) Find the principal components and the proportion of total variance explained by each when the covariance matrix is 1 ρ 0 S = ρ 1 ρ. 0 ρ 1 (4 marks) (iii) Explain briefly why multidimensional scaling might be regarded as a generalisation of principal components analysis. MAS6011 3 Question 1 continued on next page

1 (continued) > attach(protein) > protein[1:2,] RedMeat WhiteMeat Eggs Milk Fish Cereals Starch Nuts Fr.Veg Alb 10.1 1.4 0.5 8.9 0.2 42 0.6 5.5 1.7 Aus 8.9 14.0 4.3 19.9 2.1 28 3.6 1.3 4.3 > apply(protein,2,mean) RedMeat WhiteMeat Eggs Milk Fish Cereals Starch Nuts Fr.Veg 9.8 7.9 2.9 17.1 4.3 32.2 4.3 3.1 4.1 > var(protein) RedMeat WhiteMeat Eggs Milk Fish Cereals Starch Nuts Fr.Veg RedMeat 11.20 1.89 2.191 12.0 0.69-18.36 0.74-2.32-0.448 WhiteMeat 1.89 13.65 2.561 7.4-2.94-16.78 1.89-4.66-0.409 Eggs 2.19 2.56 1.249 4.6 0.25-8.74 0.83-1.24-0.092 Milk 11.96 7.39 4.570 50.5 3.33-46.22 2.58-8.76-5.234 Fish 0.69-2.94 0.249 3.3 11.58-19.58 2.25-0.99 1.634 Cereals -18.36-16.78-8.738-46.2-19.58 120.45-9.56 14.19 0.922 Starch 0.74 1.89 0.826 2.6 2.25-9.56 2.67-1.54 0.249 Nuts -2.32-4.66-1.242-8.8-0.99 14.19-1.54 3.94 1.343 Fr.Veg -0.45-0.41-0.092-5.2 1.63 0.92 0.25 1.34 3.254 > pr.pca<-princomp(protein,cor=true) > loadings(pr.pca) Loadings: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 RedMeat -0.303 0.298 0.646 0.322 0.460 WhiteMeat -0.311 0.237-0.624-0.300 0.121 Eggs -0.427-0.182 0.313-0.361 Milk -0.378 0.185 0.386-0.200-0.618 Fish -0.136-0.647 0.321-0.216-0.290 0.130 Cereals 0.438 0.233 0.238 Starch -0.297-0.353-0.243-0.337 0.736-0.148 Nuts 0.420-0.143 0.330 0.151-0.447 Fr.Veg 0.110-0.536-0.408 0.462-0.234-0.119 > summary(pr.pca) Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Standard deviation 2.00 1.28 1.06 0.98 0.681 0.570 Proportion of Variance 0.45 0.18 0.13 0.11 0.052 0.036 Cumulative Proportion 0.45 0.63 0.75 0.86 0.910 0.946 MAS6011 4 Question 1 continued on next page

1 (continued) Scree plot of variances PCA plot PC1 and PC2 cumulative percentage of total variance 0 20 40 60 80 100 0 1 2 3 4 5 6 7 8 9 PC2 4 3 2 1 0 1 Aus Nth Ire Swi Fin WGer UK Swe Bel Den EGer Fra Nor Cze Pol Hun USSR Ita Spa Por Alb Bul Rom Yug Gre 0 1 2 3 4 5 6 7 8 9 number of dimensions 2 1 0 1 2 3 PC1 PCA plot PC3 and PC4 PCA plot PC5 and PC6 PC4 1.5 1.0 0.5 0.0 0.5 1.0 1.5 2.0 Fra Gre UK Ita Swi Bel Ire Spa Aus WGer Bul Alb Hun Nth Pol Cze Rom Swe Yug Por USSR Den EGer Nor Fin 2 1 0 1 2 PC3 PC6 1.0 0.5 0.0 0.5 1.0 Alb Fra Cze Bul Por EGer Bel Den WGer Swi Nor UK Swe AusIta Rom USSR Nth Yug Ire Hun Pol Spa Fin Gre 1.0 0.5 0.0 0.5 1.0 1.5 PC5 MAS6011 5 Turn Over

2 Johnson and Wichern (2007) report on a study of 45 female hook-billed kites. The tail length and wing length are measured in centimetres for each bird. The mean tail length is 19.36cm, while the mean wing length is 27.98cm. The variance matrix is given by S = ( 1.207 1.223 1.223 2.085 is 1.207), with inverse S 1 = ) (with the variable tail length appearing first, so that its variance ( ) 2.044 1.199. 1.199 1.183 You may use the R output qf(0.95,2,43)=3.214 and qt(0.975,44)=2.015. (i) A more extended study of male kites has shown that the mean tail length for male birds is 19.75cm, and the mean wing length is 28.45cm. (a) Compute a 95% confidence interval for the mean tail length of female kites. Can we reject the hypothesis that the mean tail length of female kites is 19.75cm at the 5% level of significance? (b) Similarly, can we reject the hypothesis that the mean wing length for female kites is 28.45cm at the 5% level of significance? (c) Test the multivariate hypothesis that the mean of the pairs of measurement x = (19.36,27.98) for female kites is equal to µ 0 = (19.75,28.45) at the 95% level. (6 marks) (d) Discuss briefly the results of parts (a) (c). Illustrate your answer with a sketch of the multivariate confidence region. (3 marks) (ii) Suppose that x 1,...,x n are independent observations of a bivariate normal distribution. We work with the principal components of the ( generated ) data, so that we a 0 suppose that the sample variance matrix is of the form S =, with uncorrelated 0 b variables. Using these variables, we write x for the sample mean, and µ and Σ for the parameters of the distribution, so that our data comes from an N 2 (µ,σ) distribution. Recall that the log-likelihood is given by l(µ,σ) = 1 2 (n 1)tr(Σ 1 S) 1 2 ntr(σ 1 (x µ)(x µ) ) nlog(2π) 1 2 nlog Σ, as p = 2, and you may assume that the unrestricted MLEs are given by ˆµ = x and ˆΣ = n 1 n S. We wish to test the hypothesis that detσ = δ, some fixed positive number. You may assume that, under H 0 : detσ = δ, we have ˆµ = x, and that ˆΣ = δ ab ( ) a 0. 0 b Hence give the likelihood ratio test for the null hypothesis that detσ = δ, using Wilks s Theorem. (7 marks) MAS6011 6 Continued

3 Campbell and Mahon (1974) took various measurements of certain species of rock crab of genus Leptograpsus. Observations 1 50 were male, and observations 51 100 were female. Of interest in this question is whether we can predict the sex of the crab from the two dimension measurements, given in millimetres, as: FL frontal lobe RW rear width Each line of the data set consists of 3 fields: firstly, the sex of the crab, and then the remaining 2 dimension measurements in the order above. It seems that the dimensions of the two groups appear to be distributed as bivariate normal distributions with a similar variance matrix, which we take to be the same as the variance matrix in the R output which follows on the next page. (i) Compute Fisher s linear discriminant function for this data set. (5 marks) (ii) Using the output from the previous part, classify the sex of a crab whose measurements are FL = 13 and RW = 10. (iii) By working out the probability of misclassification under the assumption that the two groups are distributed as bivariate normal distributions with the variance matrix in the R output, does Fisher s linear discriminant function classify the data set better or worse than expected? You are given that pnorm(-1)=0.159. (8 marks) ( ) 1 (iv) Suppose that two bivariate normal distributions have means µ 1 = 0 ( ) ( ) 1 1 ρ and µ 2 =, and that both have variance matrix. Find the decision boundary 0 ρ 1 (i.e., the set of points for which the probability densities agree) as a function of ρ. (5 marks) MAS6011 7 Question 3 continued on next page

3 (continued) > crab[1,] sex FL RW 1 M 8.1 6.7 > print(s<-var(crab[,-1])) FL RW FL 9.12 6.18 RW 6.18 5.20 > print(sinv<-solve(s)) FL RW FL 0.566-0.673 RW -0.673 0.993 > print(crab.lda<-lda(sex~fl+rw)) Group means: FL RW F 13.3 12.1 M 14.8 11.7 Coefficients of linear discriminants: LD1 FL 1.21 RW -1.52 > crab.pred<-predict(crab.lda,data.frame(cbind(fl,rw))) > crab.pred$class [1] M M M M M M F M M F F F M M M M M M M M M M M M M [26] M M M M M M M M M M M M M M M M M M F M M M M M M [51] F F F F M F F F F F F F F F F F F F F F F F F F F [76] F F F F F F F F F F F F F F F F F F F F F F F F F Levels: F M > table(sex,crab.pred$class) sex F M F 49 1 M 5 45 MAS6011 8 Continued

4 (i) Explain briefly why the definition of the sample autocorrelation function is not appropriate for non-stationary time series. (ii) (a) For the following time series data time (t) response (y t ) 1 1 2 3 3 4 4 10 5 2 6 2 7 5 8 11 calculate 4-span moving averages for time points t = 3,4,5,6. (4 marks) (b) Consider the time series y t = α+βt, for some known α and β and for t = 1,2,3,... Show that the 4-span moving average of {y t } at time t is exactly equal to the value of y t. (4 marks) (c) Now consider the time series x t = y t +ǫ t, where y t is the time series in part (b) above and ǫ t is white noise with variance 2. Show that the de-trended time series defined by d t = x t ŷ t, where ŷ t is the 4-span moving average of y t, is a weakly stationary time series. (iii) Let {Z t } be a sequence of independent identically distributed random variables, so that Z t follows a normal distribution with zero mean and variance 1. Define the process Z t, if t is even y t = Zt 2 1, if t is odd. 2 (a) Show that {y t } is a white noise process WN(0,1). (6 marks) (b) Show {y t } is not an identically distributed sequence. HINT: If a random variable X follows the chi-square distribution with ν degrees of freedom X χ 2 ν, then E(X) = ν and Var(X) = 2ν. MAS6011 9 Turn Over

5 Suppose that observations y 1,y 2,...,y n are generated from the autoregressive (AR) model of order one: y t = αy t 1 +ǫ t, where α is the AR parameter and ǫ t is a Gaussian white noise with variance σ 2. (i) Write down the likelihood functionl(α,σ 2 ;y 1:n ) and the log-likelihood function l(α,σ 2 ;y 1:n ) of the parameters α and σ 2, based on observation y 1:n = {y 1,y 2,...,y n }. (3 marks) (ii) Using the approximation log(1 + x) x, for some x, with x < 1, show that the log-likelihood of part (i) can be approximated as l(α,σ 2 ;y 1:n ) n 2 log(2πσ2 ) y2 1 2σ 2 + ( y 2 1 σ 2 2σ 2 ) α 2 1 2σ 2 n (y t αy t 1 ) 2. t=2 (7 marks) (iii) Using (ii) and adopting unconditional least squares, show that the approximate likelihood estimates of α and σ 2 satisfy ˆα = n t=2 y ty t 1 n and ˆσ 2 = (1 ˆα2 )y1 2 + n t=2 (y t ˆαy t 1 ) 2. t=3 y2 t 1 + ˆσ 2 n (8 marks) (iv) If σ 2 = y1, 2 show that the maximum likelihood of α based on unconditional least squares is approximately equal to the maximum likelihood of α using conditional least squares. MAS6011 10 Continued

6 An airliner on a long-haul journey is set to fly at a pre-specified steady speed of v km/hr, but atmospheric conditions sometimes speed it up and sometimes slow it down. A simple model for its true position x t along the flight path at time t hours supposes that x t+1 = x t +v +η t, where {η t } is a zero mean normal white noise sequence with variance σ 2 η. The position x t can be observed only with error, the observed position y t at time t being y t = x t +ǫ t, where ǫ t is a zero mean normal white noise with variance σ 2 ǫ, uncorrelated with the other variables. (i) Show that if β t = (x t,v), the system can be described by the equations y t = g β t +ǫ t, (1) β t = Fβ t 1 +ζ t, (2) where g is a constant vector, F is a constant matrix and ζ t is a random vector. Give the values of g and F and write down the mean and covariance matrix of ζ t. What are equations (2) and (3) called in the context of state space linear modelling? (4 marks) (ii) Suppose that hourly observations of the plane s position are available up to time t 1 and it is believed that the true position x t 1 at time t 1 has posterior normal distribution with mean m t 1 and variance v t 1. Show that, given this information, the prior mean of the position x t at time t is m t 1 +v and that the prior variance is v t 1 +ση. 2 (4 marks) (iii) Find the posterior distribution of the true position x t at time t when an observation y t of the plane s position at time t becomes available. (8 marks) (iv) Suppose that v = 720, σ ǫ = 10 and σ η = 30. If, one hour into the flight, an observation gives the position as y 1 = 750 km and it was known with certainty that at time t = 0 the true position had been x 0 = 0, find the posterior mean of the plane s position at t = 1, and the associated posterior variance. Hence give an approximate 95% credible interval for the true position x 1 after one hour. (4 marks) End of Question Paper MAS6011 11