Statistics & Data Sciences: First Year Prelim Exam May 2018

Similar documents
Stat 5102 Final Exam May 14, 2015

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Problem Selected Scores

Master s Written Examination - Solution

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Quiz 2 Date: Monday, November 21, 2016

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Final Exam. 1. (6 points) True/False. Please read the statements carefully, as no partial credit will be given.

(Practice Version) Midterm Exam 2

Final Examination. STA 215: Statistical Inference. Saturday, 2001 May 5, 9:00am 12:00 noon

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

MATH 644: Regression Analysis Methods

First Year Examination Department of Statistics, University of Florida

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Normalising constants and maximum likelihood inference

Master s Written Examination

Generalized Linear Models I

Part IB Statistics. Theorems with proof. Based on lectures by D. Spiegelhalter Notes taken by Dexter Chua. Lent 2015

1. Fisher Information

MATH c UNIVERSITY OF LEEDS Examination for the Module MATH2715 (January 2015) STATISTICAL METHODS. Time allowed: 2 hours

Gibbs Sampling in Endogenous Variables Models

Master s Written Examination

Linear Methods for Prediction

Math 494: Mathematical Statistics

Gibbs Sampling in Linear Models #2

Qualifying Exam in Probability and Statistics.

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Asymptotic Statistics-III. Changliang Zou

Variance reduction. Michel Bierlaire. Transport and Mobility Laboratory. Variance reduction p. 1/18

STAT 730 Chapter 4: Estimation

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Markov Chain Monte Carlo (MCMC)

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Parameter Estimation

Model Checking and Improvement

Foundations of Statistical Inference

Metropolis-Hastings Algorithm

Modeling conditional distributions with mixture models: Theory and Inference

Math 362, Problem set 1

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

SCHOOL OF MATHEMATICS AND STATISTICS. Linear and Generalised Linear Models

Problem Set 6 Solution

Principles of Statistics

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Spring 2012 Math 541A Exam 1. X i, S 2 = 1 n. n 1. X i I(X i < c), T n =

Lecture 2. (See Exercise 7.22, 7.23, 7.24 in Casella & Berger)

STAT 518 Intro Student Presentation

STAT215: Solutions for Homework 2

A Very Brief Summary of Statistical Inference, and Examples

Stat 535 C - Statistical Computing & Monte Carlo Methods. Lecture 15-7th March Arnaud Doucet

Chapter 4 HOMEWORK ASSIGNMENTS. 4.1 Homework #1

INTRODUCTION TO BAYESIAN STATISTICS

Generative Models and Stochastic Algorithms for Population Average Estimation and Image Analysis

STAT 450: Final Examination Version 1. Richard Lockhart 16 December 2002

Nonparametric Bayesian Methods - Lecture I

Convex Optimization CMU-10725

Density Estimation. Seungjin Choi

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Qualifying Exam in Probability and Statistics.

F denotes cumulative density. denotes probability density function; (.)

Bayesian Regression Linear and Logistic Regression

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Chapter 5 continued. Chapter 5 sections

Multivariate Analysis and Likelihood Inference

Markov Chain Monte Carlo

Hierarchical Modeling for Univariate Spatial Data

Switching Regime Estimation

Comparing two independent samples

Statistics 253/317 Introduction to Probability Models. Winter Midterm Exam Friday, Feb 8, 2013

Generalized Linear Models

STAT 526 Spring Final Exam. Thursday May 5, 2011

An Introduction to Bayesian Linear Regression

Introduction to Machine Learning CMU-10701

Masters Comprehensive Examination Department of Statistics, University of Florida

ST3241 Categorical Data Analysis I Generalized Linear Models. Introduction and Some Examples

Vector Auto-Regressive Models

Bayesian Methods for Machine Learning

6 Markov Chain Monte Carlo (MCMC)

Final Exam # 3. Sta 230: Probability. December 16, 2012

VAR Models and Applications

Generalized Linear Models. Kurt Hornik

Chapter 3 - Temporal processes

Maximum Likelihood Estimation

Spring 2012 Math 541B Exam 1

Midterm. Introduction to Machine Learning. CS 189 Spring Please do not open the exam before you are instructed to do so.

Learning gradients: prescriptive models

Chapter 8.8.1: A factorization theorem

Information in a Two-Stage Adaptive Optimal Design

Minimum Message Length Analysis of the Behrens Fisher Problem

Chapter 4 - Fundamentals of spatial processes Lecture notes

Optimization and Simulation

Stat 451 Lecture Notes Markov Chain Monte Carlo. Ryan Martin UIC

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016

Bayesian data analysis in practice: Three simple examples

NATIONAL UNIVERSITY OF SINGAPORE EXAMINATION (SOLUTIONS) ST3241 Categorical Data Analysis. (Semester II: )

Statistical Inference and Methods

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature:.

Transcription:

Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book exam. 4. Show your working and explain your reasoning as fully as possible. 5. Switch off all electronic devices. 6. The time limit is 3 hours. 7. The exam has 6 questions do as much as you can but credit will be given for only 5 answers. 1

Question 1. Consider the exponential family model given by f Y (y; θ) = c(y) exp {yθ b(θ)} where c is a known function and b is part of the normalizing constant. (i) Show that E Y = b (θ), where b denotes the derivative of b, assumed to exist. (ii) Prove that b is a convex function; i.e. b (θ) 0, assumed to exist. (iii) Show that Var Y = b (θ). The MLE θ is given by b ( θ) = Ȳ, where Ȳ is the sample mean. (iv) Show that if the true model is f 0 (y); i.e. the data arise from f 0, show that θ converges to the θ which maximizes f 0 (y) log f(y; θ) dy. 2

Question 2. Suppose X 1,..., X n are independent and identically distributed from some distribution function F (x) and the aim is to test H 0 : F (0) = 1 2 vs H 1 : F (0) 1 2. That is, the test is to see whether the median of F is at 0 or not. (i) If H 0 is true, what is the distribution of n T = 1(X i 0). i=1 Here 1(X 0) = 1 if X 0 and otherwise is 0. (ii) Hence, using a normal approximation for T, which will involve finding the mean and variance of T, describe how you would test the hypothesis; i.e. find the test statistic and how to obtain the critical value. (iii) Find the Laplace transform of a normal density function with mean µ and variance σ 2 ; i.e. Find E e θx where X N(µ, σ 2 ). (iv) Using Laplace transforms, show that T/n F (0) F (0)(1 F (0))/n converges in distribution to a standard normal random variable. 3

Question 3. Let P = (p ij ) be a M M transition matrix for a discrete Markov chain (X n ) on {1,..., M}, for M > 1. Hence, for all n 1, P (X n = j X n 1 = i) = p ij. (i) Show that all the eigenvalues of P lie between 1 and +1 and that 1 is actually the largest eigenvalue. Hint: For any vector v = (v 1,..., v M ) of dimension M it is that M v (1) p ij v j v (M) j=1 where v (1) is the minimum from (v 1,..., v M ) and v (M) is the maximum. (ii) If P is reducible, explain why the second largest eigenvalue of P is 1. Now suppose π is the unique stationary probability for P, i.e. π P = π, and that P is irreducible and that the smallest eigenvalue of P is greater than 1. Let (π, w 2,..., w M ) be the left eigenvectors of P. (iii) Prove that w j 1 = 0 where 1 is the M vector of 1 s. (iv) The chain is started at q 0 ; i.e. P (X 0 = j) = q 0j. Show that we can write M q 0 = α π + α j w j and, using (iii), show that α = 1. j=2 Show that the probability mass function for X n is and hence explain why q n π. M q n = π + α j λ n j w j j=2 4

Question 4. Assume independent observations (y i ) n i=1 arise from the model y i = θ i + ε i, where the θ = (θ 1,..., θ n ) are unknown parameters and E ε i = 0 and Var ε i = σ 2, where σ > 0 is assumed known. The true value for the parameter is denoted by θ. (i) Find the least squares estimator θ of θ subject to the constraint for some fixed τ. (ii) Find E θ and Cov θ. Now consider the model n θ i = τ, i=1 y i = x i θ + ε i where the ε i are the same as before, but know θ is an unknown parameter of dimension d and x i = (x i1,..., x id ) is a vector of covariates. (iii) Find the least squares estimator of θ subject to to constraint for some fixed τ > 0. θ θ = τ (iv) Show that the estimator given in (iii) has a Bayesian interpretation. 5

Question 5. Consider the model for which P (y i = k x i, θ, β) = ( θ e βx i ) k k! exp { } θ e βx i for k {0, 1, 2,...}. So this is a Poisson regression model. The prior for θ is gamma(a, b) and the prior for β is N(0, σ 2 ), where (σ, a, b) is specified. (i) Write down the posterior distribution for (θ, β). (ii) Describe in detail a Gibbs sampler for approximately sampling from the posterior. In particular, show that the transition density for your chain has the posterior as the stationary density. (iii) If the above model refers to model M 1, then M 2 is the same model except now β = 0 is fixed. If the prior for each model is 1/2; i.e. P (M 1 ) = P (M 2 ) = 1/2, give a method by which you could compute P (M 1 data). (iv) Give a full interpretation of the probabilities for the models; given that if model M 2 is correct then so is model M 1. This interpretation should include the cases when one of the models is correct and when neither is correct. 6

Question 6. Suppose p θ (x) is a density function for each θ Θ and p 0 (x) a member of the family for a particular θ 0 and observations (x 1,..., x n ) are i.i.d. from p 0. The aim is to estimate θ by minimizing the Fisher information distance given by F (p 0, p θ ) = p 0 (x) ( ) p 2 0 (x) p 0 (x) p θ(x) dx, p θ (x) where p 0(x) denotes the derivative with respect to x. (i) Show that under regularity conditions, we can write ( p F (p 0, p θ ) = κ + p 0 (x) θ (x) p θ (x) where κ is a constant not depending on θ. ) 2 + 2 d dx ( ) p θ (x) p θ (x) dx, (ii) If p θ (x) = exp(w(x)/θ) exp(w(x)/θ) dx find l(θ, x) = ( ) p 2 θ (x) + 2 d p θ (x) dx ( ) p θ (x). p θ (x) (iii) Hence, explain how one could estimate θ from the observations using an approximation to F (p 0, p θ ). (iv) Find the form of the estimator using your answer to (iii) with the model in (ii). 7