Inferences about Parameters of Trivariate Normal Distribution with Missing Data

Similar documents
A Study on the Correlation of Bivariate And Trivariate Normal Models

Comparison of Some Improved Estimators for Linear Regression Model under Different Conditions

Inference about Reliability Parameter with Underlying Gamma and Exponential Distribution

Study on a Hierarchy Model

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Master s Written Examination

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Statistical Inference with Monotone Incomplete Multivariate Normal Data

STA 2201/442 Assignment 2

Lecture 3. Inference about multivariate normal distribution

Exam 2. Jeremy Morris. March 23, 2006

Primer on statistics:

On the Performance of some Poisson Ridge Regression Estimators

Bayesian Estimation of Small Proportions Using Binomial Group Test

More on nuisance parameters

Assessing the Effect of Prior Distribution Assumption on the Variance Parameters in Evaluating Bioequivalence Trials

Experimental designs for multiple responses with different models

Statistics 3858 : Maximum Likelihood Estimators

The Multivariate Normal Distribution 1

New Bivariate Lifetime Distributions Based on Bath-Tub Shaped Failure Rate

POLI 8501 Introduction to Maximum Likelihood Estimation

On the Existence and Uniqueness of the Maximum Likelihood Estimators of Normal and Lognormal Population Parameters with Grouped Data

Multivariate Statistics

Numerical computation of an optimal control problem with homogenization in one-dimensional case

Ph.D. Qualifying Exam Friday Saturday, January 6 7, 2017

Random Vectors 1. STA442/2101 Fall See last slide for copyright information. 1 / 30

Statistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation

Bayesian Inference. Chapter 9. Linear models and regression

Math 152. Rumbos Fall Solutions to Assignment #12

Statistical Inference with Monotone Incomplete Multivariate Normal Data

First Year Examination Department of Statistics, University of Florida

Expected probabilities of misclassification in linear discriminant analysis based on 2-Step monotone missing samples

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Parametric Techniques Lecture 3

SOLUTION FOR HOMEWORK 6, STAT 6331

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

Compositions, Bijections, and Enumerations

Spring 2012 Math 541B Exam 1

Associated Hypotheses in Linear Models for Unbalanced Data

Statistics and Econometrics I

Introduction to Normal Distribution

Parametric Techniques

Random Vectors and Multivariate Normal Distributions

Part 4: Multi-parameter and normal models

Problem Selected Scores

Comparing Group Means When Nonresponse Rates Differ

If we want to analyze experimental or simulated data we might encounter the following tasks:

Multiple Random Variables

Lifetime prediction and confidence bounds in accelerated degradation testing for lognormal response distributions with an Arrhenius rate relationship

STATISTICAL INFERENCE IN ACCELERATED LIFE TESTING WITH GEOMETRIC PROCESS MODEL. A Thesis. Presented to the. Faculty of. San Diego State University

Examining the accuracy of the normal approximation to the poisson random variable

Statistics Ph.D. Qualifying Exam: Part I October 18, 2003

Fundamental Probability and Statistics

UNIVERSITY OF TORONTO Faculty of Arts and Science

INFORMATION APPROACH FOR CHANGE POINT DETECTION OF WEIBULL MODELS WITH APPLICATIONS. Tao Jiang. A Thesis

Estimation of parametric functions in Downton s bivariate exponential distribution

Introduction to Maximum Likelihood Estimation

Dynamic System Identification using HDMR-Bayesian Technique

Multivariate Survival Analysis

An Introduction to Multivariate Statistical Analysis

Quick Tour of Basic Probability Theory and Linear Algebra

Machine Learning, Fall 2012 Homework 2

Approximation of Posterior Means and Variances of the Digitised Normal Distribution using Continuous Normal Approximation

Practice Exam 1. (A) (B) (C) (D) (E) You are given the following data on loss sizes:

Testing equality of two mean vectors with unequal sample sizes for populations with correlation

Empirical Likelihood Inference for Two-Sample Problems

BAYESIAN RELIABILITY ANALYSIS FOR THE GUMBEL FAILURE MODEL. 1. Introduction

Working correlation selection in generalized estimating equations

To Estimate or Not to Estimate?

TAMS39 Lecture 2 Multivariate normal distribution

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Introduction to Machine Learning. Lecture 2

The Delta Method and Applications

Part 6: Multivariate Normal and Linear Models

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Stochastic Realization of Binary Exchangeable Processes

1 Data Arrays and Decompositions

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Notes on the Multivariate Normal and Related Topics

MATH 829: Introduction to Data Mining and Analysis Graphical Models II - Gaussian Graphical Models

Chapter 4: Asymptotic Properties of the MLE (Part 2)

Confidence Intervals for the Ratio of Two Exponential Means with Applications to Quality Control

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

Estimators as Random Variables

Department of Statistical Science FIRST YEAR EXAM - SPRING 2017

COMS 4721: Machine Learning for Data Science Lecture 1, 1/17/2017

Parameter Estimation

A Distribution of the First Order Statistic When the Sample Size is Random

FIRST YEAR EXAM Monday May 10, 2010; 9:00 12:00am

Jackknife Empirical Likelihood for the Variance in the Linear Regression Model

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Bayesian Inference: Probit and Linear Probability Models

Random vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.

Next is material on matrix rank. Please see the handout

CS 195-5: Machine Learning Problem Set 1

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

COS513 LECTURE 8 STATISTICAL CONCEPTS

Hypothesis Testing. Robert L. Wolpert Department of Statistical Science Duke University, Durham, NC, USA

You can compute the maximum likelihood estimate for the correlation

Gaussian Models (9/9/13)

Transcription:

Florida International University FIU Digital Commons FIU Electronic Theses and Dissertations University Graduate School 7-5-3 Inferences about Parameters of Trivariate Normal Distribution with Missing Data Xing Wang Florida International University, xwang9@fiu.edu DOI:.548/etd.FI3899 Follow this and additional works at: http://digitalcommons.fiu.edu/etd Part of the Statistical Models Commons, and the Statistical Theory Commons Recommended Citation Wang, Xing, "Inferences about Parameters of Trivariate Normal Distribution with Missing Data" (3). FIU Electronic Theses and Dissertations. 933. http://digitalcommons.fiu.edu/etd/933 This work is brought to you for free and open access by the University Graduate School at FIU Digital Commons. It has been accepted for inclusion in FIU Electronic Theses and Dissertations by an authorized administrator of FIU Digital Commons. For more information, please contact dcc@fiu.edu.

FLORIDA INTERNATIONAL UNIVERSITY Miami, Florida INFERENCES ABOUT PARAMETERS OF TRIVARIATE NORMAL DISTRIBUTION WITH MISSING DATA A thesis submitted in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE in STATISTICS by Xing Wang 3

To: Dean Kenneth Furton College of Arts and Sciences This thesis, written by Xing Wang, and entitled Inferences about Parameters of Trivariate Normal Distribution with Missing Data, having been approved in respect to style and intellectual content, is referred to you for judgment. We have read this thesis and recommend that it be approved. Florence George Kai Huang, Co-Major Professor Jie Mi, Co-Major Professor Date of Defense: July 5, 3 The thesis of Xing Wang is approved. Dean Kenneth Furton College of Arts and Sciences Dean Lakshmi N. Reddi University Graduate School Florida International University, 3 ii

ACKNOWLEDGMENTS First of all, I would like to express my sincere thanks to my major professor, Dr. Jie Mi for his patient guidance, enthusiasm, encouragement and friendship throughout this whole study. I couldn t finish my research without his great support. I would like to express deepest appreciation to my co-major professor, Dr. Kai Huang, for his technical support and patient guidance in Latex and Matlab. I would also like to thank the member of my committee, Dr. Florence George for her time, valuable advice and great encouragement. In addition, I would like to thank the Department of Mathematics and Statistics, all the professors who supported and encouraged me throughout my life in FIU. I wish to thank auntie Miao, who gave me substantial help in my life. I also thank my classmate Maria, my technical support of Matlab. I am really grateful to have them with me, and may God bless them all. iii

ABSTRACT OF THE THESIS INFERENCES ABOUT PARAMETERS OF TRIVARIATE NORMAL DISTRIBUTION WITH MISSING DATA by Xing Wang Florida International University, 3 Miami, Florida Professor Jie Mi, Co-Major Professor Professor Kai Huang, Co-Major Professor Multivariate normal distribution is commonly encountered in any field, a frequent issue is the missing values in practice. The purpose of this research was to estimate the parameters in three-dimensional covariance permutation-symmetric normal distribution with complete data and all possible patterns of incomplete data. In this study, MLE with missing data were derived, and the properties of the MLE as well as the sampling distributions were obtained. A Monte Carlo simulation study was used to evaluate the performance of the considered estimators for both cases when was known and unknown. All results indicated that, compared to estimators in the case of omitting observations with missing data, the estimators derived in this article led to better performance. Furthermore, when was unknown, using the estimate of would lead to the same conclusion. Keywords: Trivariate Normal Distribution, Permutation-Symmetric Covariance, Missing Data, MLE. iv

TABLE OF CONTENTS CHAPTER PAGE I. Introduction....................................................................... II. Review of MLE with complete data............................................... 4 III. The Maximum Likelihood Estimates with Missing Data.......................... 6 Notation and Assumption........................................................6 Likelihood Function..............................................................7 3 Derivation of ˆµ and ˆ...........................................................8 IV. Properties of MLEs............................................................. 9 Mean of ˆµ......................................................................9 Variance of ˆµ................................................................... 3 Variance of ˆµ ˆµ............................................................. V. Sampling distributions........................................................... The distribution of ˆµ µ....................................................... The distribution of ˆµ ˆµ..................................................... VI. Simulation Study with Known................................................ VII. Simulation Study with Unknown............................................. 4 VIII. Conclusion.................................................................... 58 REFERENCES..................................................................... 59 v

LIST OF TABLES TABLE.........................................................................PAGE. Coverage Probability of confidence regions of µ...................................3. The Probability of Type I error.................................................. 5 3. The Power of testing H : µ = vs.h a : µ................................... 7 4. The Coverage probability of confidence intervals of µ............................ 9 5. The average width of confidence intervals of µ................................... 3 6. The average variance of ˆµ....................................................... 33 7. The coverage probability of confidence intervals of µ µ........................35 8. The average width of confidence intervals of µ µ..............................37 9. The average variance of ˆµ ˆµ.................................................. 39. Coverage Probability of confidence regions of µ................................. 4. The Probability of Type I error................................................. 43. The Power of testing H : µ = vs.h a : µ.................................. 45 3. The Coverage probability of confidence intervals of µ........................... 47 4. The average width of confidence intervals of µ..................................49 5. The average variance of ˆµ...................................................... 5 6. The coverage probability of confidence intervals of µ µ.......................53 7. The average width of confidence intervals of µ µ.............................55 8. The average variance of ˆµ ˆµ................................................. 57 vi

LIST OF FIGURES FIGURE....................................................................... PAGE. Comparison of Coverage Probabilities of Confidence Region of µ................. 4. Absolute Value of the Difference Between the Coverage Probabilities and.95.... 4 3. Comparison of Type I Error......................................................6 4. Absolute Value of the Difference Between the Type I Error and α =.5......... 6 5. Compare the Power of Testing................................................... 8 6. Compare the Coverage Probabilities of the confidence intervals of µ..............3 7. Absolute Value of the Difference Between the Coverage Probabilities and.95.... 3 8. Compare the Average Widths of the Confidence Intervals of µ................... 3 9. Compare the average variance of ˆµ.............................................. 34. Compare the Coverage Probabilities of the confidence intervals of µ µ....... 36. Absolute value of the difference between Coverage Probabilities and.95........ 36. Compare Average Widths of Confidence Intervals of µ µ.....................38 3. Compare the average variance of ˆµ ˆµ........................................ 4 4. Compare the Coverage Probabilities of the Confidence Regions of µ............. 4 5. Absolute value of the difference between Coverage Probabilities and.95........ 4 6. Compare the Type I Error...................................................... 44 7. Absolute value of difference between Type I Error and α =.5................. 44 8. Compare the Power of Testing.................................................. 46 9. Compare the Coverage Probabilities of the Confidence Intervals of µ =....... 48. Absolute value of the difference between Coverage Probabilities and.95........ 48. Compare the Average Widths of the Confidence Intervals of µ =..............5 vii

. Compare the average variance of ˆµ............................................. 5 3. Compare the Coverage Probabilities of the Confidence Intervals of µ µ...... 54 4. Absolute value of the difference between Coverage Probabilities and.95........ 54 5. Compare the Average Widths of the Confidence Intervals of µ µ.............56 6. Compare the average variance of ˆµ ˆµ........................................ 58 viii

I. Introduction In practice, the normal distribution is commonly encountered, since the sampling distribution of many multivariate statistics are proximately normal because of the a central limit effect. Early application of the multivariate normal distribution was with regard to biological studies. For instance, we often think that height and weight, when observed on the same individual, approximately follow bivariate normal distribution. We could extend this to the foot size, or any other variable of related physical characteristics, and these measurements together could follow the multivariate normal distribution. Moreover, we often have to analyze data that contains missing values in practice. Incomplete normal data could arise in any number of scientific investigations: early detection of diseases, wildlife survey research, mental health research, and so on. Missing data are one of the most persuasive problems for analysis of data which can occur for a variety of reasons. For example, a participant may refuse to answer some of the questions; the use of new instruments results in incomplete historical data; some information may be purposely excised to protect confidentiality. Estimation of parameters of a multivariate normal distribution when data are incomplete has been discussed by many authors. A systematic approach to missing values problem was derived using likelihoods of observed values. Wilks (93) considered MLEs for a bivariate normal population with missing data in both variables. Srivastava and Zaatar (973) used Monte Carlo simulation to compare four estimators of the covariance matrix of a bivariate normal distribution. Edgett (956) gave maximum likelihood estimates of parameters of a trivariate normal distribution when observations on only one variable are missing. Lord (955) and Matthai (95) also found estimates of parameters

of a trivariate normal distribution in other special cases. Anderson (957) indicated how one could obtain the Maximum Likelihood Estimates when the sample was monotone. Hocking and Smith (968) developed a method of estimating parameters of a p-variate normal distribution with zero mean vector in which the missing observations are not required to follow certain patterns. Their estimation technique could be summarized as follows: The data were divided into groups according to which variates were missing. Initial estimates of the parameters were obtained from that group of observations with no missing variates. (c) These initial estimators were modified by adjoining, optimally, the information in the remaining groups in a sequential manner until all data was used. Hocking and Marx (979) used the same method as Hocking and Smith (968) to derive estimates, but their use of matrices simplified the notation and gave the estimates in a form that was easily implemented on a computer. They also gave exact small sample moments of the estimators for the case of two data groups. The purpose of my research is to estimate the parameters in three-dimensional covariance permutation-symmetric normal distribution with complete data and all possible patterns of incomplete data, and then to study the properties of the estimators. It is assumed that all correlation coefficients and variances are equal, which means that we focus on the covariance permutation-symmetric trivariate normal distribution. The special case of the covariance permutation-symmetric model are the exchangeable normal variables, which make their appearance in many statistical applications. For instance, in the Bayes theory concerning normal observations, it is generally assumed that the prior distribution

of θ, the population mean, has an N(µ, ω ) distribution for some µ and ω >. Thus, for given θ, if the random variables are (conditionally) i.i.d normal variables, then the marginal distribution of the data variable X = (X,..., X n ) is a mixture. In this case X,..., X n are exchangeable normal variables. The thesis is organized as follows. The Maximum Likelihood Estimates of the parameters of trivariate normal distribution with complete data are reviewed in Section. Maximum Likelihood Estimates with missing data will be derived in Section 3. In Section 4, the properties of the MLEs will be obtained, followed by the sampling distributions in Section 5. The numerical studies based on Monte Carlo simulations are considered in Section 6, which include comparison of confidence regions of µ; the probability of Type I error, and power of testing H : µ = vs H a : µ ; (c) coverage probability and average width of confidence intervals with regard to µ ; (d) coverage probability and average width of confidence intervals with regard to µ µ using i) only three-dimensional observations and ii) both three-dimensional observations and all possible incomplete observations. Finally, in Section 7, the simulation will follow the same procedure in Section 6 but based on unknown. 3

II. Review of MLE with complete data In order to obtain the trivariate normal likelihood function, let us assume that the 3 vectors X, X,..., X n represent a random sample from a three-dimensional multivariate normal population with mean vector µ and covariance matrix Σ, where X i = (x i, x i, x i3 ), i =,,..., n, µ = (µ, µ, µ 3 ) and Σ = 3 3 3 3 33 Since X, X,..., X n are mutually independent and each has a distribution N 3 (µ,σ), the joint density function of all the observations is the product of the marginal normal densities: f(x, X,..., X n ) = = n { (π) 3/ Σ /exp[ (X i µ) Σ (X i µ) ] } n (π) 3n/ Σ n/exp[ (X i µ) Σ (X i µ) ] It has been derived that ˆµ = X () and ˆΣ = n (X i X)(X i X) n = (n )S n () are the maximum likelihood estimates of µ and Σ respectively, where X = n n X i, i =,,..., n 4

and S = n (X i X)(X i X). n 5

III. The Maximum Likelihood Estimates with Missing Data. Notation and Assumption In the present paper, we consider the covariance permutation-symmetric trivariate normal distribution, which means that = = 3 and all correlation coefficients are equal: = 3 = 3. By denoting τ =, the marginal density function can be expressed as: where f(x, x, x 3 ) = e w/[τ( )] π 3/ τ 3/ 3 + 3 (3) w = ( + ) [ (x µ ) + (x µ ) + (x 3 µ 3 ) ] + [ (x µ )(x µ ) + (x µ )(x 3 µ 3 ) + (x µ )(x 3 µ 3 ) ] The data in our study are divided into groups according to which variables are missing, and initial estimates of the parameters are obtained from that group of observations with no missing variables. Specifically, it is assumed that a sample of size n + n + n 3 + n 3 + n + n + n 3 is taken from a three-dimensional normal distribution but some of the observations have randomly occurring missing entries. In this case, the observations can be placed in 7 groups. Specifically, it means that the data consist of n complete observations {(x i, x i, x i3 ), i =,, n } on X = (X, X, X 3 ), n observations {(x () i, x() i ), i =,, n } on (X, X ), n 3 observations {(x (3) i, x(3) i3 ), i =,, n 3 } on (X, X 3 ), n 3 observations {(x (3) i, x(3) i3 ), i =,, n 3 } on (X, X 3 ), n observations {x () i, i =,, n } on X, n observations {x () i, i =,, n } on X, n 3 observations {x (3) i3, i =,, n 3 } on X 3. 6

. Likelihood Function Following Hocking and Smith(968), the likelihood function can be written as: L(µ, τ) = L (µ, τ) L (µ, µ, τ) L 3 (µ, µ 3, τ) L 3 (µ, µ 3, τ) L (µ, τ) = L (µ, τ) L 3 (µ 3, τ) n n n 3 () (3) f(x i; µ, τ) i ; µ, µ, τ) i ; µ, µ 3, τ) f(x f(x n 3 n n n 3 i ; µ, µ 3, τ) f(x () i ; µ, τ) f(x () i ; µ, τ) (3) f(x f(x (3) i ; µ 3, τ) (4) Where x i = (x i, x i, x i3 ) (), i =,, n ; i = (x x () i, x() i ), i =,, n ; (3) xi = (x (3) i, x (3) i3 ) (3), i =,, n 3 ; i = (x x (3) i, x (3) i3 ), i =,, n 3 ; x () i = x () i, i =,, n ; x () i = x () i, i =,, n ; x (3) i = x (3) i3, i =,, n 3 L is used to denote the likelihood function for the three-dimensional complete observations, where L (µ, τ) = e P n w i/[τ( )] ( π 3/ τ 3/ 3 + 3 ) n with w i = ( + ) [ (x i µ ) + (x i µ ) + (x i3 µ 3 ) ] + [ (x i µ )(x i µ ) + (x i µ )(x i3 µ 3 ) + (x i µ )(x i3 µ 3 ) ] Similarly, other notations are defined as follows: P n L (µ, µ, τ) = e [ (x() i µ ) P n (x() i µ )(x () i µ )+ P n (x() i µ ) ]/[τ( )] (π) n τ n ( ) n / 7

P n3 L 3 (µ, µ 3, τ) = e [ (x(3) i µ ) P n 3 (x(3) i µ )(x (3) i3 µ 3)+ P n 3 (x(3) i3 µ 3) ]/[τ( )] (π) n 3 τ n 3( ) n 3/ P n3 L 3 (µ, µ 3, τ) = e [ (x(3) i µ ) P n 3 (x(3) i µ )(x (3) i3 µ 3)+ P n 3 (x(3) i3 µ 3) ]/[τ( )] (π) n 3 τ n 3( ) n 3/ P n L (µ, τ) = e (x() i µ ) /(τ) (π) n / τ n / P n L (µ, τ) = e (x() i µ ) /(τ) (π) n / τ n / 3. Derivation of ˆµ and ˆ P n3 L 3 (µ 3, τ) = e (x(3) i3 µ 3) /(τ) (π) n 3/ τ n 3/ In the rest of this study it is assumed that ( /, ) is known. In order to apply likelihood-based method to obtain the MLEs ˆµ and ˆ of µ and, the natural logarithms of the above likelihood functions are taken: lnl (µ, τ) = 3n ln(π) 3n lnτ n ln( 3 + 3 ) n + + τ( ) [ n (x i µ ) + (x i µ ) + n (x i3 µ 3 ) ] n τ( ) [ n (x i µ )(x i µ ) + (x i µ )(x i3 µ 3 ) n + (x i µ )(x i3 µ 3 )] lnl (µ, µ, τ) = n ln(π) n lnτ n ln( ) 8

n τ( ) [ n + (x () i µ ) ] (x () i n µ ) (x () i µ )(x () i µ ) lnl 3 (µ, µ 3, τ) = n 3 ln(π) n 3 lnτ n 3 ln( ) n3 τ( ) [ n 3 (x (3) i µ ) (x (3) i µ )(x (3) i3 µ 3 ) n 3 + (x (3) i3 µ 3 ) ] lnl 3 (µ, µ 3, τ) = n 3 ln(π) n 3 lnτ n 3 ln( ) n3 τ( ) [ n 3 (x (3) i µ ) (x (3) i µ )(x (3) i3 µ 3 ) n 3 + (x (3) i3 µ 3 ) ] lnl (µ, τ) = n ln(π) n lnτ τ lnl (µ, τ) = n ln(π) n lnτ τ lnl 3 (µ 3, τ) = n 3 ln(π) n 3 lnτ τ n n n 3 (x () i µ ) (x () i µ ) (x (3) i3 µ 3 ) Combining all the above expressions together yields the log-likelihood function (5): lnl(µ, τ) = 3n ln(π) 3n lnτ n ln( 3 + 3 ) n ln(π) n lnτ n ln( ) n 3 ln(π) n 3 lnτ n 3 ln( ) n 3 ln(π) n 3 lnτ n 3 ln( ) n ln(π) n lnτ n ln(π) n lnτ n 3 ln(π) n 3 lnτ 9

n + + τ( ) [ n n (x i µ ) + (x i µ ) + (x i3 µ 3 ) ] n τ( ) [ n (x i µ )(x i µ ) + (x i µ )(x i3 µ 3 ) n + (x i µ )(x i3 µ 3 )] n (x () i µ )(x () n3 τ( ) [ n 3 + n 3 (x (3) i3 (x (3) i µ 3 ) ] (x (3) i µ )(x (3) n (x () n τ( ) [ n i µ ) + n 3 µ ) (x () (x () i µ ) i µ ) ] (x (3) i µ )(x (3) n3 τ( ) [ (x (3) i µ ) i3 µ 3 ) + n (x () n 3 (x (3) i3 µ 3 ) ] i3 µ 3 ) i µ ) i µ ) n 3 (x (3) i3 µ 3 ) τ τ τ = c ( 3n + n + n 3 + n 3 + n + n + n 3) lnτ n + + τ( ) [ n n (x i µ ) + (x i µ ) + (x i3 µ 3 ) ] n τ( ) [ n3 τ( ) [ n3 τ( ) [ τ n (x () n i µ ) + (x () (x (3) i µ ) + (x (3) i µ ) + i µ ) τ n (x () (x () n 3 (x (3) n 3 (x (3) i µ ) ] i3 µ 3 ) ] i3 µ 3 ) ] i µ ) τ n 3 (x (3) i3 µ 3) n τ( ) [ n (x i µ )(x i µ ) + (x i µ )(x i3 µ 3 ) n + (x i µ )(x i3 µ 3 )] + n 3 + (x (3) i µ )(x (3) n 3 i3 µ 3 ) + [ n τ( ) (x (3) i µ )(x (3) (x () i µ )(x () i3 µ 3 ) ] i µ )

(5) where c = 3n ln(π) n ln( 3 + 3 ) n ln(π) n ln( ) n 3 ln(π) n 3 ln( ) n 3 ln(π) n 3 ln( ) n ln(π) n ln(π) n 3 ln(π) In order to find the Maximum Likelihood Estimates of µ = (µ, µ, µ 3 ) and τ, the partial derivatives of lnl with respect to µ and τ are taken, and are set to be zero: lnl µ = lnl τ = which are equivalent to the following expressions: n lnl + n = (x µ τ( i µ ) + ) τ( ) [ n 3 + n (x (3) i µ )] + τ + (x i3 µ 3 )] n (x () n τ( ) [ i µ ) + (x () (x () i µ ) n τ( ) [ (x i µ ) n 3 i µ ) + (x (3) i3 µ 3 )] = (6) n lnl + n = (x µ τ( i µ ) + ) τ( ) [ n 3 + n (x (3) i µ )] + τ + (x i3 µ 3 )] n (x () n τ( ) [ i µ ) + (x () (x () i µ ) n τ( ) [ (x i µ ) n 3 i µ ) + (x (3) i3 µ 3 )] = (7)

n lnl + n3 = (x µ 3 τ( i3 µ 3 ) + ) τ( ) [ n 3 + n (x (3) i3 µ 3 )] + τ + (x i µ )] n 3 (x (3) n3 τ( ) [ i3 µ 3) + (x (3) (x (3) i3 µ 3 ) n τ( ) [ (x i µ ) n 3 i µ ) + (x (3) i µ )] = (8) lnl τ = (3n τ + n + n 3 + n 3 + n + n + n 3) n + τ ( ) [ n (x i µ ) + (x i µ ) + n (x i3 µ 3 ) ] n + τ ( ) [ n (x i µ )(x i µ ) + (x i µ )(x i3 µ 3 ) n + (x i µ )(x i3 µ 3 )] + n n 3 n 3 + n τ [ (x () i µ )(x () (x (3) i µ )(x (3) (x (3) i µ )(x (3) (x () n τ ( ) [ n i µ ) + i3 µ 3 ) + i3 µ 3 ) + n i µ ) + (x () (x () n 3 (x (3) n 3 (x (3) (x () i µ ) n 3 i µ ) + (x (3) i µ ) n 3 i3 µ 3 ) + i3 µ 3 ) ] n 3 i µ ) + (x (3) i3 µ 3 ) ] = (x (3) i µ ) (9) The system of linear equations (6) (7) (8) can be rewritten as: n n ( + )( ) (x i µ ) + ( )[ n + ( )( ) n ( )[ (x () (x () n n 3 i µ ) + (x () n (x (3) i µ )] i µ ) + ( )[ (x i µ ) + (x i3 µ 3 )] n 3 i µ ) + (x (3) i3 µ 3 )] = ()

n n ( + )( ) (x i µ ) + ( )[ n + ( )( ) n ( )[ (x () (x () n n 3 i µ ) + (x () n (x (3) i µ )] i µ ) + ( )[ (x i µ ) + (x i3 µ 3 )] n 3 i µ ) + (x (3) i3 µ 3 )] = () n n 3 ( + )( ) (x i3 µ 3 ) + ( )[ n 3 + ( )( ) n 3 ( )[ (x (3) (x (3) n n 3 i3 µ 3 ) + (x (3) n (x (3) i3 µ 3 )] i3 µ 3) + ( )[ (x i µ ) + (x i µ )] n 3 i µ ) + (x (3) They could be further expressed as follows: i µ )] = () [n ( + ) + (n + n 3 )( + ) + n ( + )( + )( )]µ [n ( + ) + n ( + )]µ [n 3 ( + ) + n ( + )]µ 3 n ( + ) n x i ( + )( x () i + n 3 x (3) i ) n n ( + )( + )( ) x () i + ( + ) (x i + x i3 ) n + ( + )( x () i + n 3 x (3) i3 ) = (3) [n ( + ) + n ( + )]µ + [n ( + ) + (n + n 3 )( + ) + n ( + )( + )( )]µ [n 3 ( + ) + n ( + )]µ 3 n n ( + ) x i ( + )( x () i + 3 n 3 x (3) i )

n n ( + )( + )( ) x () i + ( + ) (x i + x i3 ) n + ( + )( x () i + n 3 x (3) i3 ) = (4) [n 3 ( + ) + n ( + )]µ [n 3 ( + ) + n ( + )]µ + [n ( + ) + (n 3 + n 3 )( + ) + n 3 ( + )( + )( )]µ 3 n ( + ) n 3 x i3 ( + )( x (3) i3 + n 3 x (3) i3 ) n 3 n ( + )( + )( ) x (3) i3 + ( + ) (x i + x i ) n 3 + ( + )( x (3) i + n 3 x (3) i ) = (5) For the sake of convenience, the above system of equations (3) (4) (5) could be denoted by matrix and vectors: Aµ = b (6) where Matrix A= a (n γ 3 + n γ ) (n 3 γ 3 + n γ ) (n γ 3 + n γ ) a (n 3 γ 3 + n γ ) (n 3 γ 3 + n γ ) (n 3 γ 3 + n γ ) a 33 with (7) a = n γ + (n + n 3 )γ 3 + n γ 3 γ γ, a = n γ + (n + n 3 )γ 3 + n γ 3 γ γ, a 33 = n γ + (n 3 + n 3 )γ 3 + n 3 γ 3 γ γ, γ = +, γ =, γ 3 = + 4

µ = (µ, µ, µ 3 ), b = (b, b, b 3 ) (8) b = γ n x. + γ 3 (n x (). + n 3 x (3). ) + γ 3 γ γ n x (). γ n ( x. + x.3 ) γ 3 (n x (). + n 3 x (3).3 ) b = γ n x. + γ 3 (n x (). + n 3 x (3). ) + γ 3 γ γ n x (). γ n ( x. + x.3 ) γ 3 (n x (). + n 3 x (3).3 ) where x.j = b 3 = γ n x.3 + γ 3 (n 3 x (3).3 + n 3 x (3).3 ) + γ 3 γ γ n 3 x (3).3 γ n ( x. + x. ) γ 3 (n 3 x (3). + n 3 x (3). ) P n x ij n, x ().j = x () P n.j = x() ij n, x ().j = P n x() ij n, x (3).j = P n x() ij n, x (3) P n3.j = x(3) ij n 3, j =,, 3. P n3 x(3) ij n 3, x (3) P n3.j = x(3) ij n 3, If A is a positive definite matrix, then the equation (6) can be solved and thus allow us to obtain the matrix expression of Maximum Likelihood Estimates of µ. To prove the positive definiteness of A, matrix A is firstly denoted as the sum of two matrices: B and B. The next step is to show that B is positive semi-definite and B is positive definite, where n + n 3 n n 3 B = γ 3 n n + n 3 n 3 n 3 n 3 n 3 + n 3 n γ + n γ γ 3 n n B = γ n n γ + n γ γ 3 n n n n γ + n 3 γ γ 3 (9). () 5

It can be proved that matrix B is positive semi-definite by rewriting B as the sum of C, C and C 3, where n n C = γ 3 n n () C = γ 3 C 3 = γ 3 n 3 n 3 () n 3 n 3 n 3 n 3. (3) n 3 n 3 Proof. A matrix is positive semi-definite if and only if every principal minor of the matrix is nonnegative. Since n, (, ), γ 3 = + >, the first-order principal minors of matrix C are nonnegative; the second-order principal minors of matrix C are γ 3 n ( ) and ; the third-order principal minor of matrix C is. Therefore, C is a positive semi-definite matrix. Similarly, since n 3 and n 3, all the principal minors of C and C 3 are nonnegative, so that C and C 3 are also positive semi-definite matrices. Based on the results above, matrix B = C + C + C 3 is a positive semi-definite matrix. Applying the similar argument to matrix B, we could write it as the sum of C 4 and C 5 given below. By showing that C 4 is positive definite and C 5 is positive semi-definite, B can be proved as a positive definite matrix. 6

n γ n n C 4 = γ n n γ n n n n γ n γ γ 3 C 5 = γ n γ γ 3 n 3 γ γ 3 (4) (5) Proof. Since n >, (, ), γ = + >, γ = >, γ 3 = + >, the first-order principal minor of matrix C 4 is n γ > ; the second-order principal minor is n γ (γ ) > ; the third-order principal minor is n 3 ( + ) 3 ( )( + ) >. Therefore, C 4 is a positive definite matrix. Since n, n, n 3, the first-order principal minors of matrix C 5 are n γ γ γ 3, n γ γ γ 3 and n 3 γ γ γ 3. Similarly, it can be checked that all the second and third-order principal minors are nonnegative. Therefore C 5 is a positive semi-definite matrix. As a result, matrix B = C 4 + C 5 is a positive definite matrix. B is shown to be positive semi-definite and B is positive definite, therefore A = B + B is a positive definite matrix. Note that A is positive definite, thus it is a nonsingular matrix and its inverse matrix exists. It could be derived from expression (6) that ˆµ = A b where ˆµ = (ˆµ, ˆµ, ˆµ 3 ). (6) Moreover, the Maximum Likelihood Estimator of τ could be obtained by plugging ˆµ into equation (9): 7

ˆτ = 3n + n + n 3 + n 3 + n + n + n { n + 3 ( + )( ) [ (x i ˆµ ) n n n + (x i ˆµ ) + (x i3 ˆµ 3 ) ] ( + )( ) [ (x i ˆµ )(x i ˆµ ) n n + (x i ˆµ )(x i3 ˆµ 3 ) + (x i ˆµ )(x i3 ˆµ 3 )] n + ( + )( ) [ n + + (x () n 3 (x (3) n 3 + n 3 (x () i i ˆµ ) + n 3 i3 ˆµ 3 ) + (x (3) i3 (x (3) i (x (3) i ˆµ 3 ) ] + n [ n ˆµ ) (x () n 3 ˆµ ) (x () i ˆµ )(x () n 3 ˆµ ) i ˆµ ) + i ˆµ ) (x (3) i ˆµ )(x (3) (x (3) i ˆµ )(x (3) n (x () i ˆµ n 3 ) + i3 ˆµ 3 ) i3 ˆµ 3 ) (x (3) i3 ˆµ 3) ] } (7) By denoting c = 3n + n + n 3 + n 3 + n + n + n 3, γ = +, γ =, γ 3 = +, the equation above can be further simplified as: ˆτ = { γ n n n [ (x i ˆµ ) + (x i ˆµ ) + (x i3 ˆµ 3 ) ] c γ 3 γ n [ (x i ˆµ )(x i ˆµ ) + γ 3 γ n + n n (x i ˆµ )(x i3 ˆµ 3 )] + γ γ [ n 3 n 3 + n [ (x () i ˆµ )(x () (x (3) i ˆµ )(x (3) (x (3) i ˆµ )(x (3) i ˆµ n ) + (x () n (x i ˆµ )(x i3 ˆµ 3 ) i ˆµ ) + i3 ˆµ 3 ) + i3 ˆµ 3 ) + n (x () n 3 (x (3) n 3 (x (3) i ˆµ n 3 ) + (x () 8 (x () i ˆµ ) n 3 i ˆµ ) + n 3 i3 ˆµ 3 ) + i3 ˆµ 3 ) ] (x (3) i3 ˆµ 3) ] } (x (3) i ˆµ ) (x (3) i ˆµ )

(8) IV. Properties of MLEs. Mean of ˆµ The estimate of µ in our model with missing data is an unbiased estimator. This property could be easily proved as follows. Proof. Since E( x.j ) = E( x ().j ) = E( x (3).j ) = E( x (3).j ) = E( x ().j ) = E( x().j ) = E( x(3).j ) = µ j, j =,, 3. E(b ) E = E(b ) E(b 3 ) = γ n µ + γ 3 (n µ + n 3 µ ) + γ 3 γ γ n µ γ n (µ + µ 3 ) γ 3 (n µ + n 3 µ 3 ) γ n µ + γ 3 (n µ + n 3 µ ) + γ 3 γ γ n µ γ n (µ + µ 3 ) γ 3 (n µ + n 3 µ 3 ) γn µ 3 + γ 3 (n 3 µ 3 + n 3 µ 3 ) + γ 3 γ γ n 3 µ 3 γ n (µ + µ ) γ 3 (n 3 µ + n 3 µ ) = [n γ + (n + n 3 )γ 3 + n γ 3 γ γ ]µ (n γ 3 + n γ )µ (n 3 γ 3 + n γ )µ 3 (n γ 3 + n γ )µ + [n γ + (n + n 3 )γ 3 + n γ 3 γ γ ]µ (n 3 γ 3 + n γ )µ 3 (n 3 γ 3 + n γ )µ (n 3 γ 3 + n γ )µ + [n γ + (n 3 + n 3 )γ 3 + n 3 γ 3 γ γ ]µ 3 = Aµ Consequently E(ˆµ) = E(A b) = A E = A Aµ = µ 9

. Variance of ˆµ With matrix notation, the variance of ˆµ can be easily denoted as V ar(ˆµ) = V ar(a b) = A V ar(a ) (9) where b is the same vector as in (8). For the sake of convenience, b could be denoted as b = DX where and D = n γ n γ n γ n γ n γ n γ n γ n γ n γ n γ 3 n γ 3 n γ 3 n γ 3 n 3 γ 3 n 3 γ 3 n 3 γ 3 n 3 γ 3 n 3 γ 3 n 3 γ 3 n 3 γ 3 n 3 γ 3 n γ 3 γ γ n γ 3 γ γ n 3 γ 3 γ γ X = ( x., x., x.3, x ()., x ()., x (3)., x (3).3, x (3)., x (3).3, x ()., x ()., x (3).3 ). Since the V ar could be further expressed as: V ar = V ar(dx ) = DV ar(x )D, (3) combining (9) and (3), we get: V ar(ˆµ) = (A D)V ar(x )(A D) (3)

where the variance of X can be obtained by V ar(x ) = n n n n n n n n n n n n n n 3 n 3 n 3 n 3 n 3 n 3 n 3 n 3 n n n 3. 3. Variance of ˆµ ˆµ The variance of ˆµ is a 3 3 matrix which has the format as follows: V ar(ˆµ ) Cov(ˆµ, ˆµ ) Cov(ˆµ, ˆµ 3 ) V ar(ˆµ) = Cov(ˆµ, ˆµ ) V ar(ˆµ ) Cov(ˆµ, ˆµ 3 ). (3) Cov(ˆµ, ˆµ 3 ) Cov(ˆµ, ˆµ 3 ) V ar(ˆµ 3 ) Consequently, the variance of ˆµ ˆµ can be obtained as: V ar(ˆµ ˆµ ) = V ar(ˆµ ) + V ar(ˆµ ) Cov(ˆµ, ˆµ ) V. Sampling distributions. The distribution of ˆµ µ Since vector X = (X, X, X 3 ) has a joint normal distribution, ˆµ is a linear combination of b, which is also the linear combination of the components of X. As a

result, ˆµ follows the normal distribution: N 3 (E(ˆµ µ), V ar(ˆµ µ)), where E(ˆµ µ) and V ar(ˆµ µ) could be simply calculated as: E(ˆµ µ) = E(ˆµ) E(µ) = µ µ = V ar(ˆµ µ) = V ar(ˆµ) = (A D)V ar(x )(A D).. The distribution of ˆµ ˆµ Since ˆµ = (ˆµ, ˆµ, ˆµ 3 ) has a joint normal distribution, ˆµ ˆµ follows normal distribution with mean E(ˆµ ˆµ ) = E(ˆµ ) E(ˆµ ) = µ µ and variance V ar(ˆµ ˆµ ) = V ar(ˆµ ) + V ar(ˆµ ) Cov(ˆµ, ˆµ ). VI. Simulation Study with Known In this section, a simulation study is conducted to compare the performance of estimates in case only three-dimensional observations and both three-dimensional observations and all possible incomplete observations. The efficiency of this estimator is evaluated by means of coverage probabilities and average width of the confidence intervals, type I error and power of testing H : µ = vs.h a : µ. The sample size is chosen as n =, n = n 3 = n 3 = 5, n = n = n 3 =, and the following five levels of standard deviations are considered: =.5,.5,,, 4. It is assumed that is known in this section, so the real value of is used, which ranges from. to.9. The population mean µ is chosen to be µ = (,, 3) when calculating the power of testing, coverage probabilities and average width of the confidence intervals. The number of replication is r =.

. Compare the confidence regions of µ Table : Coverage Probabilities of 95% confidence regions of µ = (,, 3).5.5 4..958.95.9447.95.949..957.949.9488.9495.9497.3.953.956.957.955.9486.4.958.9488.9495.957.9463.5.95.949.9498.956.9469.6.9495.9497.959.95.9447.7.955.9486.957.949.9488.8.957.9463.953.956.957.9.956.9469.958.9488.9495.5.5 4..95.95.9496.953.945..9468.953.95.9499.9469.3.955.955.95.95.9478.4.95.955.9539.9495.9495.5.953.9439.9496.957.956.6.953.946.955.95.959.7.95.9489.953.9494.956.8.956.9495.9538.9535.957.9.954.953.954.954.95 3

Coverage Probability Coverage Probability Coverage Probability.955.95 =.5.945.96.955.95.945 =.94 =4.96.955.95.945.94 Coverage Probability Coverage Probability.96.955.95.945 =.5.94 =.956.954.95.95.948 Confidence level=.95 Figure : Comparison of Coverage Probabilities of Confidence Region of µ Absolute value of the difference Absolute value of the difference Absolute value of the difference 4 x 3 =.5 3 6 x 3 = 4 6 x 3 =4 4 Absolute value of the difference Absolute value of the difference 8 x 3 =.5 6 4 4 x 3 = 3 Figure : Absolute Value of the Difference Between the Coverage Probabilities and.95 It is evident from Figure that, in both case and, the coverage probabilities of the 95% confidence regions are close to.95. The absolute values of difference between the coverage probabilities and confidence level.95, as shown in Figure, are also small in both cases. 4

. Compare the probability of type I error and power of testing H : µ = vs.h a : µ. Compare the probability of type I error Table : The Probability of Type I error.5.5 4..54.449.484.49.46..5.58.54.54.479.3.494.497.55.54.458.4.47.5.5.5.534.5.47.484.497.55.46.6.59.59.55.487.484.7.47.464.487.48.47.8.54.487.488.59.497.9.55.47.489.53.533.5.5 4..5.53.53.55.53..489.54.537.54.5.3.53.548.5.477.48.4.53.53.479.56.54.5.485.5.54.484.47.6.483.497.463.58.449.7.439.55.55.49.48.8.49.54.498.47.57.9.5.59.478.479.55 5

Probability of Type I Error Probability of Type I Error Probability of Type I Error.55.5.45 =.5.4.54.5.5.48.46 =4.55.5.45 =.4 Probability of Type I Error Probability of Type I Error.55.5.45 =.5.4 =.6.55.5.45 α=.5 Figure 3: Comparison of Type I Error Absolute value of the difference Absolute value of the difference Absolute value of the difference 8 x 3 =.5 6 4 4 x 3 = 3 6 x 3 =4 4 Absolute value of the difference Absolute value of the difference 6 x 3 =.5 4 6 x 3 = 4 Figure 4: Absolute Value of the Difference Between the Type I Error and α =.5 In the case of α =.5, type I errors in both and are close to.5 at each level of. It can be noted more clearly from Figure 4 that the absolute values of difference between Type I Errors and α =.5 are small in both cases. 6

. Compare the power of testing H : µ = vs.h a : µ Table 3: The Power of testing H : µ = (,, ) vs.h a : µ = (,, 3) 6 8 4..364.68.473.94.9..3334.3.44.65.9.3.334.837.49.76.95.4.99.86.3.48.89.5.38.85.89.9.874.6.3.863.363.39.936.7.3578.96.48.4.93.8.4394.56.74.335.45.9.6456.3976.68.958.577 6 8 4..9888.8653.675.496.374..9835.8483.6398.4685.366.3.988.8377.686.464.353.4.979.838.65.449.3434.5.9758.865.65.454.3375.6.979.837.66.459.35.7.986.85.6535.47.3559.8.994.898.6977.5347.48.9.9994.9696.8453.69.5399 7

=6 =8 Power of testing.8.6.4 Power of testing.8.6.4.. = =.8 Power of testing.5 Power of testing.6.4. =4.8 Power of testing.6.4. Figure 5: Compare the Power of Testing In order to compare the power of testing, our levels of standard deviation have been chosen as = 6, 8,,, 4 to make the results more observable. It can be observed from Figure 5 that, at each level of, the power of testing is much higher in case than that in. 3. Compare the confidence intervals of µ 3. Compare the coverage probabilities of confidence intervals of µ It can be noted from Figure 6 and Figure 7 that, in both case and, the coverage probabilities of the 95% confidence intervals of µ are close to.95. The absolute values of differences between the coverage probabilities and confidence level.95 are small in both cases. 8

Table 4: The Coverage probabilities of confidence intervals of µ =.5.5 4..9538.9489.95.9484.95..953.949.949.95.95.3.947.947.956.954.955.4.9538.953.9497.9534.949.5.9484.95.9535.949.9488.6.95.95.9533.9489.95.7.954.955.953.949.949.8.9534.949.947.947.956.9.949.9488.9538.953.9497.5.5 4..9484.949.954.949.95..959.9486.9484.9483.9475.3.959.953.954.95.9466.4.949.959.9486.948.949.5.9479.949.948.9535.9484.6.955.9473.9485.955.957.7.958.9475.955.949.9495.8.9494.9468.954.9495.953.9.95.9498.953.9478.9499 9

Coverage Probability Coverage Probability Coverage Probability.955.95 =.5.945.96.955.95.945 =.94 =4.96.955.95.945.94 Coverage Probability Coverage Probability.955.95 =.5.945 =.965.96.955.95.945 Confidence level=.95 Figure 6: Compare the Coverage Probabilities of the confidence intervals of µ Absolute value of the difference Absolute value of the difference Absolute value of the difference 4 x 3 =.5 3 8 x 3 = 6 4 8 x 3 =4 6 4 Absolute value of the difference Absolute value of the difference 4 x 3 =.5 3 6 x 3 = 4 Figure 7: Absolute Value of the Difference Between the Coverage Probabilities and.95 3

3. Compare the average width of confidence intervals of µ Table 5: The average width of confidence intervals of µ =.5.5 4..549.399.698.396.479..549.399.698.396.479.3.549.399.698.396.479.4.549.399.698.396.479.5.549.399.698.396.479.6.549.399.698.396.479.7.549.399.698.396.479.8.549.399.698.396.479.9.549.399.698.396.479.5.5 4..63.6.54.549.97..67.54.57.54.8.3.6.39.479.4957.994.4.6.9.438.4877.9754.5.596.93.386.477.954.6.58.59.38.4637.974.7.559.7.34.4468.8937.8.53.65.9.458.856.9.499.998.997.3994.7987 3

. =.5.4 =.5 Average Width.5..5 Average Width.3.. =.8 =.5 Average Width.6.4. Average Width.5 =4.5 Average Width.5.5 Figure 8: Compare the Average Widths of the Confidence Intervals of µ As presented in Figure 8, at each level of, the average width of the confidence intervals is smaller in case than that in. 3.3 Compare the average variance of ˆµ As presented in Figure 9, at each level of, the average variance of ˆµ is smaller in case than that in. This result explains our finding in section 3.: the average width of the confidence intervals is smaller in case than that in. 3

Table 6: The average variance of ˆµ.5.5 4..63.5..4.6..63.5..4.6.3.63.5..4.6.4.63.5..4.6.5.63.5..4.6.6.63.5..4.6.7.63.5..4.6.8.63.5..4.6.9.63.5..4.6.5.5 4...4.66.664.654...4.64.654.68.3 9.99e-4.4.6.64.559.4 9.67e-4.39.55.69.476.5 9.6e-4.37.48.593.37.6 8.75e-4.35.4.56.39.7 8.e-4.3.3.5.79.8 7.37e-4.9.8.47.888.9 6.49e-4.6.4.45.66 33

Average Variance Average Variance Average Variance 8 x 3 =.5 6 4.5..5.5 = =4.5 Average Variance Average Variance.3...4.3.. =.5 = Figure 9: Compare the average variance of ˆµ 4. Compare the confidence intervals of µ µ 4. The coverage probabilities of confidence intervals of µ µ (µ =, µ = ) It can be observed from Figure and Figure that, in both case and, the coverage probabilities of the 95% confidence intervals of µ µ are close to.95. The absolute values of differences between the coverage probabilities and confidence level.95 are small in both cases. 34

Table 7: The coverage probability of confidence intervals of µ µ (µ =, µ = ).5.5 4..954.9498.947.95.959..9494.958.9469.9486.9459.3.949.95.954.948.9486.4.953.9458.9535.9497.9499.5.959.95.954.953.9444.6.955.9479.9494.957.9469.7.95.949.9496.9497.9495.8.956.947.9534.9494.953.9.9496.9455.954.9443.95.5.5 4..9497.9489.953.956.9495..95.959.954.9496.9483.3.959.958.958.945.95.4.953.9465.959.956.9479.5.959.949.9476.9499.947.6.95.945.948.9494.958.7.946.9493.949.956.953.8.957.949.95.9497.955.9.9497.955.954.946.954 35

Coverage Probability Coverage Probability Coverage Probability.96.955.95 =.5.945.96.955.95.945 =4.96.955.95.945 =.94 Coverage Probability Coverage Probability.96.955.95.945 =.5.94 =.96.955.95.945.94 Confidence level=.95 Figure : Compare the Coverage Probabilities of the confidence intervals of µ µ Absolute value of the difference Absolute value of the difference Absolute value of the difference 6 x 3 =.5 4 6 x 3 = 4 6 x 3 =4 4 Absolute value of the difference Absolute value of the difference 6 x 3 =.5 4 6 x 3 = 4 Figure : Absolute value of the difference between Coverage Probabilities and.95 36

4. Average widths of confidence intervals of µ µ (µ =, µ = ) Table 8: Average widths of confidence intervals of µ µ (µ =, µ = ).5.5 4..79.458.835.663 3.36..96.39.784.568 3.359.3.833.3667.7334.4667.9334.4.697.3395.679.3579.758.5.549.399.698.396.479.6.386.77.5544.87.74.7..4.48.96.94.8.98.96.39.784.568.9.693.386.77.5544.87.5.5 4..874.747.3494.6988.3976..847.695.3389.6779.3558.3.86.63.36.654.349.4.777.554.38.66.433.5.73.46.9.5843.687.6.673.347.693.5386.77.7.6.3.46.48.963.8.57.4.9.458.85.9.37.74.484.968.5935 37

.4 =.5.8 =.5 Average Width.3.. Average Width.6.4. = = Average Width.5 Average Width.5.5 =4 4 Average Width 3 Figure : Compare Average Widths of Confidence Intervals of µ µ Figure shows that the average width of confidence intervals of µ µ is decreasing as the is increasing. The average widths in case are always smaller than those in at each level of. 4.3 Compare the average variance of confidence intervals of µ µ As presented in Figure 3, at each level of, the average variance of ˆµ ˆµ is smaller in case than that in. This result explains our finding in section 4.: the average width of the confidence intervals is smaller in case than that in. 38

Table 9: The average variance of ˆµ ˆµ.5.5 4..3.45.8.7.88...4.6.64.56.3.88.35.4.56.4.4.75.3..48.9.5.63.5..4.6.6.5..8.3.8.7.37.5.6.4.96.8.5..4.6.64.9.3.5..8.3.5.5 4...79.38.7.585..9.75.99.96.4785.3.7.69.77.8.443.4.6.63.5.6.44.5.4.56..889.3556.6..47.89.755.3.7 9.4e-4.38.5.63.4.8 6.7e-4.7.7.49.74.9 3.58e-4.4.57.9.97 39

.5 =.5.6 =.5 Average Variance..5 Average Variance.4. =. =.8 Average Variance.5..5 Average Variance.6.4. =4 3 Average Variance Figure 3: Compare the average variance of ˆµ ˆµ VII. Simulation Study with Unknown It is assumed that is unknown in this section, so the estimate of is used. The formula to estimate is shown by Orjuela (3) using MLE method with threedimensional complete data: ˆ = n [(x i x. )(x i x. ) + (x i x. )(x i3 x.3 ) + (x i x. )(x i3 x.3 )] n [(x. i x. ) + (x i x. ) + (x i3 x.3 ) ] Everything else is the same as that in Section VI, specifically, simulation study in this part is conducted to compare the performance of estimates in case only three-dimensional observations and both three-dimensional observations and all possible incomplete observations. 4

. Compare the confidence regions of µ Table : Coverage Probabilities of confidence regions of µ = (,, 3).5.5 4..986.973.99.93.96..996.973.97.93.939.3.93.966.9336.938.955.4.933.957.935.93.967.5.996.969.994.988.939.6.95.97.983.93.957.7.935.963.958.98.956.8.977.9.987.99.989.9.94.999.959.995.955.5.5 4..937.937.9377.9378.938..934.9364.939.9339.933.3.937.9398.936.936.93.4.9374.938.9387.934.989.5.93.987.978.936.939.6.988.95.933.935.937.7.996.97.954.967.934.8.964.994.99.933.975.9.946.97.989.959.963 4

Coverage Probability Coverage Probability Coverage Probability.96.94.9 =.5.9 =.95.9 =4.95.9 Coverage Probability Coverage Probability.96.94.9 =.5.9 =.95.9 Confidence level=.95 Figure 4: Compare the Coverage Probabilities of the Confidence Regions of µ Absolute value of the difference Absolute value of the difference Absolute value of the differen.3. =.5. =.3...3.. =4 Absolute value of the difference Absolute value of the differen.4.3. =.5. =.4. Figure 5: Absolute value of the difference between Coverage Probabilities and.95 Using the estimates of would make coverage probabilities a little lower than the confidence level.95 for both case and. It can be noted from Figure 5 that the absolute values of the difference between Coverage Probabilities and.95 are around. for each level of. 4

. Compare the probability of type I error and power of testing H : µ = vs.h a : µ. Compare the probability of type I error Table : The Probability of Type I error.5.5 4..68.67.648.69.75..665.667.68.743.695.3.647.66.693.74.69.4.666.67.68.7.75.5.647.679.658.747.678.6.7.7.687.76.7.7.683.654.73.746.7.8.733.697.78.743.768.9.76.698.775.784.8.5.5 4..67.644.69.657.646..58.67.656.676.639.3.68.68.6.653.636.4.64.637.6.7.688.5.648.637.678.659.646.6.65.667.657.73.66.7.68.79.688.73.7.8.679.7.774.7.759.9.76.73.753.736.73 43

Type I Error.8.6 =.5.4 = Type I Error.8.6 =.5.4 = Type I Error.8.6.4 =4 Type I Error.8.6.4 Type I Error.8.6.4 α=.5 Figure 6: Compare the Type I Error Absolute value of the difference Absolute value of the difference Absolute value of the differen.3.. =.5 =.3...4. =4 Absolute value of the differen Absolute value of the difference.5..5 =.5. =.3.. Figure 7: Absolute value of difference between Type I Error and α =.5 When using the estimate of, type I errors in both and are a little higher than α =.5 at each level of. It can be observed more clearly from Figure 7 that the absolute values of difference between Type I Errors and α =.5 are around. in both cases. 44

. Compare the power of testing H : µ = vs.h a : µ Table : The Power of testing H : µ = vs.h a : µ = (,, 3) 6 8 4..399.495.766.37.74..368.37.69.36.5.3.338.4.57.37.55.4.34.6.584.69.6.5.34.6.5.47.69.6.338.4.67.5.57.7.3665.3.676.357.7.8.436.778.964.579.56.9.6.3983.768.96.84 6 8 4..9884.875.6868.539.395..9858.8588.6586.4867.387.3.986.849.648.4854.374.4.984.848.633.469.3638.5.9773.8363.63.4748.3558.6.9796.8387.6373.473.378.7.984.8536.66.4884.3776.8.99.897.6945.543.448.9.9985.9568.84.676.5358 45

Power of testing Power of testing Power of testing.5 =6 =.5 =4.5 Power of testing Power of testing.5 =8 =.5 Figure 8: Compare the Power of Testing H : µ = vs.h a : µ = (,, 3) With regard to the power of testing, it can be observed that, when the level of is fixed, case has higher power than that of case. 3. Compare the confidence intervals of µ 3. Compare the coverage probabilities of confidence intervals of µ The coverage probabilities of the 95% confidence intervals are close to.95 for all cases, and the differences between coverage probabilities and confidence level.95 are small. 46

Table 3: The Coverage probabilities of confidence intervals of µ =.5.5 4..9538.9489.95.9484.95..953.949.949.95.95.3.947.947.956.954.955.4.9538.953.9497.9534.949.5.9484.95.9535.949.9488.6.95.95.9533.9489.95.7.954.955.953.949.949.8.9534.949.947.947.956.9.949.9488.9538.953.9497.5.5 4..944.9443.9485.944.9446..9453.943.948.9439.944.3.9476.946.95.9485.945.4.9443.9456.9436.9446.943.5.944.945.9446.947.9445.6.9463.946.9459.949.948.7.9488.947.957.945.9478.8.948.9468.955.9479.9539.9.956.95.959.9497.953 47

Coverage Probability Coverage Probability Coverage Probability.95 =.5.94 =.95.94 =4.95.94 Coverage Probability Coverage Probability.95 =.5.94 =.95.94 Confidence level=.95 Figure 9: Compare the Coverage Probabilities of the Confidence Intervals of µ = Absolute value of the difference Absolute value of the difference Absolute value of the differen..5 =.5 =..5 =4..5 Absolute value of the difference Absolute value of the differen..5 =.5 =..5 Figure : Absolute value of the difference between Coverage Probabilities and.95 48

3. Compare the average width of confidence intervals of µ Table 4: The average width of confidence intervals of µ =.5.5 4..549.399.698.396.479..549.399.698.396.479.3.549.399.698.396.479.4.549.399.698.396.479.5.549.399.698.396.479.6.549.399.698.396.479.7.549.399.698.396.479.8.549.399.698.396.479.9.549.399.698.396.479.5.5 4..65.5.5.5.3..6.43.485.497.994.3.65.3.46.499.9835.4.66.3.46.485.97.5.595.89.378.4757.956.6.58.6.3.4644.979.7.56.3.45.4493.899.8.537.74.49.497.86.9.55.9..445.884 49

Average Width Average Width Average Width.. =.5 =.5 =4 4 Average Width Average Width.4. =.5 = Figure : Compare the Average Widths of the Confidence Intervals of µ = With regard to the average width of the confidence intervals of µ =, the confidence intervals in are always narrower than those in. 3.3 Compare the average variance of ˆµ As presented in Figure, at each level of, the average variance of ˆµ is smaller in case than that in. This result explains our finding in section 3.: the average width of the confidence intervals is smaller in case than that in. 5

Table 5: The average variance of ˆµ.5.5 4..63.5..4.6..63.5..4.6.3.63.5..4.6.4.63.5..4.6.5.63.5..4.6.6.63.5..4.6.7.63.5..4.6.8.63.5..4.6.9.63.5..4.6.5.5 4...4.63.65.66...4.6.644.573.3 9.86e-4.39.58.63.5.4 9.57e-4.38.53.63.45.5 9.3e-4.37.48.59.36.6 8.78e-4.35.4.56.48.7 8.3e-4.33.3.58.7.8 7.53e-4.3..483.98.9 6.64e-4.7.6.45.74 5

Average Variance Average Variance..5 =.5 =..5 Average Variance =4 Average Variance Average Variance.4. =.5 =.5 Figure : Compare the average variance of ˆµ 4. Compare the confidence intervals of µ µ 4. The coverage probabilities of confidence intervals of µ µ (µ =, µ = ) Using the estimates of would make coverage probabilities a little lower than the confidence level.95. It can be noted from Figure 4 that the absolute values of the difference are around. for each level of. 5

Table 6: The coverage probabilities of confidence intervals of µ µ (µ =, µ = ).5.5 4..9464.9433.944.9444.944..946.949.9378.946.9385.3.944.9398.943.9388.9389.4.944.9364.948.946.9384.5.944.9394.9433.945.9358.6.9394.9357.9374.9358.9345.7.934.933.9363.9333.9344.8.9346.9346.938.9344.936.9.9359.93.9368.947.938.5.5 4..943.947.9465.9457.948..946.94.9446.94.94.3.943.948.944.9363.939.4.94.9374.94.94.9385.5.94.9343.9354.9394.9339.6.9343.936.9347.9385.938.7.994.9346.934.9354.9397.8.9337.996.935.9353.936.9.933.936.938.9335.9376 53

Coverage Probability Coverage Probability Coverage Probability =.5.94.9 =.94.9 =4.94.9 Coverage Probability Coverage Probability.94.9.94.9 =.5 = Confidence level=.95 Figure 3: Compare the Coverage Probabilities of the Confidence Intervals of µ µ Absolute value of the difference Absolute value of the difference Absolute value of the differen.3. =.5. =.3...3.. =4 Absolute value of the difference Absolute value of the differen.4.3. =.5. =.4. Figure 4: Absolute value of the difference between Coverage Probabilities and.95 54

4. The average width of confidence intervals of µ µ (µ =, µ = ) Table 7: The average width of confidence intervals of µ µ (µ =, µ = ).5.5 4..75.456.89.6584 3.335..96.394.785.569 3.37.3.84.3679.743.479.9458.4.77.344.687.37.7433.5.576.349.69.585.586.6.4.84.5678.398.663.7.4.486.4955.9938.994.8.4.46.49.875.656.9.73.464.947.593.8.5.5 4..865.73.3458.695.38..839.678.3357.67.345.3.88.66.344.6467.98.4.773.548.394.68.36.5.79.458.93.588.664.6.676.353.74.54.85.7.69..434.4879.9779.8.5.4.8.455.8383.9.387.774.556.38.63 55

Average Width Average Width Average Width.4. =.5 =.5 =4 4 Average Width Average Width.5 =.5 = Figure 5: Compare the Average Widths of the Confidence Intervals of µ µ With regard to the average width of the confidence intervals of µ µ, the confidence intervals in are always narrower than those in. 4.3 Compare the average variance of ˆµ ˆµ As presented in Figure 6, at each level of, the average variance of ˆµ is smaller in case than that in. This result explains our finding in section 4.: the average width of the confidence intervals is smaller in case than that in. 56

Table 8: The average variance of ˆµ.5.5 4..3.453.85.756.97...48.63.65.649.3.9.36.438.578.38.4.79.34.6.59.45.5.67.7.7.475.7.6.54..874.3498.4.7.43.68.678.74.8.8.9.6.466.867.74.9.5.6.43.969.395.5.5 4...78.3.5.5..8.74.95.79.474.3.7.69.74.99.4399.4.6.63.5.5.45.5.4.57.6.9.364.6..49.95.78.37.7..4.6.64.555.8 7.39e-4.9.8.473.878.9 4.8e-4.7.67.67.75 57

Average Variance Average Variance.. =.5 =. Average Variance. =4 4 Average Variance Average Variance.5 =.5 =.5 Figure 6: Compare the average variance of ˆµ VIII. Conclusion It can be concluded from the numerical study that, compared to estimators in the case of omitting observations with missing data, using the considered estimators in this thesis would generally lead to better performance in the following aspects: higher power of testing and shorter confidence intervals, while keeping the coverage probabilities and type I errors close to the desired α level. Furthermore, when is unknown, using the estimate of would lead to the same conclusion. 58