Efficient estimation in missing data and survey sampling problems

Size: px
Start display at page:

Download "Efficient estimation in missing data and survey sampling problems"

Transcription

1 Graduate Theses and Dssertatons Iowa State Unversty Capstones, Theses and Dssertatons 2012 Effcent estmaton n mssng data and survey samplng problems Sxa Chen Iowa State Unversty Follow ths and addtonal works at: Part of the Statstcs and Probablty Commons Recommended Ctaton Chen, Sxa, "Effcent estmaton n mssng data and survey samplng problems" (2012). Graduate Theses and Dssertatons Ths Dssertaton s brought to you for free and open access by the Iowa State Unversty Capstones, Theses and Dssertatons at Iowa State Unversty Dgtal Repostory. It has been accepted for ncluson n Graduate Theses and Dssertatons by an authorzed admnstrator of Iowa State Unversty Dgtal Repostory. For more nformaton, please contact dgrep@astate.edu.

2 Effcent estmaton n mssng data and survey samplng problems by Sxa Chen A dssertaton submtted to the graduate faculty n partal fulfllment of the requrements for the degree of DOCTOR OF PHILOSOPHY Major: Statstcs Program of Study Commttee: Jae Kwang Km, Co-major Professor Wayne A. Fuller, Co-major Professor Zhengyuan Zhu Nordman Dan Cndy L. Yu Iowa State Unversty Ames, Iowa 2012 Copyrght c Sxa Chen, All rghts reserved.

3 DEDICATION I would lke to dedcate ths thess to my parents and my wfe wthout whose support I would not have been able to complete ths work. I would also lke to thank my frends and famly for ther lovng gudance and fnancal assstance durng the wrtng of ths work.

4 TABLE OF CONTENTS LIST OF TABLES v LIST OF FIGURES v ACKNOWLEDGEMENTS ABSTRACT v x CHAPTER 1. GENERAL INTRODUCTION CHAPTER 2. SEMI-PARAMETRIC INFERENCE WITH A FUNCTIONAL- FORM EMPIRICAL LIKELIHOOD Introducton Man Results Extenson Computatonal Aspects Smulaton Study Conclusons CHAPTER 3. A UNIFIED THEORY ON EMPIRICAL LIKELIHOOD METH- ODS WITH MISSING DATA AND SURVEY SAMPLING Introducton Basc setup Estmaton wth known response probablty Estmaton wth unknown response probablty Nonparametrc estmaton of the response mechansm Extenson to two-phase samplng

5 v 3.7 Smulaton Study Smulaton One Smulaton Two CHAPTER 4. POPULATION EMPIRICAL LIKELIHOOD FOR NON- PARAMETRIC INFERENCE IN SURVEY SAMPLING Introducton Populaton emprcal lkelhood Man results Extenson to rejectve Posson samplng Combnng nformaton from two ndependent surveys Smulaton Study Smulaton One Smulaton Two Concludng remarks CHAPTER 5. TWO-PHASE SAMPLING FOR PROPENSITY SCORE ES- TIMATION IN VOLUNTARY SAMPLES Introducton Basc Setup Man Results Extenson to non-nested two-phase samplng Smulaton Study Smulaton One Smulaton Two Emprcal Study Concludng Remarks CHAPTER 6. FUTURE RESEARCH TOPICS Jackknfe emprcal lkelhood for nference wth mputed data Nonparametrc propensty score estmaton

6 v 6.3 Inference wth parametrc fractonal mputaton APPENDIX A. PROOFS FOR CHAPTER APPENDIX B. PROOFS FOR CHAPTER APPENDIX C. PROOFS FOR CHAPTER APPENDIX D. PROOFS FOR CHAPTER BIBLIOGRAPHY

7 v LIST OF TABLES 2.1 Monte Carlo relatve effcency of the pont estmators Power comparsons for testng H 0 : ρ = Data structure for two-phase samplng Bases, Varances and Mean squared errors (MSE) of the estmators under four dfferent scenaros n smulaton one The Monte Carlo bases, varances, and the mean squared errors (MSE) of the pont estmators n smulaton two Monte Carlo bases, varances, and mean squared errors of the pont estmators Coverage rate and average length comparson for Wald s and Wlk s type 95% confdence ntervals of proposed POEL2 method The Monte Carlo bases, varances, and the mean squared errors (MSE) of the pont estmators n Smulaton Two Smulaton results of the pont estmators for θ 1 and θ 2 n Smulaton One Smulaton results of the pont estmators for θ n Smulaton Two Estmated coeffcents n the propensty model Estmated parameters (s.e.) for 2012 Iowa Caucus Survey Results... 89

8 v LIST OF FIGURES 2.1 Parameter estmatons versus penalty parameter Sample structure of 2012 Iowa Caucus Survey

9 v ACKNOWLEDGEMENTS I would lke to take ths opportunty to express my thanks to those who helped me wth varous aspects of conductng research and the wrtng of ths thess. Frst and foremost, Dr. Jae Kwang Km for hs gudance, patence and support throughout ths research and the wrtng of ths thess. Hs nsghts and words of encouragement have often nspred me and renewed my hopes for completng my graduate educaton. I am partcularly grateful to Dr. Wayne A. Fuller for hs helpful comments and suggestons. I would also lke to thank my commttee members for ther efforts and contrbutons to ths work.

10 x ABSTRACT The thess conssts of four research papers. The frst paper deals wth general theory for emprcal lkelhood under the standard setup. Instead of maxmzng the emprcal lkelhood functon, a functonal-form approach s proposed to generalze the theory of emprcal lkelhood and to acheve computatonal effcency. The second paper deals wth an emprcal lkelhood approach for mssng data. The proposed method uses a partal lkelhood for the respondents and theores are developed for both a parametrc response model and a nonparametrc response model. Also, the proposed method s extended to two-phase samplng where the frst-phase sample s obtaned by complex survey samplng. The thrd paper deals wth emprcal lkelhood n the survey samplng setup. In the proposed method, called the populaton emprcal lkelhood method, the emprcal lkelhood functon s defned for the fnte populaton and the samplng desgn s ncorporated nto one of the constrants n the optmzaton problem. The proposed method s qute useful when combnng nformaton from several ndependent surveys. The fourth paper proposes a novel applcaton of the capture-recapture experment to estmate the propensty score for nongnorable nonresponse. The proposed method can be used to reduce the selecton bas assocated wth voluntary samplng.

11 1 CHAPTER 1. GENERAL INTRODUCTION Hartley and Rao (1968) ntroduced the emprcal lkelhood (EL) approach under the name of scale load. Owen (1988,1990) brought the EL method to standard statstcal problems. For a comprehensve overvew of EL method, see Owen (2001). Chen and Hall (1993) extended the EL method to nference for quantles. Qn and Lawless (1994) extended the EL method to nference for parameters defned by some general estmatng equatons. DCcco et al. (1991) and Chen and Cu (2006) used bartlett correcton technques to mprove the convergence rate of emprcal lkelhood rato. The applcaton of EL method n tme seres has been consdered by Ktamura (1997), Nordman et al. (2007) and others. Recently, Hjort, McKeague and Van Kelogom (2009), Chen, Peng and Qn (2009) and Tang and Leng (2010) showed that the EL method contnues to work when data dmensonalty s growng. Newey and Smth (2004) proposed generalzed emprcal lkelhood (GEL) whch extended the scope of the tradtonal EL method. In chapter 2, we propose a dfferent extenson by usng the functonal-form emprcal lkelhood (FEL) method. The basc dea s to generalze the form of the EL weght or of the objectve functon. We prove the frst order equvalence between our proposed estmator and the tradtonal EL estmator. The proposed estmator has certan advantages n terms of computaton and choce of weghts. Mssng data happens frequently n observatonal studes. If the mssng mechansm s completely mssng at random (CMAR) n the sense of Rubn (1976), we can safely remove the mssng part of the data. However, f the response mechansm s mssng at random (MAR) or not mssng at random (NMAR), we may not gnore the mssng data n order to produce effcent and consstent estmates. There are two man approaches for nference wth mssng data: Imputaton and Propensty Score Weghtng. Wang and Rao (2002) and Wang and Chen (2009) consdered combnng EL and mputaton methods for nference wth data mssng

12 2 at random. Alternatvely, Qn, Leung and Shao (2002) proposed the EL method to deal wth nongnorable mssng data by usng the propensty score method. Qn and Zhang (2007) appled the EL method n mssng response problems. Chen, Leung and Qn (2008) proposed constructng two dfferent emprcal lkelhood method wth data MAR. Most recently, Qn et al. (2009) provded the complete EL method for mssng covarate problem. The lterature s somewhat sparse for modelng the response mechansm nonparametrcally. Cheng (1994) dscussed some asymptotc propertes of the mean estmator based on the kernel regresson method under gnorable mssng data. Recently, Km and Yu (2011) extended the approach of Cheng (1994) to handle nongnorable nonresponse. Xue (2009) dscussed an emprcal lkelhood method for lnear models usng the weghts computed from a nonparametrc model where the kernel regresson method s used to estmate the response model. Da Slva and Opsomer (2009) consdered another type of nonparametrc response probablty estmator usng local polynomal regresson. Hrano et al (2003) and Cattaneo (2010) dscussed semparametrc effcency of the nonparametrc response propensty estmators n the context of estmatng average treatment effect n econometrcs. In chapter 3, we propose a response EL method whch can be used to handle both survey samplng and mssng data problems. Specfcally, we propose estmatng the propensty score nonparametrcally n the EL method. By dong ths, the sem-parametrc lower bound can be acheved automatcally. The use of the EL method for a fnte populaton parameter was frst consdered by Chen and Qn (1993), but ther method s only applcable under smple random samplng (SRS). Chen and Stter (1999) proposed pseduo emprcal lkelhood (PEL) whch can be used to deal wth complex survey data. Wu and Rao (2006) constructed a lkelhood rato-based confdence nterval for the populaton mean by usng PEL. For the most recent development of PEL, see Rao and Wu (2009). The lkelhood rato property s the most attractve property of the EL method. The correspondng confdence regon has several advantages compared to the normal approxmaton (NA) confdence regon. These nclude better coverage rate, shape respectng, and transformaton nvarance. However, the PEL rato converges to a scaled chsqured dstrbuton nstead of the standard ch-squred dstrbuton. The scale factor needs to be estmated and t often depends on the complex samplng desgn. In addton, the PEL

13 3 estmator s not equvalent to the desgn optmal estmator. To avod those drawbacks, we propose usng the populaton emprcal lkelhood (POEL) estmator n chapter 4. The POEL lkelhood rato converges to the standard ch-squred dstrbuton; the proposed estmator s equvalent to the desgn optmal estmator and the POEL method can combne several sources of auxlary nformaton. A voluntary sample s a self-selected sample whose frst order ncluson probabltes are unknown. The most popular method for the nference for a voluntary sample s propensty score weghtng. Rosenbaum and Rubn (1983) and Rosenbaum (1987) proposed usng propensty scores to estmate treatment effects n observatonal studes. Duncan and Stasny (2001) used the propensty score method to control coverage bas n telephone surveys. Lee (2006) appled the propensty score method to a volunteer panel web survey. Lee and Vallant (2009) and Vallant and Dever (2011) consdered the propensty score method for a web-based voluntary sample. All of these studes assumed an gnorable selecton mechansm. However, we often confront the case where the selecton mechansm does depend on the study varable tself. In chapter 5, we propose a novel two-phase approach for estmators wth a voluntary sample. The proposed method can be extended to handle a non-nested two-phase voluntary sample. The auxlary nformaton can be ncorporated va the generalzed method of moment (GMM). We organze the thess as followngs. In chapter 2, we present the new functonal form emprcal lkelhood (EL) method; We proposed a unfed theory of usng the EL method n mssng data problems n chapter 3; In chapter 4, we propose usng the populaton emprcal lkelhood (POEL) method for nference wth survey data; In chapter 5, a novel approach s proposed for nference n the voluntary sample problem. Future works are presented n chapter 6. Techncal detals are presented n the appendxes.

14 4 CHAPTER 2. SEMI-PARAMETRIC INFERENCE WITH A FUNCTIONAL-FORM EMPIRICAL LIKELIHOOD A paper submtted to the Journal of the Korean Statstcal Socety Sxa Chen and Jae Kwang Km Abstract A functonal-form emprcal lkelhood method s proposed as an alternatve for the emprcal lkelhood method. The proposed method has the same asymptotc propertes as the emprcal lkelhood method but has more flexblty n choosng the weght constructon. Also, some computatonal effcency can be ganed. Because t enjoys the lkelhood-based nterpretaton, the profle lkelhood rato test has a ch-square lmtng dstrbuton. Some computatonal detals are also dscussed, and results from lmted smulaton studes are presented. Key Words: Exponental tltng, Generalzed method of moments, Nonparametrc maxmum lkelhood method, Profle lkelhood rato test. 2.1 Introducton The emprcal lkelhood method, proposed by Owen (1988, 1990), provdes a useful tool for obtanng nonparametrc confdence regons for statstcal functonals. Even though the emprcal lkelhood method s a nonparametrc approach n the sense that t does not requre a parametrc model for the underlyng dstrbuton of the sample observaton, the emprcal lkelhood method enjoys some of the desrable propertes of the lkelhood-based method. Usng a nonparametrc lkelhood functon, the emprcal lkelhood method can easly ncorporate

15 5 known constrants on parameters and also ncorporate pror nformaton on parameters. For example, Chen and Qn (1993) and Qn (2000) dscuss combnng nformaton usng the emprcal lkelhood. A comprehensve overvew of the emprcal lkelhood method s provded by Owen (2001). We consder an extenson of the emprcal lkelhood method by provdng a class of nonparametrc estmators that have the same asymptotc propertes as the emprcal lkelhood method. In partcular, nstead of assumng a nonparametrc lkelhood, we consder a generalzaton of the emprcal lkelhood that uses a functonal-form lkelhood functon n the lkelhood maxmzaton. The class of functonal-form lkelhood functon contans the emprcal lkelhood functon as a specal case. The functonal-form lkelhood approach provdes several useful alternatves to the classcal emprcal lkelhood method n the sense that some of the computatonal dffculty of the emprcal lkelhood method can be avoded, and more clear nsghts can be obtaned from the emprcal lkelhood method. Let z 1,, z n be n ndependent realzatons of a vector-valued random varable Z wth a dstrbuton functon F (z) that s completely unspecfed. In the emprcal lkelhood approach, we consder a class of dstrbuton functons, F 1 F, that have support on z 1,., z n. Thus, the elements n F 1 can be wrtten as F w (x) = w I(z x) wth n w = 1 and w > 0, where I(z x) takes the value one f z x and takes the value zero otherwse. The parameter w s the amount of pont mass that unt z represents n the populaton. We are nterested n makng an nference about θ 0 that s defned as a unque soluton to E U (Z; θ) = 0, where U (Z; θ) s an r-dmensonal vector of some functon U(Z; θ) known up to θ and the dmenson of θ equals p r. Hansen (1982) and Imbens (1997) consdered ths over-dentfed stuaton n the context of a generalzed method of moments n econometrcs. In ths setup, Qn and Lawless (1994) consdered the emprcal lkelhood estmator of θ 0 that can be obtaned by maxmzng ln(w ) (2.1)

16 6 subject to w 1, U(z ; θ) = (1, 0). (2.2) Note that (2.2) s equal to the condton E U (Z; θ) = 0 for F F 1. Usng the Lagrange multpler method, the emprcal lkelhood estmator can be obtaned by maxmzng l e (θ) = ln w (θ), (2.3) where w (θ) s of the form w (θ) = 1 1 n 1 + ˆλ T θ U(z (2.4) ; θ) and ˆλ θ satsfes the second equaton of (2.2). Qn and Lawless (1994) showed that the emprcal lkelhood estmator satsfes 2 l e (ˆθ) l e (θ 0 ) d χ 2 p (2.5) where ˆθ s the emprcal lkelhood estmator. The result (2.5) s often called the Wlk s theorem for emprcal lkelhood and s qute useful n obtanng confdence regons for θ 0. The weght (2.4) used to compute the emprcal lkelhood estmator can be expressed as w (θ, ˆλ m ˆλT θ U(z ; θ) θ ) = (2.6) n j=1 ˆλT m θ U(z j ; θ), where m(x) = 1/(1 x) and ˆλ θ = ˆλ(θ; z 1,, z n ) satsfes w (θ, ˆλ θ )U(z ; θ) = 0. (2.7) The Lagrange multpler ˆλ θ = ˆλ(θ; z 1,, z n ) s completely determned by (2.7). We assume that, for gven θ, the soluton ˆλ θ to (2.7) s unque. The unque soluton exsts for any gven θ f 0 s nsde the convex hull of the ponts U(z 1 ; θ),, U(z n ; θ). We consder an extenson of the emprcal lkelhood estmator by allowng m(x) n (2.6) to be some smooth functon other than m(x) = 1/(1 x). The proposed estmator can be called the functonal-form emprcal lkelhood (FEL) estmator because t uses a known functon m(x) n computng the weghts n the FEL estmator. For example, the exponental tltng (ET) estmator consdered n Ktamura and Stutzer (1997) and Schennach (2007) s the same form (2.6) wth m(x) = exp(x). Imbens, Spady, and Johnson (1998) advocated usng the ET

17 7 estmator over the emprcal lkelhood (EL) estmator based on Monte Carlo nvestgaton and analytc comparson usng hgher order asymptotc expanson. In ths paper, we dscuss some asymptotc propertes for the FEL estmator. In partcular, asymptotc normalty and a verson of Wlk s theorem for the FEL estmator are establshed. We found that the asymptotc results n Qn and Lawless (1994) are specal cases of the general results n ths paper. The results n ths paper can also be used to make nferences for other types of FEL estmators, ncludng the ET estmator. The man results are presented n Secton 2. Some extensons are ntroduced n Secton 3 to llustrate possble theoretcal results of the proposed FEL estmator. In Secton 4, the underlyng algorthm s dscussed. Results from a lmted smulaton study are presented n Secton 5 and concludng remarks are made n Secton Man Results Based on the functonal form of the FEL weghts n (2.6), we can defne a functonal-form emprcal log-lkelhood functon l(θ) = l(θ, ˆλ θ ) = ln ω (θ, ˆλ θ ) = m (θ, ln ˆλ θ ) n m (θ, ˆλ θ ) (2.8) where m (θ, ˆλ θ ) = mˆλ T θ U(z ; θ) for some functon m( ) and ˆλ θ satsfes (2.7). The loglkelhood functon n (2.8) s a parametrc form n the sense that the lkelhood functon s known except for some unknown parameter (θ, λ). The computaton for optmzaton usng (2.8) s generally smpler than the computaton usng the nonparametrc lkelhood (2.1) snce the parameter space s reduced from n to p + r. The parameter λ s used to facltate the computaton for constraned optmzaton. Furthermore, the log-lkelhood functon (2.8) does not drectly use any dstrbutonal assumptons. Thus, the nature of the maxmum lkelhood estmator usng (2.8) s stll nonparametrc n the sense that t s vald wthout assumng any dstrbutonal assumptons. The only assumpton we use s E U(Z; θ 0 ) = 0. Let ˆθ be the soluton that maxmzes l(θ, ˆλ θ ) n (2.8). Let ˆQ 1 (θ, λ) = n ω (θ, λ) U(z ; θ) and ˆQ 2 (θ, λ) n 1 dl(θ, ˆλ θ )/dθ. The soluton ˆθ and ts correspondng λ-value, denoted by

18 8 ˆλ = ˆλ(ˆθ), satsfes ˆQ 1 (ˆθ, ˆλ) = 0 and ˆQ 2 (ˆθ, ˆλ) = 0. The soluton ˆθ s called the FEL estmator of θ 0. For smplcty of notaton, let γ = (θ, λ) and ˆγ = (ˆθ, ˆλ). Also, let ˆQ(γ) = ( ˆQ 1 (γ), ˆQ 2 (γ)). To dscuss the asymptotc propertes of the FEL estmator, we assume the followng condtons: (C1) The soluton θ 0 to E U(Z; θ) = 0 s unque. (C2) In the weght functon (2.6), the functon m(x) s always postve and has contnuous second-order dervatves at x = 0 wth m(0) = m (0) = 1. (C3) The partal dervatve U (θ) = U(θ)/ θ s a contnuous functon of θ n the compact set A and θ 0 A almost surely. (C4) The random functons ˆQ(γ) converge unformly n probablty to Q(γ) = E ˆQ(γ) n the compact set B and γ 0 B, where γ 0 = (θ 0, 0). The followng theorem provdes the consstency of the FEL estmator. Theorem Assume that condtons (C1)-(C4) hold. Assume that the soluton (ˆθ, ˆλ) to ˆQ 1 (θ, λ) = 0 and ˆQ 2 (θ, λ) = 0 s unquely determned. Then, the soluton (ˆθ, ˆλ) satsfes where θ 0 s a unque soluton to E U(Z; θ) = 0. p lm n (ˆθ, ˆλ) = (θ 0, 0) (2.9) In the specal case of the emprcal lkelhood method, Qn and Lawless (1994) also proved (2.9). The proof of Theorem 2.2.1, whch s dfferent from that of Qn and Lawless (1994), s presented n Secton A of Appendx A. Theorem In addton to the condtons of Theorem 2.2.1, assume that (C5) 2 U(z, θ)/( θ θ T ) s contnuous at θ n the compact set A almost surely. (C6) U(Z; θ) 3, U(Z; θ)/ θ, and 2 U(Z, θ)/( θ θ T ) are bounded by some ntegrable functon G(Z).

19 9 (C7) The r p matrx E U(Z; θ 0 )/ θ has full column rank p. Also, V aru(z; θ) s postve defnte n the compact set A. Then, we have where where and n ˆθ θ 0 ˆλ 0 V = d N(0, V) (2.10) V V 2 V 1 = E( U θ )T (EUU T ) 1 E( U θ ) 1 V 2 = E(UU T ) 1 I E( U θ )V 1E( U θ )T [E(UU T )] 1. The proof of Theorem s presented n Secton B of Appendx A. Usng Theorem 2.2.2, we can construct a Wald-type confdence nterval for θ 0. The asymptotc varance V 1 of n(ˆθ θ 0 ) can be consstently estmated by w U ( z ; ˆθ ) T w U ( z ; ˆθ ) ( U z ; ˆθ ) 1 T w U where w = w (ˆθ, ˆλ) s the fnal FEL weght n (2.6) evaluated at ˆθ and ˆλ. ( ˆθ) 1 z ;, By Theorem 2.2.2, asymptotc varance of the FEL estmator can be derved. For example, f z = (x, y ) T and µ x = E(x) s known, the FEL estmator of θ = E(y) can be obtaned usng ˆθ = n ˆm y / n ˆm wth ˆm = mˆλ(x µ x ) where ˆλ satsfes n ˆm (x µ x ) = 0. The asymptotc varance of ˆθ s equal to n 1 V (y) 1 ρ 2 where ρ s the correlaton coeffcent of x and y n the populaton. Note that the asymptotc varance s equal to the asymptotc varance of the regresson estmator ˆθ reg = ȳ + S yx S 1 xx (µ x x) (2.11) and so the FEL estmator n ths setup s asymptotcally equvalent to the regresson estmator (2.11). The regresson estmator (2.11) s the maxmum lkelhood estmator under the

20 10 bvarate normalty assumpton (Anderson, 1957). The asymptotc varance V 1 s equal to the semparametrc lower bound dscussed n Chamberlan (1987) and so the FEL estmator acheves semparametrc effcency. Theorem The functonal-form emprcal lkelhood rato statstc for testng H 0 : θ = θ 0 s W (θ 0 ) = l(ˆθ) l(θ 0 ) (2.12) where l(θ) s gven by (2.8). Under the assumpton of Theorem 2.2.1,we have that 2W (θ 0 ) d χ 2 p (2.13) as n, when H 0 s true. Theorem 2.2.3, whch can be called the Wlk s theorem for FEL method, shows that the FEL log-lkelhood n (2.8) can be used to construct a confdence nterval based on the lkelhood rato statstcs (2.12) as n the parametrc lkelhood method. In the followng corollary, we show that the FEL method can be used to construct a profle of lkelhood rato confdence ntervals. The proofs of Theorem and Corollary are presented n Sectons C and D of Appendx A, respectvely. Results smlar to Corollary are also presented n Qn and Lawless (1994) n the context of emprcal lkelhood method, but we presents a dfferent proof of the corollary. Corollary Let θ T = (θ 1, θ 2 ) T, where θ 1 and θ 2 are q 1 and (p q) 1 vectors, respectvely. For H 0 : θ 1 = θ1 0, the profle generalzed emprcal lkelhood rato test statstc s defned by W 2 = l(ˆθ 1, ˆθ 2 ) l(θ 0 1, ˆθ 0 2) (2.14) where ˆθ 0 2 maxmzes l(θ0 1, θ 2) wth respect to θ 2. Then, under H 0, we have that 2W 2 d χ 2 q as n.

21 11 Remark The FEL method could be called a generalzed emprcal lkelhood method because t s essentally a generalzaton of the emprcal lkelhood method usng functonal-form weght functon. The term generalzed emprcal lkelhood, however, was already used by Smth (1997) and Newey and Smth (2004) to denote another type of extenson to emprcal lkelhood method n econometrcs usng a saddle pont optmzaton problem. Our method s dfferent from the GEL method because we do not have to specfy the objectve functon for saddle pont computaton and we have only to drectly specfy the functonal-form for the weghts n FEL estmators. 2.3 Extenson The log-lkelhood functon n (2.8) can be vewed as a negatve dvergence functon between 1/n and w. Instead of usng a dvergence functon based on the log-lkelhood (2.8), one can also consder a more general class of dvergence functons. Specfcally, we consder a class of dvergence functons based on power-dvergence statstcs, proposed by Cresse and Read (1984), CR(α) = 2 α(α + 1) ( ) 1/n α 1. (2.15) Note that CR(0) = 2 n log(nω ), whch s the log-lkelhood functon n (2.6) and CR( 1) = 2 n nω log(nω ), whch s often called the Kullback-Lebler dvergence measure. The results n Secton 2 show that the choce of weght functon s not crtcal because the ω resultng estmators are all asymptotcally equvalent. Surprsngly, we show n ths secton that the choce of the objectve functon s not crtcal ether. The results presented here are an extenson of Baggerly (1998) to the case when θ s defned through the soluton to an estmatng equaton. Theorem Let ˆQ 1 (θ, λ) = n ω U(z ; θ) and ˆQ 2 (θ, λ) = n 1 dl 3 (θ, λ)/dθ where ω s defned n (2.6) and 1 l 3 (θ, λ) = α(α + 1) [ ω (θ, λ)n α 1 ]. (2.16)

22 12 Suppose that (ˆθ, ˆλ) s the soluton of ˆQ1 (θ, λ) = 0 and ˆQ 2 (θ, λ) = 0. Then under condtons stated n theorem and theorem 2.2.2, we have ˆθ θ 0 n d N(0, V ) (2.17) ˆλ where V s defned n (2.10). Also, the generalzed emprcal lkelhood rato statstc for testng H 0 : θ = θ 0 satsfes 2 l 3 (ˆθ) l 3 (θ 0 ) d χ 2 p (2.18) where l 3 (θ) s gven by (2.16). Theorem s a general result n the sense that, for the specal case of α = 0 n (2.15), t leads to Theorem and Theorem Also, for the specal case of α = 1, we have the followng result. Its proof s very smlar to that of Theorem and s not presented here. Corollary Let l 2 (θ) = n nω log(nω ) and assume that ˆθ maxmzes l 2 (θ). Then we have and θ 0 s the true value of θ. 2 l 2 (ˆθ) l 2 (θ 0 ) d χ 2 p, 2.4 Computatonal Aspects The FEL estmator that maxmzes the objectve functon (2.8) subject to the constrant (2.7) could be vewed as a standard optmzaton problem n the (θ, λ) space of dmenson p + r. However, as shown n Secton A of Appendx A, the probablty lmt Q 2 (θ, λ) of ˆQ 2 (θ, λ) satsfes Q 2 (θ, 0) = 0 for all θ. Thus, standard approaches to solvng the systems of equatons ˆQ 1 (θ, λ) = 0 and ˆQ 2 (θ, λ) = 0 can have erratc behavor n the neghborhood of λ = 0. To avod ths numercal problem, we consder an approach usng a penalty term used n the rdge regresson method, as was also consdered by Imbens, Spady, and Johnson (1998). The objectve functon wth a penalty term can be expressed as l (θ, λ) = l (θ, λ) 0.5K ˆQ 1 (θ, λ) T W ˆQ 1 (θ, λ), (2.19)

23 13 where l (θ, λ) s the orgnal objectve functon, such as (2.8) or (2.16), and K s a scalar penalty term that makes the optmzaton problem locally convex, and W s some r r postve defnte matrx. Note that ˆQ 2 (θ, λ) = n 1 l (θ, λ) / θ can be wrtten ˆQ 2 (θ, λ) = Q 2 (θ, λ) K n 1 Q 1θ (θ, λ) T W ˆQ 1 (θ, λ), where Q 1θ (θ, λ) = ˆQ 1 (θ, λ)/ θ. Thus, for suffcently large K = O(n), we have Q 2(θ, 0) 0 for θ θ 0 and Q 2(θ 0, 0) = 0, (2.20) where Q 2 (θ, λ) s the probablty lmt of ˆQ 2 (θ, λ). Property (2.20) follows because Q 2(θ, λ) = Q 2 (θ, λ) + C(θ, λ)q 1 (θ, λ) (ˆθ for some matrx C(θ, λ), and Q 1 (θ, λ) satsfes (2.20). Once the soluton, ˆλ ) that maxmzes l (θ, λ) n (2.19) s obtaned, we solve ˆQ (ˆθ ) 1, λ = m λ T U(z ; ˆθ ) U(z ; ˆθ ) = 0 (2.21) for λ to get the fnal soluton. The Newton-type soluton to (2.21) can be computed by ˆλ (t+1) = ˆλ ) 1 ) (t) ṁ (ˆλT (t) U U U T m (ˆλT (t) U U, where U = U(z ; ˆθ ), wth an ntal value ˆλ (0) = 0. To demonstrate the computaton, we use a sample of sze n = 50 generated from a bvarate normal dstrbuton (X, Y ) d N 1, (2.22) In the computaton, we set W = I and let K vary from 10 to We assume that µ x = 1 s known and we are nterested n estmatng µ y. We used the exponental tltng weght of the form ω = exp(λ 1 x + λ 2 y ) n j=1 exp(λ 1x j + λ 2 y j ) From the realzed sample, the estmates of (µ y, λ 1, λ 2 ) that maxmze the penalzed lkelhood (2.19) are computed for each K usng ( ) ˆQ 1 (θ, λ) = ω (x 1), ω (y θ).

24 14 < Fgure 2.1 around here. > Fgure 2.1 presents the plot of the soluton (ˆµ y, ˆλ 1, ˆλ 2 ) aganst the value of the penalty parameter K. The estmates of µ y and λ 1 converge as K gets larger, but the estmate of λ 2 does not converge even for large K. Because the computaton n Fgure 1 s based on a sngle realzaton of the sample, the resultng ˆµ y s not necessarly equal to µ y = 1. The estmate for µ y can be used for fnal computaton but ˆλ = (ˆλ 1, ˆλ 2 ) need to be updated usng (2.21). 2.5 Smulaton Study To check the fnte sample performance of the FEL estmators, we performed two lmted smulaton studes. In the frst smulaton study, we generated two sets of bvarate data (x, y ) from two dfferent samplng dstrbutons: the bvarate normal dstrbuton (2.22) and a bvarate non-normal dstrbuton defned by x χ 2 (1) y = M(x 1) + e, (2.23) where M = 0.5, e exp(1), and e s ndependent of x for = 1, 2,..., n. Note that, n both dstrbutons, E(X) = E(Y ), V (X) = V (Y ), and Corr(X, Y ) = 0.5. For each dstrbuton, we generated B = 2, 000 ndependent Monte Carlo samples of sze n, where we used the three dfferent sample szes: n = 20, 50, and 100. For each sample generated above, we computed three FEL estmators of µ y = E(Y ) under the followng scenaros: (Scenaro 1) We have no extra nformaton. (Scenaro 2) We use µ x = 1 as the constrant. (Scenaro 3) We use µ x = µ y as the constrant. (Scenaro 4) We use µ x = µ y and σ x = σ y as the constrants. In Scenaro 1, we used the sample mean to estmate θ. In Scenaros 2-4, the FEL methods are used to ncorporate the addtonal nformaton. In Scenaro 3, for example, the addtonal

25 15 nformaton can be ncorporated by usng the FEL weghts ω = m λ 1 (x y ) + λ 2 (y θ) n j=1 m λ 1(x j y j ) + λ 2 (y θ) where λ 1 and λ 2 are computed by (2.21) wth U(x, y ; θ) = (x y, y θ) and θ s determned by maxmzng the gven objectve functon. For the choce of m( ) functon n ω, we consdered three dfferent FEL estmators as below: 1. Emprcal lkelhood estmator (EL) usng m(x) = 1/(1 x) wth the objectve functon (2.8). 2. Exponental tltng estmator (ET1) usng m(x) = exp(x) wth the objectve functon l(θ) = n nω log(nω ). 3. Exponental tltng estmator (ET2) wth the objectve functon (2.8). Monte Carlo mean and Monte Carlo varance of the FEL estmators are computed for each scenaro based on the Monte Carlo sample of sze B = 2, 000. All of the FEL estmators are essentally unbased, and the Monte Carlo means are not presented here. Table 2.1 presents the Monte Carlo estmates of the relatve effcency of the FEL estmators. The effcency s computed by the rato of the varance of the sample mean (under Scenaro 1) to the varance of the correspondng FEL estmator. Under the normal dstrbuton, the theoretcal values of the standardzed varance of the FEL estmators are all approxmately equal to 1/(1 ρ 2 ) = for the three scenaros, whch s consstent wth the smulaton results n Table 2.1. The smulaton results n Table 2.1 show that all of the FEL estmators show smlar effcency for large sample sze (n = 100) but the ET estmators are slghtly more effcent than the EL estmator for small sample sze (n = 20, 50). In the second smulaton study, we compared the statstcal power of test statstcs derved from the FEL methods. In ths smulaton study, we frst generated 6 dfferent samples from (X, Y ) d N 1, 1 ρ. 1 ρ 1

26 16 wth 6 dfferent values of ρ, varyng from 0 to 0.5. In addton to the normal model, we also generated samples from the non-normal model (2.23) where M s chosen to make ρ = (0, 0.1, 0.2, 0.3, 0.4, 0.5). In the second study we consdered the same three FEl estmators. We used θ = (µ x, µ y, σx, 2 σy, 2 ρ) and U(x, y; θ) s a 5-dmensonal vector of unbased estmatng functon for θ. For each FEL method, the profle lkelhood test s constructed by computng the full maxmum lkelhood estmator ˆθ and the profle maxmum lkelhood estmator ˆθ 0 that s computed under the null hypothess H 0 : ρ = 0. H 0 : ρ = 0 f The profle lkelhood test wth level α rejects the null hypothess 2 l (ˆθ1, ˆθ ) ( 2 l 0, ˆθ ) 2 0 χ 2 1(1 α) where θ 1 = ρ, θ 2 = (µ x, µ y, σx, 2 σy) 2 and χ 2 1 (1 α) s the 1 α quantle of the ch-square dstrbuton wth 1 degrees of freedom. In addtonal to the FEL method, we also computed the normal-based Pearson test for comparson. The Monte Carlo power of the level α = 0.05 test statstc was computed by the relatve frequency of rejectng the null hypothess H 0 : ρ = 0. Table 2.2 presents the Monte Carlo power of the test statstcs obtaned from three FEL methods for each sample. For ρ = 0, the power s the sze of the test and t converges to α = 0.05 as n gets larger. In the normal sample, the power of the test based on ET method s hgher than that for EL method when n = 100. The ET1 method shows smaller type-1 error than the ET2 method when the sample sze s small. In the non-normal sample, the EL method seems to have better statstcal powers than the ET methods. Overall, the three FEL methods show smlar performances n most cases, whch s consstent wth our theory. 2.6 Conclusons Emprcal lkelhood method s useful n ncorporatng the known constrants of parameters and also n combnng nformaton from dfferent sources. The functonal-form emprcal lkelhood method proposed n ths paper provdes a unfed approach of handlng such constrants wthout usng dstrbutonal assumptons on the sample observaton. FEL methods allow us

27 17 to set a more flexble objectve functon as well as a flexble weght functon. Thus, computatonal effcency can be acheved by fndng a smple weght functon n the FEL method. For example, n the smulaton study, the computng tme for the ET method s much shorter than the computng tme for the EL method. The FEL method can be used to provde a lkelhood rato test wth a ch-square lmtng dstrbuton. Also, a profle lkelhood rato test can be derved usng the orthogonalty of the log-lkelhood functons. To mprove the coverage propertes of the FEL n the small sample szes, some cuttng-edge technques such as bootstrap calbraton (Hall and Horowtz, 1996) or the Bartlett correcton (Chen and Cu, 2006) can be used. Further nvestgaton n ths drecton, ncludng the Hgher order expanson as n Lu and Chen (2010), s not dscussed here and wll be a topc of future research. muy lambda1 lambda2 emuy elambda elambda2 4e 08 2e 08 0e+00 2e 08 4e K K K Fgure 2.1 Parameter estmatons versus penalty parameter.

28 18 Table 2.1 Monte Carlo relatve effcency of the pont estmators. Model Stuaton Sample sze(n) EL ET1 ET2 n = S1 n = n = n = S2 n = Normal n = n = S3 n = n = n = S4 n = n = n = S1 n = n = n = S2 n = Non-normal n = n = S3 n = n = n = S4 n = n =

29 19 Table 2.2 Power comparsons for testng H 0 : ρ = 0 ρ Model Method Sample sze n = Pearson n = n = n = EL n = Normal n = n = ET1 n = n = n = ET2 n = n = n = Pearson n = n = n = EL n = Non n = normal n = ET1 n = n = n = ET2 n = n =

30 20 CHAPTER 3. A UNIFIED THEORY ON EMPIRICAL LIKELIHOOD METHODS WITH MISSING DATA AND SURVEY SAMPLING A paper submtted to the Australan and New Zealand Journal of Statstcs(revson nvted) Sxa Chen and Jae Kwang Km Abstract Effcent estmaton wth mssng data s an mportant practcal problem wth many applcaton areas. Survey samplng can be treated as a mssng data problem where the sample s treated as a realzaton of a known response mechansm. Parameter estmaton under nonresponse s consdered when the parameter s defned as a soluton to an estmatng equaton. Usng a response probablty model, a complete-response emprcal lkelhood method can be constructed and the nonparametrc maxmum lkelhood estmator can be obtaned by solvng the weghted estmatng equaton where the weghts are computed by maxmzng the complete-response emprcal lkelhood subject to the constrants that ncorporate the auxlary nformaton obtaned from the full sample. Often the constrants are constructed from the workng outcome regresson model for the condtonal dstrbuton of the estmatng functon gven the observaton. The proposed method acheves the sem-parametrc lower bound when we correctly specfy the condtonal expectaton of the estmatng functon, regardless of whether the response probablty s known or estmated. When the response probablty s estmated nonparametrcally, the resultng emprcal lkelhood method automatcally acheves the sem-parametrc lower bound wthout specfyng the condtonal dstrbuton of the estmatng functon. The proposed method s also applcable to two-phase samplng. Asymptotc theores

31 21 are derved and smulaton studes are also presented. Key Words: Mssng at random; Nonparametrc estmaton; Response mechansm; Propensty score. 3.1 Introducton The emprcal lkelhood (EL) method, proposed by Owen (1988, 1990), has become a very powerful tool for nonparametrc nference n statstcs. It uses a lkelhood-based approach wthout havng to make a parametrc dstrbutonal assumpton about the data observaton. Thus, the EL method often leads to effcent estmaton and enables lkelhood-rato type nference. Qn and Lawless (1994) consdered the stuaton when the parameter of nterest s the soluton to a system of estmatng equatons. Owen (2001) provdes a comprehensve overvew of the EL method. Under exstence of mssng data or survey data, however, the EL method s not drectly applcable and some adjustment needs to be made. Qn (1993) addressed ths problem usng a based samplng argument of Vard (1985). Wang and Rao (2002) used regresson-type mputaton approaches to emprcal lkelhood nference. Wang and Chen (2009) used a nonparametrc regresson mputaton approach to handle mssng data n the emprcal lkelhood nference. The mputaton approach uses some assumptons about the mssng data gven the observed data and usually assumes that the response mechansm s gnorable n the sense of Rubn (1976). Under an gnorable mssng mechansm, the explct modelng of the response model s avoded. In the case of survey samplng, Chen and Stter (1999) consdered the pseudo emprcal lkelhood estmator that uses the samplng weght n the emprcal log-lkelhood functon. Km (2009) consdered an alternatve emprcal lkelhood functon based on the based samplng lkelhood of Vard (1985) and Qn (1993). Wu and Rao (2006) dscussed nterval estmaton usng the pseudo emprcal lkelhood. Note that the survey samplng can be treated as a specal case of mssng data problem, where the sample s obtaned by a planned mssng mechansm and the frst-order sample ncluson probablty corresponds to the response probablty n the usual mssng data problem. The man dfference s that the sample ncluson probabltes are

32 22 known n survey samplng, as the mssng mechansm s planned by the samplng desgn. In ths paper, we consder an alternatve approach to handlng mssng data usng a model for response probablty. Use of parametrc response probablty model n the emprcal lkelhood nference has been consdered n Qn and Zhang (2007) and n Chen et al. (2008). Qn et al. (2009) and Tan (2011) consdered usng EL to model the complete lkelhood, where the nonparametrc lkelhood functon s computed for the whole sample ncludng the unts wth mssng data. The use of complete lkelhood attans the full effcency and also provdes a nce theory of the lmtng ch-square dstrbuton n the lkelhood rato test statstcs. However, n some practcal case, the unt-level nformaton for the complete lkelhood s not always avalable and the complete lkelhood cannot be computed. For example, n survey samplng, the ndvdual values of auxlary varable n the non-sampled part are not usually avalable. In ths case, the approach of usng the complete lkelhood for the fnte populaton may not be applcable. If the response mechansm s nonparametrcally modeled, the lterature s somewhat sparse. Cheng (1994) dscussed some asymptotc propertes of the mean estmator usng the kernel regresson method to estmate the condtonal outcome regresson model under an gnorable mssng case. Recently, Km and Yu (2011) extended the approach of Cheng (1994) to handle nongnorable nonresponse. Xue (2009) dscussed an emprcal lkelhood method for lnear models usng the weghts computed from a nonparametrc model where the kernel regresson method s used to estmate the response model. Da Slva and Opsomer (2009) consdered another type of nonparametrc response probablty estmaton usng local polynomal regresson. Hrano et al (2003) and Cattaneo (2010) dscussed semparametrc effcency of the nonparametrc response propensty estmators n the context of estmatng average treatment effect n econometrcs. In ths paper, we propose a unfed approach of the EL method wth mssng data that avods usng the complete lkelhood. Under the setup of estmatng functon n Qn and Lawless (1994), the proposed method can handle the stuaton regardless of whether the response probabltes are known or estmated, parametrcally or even nonparametrcally. When the response probabltes are known, the proposed method can be appled to survey weghtng

33 23 problems when the frst-order ncluson probabltes are known. Incorporatng the populaton level auxlary nformaton nto the weghts n the sample s an mportant problem n survey samplng and s often called calbraton weghtng. Calbraton weghtng s consdered n Devlle and Särndal (1992), Fuller (2002), and Km and Park (2010), among others. The proposed method can be drectly applcable to the calbraton weghtng problem. When the response probabltes are estmated from a parametrc model, the proposed method under gnorable response mechansm s smlar to the method of Qn and Zhang (2007). The proposed method s drectly applcable to the problem of the propensty score weghtng method. The propensty score weghtng method can be found, for example, n Durrant and Sknner (2006), Km and Km (2007), and Chang and Kott (2008). We show that employng EL method usng a sutable choce of control varable leads to effcent estmaton n the sense that t acheves the lower bound of the asymptotc varance. Optmal choce of the control varable requres correct specfcaton of the condtonal dstrbuton of the mssng data gven the observaton. Under the nonparametrc propensty score method, whch wll be dscussed n Secton 5, the lower bound of the asymptotc varance can be acheved wthout correctly specfyng the condtonal dstrbuton. In Secton 2, we frst revew the exstng methods of emprcal lkelhood under mssng data and dscuss a unfed approach of the EL method. Asymptotc propertes of the proposed estmator under known response probabltes are dscussed n Secton 3. The proposed EL estmator s dscussed under estmated response probablty n Secton 4. Use of the nonparametrc response model for the EL approach s dscussed n Secton 5. The proposed method s extended to two-phase samplng n Secton 6. Results from two smulaton studes are reported n Secton Basc setup Consder a multvarate random varable (X, Y ) wth dstrbuton functon F (x, y) whch s completely unspecfed except that EU(X, Y ; θ 0 ) = 0 for some θ 0. We are nterested n estmatng the parameter θ 0 from a random sample of the dstrbuton. To avod unnecessary detals, we assume that the soluton to EU(X, Y ; θ) = 0 s unque. For smplcty, we assume

34 24 that the dmenson of U s equal to the dmenson of θ. If (x, y ), = 1, 2,..., n, are n ndependent realzatons of the random varable (X, Y ), a consstent estmator of θ 0 can be obtaned by solvng U(x, y ; θ) = 0. (3.1) In ths paper, we consder the problem of estmatng θ 0 when x s always observed and y s subject to mssngness. Let r = 1 f y s observed and r = 0 otherwse. We consder an approach based on the emprcal lkelhood (EL) method. To explan the dea, frst note that the jont densty of the observed data can be wrtten as p nr (1 p) n nr f(x, y r = 1) f(x r = 0), (3.2) r =1 r =0 where n r s the response sample sze, p = P r(r = 1), f(x, y r) s the condtonal densty of (X, Y ) gven r, and f(x r = 0) = f(x, y r = 0)dy s the margnal densty of X among r = 0. In the emprcal lkelhood approach, the dstrbuton s assumed to have the support on the sample observaton. Let F 1 (x, y) = P r(x x, Y y r = 1) and F 0 (x, y) = P r(x x, Y y r = 0). Under the emprcal lkelhood approach, we can express F 1 (x, y) = r =1 ω I(x x, y y), (3.3) where r =1 ω = 1, ω s the pont mass assgned to (x, y ) n the nonparametrc dstrbuton of F 1 (x, y), and I(B) s an ndcator functon for event B. To express F 0 (x, y) usng ω, note that we can wrte f(x, y r = 0) = f(x, y r = 1) Odd(x, y ) EOdd(x, y ) r = 1, where Odd(x, y) = P r(r = 0 x, y) P r(r = 1 x, y). Thus, we can express F 0 (x, y) = P r(x x, Y y r = 0) by F 0 (x, y) = r =1 ω O I(x x, y y) r =1 ω O, (3.4)

35 25 where O = Odd(x, y ). Note that F 0 (x, y) s completely determned by two factors: ω and O. The factor ω s determned by the dstrbuton F 1 (x, y) and the factor O s determned by the response mechansm. If Odd(x, y) s a known functon of (x, y), then we have only to determne ω. From (3.4), the jont dstrbuton of (x, y) can be wrtten as F w (x, y) = p r ω I(x x, y y) + (1 p) =1 ω O I(x x, y y) r =1 r =1 ω O r = p ω I(x x, y y) + (1/p 1) =1 ω O I(x x, y y) r =1 ω. O Note that (3.3) mples r =1 r =1 Thus, we have r =1 ω O = 1/p 1 and 1 ω (O + 1) = E π(x, Y ) r = 1 1 = f(x, y r = 1)dxdy π(x, y) F w (x, y) = = 1 π(x, y) π(x, y)f(x, y) dxdy = 1/p. p r =1 ω (1 + O )I(x x, y y) r =1 ω. (O + 1) We propose maxmzng the partal lkelhood r =1 f(x, y r = 1) n (3.2) n constructng the emprcal lkelhood. Thus, the proposed emprcal lkelhood approach can be formulated as maxmzng subject to ω = 1, r =1 l e (θ) = r =1 log (ω ), (3.5) ω (1 + O )U(x, y ; θ) = 0. (3.6) r =1 Note that, n constrant (3.6), the observed values of x wth r = 0 are not used. To ncorporate the partal nformaton, we can mpose r =1 ω (1 + O )h(x ; θ) r =1 ω (1 + O ) = n 1 h(x ; θ). (3.7) as an addtonal constrant for some h(x ; θ). The choce of h(x; θ) wll be dscussed later.

36 26 There are several other approaches usng the emprcal lkelhood wth mssng data. Qn et al. (2002) consdered usng emprcal lkelhood for nongnorable nonresponse. Wang and Rao (2002) proposed emprcal lkelhood-based nference under mputaton for mssng response data. Qn and Zhang (2007) proposed an emprcal lkelhood method for estmatng the mean response under gnorable mssng data where the response probablty π = P r(r = 1 X ) s parametrcally modeled by π = π (φ 0 ) for some φ 0. Specfcally, they proposed maxmzng l = r =1 log π ( ˆφ)p /ˆν, subject to p = 1, r =1 p π ( ˆφ) = ˆν, r =1 p h(x ) = n 1 r =1 h(x ), (3.8) where ˆφ s the maxmum lkelhood estmator of φ 0 n the response probablty, h(x ) s an arbtrary varable and ˆν = n 1 n π ( ˆφ). Once the estmated probablty ˆp s computed by the above maxmzaton procedure, the populaton mean can be estmated by ˆθ = r =1 ˆp y. Chen et al. (2008) bult two emprcal lkelhoods for response and non-response varables separately and formulated two estmatng equatons based on these two emprcal lkelhoods. In the context of the current setup, ther proposed method can be descrbed as maxmzng l = r =1 log(p ) + r j =0 log(q j), subject to r =1 p = 1, p 0, r j =0 q j = 1, q j 0, and r =1 p h(x ; θ) µ π ( ˆφ) = 0, r j =0 q j h(x j ; θ) µ 1 π j ( ˆφ) = 0, (3.9) where ˆφ s the maxmum lkelhood estmator. Qn et al. (2009) consdered maxmzng the complete lkelhood l c = n log(ω ) subject to and ω = 1, ω ( r π 1)h (θ) = 0, ω r π U (θ) = 0, (3.10) π (φ)/ φ ω r π (φ) = 0. (3.11) π (φ)1 π (φ) The computaton requres that the ndvdual values of x for r = 0 be avalable, whch s not always possble, as dscussed n Secton 1. For example, n survey samplng problem, we only observe (x, y ) for r = 1 and the aggregate nformaton x n = n 1 n x s avalable. In ths case, the method of Qn et al. (2009) s not applcable.

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors

Stat260: Bayesian Modeling and Inference Lecture Date: February 22, Reference Priors Stat60: Bayesan Modelng and Inference Lecture Date: February, 00 Reference Prors Lecturer: Mchael I. Jordan Scrbe: Steven Troxler and Wayne Lee In ths lecture, we assume that θ R; n hgher-dmensons, reference

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Nonparametric model calibration estimation in survey sampling

Nonparametric model calibration estimation in survey sampling Ames February 18, 004 Nonparametrc model calbraton estmaton n survey samplng M. Govanna Ranall Department of Statstcs, Colorado State Unversty (Jont work wth G.E. Montanar, Dpartmento d Scenze Statstche,

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 193-9466 Vol. 10, Issue 1 (June 015), pp. 106-113 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) Exponental Tpe Product Estmator

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin

LOW BIAS INTEGRATED PATH ESTIMATORS. James M. Calvin Proceedngs of the 007 Wnter Smulaton Conference S G Henderson, B Bller, M-H Hseh, J Shortle, J D Tew, and R R Barton, eds LOW BIAS INTEGRATED PATH ESTIMATORS James M Calvn Department of Computer Scence

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE STATISTICA, anno LXXV, n. 4, 015 USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE Manoj K. Chaudhary 1 Department of Statstcs, Banaras Hndu Unversty, Varanas,

More information

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1

On an Extension of Stochastic Approximation EM Algorithm for Incomplete Data Problems. Vahid Tadayon 1 On an Extenson of Stochastc Approxmaton EM Algorthm for Incomplete Data Problems Vahd Tadayon Abstract: The Stochastc Approxmaton EM (SAEM algorthm, a varant stochastc approxmaton of EM, s a versatle tool

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria ECOOMETRICS II ECO 40S Unversty of Toronto Department of Economcs Wnter 07 Instructor: Vctor Agurregabra SOLUTIO TO FIAL EXAM Tuesday, Aprl 8, 07 From :00pm-5:00pm 3 hours ISTRUCTIOS: - Ths s a closed-book

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi LOGIT ANALYSIS A.K. VASISHT Indan Agrcultural Statstcs Research Insttute, Lbrary Avenue, New Delh-0 02 amtvassht@asr.res.n. Introducton In dummy regresson varable models, t s assumed mplctly that the dependent

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Systems of Equations (SUR, GMM, and 3SLS)

Systems of Equations (SUR, GMM, and 3SLS) Lecture otes on Advanced Econometrcs Takash Yamano Fall Semester 4 Lecture 4: Sstems of Equatons (SUR, MM, and 3SLS) Seemngl Unrelated Regresson (SUR) Model Consder a set of lnear equatons: $ + ɛ $ + ɛ

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for

More information

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Factor models with many assets: strong factors, weak factors, and the two-pass procedure Factor models wth many assets: strong factors, weak factors, and the two-pass procedure Stanslav Anatolyev 1 Anna Mkusheva 2 1 CERGE-EI and NES 2 MIT December 2017 Stanslav Anatolyev and Anna Mkusheva

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 89 Fall 206 Introducton to Machne Learnng Fnal Do not open the exam before you are nstructed to do so The exam s closed book, closed notes except your one-page cheat sheet Usage of electronc devces

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

9. Binary Dependent Variables

9. Binary Dependent Variables 9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons

More information

Hidden Markov Models & The Multivariate Gaussian (10/26/04)

Hidden Markov Models & The Multivariate Gaussian (10/26/04) CS281A/Stat241A: Statstcal Learnng Theory Hdden Markov Models & The Multvarate Gaussan (10/26/04) Lecturer: Mchael I. Jordan Scrbes: Jonathan W. Hu 1 Hdden Markov Models As a bref revew, hdden Markov models

More information

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2)

MATH 829: Introduction to Data Mining and Analysis The EM algorithm (part 2) 1/16 MATH 829: Introducton to Data Mnng and Analyss The EM algorthm (part 2) Domnque Gullot Departments of Mathematcal Scences Unversty of Delaware Aprl 20, 2016 Recall 2/16 We are gven ndependent observatons

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Estimation of the Mean of Truncated Exponential Distribution

Estimation of the Mean of Truncated Exponential Distribution Journal of Mathematcs and Statstcs 4 (4): 84-88, 008 ISSN 549-644 008 Scence Publcatons Estmaton of the Mean of Truncated Exponental Dstrbuton Fars Muslm Al-Athar Department of Mathematcs, Faculty of Scence,

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Introduction to Generalized Linear Models

Introduction to Generalized Linear Models INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

A Hybrid Variational Iteration Method for Blasius Equation

A Hybrid Variational Iteration Method for Blasius Equation Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 1 (June 2015), pp. 223-229 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) A Hybrd Varatonal Iteraton Method

More information

Conjugacy and the Exponential Family

Conjugacy and the Exponential Family CS281B/Stat241B: Advanced Topcs n Learnng & Decson Makng Conjugacy and the Exponental Famly Lecturer: Mchael I. Jordan Scrbes: Bran Mlch 1 Conjugacy In the prevous lecture, we saw conjugate prors for the

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012

MLE and Bayesian Estimation. Jie Tang Department of Computer Science & Technology Tsinghua University 2012 MLE and Bayesan Estmaton Je Tang Department of Computer Scence & Technology Tsnghua Unversty 01 1 Lnear Regresson? As the frst step, we need to decde how we re gong to represent the functon f. One example:

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group

More information

On mutual information estimation for mixed-pair random variables

On mutual information estimation for mixed-pair random variables On mutual nformaton estmaton for mxed-par random varables November 3, 218 Aleksandr Beknazaryan, Xn Dang and Haln Sang 1 Department of Mathematcs, The Unversty of Msssspp, Unversty, MS 38677, USA. E-mal:

More information