Smooth Test for Testing Equality of Two Densities 1

Size: px
Start display at page:

Download "Smooth Test for Testing Equality of Two Densities 1"

Transcription

1 Sooth Test for Testing Equality of Two Densities Anil K. Bera University of Illinois. Phone: Aurobindo Ghosh Singapore Manageent University. Phone: Zhijie Xiao University of Illinois. Phone: October, 003. This is a preliinary draft of the paper and should not be quoted without perission. Earlier version of this paper was presented at the Midwest Econoetric Group Meetings at Colubus, Ohio, 00 and at the Tinbergen Institute, University of Asterda, 003.

2 Abstract The two-saple version of the celebrated Pearson (900) goodness-of- t proble has been a topic of extensive research and several tests like the Kologorov-Sirnov, Craér-von Mises and Anderson-Darling tests have been suggested. Although these tests perfor fairly well as onibus tests for coparing two saples with di erent probability density functions (PDF), they fail to have good power particularly against very speci c alternatives like departures in location, scale, skewness and kurtosis ters. We show that a odi ed version of Neyan sooth test based on the epirical distribution functions (EDF) obtained fro the two saples rearkably iproves the detection of directions of departure. We can identify deviations in the ean, variance, skewness or kurtosis ters using what we call the ratio density function, which is the PDF of the probability integral transfor of one saple based on the other saple EDF. We derived Neyan s sooth test using Rao s score principle. We also establish a bound on the relative saple sizes of the two saples that akes our test consistent. Furtherore, we suggest an optial choice range of the saple size of one saple copared to the other to ensure inial size distortion in nite saples. As an application of our procedure we copare the age distributions of eployees with sall eployers in New York and Pennsylvania with group insurance before and after the enactent of the Counity Rating legislation in New York. It has been a conventional wisdo that if counity rating is enforced (where group health insurance preiu does not depend on age or any other physical characteristics of the insured), then the insurance arket will collapse since only older or less healthy patients would prefer group insurance. We nd that there are signi cant changes in the age distribution in the population in New York owing ainly to a shift in the location and scale. Our extensive Monte Carlo studies also indicate that the suggested test has attractive nite saple properties both in ters of size and power.

3 Introduction One of the old, celebrated probles in statistics is the two-saple version of Pearson (900) goodness-of- t proble [Lehann (953) and Darling (957)]. Suppose we have two saples X ; X ; :::; X n and Y ; Y ; :::; Y fro two unspeci ed absolutely continuous distributions with cuulative distribution functions (CDF) F (x) and G (x), respectively. The proble is to test the hypothesis H 0 : F = G: Most of the tests in the literature are based on soe distance easures between the two epirical distribution functions (EDF), F n (x) and G (x) : The Kologorov-Sirnov criterion uses [see, for instance, Darling (957, p. 88)] D n = r n n + sup jf n (x) G (x)j : () <x< Craér- von Mises statistic is based on the easure [see, for exaple, Anderson (96, p. 48)] W n = n Z [F n (x) G (x)] dh n+ (x) ; () n + where H n+ (x) is the EDF of the two saples together, i.e., H n+ (x) = [nf n (x) + G (x)]= ( + n) : Anderson and Darling (95) odi cation of () is given by [see Darling (957, p. 87)] A n = n Z [F n (x) G (x)] (H n+ (x)) dh n+ : (3) n + Here (:) is soe non-negative weight function chosen to accentuate the distance between F n (x) and G (x) where the test is desired to have sensitivity. A statistically appealing weight function (u) = [u ( u)] has the e ect of weighing the tails heavily since the function is large near u = 0 and u = [Anderson and Darling (954, p. 767)]. As Darling (957, p. 84) stated the asyptotic distribution of these statistic are based on certain distribution analogues of the Glivenko-Cantelli type lea sup jf n (x) F (x)j! 0 with probability <x< in the sae way that the central liit theore is a distribution analogue of the law of large nubers. However, the coputation and ipleentation of these tests are not easy. Moreover, being very general, these tests do not have good nite saple power properties. In this paper, we propose a uch sipler test for H 0 using Neyan (937) sooth test principle. Let us rst consider a sipler proble, the one-saple Pearson goodness-of-

4 t test of H 0 0 : F (x) = F 0 (x) where F 0 (x) is a speci ed CDF with f 0 (x) as the corresponding probability density function (PDF). De ne the probability integral transfor (PIT) z i = F 0 (x i ) = Z xi f 0 (!) d!: (4) If H 0 0 is true, then Z ; Z ; :::; Z n are independently and identically distributed (IID) as U (0; ) irrespective of F 0. And we can test H 0 by testing unifority of Z in (0; ) : Therefore, in soe sense, all testing probles can be converted into testing only one kind of hypothesis [see Neyan (937, pp. 60-6) and Bera and Ghosh (00, p. )]. Neyan (937) considered the following sooth alternative to the unifor density " # h (z) = c () exp j j (z) ; 0 < z < ; (5) where c () is the constant of integration depending on ; ; :::; k ; j (z) are orthogonal polynoials of order j satisfying Z 0 i (z) j (z) dz = ij where ij = if i = j = 0 if i 6= j: (6) We can test H 0 0 : F (x) = F 0 (x) by testing H 00 0 : = = ::: = k = 0 in (5): Using the generalized Neyan-Pearson (N-P) lea, Neyan (937) derived the locally ost powerful syetric test for H 00 0 against the alternative H :At least one i 6= 0; for sall values of i : Under H 0 ; asyptotically the test statistic k = u j k where u j = p n nx j (z i ) ; j = ; :::; k: (7) Neyan suggested this test to rectify soe of the drawbacks of Pearson (900) goodness-of- t statistic [see Bera and Ghosh (00) for ore on this issue and for a Neyan (937) used j (y) 0 s as the orthogonal polynoials which can be obtained by using the following conditions, j (y) = a j0 + a j y + ::: + a jj y j ; a jj 6= 0; given the restrictions of orthogonality given in (6). Solving these the rst ve j (y) are (Neyan 937, pp ) 0 (y) = ; (y) = p y ; (y) = p 5 6 y ; 3 3 y 3 (y) = p 7 0 y 4 (y) = 0 y 4 45 y ; + 9 8

5 historical perspective], and called it a sooth test since the alternative density (5) is close to the null density U (0; ) for sall s and has few intersections with the null density. Now turning to the proble of testing H 0 : F = G in the two saple case, let us start assuing that F (:) is known. We construct a new rando variable Z de ning Z j = F (Y j ) ; j = ; ; :::; : The CDF of Z is given by H (z) = Pr (Z z) = Pr (F (Y ) z) = Pr Y F (z) = G F (z) = G (Q (z)) (8) where Q (z) = F (z) is the quantile function of Z: Therefore, the PDF of Z can be written as [see Neyan (937, p. 6), Pearson (938, p. 38) and Bera and Ghosh (00, p. 85)] h (z) = d dz g (Q (z)) H (z) = ; 0 < z < : (9) f (Q (z)) Although this is the ratio of two PDFs, h (z) is a proper density function in that sense that h (z) 0; z (0; ) and R h (z) dz = ; if we assue that F and G 0 are also strictly increasing functions. In the literature the PDF h (:) is known under di erent naes. Cwik and Mielniczuk (989) tered it the relative density, while Parzen (99, p. 7) and Handcock and Morris (999, p. ) called it the coparison density function and the density ratio, respectively. We will call it ratio density function (RDF) since it is both a ratio of two densities and a proper density function itself. Under H 0 : F = G; h (z) = ; i.e., Z U (0; ) : And, under the alternative hypothesis H : F 6= G; h (z) will di er fro and that provides a basis for the Neyan sooth test. Under the alternative, we take h (z) as given in (5) and test = = ::: = k = 0. Therefore, the test utilizes (9) which looks ore like a likelihood ratio. To see the exact for of h (z), let us consider soe particular cases. When the two distributions di er only in location; for exaple, f (:) N (0; ) and g (:) N (; ) ; ln(h (z)) = z which is linear in z: Siilarly, if the distributions di er in scale paraeter, such as, f (:) N (0; ) and g (:) N (0; ) ; 6= ; ln (h (z)) = z ln ; a quadratic function of z: As plotted in Figure, the rst and second-order noralized polynoials (z) and (z) can capture this type of di erences in location and scale aspects of the distributions. In Figure, we provide plots of h (z) when f (:) and g (:) di er respectively in skewness and kurtosis ters. These plots very closely reseble the plots of the third and the fourth noralized Legendre polynoials 3 (z) and 4 (z) plotted in Figure 3. Therefore, we believe that the test will not only be powerful but also will be inforative on by 3

6 identifying particular source(s) of departure(s) fro H 0 : Figure : Plots of (z) and (z) : Figure. Plots of RDF with di erent skewness and kurtosis ters 4

7 Figure 3: Plots of 3 (z) and 4 (z) : This approach to testing H 0 : F = G is related to those based on the distance functions between F and G entioned earlier. Under Neyan s sooth test forulation, we consider the distance between h (z) and, i:e: d dz [G (F (z)) z] ; 0 < z < : G (F (z)) z is a failiar quantity in the literature of tests based on di erences of EDFs, such as those entioned in ()-(3). Due to the equality [see Ser ing (980, pp. 0-)] sup jf (x) <x< G (x)j = sup G F (z) z ; 0<z< we can transfor the doain fro ( ; ) to (0; ) : Moreover, p [G (F (z)) z] converges to the well-known Brownian bridge process in distribution [see, for instance, Billingsley (968, p. 04)]. Neyan s sooth test sipli es the proble even further by looking at the distance of h (z) fro towards a particular direction speci ed by (5) [see Figure B for the directions using Legendre polynoials]. In the next section we derive Neyan s sooth test using Rao s (948) score test principle for testing H 0 : = = ::: = k = 0 in (5), assuing F (:) is known. In Section 3, we allow F (:) to be unknown and use ^z j = F n (y j ) ; i.e., we get the EDF of one population and then consider the PIT of the saple fro the other population. Replacing F (:) by F n (:) requires careful theoretical considerations and these are also addressed in Section 3. 5

8 Neyan Sooth Test For siplicity we rst consider the test when F (x) is known and we use of the PIT as de ned by Z j = F (Y j ) : Neyan (937) derived a locally ost powerful syetric (regular) unbiased test (or critical region) for H 0 : = = ::: = k = 0 in (5) against H : at least i 6= 0, and he called it an unbiased critical region of type-c. This type-c critical region is an extension of the locally ost powerful unbiased (LMPU) test (type-a region) of Neyan and Pearson (936) fro a single paraeter case to a ultiparaeter situation. Let us denote the power function as ( ; ; :::; k ) = () : Assuing, that the power power function () is twice di erentiable in the neighborhood of H 0 : = 0; Neyan (937, pp ) forally required that an unbiased critical region of type-c of size should satisfy the following conditions : : (0; 0; :::0) = : (0) : j = 0; j = ; ; :::; k: () =0 3: l = 0; j; l = ; ; :::; k; j 6= l: () =0 4: j =0 5: And nally, the coon satisfying the conditions (0)-(3). ; j = ; 3; :::k: (3) =0 is the axiu over all such regions It is clear that conditions (0) and () are respectively for the size and unbiasedness. Conditions () and (3) ensures that equal departures fro = 0; in all directions should lead to the sae power. Therefore, for this type-c critical region the approxiate power function is () = + P k j : Neyan (937) used (5) as the PDF under the alternative hypothesis. After considerable algebra [see Neyan (937, pp )] and using ultiparaeter version of the generalized Neyan-Pearson les, Neyan obtained his optial test statistic. His resulting statistic, however, takes a very siple for as given below. Proposition (Neyan, 937) The type-c critical region is given by k = u j C (4) 6

9 where u j = p P j (z i ) ; and for large the critical point C is deterined fro Pr [ k C ] =. We now show that the test statistic k can siply be obtained using Rao(948) score (RS) test principle. Taking (5) as the PDF under the alternative hypothesis, the log-likelihood function l () can be written as l () = ln c () + j X j (z i ) : (5) The RS test for testing the null H 0 : = 0 is given by RS = s ( 0 ) 0 I ( 0 ) s ( 0 ) (6) h where s () is the score () =@ and I () is the inforation atrix E In our case 0 = 0: It is easy to see that s ( j ) ln c () = j X j (z i ) l() ln c () = + p u j ; j = ; ; :::; k: j Fro (5), R 0 h (z) dz = : Di erentiating this identity with respect to j Z 0 " # exp j j (z) dz + c () Z 0 " # exp j j (z) j (z) dz = 0: (8) Evaluating this under = 0; we ln c() j = 0 and under the null s ( j ) = p u j : (9) To get the inforation atrix, let us rst note fro (7) l () ln c (), l which is a constant. atrix I () is siply Therefore, under H 0 the (j; l)th eleent of the c () =@ l evaluated at = 0: Di erentiating (8) with 7

10 respect to l and evaluating it at = 0; after soe sipli cation we c l + =0 Z 0 j (z) l (z) dz = 0: () Using (6) C l = jl ; () =0 I ( 0 ) = I k ; (3) where I k is a k k identity atrix. Cobining (6), (0) and (3) the RS test statistic has the siple for RS = u j: (4) The test is transfored into the proble of testing unifor distribution of Z i, i = ; ; :::; n against the sooth alternative with density function " # h (y) = C () exp j j (y) ; where C () is a noralizing constant. If we denote log C () = () ; we ay re-write the density in the for " # h (y) = exp () + j j (y) : (5) This choice of the alternative can be recast into the well-known Maxiu Entropy characterization. Suppose, we have k restrictions on the rando variable Z i given by [see for exaple, Kagan, Linnik, Rao and Raachandran (973), p. 409] E [ j (z)] = c j ; j = ; ; :::; k: (6) Then, aong all possible distributions, the density function that axiizes the entropy E [ln f (x)] has the for (5). In this sense, we are testing for the ost plausible density function that satis es the required conditions in (6) and axiizes the entropy easure. The coponents that Neyan considered in his test is connected to an ANOVA We can interpret it as k oent conditions in a ore general fraework without the orthogonal polynoials j ; this proble can be extended to a GMM fraework, which is a topic of out future research. 8

11 like decoposition of the distribution. If we consider the classical Craér- von Mises test for H 0 : F X (:) is uniforly distributed against H : F X (:) is not uniforly distributed, the test statistic is CvM = n Z 0 h ^FX (y) i y dy; (or equivalently, if we test that H 0 : F X (:) = F 0 (:) against H : F X (:) 6= F 0 (:) ; we consider n R h i ^FX (:) F 0 0 (:) df0 (:)) where ^F X (:) is the epirical distribution function of X i : Now, if we consider the following Fourier transfor j = j = Z 0 Z 0 cos (jx) df (x) sin (jx) df (x) ; j = ; ; :::; then the testing proble is equivalent to testing H 0 : j = 0; j = ; ; :::: against H :at least one of the j 6= 0: If we let ^ j be the epirical Fourier coe cients, then the CvM test can be expressed as CvM = n X j ^ j + ^ j : (7) Fro the representation in (7), we can see that the classical CvM test (and also K-S) put increasingly saller weights (j ) to high frequency coponents (i.e. put a weight j to e ijx with large j). hese weights (j ) ake the CvM type test e ectively only use the rst few coponents of j : In Neyan s sooth test forulation, instead of putting a decreasing weight to the coponents, the selected coponents are equally weighted. This allows us to focus on the selected directions of departure fro the null distribution, this akes the sooth test superior when we are trying to identify speci c directions of departure and not just an overall departure [see, for exaple, Durbin and Knott (97), Eubank and LaRiccia (99) for studies on this issue]. Durbin and Knott (97) also suggested that when using a slightly di erent representation for the ter Z nj in their decoposition (see Durbin 97) given in (7) for testing goodness-of- t, usually the rst few j s should be exained, this is essentially the sae spirit as the Neyan s sooth test with k coponents. The basic idea of the decoposition of the Craér- von Mises statistic is to consider a Fourier transfor on F (y) ; y [0; ] to the frequency doain j ; j = 9

12 ; ; ::: and thus the population version of the statistic n Z 0 [F X (y) y] dy can be expressed as in (7). Given a nite saple fx ; X ; :::; X n g ; the epirical Fourier coe cients ^ ; ^ ; :::; ^ n are a su cient statistic, and consequently the origibal testing proble is equivalent to the test H0 0 : j = 0; j = ; ; :::; n: If there is a priori evidence that the two saples di er ostly at low frequencies, then Neyan Sooth test with a choice of sall k can indeed be a powerful test. In soe applications however, it is desirable to choose the paraeter k in the construction of Neyan s test. In other applications, we ay want k to be even an increasing paraeter with the saple size; say, n and choose k using soe odel selection criterion, such as the Schwarz s criterion (see for exaple, Ledwina 994). When choosing the ideal value of k; we have to consider two aspects. One one side, a large value of k can potentially di erentiate between two distribution in several directions (particularly, in high frequency data this ight be useful). On the other hand, if k is too high there are two possible probles. First the e ectiveness of te test in each direction would be diluted as pointed out by Neyan hiself. Second, the accuulated stochastic errors discussed in Section 3 in each direction would be large and deteriorate the perforance of the resulting test. 3 Testing Equality of Distributions Using the Sooth In practice, we have to relax the assuption of known distribution function in the two saple case. We consider again two-saples of n and observations fx i g n and fy i g fro unknown distributions with CDFs F (x) and G (x) ; and test the hypothesis H 0 : F = G; the underlying distribution being otherwise unspeci ed. Without the knowledge about the distribution function F; Z i = F (Y i ) is unknown and thus k is infeasible. Instead, we consider using the epirical distribution function (EDF) F n (x) = n in place of F and construct nx I (X i x) ; (8) ^Z i = F n (Y i ) = n nx I (X l Y i ) ; i = ; :::; : (9) l= 0

13 where I (:) is the indicator function. Substituting Z i by ^Z i in the forula of the statistic k ; we obtain the following generalized version of the sooth test using the EDF F n ^ k = " X # p j ^Zi : (30) As noted earlier, the infeasible test k converges to a -distribution with degree of freedo k and can be interpreted using Rao s score test principles. Now, we want to show that under certain regularity conditions, ^ k has the sae liiting distribution. To obtain the asyptotic -distribution for ^ k ; we need to show that the errors coing fro estiating F by the EDF F n is asyptotically zero. will be clear in our later analysis and in the proofs given in the Appendix, this asyptotic result does not hold true autoatically even with both and n! : In fact, di erent expansion rates for and n are required to preserve the liiting distribution. We suarize the result in the following Theore, whose proof is given in the Appendix. As Theore Under the null hypothesis that F=G, if ^ k ) k : (log log n) n! 0 as ; n!, Fro the above Theore we can see that the size of the saple used in estiating the distribution function should be larger in agnitude than the other saple size. The reason for this requireent is the following. If we estiate the CDF of X based on observations fx i g n ; F n converges to F at the rate p n (pointwise), i.e. the estiation error in ^Z i is of the order n : Thus, notice that j (:) is rst-order P Lipschitz continuous, the accuulated estiation error in p j ^Zi has the order n ; which goes to zero if n increases to at a rate faster than : Let us try to give a heuristic justi cation of the results given here. To obtain consistent tests, we require that the size of the saple that is used to estiate the epirical distribution (n) should be larger than the size of the saple in the approxiation (). Intuitively, the errors of the preliinary estiation of z is of order n : These preliinary estiation errors would be accuulated in then second stage when we ake su of z 0 s; which is necessary to obtain the assyptotic -approxiation. Roughly speaking, the aggregated errors fro the preliinary distributional estiation is of order p = p n: To obtain consistent tests, this error ter should go to zero, as the saple size increases. Consequently, the distribution function F (:) should be estiated with higher accuracy than the second stage approxiation. The sae proble exists in siilar statistical applications. For exaple, the situation in our case is siilar, but not exactly the, to the ethods of

14 siulation based inference (e.g. Gourieroux and Monfort 996), where, for exaple, the conditional oent are estiated based on siulation. If the nuber of siulation increases fast enough relative to the saple size, the siulation ethod estiator has the sae asyptotic distribution as the corresponding estiator as if the conditional oent is known. Theore provides an upper bound for given n so that we still have a consistent test using the EDF F n. Under this condition, a wide range of saple sizes can be chosen and all provie asyptotically equivalent tests, although the nite saple perforance of these tests ay di er substantially. A natural question to ask is: what is the optioal rate of relative to n? Theore Under the null hypothesis, r ^ k = + O p p + O p ; n and thus the optial relative agnitude of and n that iniizes the size distortion is = O n : into Heuristically, we can decopose X p j ^Zi X p j (Z i ) + X p h j ^Zi i j (Z i ) : (3) The rst part of (3), i.e. p P j (Z i ) ; converges to a standard noral variate which contributes to the liiting distribution.. The larger the, the faster it goes P i to the liiting noral distribution. The second part of (3), i.e. p h j ^Zi j (Z i ) ; which is the error ter coing fro the estiating distribution function, converges to zero under teh conditions in Theore. The larger the n relative to, the saller the ter. To optiize the sapling properties of the test, a trade-o has to be ade to balance these two coponents, giving an optial relative agnitude between and n. Although an exact forula of the optial saple sizes will be dependent on the optality criterion and the exact forulation of the higher order ters are potentially not available. Theore gives the optial relative agnitude between and n, which substantially narrows down the range of choices for the saple sizes. Monte Carlo experient results indicate that a siple rule of thub choices such as = p n

15 based on the guidance of Theore provides pretty good sapling perforance to the propsed test. 4 Application of Neyan Sooth Test with EDF Let us consider an insurance arket where there are several insurance copanies providing health insurance who are copeting for clients. These clients can be grouped into di erent risk categories depending on their proneness or propensity of having bad health. However, the ain proble that the insurance copanies face is one of adverse selection since they cannot see beforehand what type of client they are insuring, high risk or low risk (Akerlof, 970). Rothschild and Stiglitz (976) claied that in a insurance arket setup where there are two types of clients, a health insurance contract based on risk categories will ensure that the high risk client chooses to pay higher preiu while both high and low risk clients will fully insure. However, if it is not possible to write a health insurance contract based on the risk categories due to either legislation or other restrictions then there could be two possible scenarios. A healthier (low risk) individual will chose to buy less than coplete coverage while the less healthy (high risk) individuals will buy full insurance, so the insurance arket will still function although this will not the the ost e cient outcoe like the case where there could be risk based contracts. The second scenario happens if further regulations restrict or prohibit the selling of less than full insurance, then we have a proble that the healthy or low risk individuals will stop by coverage which eans that gradually the insured population will be ade up of ore high risk individuals and the insurance copany has to pay out ore often. This will cause the preiu to go up, so healthier individuals will drop their coverage even further and this cycle will nally result in a total collapse of the insurance arket. This scenario is refered to as the Adverse Seclection Death Spiral [Buchueller and DiNardo (000)]. It has been a conventional wisdo that if counity rating for setting health insurance preius is enforced (where all the insured people have to pay exactly the sae preiu irrespective of their age, sex, and health conditions), ore and ore healthier (younger) ebers of a group insurance policy would drop out of the insurance plan. To nd the evidence of any existence of Adverse Selection Death Spiral, the state of New York where legislation for enforcing counity rating was enacted in 993 can be copared with Pennsylvania, where there was no such legislations enacted. We would like to test for the di erence between the age distributions of the adult civilian population between 8 and 64, before and after 993 for each state to verify whether there is any di erence solely due to the death 3

16 spiral story. The data is fro March Current Population Survey covering questions on whether an individual have insurance coverage, and if so whether the coverage is through a sall eployer and other siilar questions. New York and Pennsylvania were selected because these states are very siilar both geographically as well as deographically. Our objective is to use Neyan s sooth test to deterine if there is a di erence between the PDF of the age distributions in a state before and after 993. The population selected was individuals who were covered by group insurance policies sponsored by their eployers who had 00 or less eployees. This was divided into two parts, one for the individuals before 993 and the other for those after 993. Fig. 4

17 4A and 4B gives the estiated PDFs using kernel density estiator. 3 Fig 4A. Density estiates for New York. 3 We estiated the PDF f (x) of the saple x ; x ; :::; x n before 993 using kernel density estiator (see Fig. 4A and Fig. 4B) ^f bef (x) = b nx x K xj ; (3) b where the bandwidth is quadruple of b = :06 in(^ x ; IQR=:34)n 5, ^ x is the estiated standard deviation of x 0 js and IQR is the interquartile range for the saple and K (:) is the kernel suggested by Parzen (96) [also see Silveran (986) pp ]. 5

18 Fig 4B. Density estiates for Pennsylvania. Suppose now, that the saple y ; y ; :::; y coes fro a population with PDF g (z) after 993. The probability integral transfors (PIT) of each z i based on the density estiates are z j = F n (y j ) ; j = ; ; :::; : (33) We estiate the cuulative distribution function (CDF) based on the equation (33) by siply calculating the epirical distribution function (EDF) of x. If the null hypothesis of the correct speci cation of the odel is true then z ; z ; :::; z should be distributed as U (0; ) (see Fig. 5B). However, if the null hypothesis is not true, then Neyan sooth test ight give us the direction (s) of departure(s) fro the 6

19 null hypothesis. Fig 5A. Age distributions of sall groups. Fig 5B. PIT histogra for age after 993. Kologorov-Sirnov (KS) or the Craér-von Mises (CvM) types of tests are the ost coonly used tests based on the EDF for coparing two distributions. KS statistic is the axiu distance between the two EDF s in g. 5A while CvM statistic is a weighted su of the squares of the di erence between the two CDFs. Table. gives us the values of all the coonly used test statistics based on the EDF of individuals covered under sall eployer sponsored group insurance in New York and Pennsylvania along with the.% critical values of all the odi ed statistics [Stephens (970)]. Test Statistic Critical Values New York Pennsylvania Upper.% D D KS Kuiper CvM A-D W Table. Goodness-of-Fit Statistics based on EDF. Except for the test on D + [which considers the positive part of ()]; all the tests clearly 7

20 indicate that the two distributions are di erent. However, these test statistics based on the EDF fails to indicate the nature of the deviation fro the null hypothesis. In order to attept to answer the above question we use Neyan s sooth test based on the test statistic k = u j; u j = p X j (z i ) ; j = ; :::; k; i = ; :::; ; (34) where we chose k = 4 and j (:) as the j th order noralized Legendre polynoial discussed in Neyan (937). Table gives the results on the data on all the individuals under sall eployer sponsored group in New York (with n=4548 and =57). Source u u u 3 u 4 Test Statistic p-value Table. Neyan s sooth statistic and coponents. In place of taking the whole population if we just use a rando saple of size = 500 of individuals covered under sall eployer sponsored group insurance in New York, the results we are given in Table A. Source u u u 3 u 4 Test Statistic p-value Table A. Neyan s sooth statistic and coponents for a saple. Under the null, k should have a distribution with k (here we have k = 4) degrees of freedo and each of the u j should be independent with degree of freedo. We can interpret the coponents of given by u j as soe easure of the j th order of the departure fro the null hypothesis. If we perfor the sae test on all the individuals who has any health insurance coverage ( = 60:3078) and those who purchased individual insurance ( = 0:8446) we see the departure fro the null hypothesis with varying degrees. Although, based on these results it appears that the population of civilian adults who had insurance before 993 and after 993 were distinctly di erent, however, as 8

21 shown in the Table 3. the suary statistics of the two distribution are siilar. State New York Pennsylvania Data Before 993 After 993 Before 993 After 993 Observations Mean Standard Deviation Skewness Coe cient Excess Kurtosis Miniu st Quartile Median rd Quartile Maxiu Table 3. Age distribution for sall eployers insured group in NY and PA. If we perfor a siple test for equality of eans in the saples for New York before and after 993 we get a p-value of while the test for di erence of variances gives Using the coponents of Neyan s sooth test we can conclude that the age distributions before and after 993 are distinctly di erent and the ajor departure sees to be due to a di erence in the rst and second order ters, although third order ter also plays soe part in the di erence in the two distributions. 5 Monte Carlo Study of Sooth test with EDF Soe preliinary size and power Monte Carlo studies have shown that the result of the sooth test have better size properties for saples where the evaluation saple size (which is used to estiate the EDF) is uch bigger than the test saple size (which is used for perforing the sooth test). Table 4. shows the actual size of a 5% noinal level test where the saples are drawn at rando fro the population of insured civilian adults who has health coverage sponsored by sall eployers (with less than 00 eployees) before 993 when the counity rating legislation was 9

22 enacted. Source u u u 3 u 4 Null Distribution 4 actual size (n = 500; = 500) actual size (n = 500; = 50) Table 4. Actual sizes for 5% noinal size sooth tests. The results are based on evaluation saple size n = 500 and the test saple size = 500 and = 50;, with 0000 replications. Fro the results we see that if we have the test saple size n to be substantially saller than the evaluation saple size n iproves the size distortion of the resulting sooth test. We have shown in Theore that having to be in the order of p n is optial fro the point of view of size distortion. Fig 6A. Saple size and test size for n = 900; increasing ; no. of repl.=500. 0

23 Fig 6B. Saple size and test size for n = 500; increasing ; no. of repl.=00. In Fig 9 shows a plot the size of the test with the saple size of the test saple drawn at rando fro the population of eployees covered by sall eployers group insurance. The evaluation saple of size n = 900 is also drawn at rando fro the sae population. For calculating the size of the sooth test we took 500 replications and evaluated the size for each value of = 0 to = 900 in increents of 0. The results with increents of 5 is also siilar shown in gure. We observe that the test size increases ore or less with the saple size after a while. Furtherore, there could be several values of for n = 900 where we have a test with a 5% actual size. Although, subsapling fro the original data of sall insurance population before 993 in New York gives us an idea about how size ight be a ected by the saple size keeping the evaluation saple size the sae, it will not give us uch idea about the power properties. Therefore, we redo the whole exercise with siulated data using a ixture of Gaussian and log-noral distributions which gives us a density close to the real data at least visually. The test size and saple size plot is given in Fig. 4. In order to get a better idea about the relative size and power

24 properties for a data fro a population of the variable X which is X = B X + ( B) X (35) where B Bernoulli(0:3), X has a log noral distribution with = :, = 4 and X is N ( = :; = :). Suppose, we also consider the alternative where B Bernoulli(0:5), ln X is N ( 0:; ) and nally X has N (:75; 0:8) : In the following charts Fig 7 and Fig 7A, we have considered n = 65 and gradually increased in increents of 5 to calculate the size of the tests based on 00 replications. Fig 7A is a agni cation of Fig 7 upto = 50: Fig 7. Saple size and the size of sooth test with siulated data using EDF.

25 Fig 7A. Saple size and the size of sooth test with siulated data using EDF. Fig 8 shows the plot of the saple size with the power of the sooth test based 3

26 on the EDF of a saple of size n = 65: Fig 8. Saple size and power of the sooth test with EDF with siulated data. Putting the above two charts together in Fig 9, we can get soe idea about the range of the optial saple size for the test saple given a xed value n of the evaluation saple (here n = 65): It is also worth noting that the power properties draatically increase as we increase the size of the test saple : For exaple, if n = 500 then using = 50; the power of the sooth test is while the size is If we plot the test saple size with the size and power of the test for n = 000 and between 0 and 600, we observe that the size distortion is less pronounced (see Fig. 0). 4

27 Fig 9. Size and Power of the sooth test for siulated data ( = 65) 5

28 Fig 0. Size and Power of the sooth test for siulated data (n = 000). The optial value of relative to n and also the size and power of the sooth test can be obtained fro a range of values of with acceptable value of size. For a ore analytical answer, we have already shown that the size of should be in the order of p n for iniizing the size distortion, however the exact relationship of and n will depend on the values of the coe cients of the higher order ters in the Cornish-Fisher expansion discussed in the Section 3 under Theore. 6 Conclusion and Future Research We proposed a sooth test for coparing two densities in a two saple setup. Unlike traditional goodness of t tests like the Kologorov-Sirnov or the Craér-von Mises classes of tests, the sooth test for coparing two densities helps us identify the nature of the discrepancy between the two densities, and hence, the sources of departure fro the null hypothesis that the two densities are identical. We have already seen that the sooth test is nothing but Rao s score test, and hence it enjoys all the optiality properties of the score test (Bera and Bilias, 00). We have also shown that the choices of the relative sizes of the estiation and test saples are 6

29 very iportant to get a consistent test with iniu size distortion. Unfortunately, in the current setup the exact values of the relative sizes cannot be obtained, we can only give the test saple size upto the p n -order of the estiation saple size. Hence, we have to rely on Monte Carlo siulation to get what the relative saple sizes should be. Our Monte Carlo siulation also revealed that the sooth test has very good power properties and an optial choice of the test saple size will also ensure that the true size is close to the noinal size of the test even in nite saples. There are several directions of future research that we want to pursue. One of the questions that we would like to answer is how we can adjust one distribution to reove the e ect of another distribution, very uch in the line of what the di erencein-di erence estiate tries to accoplish (Dinardo and Buchuller, 00). Only in that case, we can provide a viable alternative procedure for exaining whether the Adverse Selection death spiral can be observed in the insurance arket. We have looked at the global characteristics of the two distributions we are coparing, however, there could be soe very local chracteristics of the density function that we have not focused on. We can extend the sooth test for coparison of two densities siilar to what Fan (996) proposed adaptively using both Neyan sooth test technique and a wavelet based test for coparing global and local departures fro the null hypothesis. Last but not the least, we have to accoodate for possible dependence in the data when coparing two distributions particularly in the context of tie series or panel data, we have not addressed that issue in this context. There are several possible applications of this testing technique, we can list just a few of the. Sooth test can be used for coparing wage distributions between di erent genders, di erent ethnic groups or di erent residential neighborhoods like public housing projects or wealthier localites in Toronto (Oreopoulos 00). Regional di erences between uneployent rates in 50 Spanish provinces between two di erent regies, 985 and 997 can also be investigated using the sooth test technique (López-Baro, Barrio and Artis, 00). There has been an extensive literature in deography and public policy econoics on causes adverse birth outcoes in ters of low birth weight (LBW) including socio-econoic conditions and behavioral patterns like soking habits of others during pregnancy [see Evans and Ringel (999)]. Sooth test based techniques can also be used for several other studies explaining the infant ortality gap between two ethnic groups by coparing the distributions of birth weight (Miller 00). In nance, we can look at the proble of coparing distribution of pre-tax and after-tax returns of utual funds for taxable investors that can deterine the cash in ows of these funds (Bergstresser and Poterba 000). 7

30 7 Appendix 7. Proof of Theore We rst give two Leas that can be used in our proof. Lea If we de ne for P l (z) = (z ) l for any l k and l, then for any r l d r dz r P l (z) l 4 X in(l;l r) j=0 3 l! (l j)! 5 : (36) j! (l j)! (l j r)! Proof. Let us use the following sipli cation of the expression (z ) l ; Since z (0; ) ; P l (z) = (z ) l = 4z 4z + l = 4 l z l (z ) l = l lx d dz P l (z) = l j=0 lx j=0 in(l;l ) X = l l! j! (l j)! zl j (37) l! j (l j) zl j! (l j)! j=0 l! (l j)! j! (l j)! (l j )! zl j : (38) Thus (36) holds for r = : Now suppose that the equality in (38) is true for the rth derivative of P l (z) for any arbitrary 0 r l ; then we will show it also holds 8

31 for r + : We have d r+ dz P r+ l (z) = d d r dz dz P r l (z) 0 3 = d in(l;l r) l 4 l! (l j)! dz j! (l j)! (l j r)! zl j r 5A j=0 3 in(l;l r ) X = l 4 l! (l j)! j r (l j r) zl 5 j! (l j)! (l j r)! j=0 3 in(l;l r ) X = l 4 l! (l j)! j! (l j)! (l j r )! zl j r 5 : (39) j=0 Therefore, for any r l; d r dz r P l (z) = l 4 X in(l;l r) j=0 Since z (0; ) ; fro (40) the result follows. 3 l! (l j)! j! (l j)! (l j r)! zl j r 5 : (40) Lea If l (:) is the noralized Legendre polynoial of degree l de ned on (0; ) ; then the rst order Lipschitz condition holds; that is j l (^z) l (z)j M j^z zj (4) where M is a positive constant, and z and ^z are any two points between 0 and : Proof. The Legendre polynoials are de ned as [see, for instance, Kendall and Stuart (973, p. 460)] L l (z) = l! l where Z d l n dz l z l o L l (z) L k (z) dz = 0 if l 6= k; = l+ if l = k: (4) 9

32 The corresponding orthonoral polynoials on (0; ) are Therefore, l (z) = (l + ) Ll z = (l + ) Ll (z ) = (l + ) l! l d l = (l + ) l! l+ d l d (z ) l (z ) dz l (z ) l l : (43) l (^z) l (z) = (l + ) l! l+ d l d^z l (^z ) l dl dz l (z ) l : (44) For jz zj < j^z zj the ean value theore gives d l dz P l l (^z) = dl dz P l l (z) + (^z d l i.e., dz P d l l l (^z) dz P l l (z) = (^z z) dl+ dz l+ P l (z ) z) dl+ dz l+ P l (z ) : (45) Hence, using (4) and (45) and applying Lea, we have j l (^z) l (z)j = (l + ) l! l+ d l dz P d l l l (^z) dz P l l (z) = (l + ) l! l+ (^z z) dl+ dz P l+ l (z ) j^z zj (l + ) l! l+ l 4 where M is a nite nuber: Proof of Theore We only need to show that X in(l;l ) j=0 3 l! (l j)! 5 j! (l j)! (l j )! = M j^z zj ; (46) b k k = o p (): 30

33 Notice that where b k = " X p j ( Z b i ) # " X X = j (Z i ) + " X = j (Z i )# + " # " X X + j (Z i ) = R ;;n = k + R ;;n + R ;;n R ;;n = " X h j ( Z b i # i ) j (Z i ) " X h j ( b Z i ) j (Z i ) h j ( Z b i # i ) j (Z i ) ; " X # " X j (Z i ) We show that, as ; n! ; and (log log n)=n! 0, h j ( Z b i # i ) j (Z i ) i # h j ( b Z i ) j (Z i )i # : R ;;n = o p (); R ;;n = o p (): We rst look at R ;;n. R ;;n = " X h j ( Z b i # i ) j (Z i ) " X j ( Z b i ) j (Z i ) # : By Lea, the orthonoral polynoials j () satisfy the rst-order Lipschitz condition that j j (u) j (v)j M ju vj ; 3

34 for soe nite nuber M > 0. Thus R ;;n M M " X j ( Z b i ) j (Z i ) " X # Z b i Z i ax i # Z b i Z i Notice that for each xed Y i ; by Donsker s Theore we have that p n [Fn (Y i ) F (Y i )] converges in distribution to a noral variate. In addition, sup u jf n (u) F (u)j is axially of order p (log log n)=n as n! (see, e.g. Shorack and Wellner (986)). Thus and ax i R ;;n M Z b i ax i Z i = Op ( p (log log n)=n) Z b i which converges to 0 if (log log n)=n! 0 as ; n! : Siilarly, jr ;;n j = Z i = O p ((log log n)=n); X X h j (Z i ) j ( b i Z i ) j (Z i ) " X # X j (Z i ) j ( Z b i ) j (Z i ) : Notice that p P j(z i ) converges to a standard noral variable and X j (Z i ) = O p ( = ): By a siilar analysis to that for R ;;n, we have X j ( Z b i ) j (Z i ) M X Z b i Z i M ax i Z b i Z i = Op ( p (log log n)=n): 3

35 Thus " X # X jr ;;n j j (Z i ) j ( b Z i ) j (Z i ) = O p = p (log log n)=n = O p ( p (log log n)=n); converging to 0 when (log log n)=n! 0 as ; n! : 7. Proof of Theore >Fro the proof of Theore, we know that the asyptotic test statistic b k can be decoposed into k + R ;;n + R ;;n : We analyze each of these ters to show how the relative agnitude of and n a ects the testing size. For the leading ter k = " X j (Z i )# : By construction, conditional on X, for each j; V ji = j (Z i ) (i = ; :::::; ) are independent and identically distributed rando variables with ean zero and unit variance, thus X p V ji ) N(0; ) j ; where j (j = ; ::::; k) are k independent standard noral variates. In addition, assuing that V ji possesses oents up to the forth order, by standard result of Edgeworth expansion (see, e.g., Rothenberg 984 and references therein), the probability density function of p P V ji has the following expansion f(x) '(x) + k 3H 3 (x) 6 p + 3k 4H 4 (x) + k3h 6 (x) 7 where ' is the density of standard noral, k r is the r-th cuulant, and H r is the Herite polynoial of degree r de ned as H r (x) = ( ) r ' (r) (x)='(x). In this sense, P = V ji can be expanded into a leading ter of standard noral variable j 33

36 plus a second order ter, say p A j, of O p ( = ) and a third ter, B j, of order O p ( ) : X p V ji j + p A j + B j: Therefore, the ter can be expanded as k = " # X j (Z i ) j + p A + B + o p( ) where the leading ter P k j is the rando variable with k degree of freedo and the second ter p A is of order O p ( = ): To obtain a good approxiation of the distribution, a large is preferred. Now we turn to the estiation of distribution functions. R ;;n = = n " = n C: " X h j ( Z b i # i ) j (Z i ) X p h n j ( Z b i # i ) j (Z i ) Notice that U ji = p h n j ( Z b i i ) j (Z i ) = O p (); and P U ji = O p (); we have C = " X p h n j ( Z b i # i ) j (Z i ) = O p (); and thus R ;;n = O p n : 34

37 Siilarly, " X R ;;n = j (Z i ) r " X = p n r = O p n # " X h j ( Z b i # i ) j (Z i ) # " X j (Z i ) p h n j ( Z b i # i ) j (Z i ) since D = " # " X X p h p j (Z i ) n j ( Z b i ) j (Z i )i # = O p (): In suary b k = = k + R ;;n + R ;;n j + p A + r n D + o p( p + r n ); where p A, p D; etc., are higher order ters that brings size distortion in nite n saple, but are o p (). In particular, the leading ters are of order O p p and p O p n respectively. To iniize the distortion coing fro estiating the distribution function, we prefer n to be larger relative to. On the other hand, to obtain fast convergence to the liit, we want a large. Thus, a trade-o has to be ade to iniize size distortion. We balance these two ters so that they are of the sae order of agnitude, giving the optial relative agnitude as = O( p n): References [] Anderson, T. W.(96), Annals of Matheatical Statistics, 33, pp [] Anderson, T. W. and D.A. Darling (95), Asyptotic Theory of certain goodness of t criteria based on stochastic processes. Annals of Matheatical Statistics, 3, pp. 93- [3] Anderson, T. W. and D.A. Darling (954), A test of goodness of t. Journal of the Aerican Statistical Association, 49, pp

38 [4] Andrews, D. W. K.(994), Epirical Process ethods in Econoetrics. In Handbook of Econoetrics, Vol IV. Eds. R. F. Engle and D. L. McFadden. Elsevier Science: Asterda, pp [5] Andrews, D. W. K. and M. Buchinsky. A three-step ethod for choosing the nuber of bootstrap replications. Econoetrica, 68, pp [6] Bahadur, R. R. (966), A note on quantiles in large saples, Annals of Matheatical Statistics, 37, pp [7] Bera, A. K. and A. Ghosh (00), Neyan s sooth test and its applications in Econoetrics. In: Handbook of Applied Econoetrics and Statistical Inference, Eds. A. Ullah, A. Wan and A. Chaturvedi, Marcel Dekker, pp [8] Billingsley, P. (968). Convergence in Probability Measures. John Wiley: New York. [9] Ćwik, J and J. Mielniczuk (989), Estiating density ratio with application to discriinant analysis, Counications in Statistics: Theory and Methods, 8, pp [0] Darling, D.A. (955), The Craér-Sirnov test in the paraetric case. Annals of Matheatical Statistics, 6, pp. -0. [] Darling, D.A. (957), The Kologorov-Sirnov, Craér-von Mises tests. Annals of Matheatical Statistics, 8, pp [] DiNardo, J. and Buchueller T. (00), Did Counity Rating Induce an Adverse Selection Death Spiral? Evidence fro New York, Pennsylvania and Connecticut. anuscript. [3] Durbin, J.(973), Distribution Theory for tests based on the saple distribution function. In Regional Conference Series in Applied Matheatics, Vol. 9, SIAM: Philadelphia, PA. [4] Durbin, J. and M. Knott. Coponents of Craer-von Mises Statistics. I. Journal of the Royal Statistical Society. Series B (Methodological), 34:90-307, 97. [5] Eubank, R.L. and V.N. LaRiccia. Asyptotic coparison of Craér-von Mises and nonparaetric function estiation techniques for testing goodness-of- t. Annals of Statistics 0:4-45, 99. [6] Fan, J. (996) Test of signi cance based on wavelet thresholding and Neyan s truncation. Journal of the Aerican Statistical Association 9:

39 [7] Fan, J. and L-S. Huang (00). Goodness-of- t tests for paraetric regression odels. Journal of the Aerican Statistical Association, 96, pp [8] Finkelstein, H. (97) The Law of the Iterated Logarith for Epirical Distribution, Annals of Matheatical Statistics, 4, pp [9] Gaenssler, P. and W. Stute (979), Epirical Processes: A survey of results for independent and identically distributed rando variables, Annals of Probability, 7, pp [0] Gourieroux, C. and A. Monfort (99), Siulation based Inference in Models with Heterogeneity. Annales d Econonoie et de Statistique, 0/, pp [] Gourieroux, C. and A. Monfort (993), Pseudo-Likelihood Methods. In Handbook of Statistics, Vol., Eds, C. R. Rao, H. D. Vinod, Elsevier Science; Asterda, pp [] Gourieroux, C. and A. Monfort (996), Siulation-based Econoetric Methods. Oxford University Press: Oxford. [3] Hall, P. (99), The Bootstrap and Edgeworth Expansion. Springer-Verlag: New York. [4] Handcock M.S. and M. Morris (999), Relative Distribution Methods in Social Sciences. Springer: New York. [5] Kendall, M.G. and A. Stuart (969), The Advanced Theory of Statistics, 3rd Edition, Vol.. Hafner: New York. [6] Kagan, A.M., Yu. V. Linnik and C. R. Rao (Translated fro Russian text by B. Raachandran), (973) Characterization probles in atheatical statistics. Wiley: New York. [7] T. Ledwina (994), Data-driven version of Neyan s sooth test of t. Journal of the Aerican Statistical Association 89: [8] Lehann, E. L. (953), The Power of Rank Tests, Annals of Matheatical Statistics, 4, pp [9] Neyan J. Sooth test for goodness of t. Skandinaviske Aktuarietidskrift 0:50-99, 937. [30] Neyan, J. and E.S. Pearson (936). Contributions to the theory of testing statistical hypothesis I: Unbiased critical regions of Type A and A : Statistical Research Meoirs :

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution

Keywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality

More information

Bootstrapping Dependent Data

Bootstrapping Dependent Data Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly

More information

A Simple Regression Problem

A Simple Regression Problem A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where

More information

Testing equality of variances for multiple univariate normal populations

Testing equality of variances for multiple univariate normal populations University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate

More information

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES

TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,

More information

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS

DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS ISSN 1440-771X AUSTRALIA DEPARTMENT OF ECONOMETRICS AND BUSINESS STATISTICS An Iproved Method for Bandwidth Selection When Estiating ROC Curves Peter G Hall and Rob J Hyndan Working Paper 11/00 An iproved

More information

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).

The proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013). A Appendix: Proofs The proofs of Theore 1-3 are along the lines of Wied and Galeano (2013) Proof of Theore 1 Let D[d 1, d 2 ] be the space of càdlàg functions on the interval [d 1, d 2 ] equipped with

More information

OBJECTIVES INTRODUCTION

OBJECTIVES INTRODUCTION M7 Chapter 3 Section 1 OBJECTIVES Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance, and

More information

Non-Parametric Non-Line-of-Sight Identification 1

Non-Parametric Non-Line-of-Sight Identification 1 Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,

More information

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon

Model Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential

More information

Probability Distributions

Probability Distributions Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples

More information

Introduction to Machine Learning. Recitation 11

Introduction to Machine Learning. Recitation 11 Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,

More information

3.3 Variational Characterization of Singular Values

3.3 Variational Characterization of Singular Values 3.3. Variational Characterization of Singular Values 61 3.3 Variational Characterization of Singular Values Since the singular values are square roots of the eigenvalues of the Heritian atrices A A and

More information

EXTENDED NEYMAN SMOOTH GOODNESS-OF-FIT TESTS, APPLIED TO COMPETING HEAVY-TAILED DISTRIBUTIONS. J. Huston McCulloch and E. Richard Percy, Jr.

EXTENDED NEYMAN SMOOTH GOODNESS-OF-FIT TESTS, APPLIED TO COMPETING HEAVY-TAILED DISTRIBUTIONS. J. Huston McCulloch and E. Richard Percy, Jr. EXTENDED NEYMAN SMOOTH GOODNESS-OF-FIT TESTS, APPLIED TO COMPETING HEAVY-TAILED DISTRIBUTIONS J. Huston McCulloch and E. Richard Percy, Jr.* J. Econoetrics Special Issue Subission: Sept. 4, 200 First Revision:

More information

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels

Extension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique

More information

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics

ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents

More information

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,

are equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are, Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations

More information

e-companion ONLY AVAILABLE IN ELECTRONIC FORM

e-companion ONLY AVAILABLE IN ELECTRONIC FORM OPERATIONS RESEARCH doi 10.1287/opre.1070.0427ec pp. ec1 ec5 e-copanion ONLY AVAILABLE IN ELECTRONIC FORM infors 07 INFORMS Electronic Copanion A Learning Approach for Interactive Marketing to a Custoer

More information

Meta-Analytic Interval Estimation for Bivariate Correlations

Meta-Analytic Interval Estimation for Bivariate Correlations Psychological Methods 2008, Vol. 13, No. 3, 173 181 Copyright 2008 by the Aerican Psychological Association 1082-989X/08/$12.00 DOI: 10.1037/a0012868 Meta-Analytic Interval Estiation for Bivariate Correlations

More information

Machine Learning Basics: Estimators, Bias and Variance

Machine Learning Basics: Estimators, Bias and Variance Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics

More information

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians

Using EM To Estimate A Probablity Density With A Mixture Of Gaussians Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points

More information

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical

ASSUME a source over an alphabet size m, from which a sequence of n independent samples are drawn. The classical IEEE TRANSACTIONS ON INFORMATION THEORY Large Alphabet Source Coding using Independent Coponent Analysis Aichai Painsky, Meber, IEEE, Saharon Rosset and Meir Feder, Fellow, IEEE arxiv:67.7v [cs.it] Jul

More information

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters

The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn

More information

The Weierstrass Approximation Theorem

The Weierstrass Approximation Theorem 36 The Weierstrass Approxiation Theore Recall that the fundaental idea underlying the construction of the real nubers is approxiation by the sipler rational nubers. Firstly, nubers are often deterined

More information

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines

Intelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes

More information

In this chapter, we consider several graph-theoretic and probabilistic models

In this chapter, we consider several graph-theoretic and probabilistic models THREE ONE GRAPH-THEORETIC AND STATISTICAL MODELS 3.1 INTRODUCTION In this chapter, we consider several graph-theoretic and probabilistic odels for a social network, which we do under different assuptions

More information

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis

E0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds

More information

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS

AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri

More information

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis

Soft Computing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Soft Coputing Techniques Help Assign Weights to Different Factors in Vulnerability Analysis Beverly Rivera 1,2, Irbis Gallegos 1, and Vladik Kreinovich 2 1 Regional Cyber and Energy Security Center RCES

More information

A Note on the Applied Use of MDL Approximations

A Note on the Applied Use of MDL Approximations A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention

More information

COS 424: Interacting with Data. Written Exercises

COS 424: Interacting with Data. Written Exercises COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well

More information

Ensemble Based on Data Envelopment Analysis

Ensemble Based on Data Envelopment Analysis Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807

More information

Estimating Parameters for a Gaussian pdf

Estimating Parameters for a Gaussian pdf Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3

More information

Federal Reserve Bank of New York Staff Reports

Federal Reserve Bank of New York Staff Reports Federal Reserve Bank of New York Staff Reports Bootstrapping Density-Weighted Average Derivatives Matias D. Cattaneo Richard K. Crup Michael Jansson Staff Report no. 45 May 00 This paper presents preliinary

More information

GEE ESTIMATORS IN MIXTURE MODEL WITH VARYING CONCENTRATIONS

GEE ESTIMATORS IN MIXTURE MODEL WITH VARYING CONCENTRATIONS ACTA UIVERSITATIS LODZIESIS FOLIA OECOOMICA 3(3142015 http://dx.doi.org/10.18778/0208-6018.314.03 Olesii Doronin *, Rostislav Maiboroda ** GEE ESTIMATORS I MIXTURE MODEL WITH VARYIG COCETRATIOS Abstract.

More information

arxiv: v1 [cs.ds] 3 Feb 2014

arxiv: v1 [cs.ds] 3 Feb 2014 arxiv:40.043v [cs.ds] 3 Feb 04 A Bound on the Expected Optiality of Rando Feasible Solutions to Cobinatorial Optiization Probles Evan A. Sultani The Johns Hopins University APL evan@sultani.co http://www.sultani.co/

More information

Biostatistics Department Technical Report

Biostatistics Department Technical Report Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent

More information

IN modern society that various systems have become more

IN modern society that various systems have become more Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto

More information

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs

Algorithms for parallel processor scheduling with distinct due windows and unit-time jobs BULLETIN OF THE POLISH ACADEMY OF SCIENCES TECHNICAL SCIENCES Vol. 57, No. 3, 2009 Algoriths for parallel processor scheduling with distinct due windows and unit-tie obs A. JANIAK 1, W.A. JANIAK 2, and

More information

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.

This model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t. CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when

More information

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange.

Measures of average are called measures of central tendency and include the mean, median, mode, and midrange. CHAPTER 3 Data Description Objectives Suarize data using easures of central tendency, such as the ean, edian, ode, and idrange. Describe data using the easures of variation, such as the range, variance,

More information

On Conditions for Linearity of Optimal Estimation

On Conditions for Linearity of Optimal Estimation On Conditions for Linearity of Optial Estiation Erah Akyol, Kuar Viswanatha and Kenneth Rose {eakyol, kuar, rose}@ece.ucsb.edu Departent of Electrical and Coputer Engineering University of California at

More information

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words)

A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine. (1900 words) 1 A Self-Organizing Model for Logical Regression Jerry Farlow 1 University of Maine (1900 words) Contact: Jerry Farlow Dept of Matheatics Univeristy of Maine Orono, ME 04469 Tel (07) 866-3540 Eail: farlow@ath.uaine.edu

More information

RAFIA(MBA) TUTOR S UPLOADED FILE Course STA301: Statistics and Probability Lecture No 1 to 5

RAFIA(MBA) TUTOR S UPLOADED FILE Course STA301: Statistics and Probability Lecture No 1 to 5 Course STA0: Statistics and Probability Lecture No to 5 Multiple Choice Questions:. Statistics deals with: a) Observations b) Aggregates of facts*** c) Individuals d) Isolated ites. A nuber of students

More information

Nonlinear Log-Periodogram Regression for Perturbed Fractional Processes

Nonlinear Log-Periodogram Regression for Perturbed Fractional Processes Nonlinear Log-Periodogra Regression for Perturbed Fractional Processes Yixiao Sun Departent of Econoics Yale University Peter C. B. Phillips Cowles Foundation for Research in Econoics Yale University First

More information

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES

Proc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co

More information

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry

About the definition of parameters and regimes of active two-port networks with variable loads on the basis of projective geometry About the definition of paraeters and regies of active two-port networks with variable loads on the basis of projective geoetry PENN ALEXANDR nstitute of Electronic Engineering and Nanotechnologies "D

More information

Chapter 6: Economic Inequality

Chapter 6: Economic Inequality Chapter 6: Econoic Inequality We are interested in inequality ainly for two reasons: First, there are philosophical and ethical grounds for aversion to inequality per se. Second, even if we are not interested

More information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information

Inspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub

More information

Block designs and statistics

Block designs and statistics Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent

More information

When Short Runs Beat Long Runs

When Short Runs Beat Long Runs When Short Runs Beat Long Runs Sean Luke George Mason University http://www.cs.gu.edu/ sean/ Abstract What will yield the best results: doing one run n generations long or doing runs n/ generations long

More information

arxiv: v1 [stat.ot] 7 Jul 2010

arxiv: v1 [stat.ot] 7 Jul 2010 Hotelling s test for highly correlated data P. Bubeliny e-ail: bubeliny@karlin.ff.cuni.cz Charles University, Faculty of Matheatics and Physics, KPMS, Sokolovska 83, Prague, Czech Republic, 8675. arxiv:007.094v

More information

Best Linear Unbiased and Invariant Reconstructors for the Past Records

Best Linear Unbiased and Invariant Reconstructors for the Past Records BULLETIN of the MALAYSIAN MATHEMATICAL SCIENCES SOCIETY http:/athusy/bulletin Bull Malays Math Sci Soc (2) 37(4) (2014), 1017 1028 Best Linear Unbiased and Invariant Reconstructors for the Past Records

More information

CS Lecture 13. More Maximum Likelihood

CS Lecture 13. More Maximum Likelihood CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood

More information

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)

Recovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup) Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains

More information

The Transactional Nature of Quantum Information

The Transactional Nature of Quantum Information The Transactional Nature of Quantu Inforation Subhash Kak Departent of Coputer Science Oklahoa State University Stillwater, OK 7478 ABSTRACT Inforation, in its counications sense, is a transactional property.

More information

Support recovery in compressed sensing: An estimation theoretic approach

Support recovery in compressed sensing: An estimation theoretic approach Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October

More information

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices

13.2 Fully Polynomial Randomized Approximation Scheme for Permanent of Random 0-1 Matrices CS71 Randoness & Coputation Spring 018 Instructor: Alistair Sinclair Lecture 13: February 7 Disclaier: These notes have not been subjected to the usual scrutiny accorded to foral publications. They ay

More information

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction

C na (1) a=l. c = CO + Clm + CZ TWO-STAGE SAMPLE DESIGN WITH SMALL CLUSTERS. 1. Introduction TWO-STGE SMPLE DESIGN WITH SMLL CLUSTERS Robert G. Clark and David G. Steel School of Matheatics and pplied Statistics, University of Wollongong, NSW 5 ustralia. (robert.clark@abs.gov.au) Key Words: saple

More information

Lower Bounds for Quantized Matrix Completion

Lower Bounds for Quantized Matrix Completion Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &

More information

Topic 5a Introduction to Curve Fitting & Linear Regression

Topic 5a Introduction to Curve Fitting & Linear Regression /7/08 Course Instructor Dr. Rayond C. Rup Oice: A 337 Phone: (95) 747 6958 E ail: rcrup@utep.edu opic 5a Introduction to Curve Fitting & Linear Regression EE 4386/530 Coputational ethods in EE Outline

More information

On the Use of A Priori Information for Sparse Signal Approximations

On the Use of A Priori Information for Sparse Signal Approximations ITS TECHNICAL REPORT NO. 3/4 On the Use of A Priori Inforation for Sparse Signal Approxiations Oscar Divorra Escoda, Lorenzo Granai and Pierre Vandergheynst Signal Processing Institute ITS) Ecole Polytechnique

More information

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution

More information

Sharp Time Data Tradeoffs for Linear Inverse Problems

Sharp Time Data Tradeoffs for Linear Inverse Problems Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used

More information

Optimal Jackknife for Discrete Time and Continuous Time Unit Root Models

Optimal Jackknife for Discrete Time and Continuous Time Unit Root Models Optial Jackknife for Discrete Tie and Continuous Tie Unit Root Models Ye Chen and Jun Yu Singapore Manageent University January 6, Abstract Maxiu likelihood estiation of the persistence paraeter in the

More information

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples

Estimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked

More information

ESE 523 Information Theory

ESE 523 Information Theory ESE 53 Inforation Theory Joseph A. O Sullivan Sauel C. Sachs Professor Electrical and Systes Engineering Washington University 11 Urbauer Hall 10E Green Hall 314-935-4173 (Lynda Marha Answers) jao@wustl.edu

More information

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search

Quantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths

More information

Chapter 6 1-D Continuous Groups

Chapter 6 1-D Continuous Groups Chapter 6 1-D Continuous Groups Continuous groups consist of group eleents labelled by one or ore continuous variables, say a 1, a 2,, a r, where each variable has a well- defined range. This chapter explores:

More information

Constructing Locally Best Invariant Tests of the Linear Regression Model Using the Density Function of a Maximal Invariant

Constructing Locally Best Invariant Tests of the Linear Regression Model Using the Density Function of a Maximal Invariant Aerican Journal of Matheatics and Statistics 03, 3(): 45-5 DOI: 0.593/j.ajs.03030.07 Constructing Locally Best Invariant Tests of the Linear Regression Model Using the Density Function of a Maxial Invariant

More information

Kernel Methods and Support Vector Machines

Kernel Methods and Support Vector Machines Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic

More information

Distributed Subgradient Methods for Multi-agent Optimization

Distributed Subgradient Methods for Multi-agent Optimization 1 Distributed Subgradient Methods for Multi-agent Optiization Angelia Nedić and Asuan Ozdaglar October 29, 2007 Abstract We study a distributed coputation odel for optiizing a su of convex objective functions

More information

Testing for a Break in the Memory Parameter

Testing for a Break in the Memory Parameter Testing for a Break in the Meory Paraeter Fabrizio Iacone Econoics Departent London School of Econoics Houghton Street London WCA AE April 5, 005 Abstract It is often of interest in the applied analysis

More information

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS

W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well

More information

Bootstrapping Density-Weighted Average Derivatives. May 17, 2010

Bootstrapping Density-Weighted Average Derivatives. May 17, 2010 Bootstrapping Density-Weighted Average Derivatives M D. C D E, U M R K. C F R B N Y M J D E, UC B CREATES May 17, 010 A. Eploying the sall bandwidth asyptotic fraework of Cattaneo, Crup, and Jansson (009),

More information

Best Procedures For Sample-Free Item Analysis

Best Procedures For Sample-Free Item Analysis Best Procedures For Saple-Free Ite Analysis Benjain D. Wright University of Chicago Graha A. Douglas University of Western Australia Wright s (1969) widely used "unconditional" procedure for Rasch saple-free

More information

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS

DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison

More information

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation

On the Communication Complexity of Lipschitzian Optimization for the Coordinated Model of Computation journal of coplexity 6, 459473 (2000) doi:0.006jco.2000.0544, available online at http:www.idealibrary.co on On the Counication Coplexity of Lipschitzian Optiization for the Coordinated Model of Coputation

More information

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval

Uniform Approximation and Bernstein Polynomials with Coefficients in the Unit Interval Unifor Approxiation and Bernstein Polynoials with Coefficients in the Unit Interval Weiang Qian and Marc D. Riedel Electrical and Coputer Engineering, University of Minnesota 200 Union St. S.E. Minneapolis,

More information

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models

Compression and Predictive Distributions for Large Alphabet i.i.d and Markov models 2014 IEEE International Syposiu on Inforation Theory Copression and Predictive Distributions for Large Alphabet i.i.d and Markov odels Xiao Yang Departent of Statistics Yale University New Haven, CT, 06511

More information

Fairness via priority scheduling

Fairness via priority scheduling Fairness via priority scheduling Veeraruna Kavitha, N Heachandra and Debayan Das IEOR, IIT Bobay, Mubai, 400076, India vavitha,nh,debayan}@iitbacin Abstract In the context of ulti-agent resource allocation

More information

1 Bounding the Margin

1 Bounding the Margin COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost

More information

Analyzing Simulation Results

Analyzing Simulation Results Analyzing Siulation Results Dr. John Mellor-Cruey Departent of Coputer Science Rice University johnc@cs.rice.edu COMP 528 Lecture 20 31 March 2005 Topics for Today Model verification Model validation Transient

More information

International Journal of Pure and Applied Mathematics Volume 37 No , IMPROVED DATA DRIVEN CONTROL CHARTS

International Journal of Pure and Applied Mathematics Volume 37 No , IMPROVED DATA DRIVEN CONTROL CHARTS International Journal of Pure and Applied Matheatics Volue 37 No. 3 2007, 423-438 IMPROVED DATA DRIVEN CONTROL CHARTS Wille Albers 1, Wilbert C.M. Kallenberg 2 1,2 Departent of Applied Matheatics Faculty

More information

Estimation of the Population Mean Based on Extremes Ranked Set Sampling

Estimation of the Population Mean Based on Extremes Ranked Set Sampling Aerican Journal of Matheatics Statistics 05, 5(: 3-3 DOI: 0.593/j.ajs.05050.05 Estiation of the Population Mean Based on Extrees Ranked Set Sapling B. S. Biradar,*, Santosha C. D. Departent of Studies

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Coputable Shell Decoposition Bounds John Langford TTI-Chicago jcl@cs.cu.edu David McAllester TTI-Chicago dac@autoreason.co Editor: Leslie Pack Kaelbling and David Cohn Abstract Haussler, Kearns, Seung

More information

3.8 Three Types of Convergence

3.8 Three Types of Convergence 3.8 Three Types of Convergence 3.8 Three Types of Convergence 93 Suppose that we are given a sequence functions {f k } k N on a set X and another function f on X. What does it ean for f k to converge to

More information

Multi-Scale/Multi-Resolution: Wavelet Transform

Multi-Scale/Multi-Resolution: Wavelet Transform Multi-Scale/Multi-Resolution: Wavelet Transfor Proble with Fourier Fourier analysis -- breaks down a signal into constituent sinusoids of different frequencies. A serious drawback in transforing to the

More information

lecture 36: Linear Multistep Mehods: Zero Stability

lecture 36: Linear Multistep Mehods: Zero Stability 95 lecture 36: Linear Multistep Mehods: Zero Stability 5.6 Linear ultistep ethods: zero stability Does consistency iply convergence for linear ultistep ethods? This is always the case for one-step ethods,

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods

More information

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression

Inference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity

More information

Testing the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests

Testing the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests Working Papers 2017-03 Testing the lag length of vector autoregressive odels: A power coparison between portanteau and Lagrange ultiplier tests Raja Ben Hajria National Engineering School, University of

More information

Research in Area of Longevity of Sylphon Scraies

Research in Area of Longevity of Sylphon Scraies IOP Conference Series: Earth and Environental Science PAPER OPEN ACCESS Research in Area of Longevity of Sylphon Scraies To cite this article: Natalia Y Golovina and Svetlana Y Krivosheeva 2018 IOP Conf.

More information

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:

A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax: A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,

More information

Using a De-Convolution Window for Operating Modal Analysis

Using a De-Convolution Window for Operating Modal Analysis Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis

More information

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence

Best Arm Identification: A Unified Approach to Fixed Budget and Fixed Confidence Best Ar Identification: A Unified Approach to Fixed Budget and Fixed Confidence Victor Gabillon Mohaad Ghavazadeh Alessandro Lazaric INRIA Lille - Nord Europe, Tea SequeL {victor.gabillon,ohaad.ghavazadeh,alessandro.lazaric}@inria.fr

More information

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine

Lecture October 23. Scribes: Ruixin Qiang and Alana Shine CSCI699: Topics in Learning and Gae Theory Lecture October 23 Lecturer: Ilias Scribes: Ruixin Qiang and Alana Shine Today s topic is auction with saples. 1 Introduction to auctions Definition 1. In a single

More information

Q ESTIMATION WITHIN A FORMATION PROGRAM q_estimation

Q ESTIMATION WITHIN A FORMATION PROGRAM q_estimation Foration Attributes Progra q_estiation Q ESTIMATION WITHIN A FOMATION POGAM q_estiation Estiating Q between stratal slices Progra q_estiation estiate seisic attenuation (1/Q) on coplex stratal slices using

More information

Computable Shell Decomposition Bounds

Computable Shell Decomposition Bounds Journal of Machine Learning Research 5 (2004) 529-547 Subitted 1/03; Revised 8/03; Published 5/04 Coputable Shell Decoposition Bounds John Langford David McAllester Toyota Technology Institute at Chicago

More information