A KULLBACK-LEIBLER MEASURE OF CONDITIONAL SEGREGATION

Size: px
Start display at page:

Download "A KULLBACK-LEIBLER MEASURE OF CONDITIONAL SEGREGATION"

Transcription

1 orkng Paper Departamento de Eonomía Eonom Seres 0-5 Unversdad Carlos III de Madrd June 200 Calle Madrd, Getafe (Span) Fax (34) A KULLACK-LEILER MEASURE OF CONDIIONAL SEGREGAION Rardo Mora and Javer Ruz-Castllo Departamento de E onomía, Unversdad Carlos III de Madrd Abstrat In ths paper the Kullbak-Lebler noton of dsrepany (Kullbak and Lebler, 95) s used to propose a measure of multgroup segregaton over a set of organzatonal unts wthn a multvarate framework. Among the man results of the paper t s establshed that the Mutual Informaton ndex of segregaton, M, frst proposed by hel and Fnzza (97), whose rankng has been fully haraterzed n terms of seven ordnal axoms by Frankel and Volj (2009), an be deomposed to solate a term whh aptures segregaton ondtonal on any vetor of ovarates. Furthermore, onsstent estmators for M and the terms n ts deomposton are proposed, and ther asymptot propertes are obtaned. he usefulness of the approah s llustrated by lookng at patterns of multraal segregaton aross publ shools n the U.S. for the aadem years and It s found that most wthn-tes segregaton and a sfant part of wthn-dstrts segregaton s aounted for by ounty-level nome per apta and wages per job, and teahers per pupl at shool level. Keywords: Kullbak-Lebler Dsrepany; Condtonal Segregaton; Asymptot Propertes; Eonometr Models. hs s a ompletely new verson of a 2007 orkng Paper enttled he Statstal Propertes of the Mutual Informaton Index of Multgroup Segregaton. he authors aknowledge fnanal support from the Spansh DGI, Grants ECO and SEJ

2 I. INRODUCION Soal sentsts have long been nterested n the measurement of oupatonal segregaton by gender, as well as n resdental and eduatonal segregaton by ethn group. 2 Mathematally, these problems are smlar n the sense that both nvolve summarzng by means of a real number the nformaton ontaned n the frequeny of ndvduals (workers, resdents, students) over a fnte set of organzatonal unts (oupatons, neghbourhoods, shools) and a fnte set of demograph groups (defned n terms of gender, raal, or ethn ategores). Suh a real number s referred to as an ndex of segregaton. For onreteness, ths paper wll use the example of shool segregaton n the multraal ase. he man queston we address s how to aount for raal group and shool dfferenes n sooeonom varables n the measurement of segregaton. o plae ths ssue nto a proper perspetve, thnk of shool segregaton at a natonal level as arsng from two fores. Frstly, gven the partton of tes nto shool dstrts, shool segregaton arses from poltally determned segregatve or ntegratve rules n the assment of students to shools wthn a gven dstrt (see nter ala Rvkn, 994, and Clotfelter, 999). Seondly, magne a stuaton wthout wthn-dstrts shool segregaton, that s, a stuaton where shool dstrt authortes all over the ountry are able to mplement a poly that reprodues n all shools the raal mx of the dstrt to whh they belong. In ths senaro, the student populaton would stll experene some segregaton arsng from the resdental hoes adopted by ther parents or aretakers: as long as the raal omposton at the shool dstrt level dffers from the raal omposton at the ty and/or the natonal level, there wll be between-tes and between-dstrts (or wthn-tes) shool segregaton n the ountry as a whole. Preferenes and opportuntes behnd resdental desons may dretly depend on a number of sooeonom varables, gvng rse to the man ssue addressed n ths paper. Assume, for nstane, that there s a statstal assoaton between student rae and household nome levels. In 2 For a treatse on oupatonal segregaton by gender, see Flukber and Slber (999), and for a reent useful ontrbuton on resdental and shool segregaton, see Reardon and Frebaugh (2002). 2

3 so far as household nome s a potental determnant of resdental and shool hoe, t an be sad that multgroup shool segregaton may be partally due to nome nequalty. herefore, for both explanatory and poly reasons t s mportant to dentfy the extent to whh the value of segregaton arses from nome and other sooeonom haratersts. In the absene of a better strategy, one an dsretze the vetor of sooeonom ontrols and use ndes of segregaton whh are addtvely deomposable nto between and wthn dsrete ategores, suh as n Reardon et al. (2000), Mora and Ruz-Castllo (2003), and Frankel and Volj (2009). However, ths strategy has a pratal lmtaton and a oneptual drawbak. he pratal lmtaton stems from the urse of dmensonalty: to avod serous aggregaton bas, one should onsder as many ategores as possble for eah ontrol, but wth usual sample szes ths s mplementable n prate only when the vetor of ontrols has few dmensons. he oneptual drawbak s due to the absene of a lear nterpretaton of the between term as the used dsrete ategores are only arguable approxmatons of the atual values. herefore, the use of ndes whh are addtvely deomposable nto between and wthn dsrete ategores only partally answers ths queston beause t does not deal properly wth many ontnuous ontrols. Other researhers have tred to develop notons of ondtonal segregaton whh should be mplementable n a general multvarate framework. Frequently, ther ultmate purpose s to assess to what extent segregaton an be explaned by the determnants of ndvdual hoe; to do so, they borrow the tools used n the lterature on dsrete hoe. For example, n ther analyss of oupatonal segregaton, Sprggs and llams (996) propose a modfed Dunan dssmlarty ndex whh uses gender (and rae) dfferenes n estmated probabltes of beng n an oupaton obtaned from multnomal logt models. Followng losely Carrngton and roske (997), other researhers have proposed ndes of segregaton whh attempt to ontrol for systemat dfferenes n the dstrbuton of ovarates aross groups. For example, n the ontext of oupatonal segregaton by mmgrant 3

4 status, Aslund and Skans (2009) propose estmatng the propensty sore for eah group gven the vetor of haratersts to reate the benhma rk random alloaton (ondtonal on the ovarates) for any segregaton ndex. 3 hey then develop a test of ondtonal segregaton usng an ndex of exposure. he most mportant drawbak of these strateges s that the ndes obtaned are nether haraterzed n terms of axomat propertes, nor related n an unambguous way to ndes whh are fully haraterzed. hs mples that although the proedures suggested sometmes have a lear ntutve appeal, t s not lear how they relate to unondtonal measures of segregaton and one annot be ertan of what the resultng ndex atually measures. In ths paper, a multvarate statstal framework to analyse multgroup shool segregaton s set up by borrowng the Kullbak and Lebler (95) noton of dsrepany from Informaton heory. A measure of segregaton, M, s then proposed and shown to satsfy several mportant propertes. Frstly, M ondes wth the Mutual Informaton ndex, frst proposed by hel and Fnzza (97) as a measure of raal shool segregaton at dstrt level, and whose rankng has been reently haraterzed by Frankel and Volj (2009) n terms of seven ordnal desrable axoms. Seondly, Frankel and Volj (2009) show that, for any varable d whh parttons the set of shools or the set of raal ategores, M s strongly deomposable and the wthn term n ths deomposton an be nterpreted as segregaton ondtonal on d. In ths paper, ths result s generalzed to ondton segregaton on any vetor of (possbly ontnuous) student and shool haratersts x. In partular, the M ndex an be deomposed nto a between term, M, whh s a Kullbak-Lebler measure of dsrepany and aptures the statstal dependene between rae status (or shool membershp) and x, and a wthn term, M, whh aptures multgroup shool segregaton ondtonal on x. eause M KL KL KL and M KL are ndependent, n the sense that t s possble to ntrodue hanges n the populaton to elmnate ondtonal segregaton M keepng ondtonng segregaton M onstant, ths KL KL 3 See also Hellersten and Neumark (2008) and Kalter (2000) for related methodologal proposals. 4

5 deomposton allows us to answer questons suh as to what extent s raal segregaton at shool level assoated wth raal dfferenes n sooeonom varables? 4 Moreover, sne M KL and M KL are funtons of terms that an be nterpreted as qualtatve response models, the deomposton provdes an ntutve unfyng eonometr framework for studes of segregaton usng segregaton ndes and eonometr models. hrdly, sne segregaton measures are routnely omputed usng samples, t s usually of nterest to study ther statstal sfane. he smplest approah to ths problem nvolves reportng t- statsts usng omputer ntensve methods suh as the bootstrap as n osso et al. (994). A related approah onssts of standardzng the segregaton measure, usng as mean and standard devatons estmates obtaned from resamplng under random assment nto groups and organzatonal unts, as n Carrngton and roske (997). Other authors have made use of a statstal framework for the empral analyss of segregaton, as n Kakwan (994). In ths paper, for any sample of sze, estmators for both the M ndex, M ˆ, and also the between and wthn terms n ts deomposton, ˆ M and ˆ M, are proposed usng the prnple of analogy. M ˆ s shown to be a monoton transformaton of the lkelhood-rato statst for testng statstal ndependene between shool membershp and raal status. Furthermore, when the vetor of ovarates x only nludes dsrete varables, t s shown that ˆ M an be nterpreted as a monotone transformaton of the lkelhood-rato statst for testng statstal ndependene between shool membershp and raal status gven x. Fnally, suffent ondtons are provded to obtan under all segregaton senaros the asymptot propertes of M ˆ, ˆ M, and ˆ M, both n the ase when all varables are dsrete and also when there s at least one ontnuous varable n x. 4 In the feld of nome nequalty, between -groups nome nequalty an also be nterpreted as the amount by whh overall nome nequalty s redued when the dfferenes between subgroup nome means are elmnated by makng them equal to the populaton nome mean (see, nter ala, Shorroks, 984). As shown by Mora and Ruz-Castllo (2009), the orrespondng nterpretaton s logally mpossble n segregaton studes. 5

6 o summarze, t has been shown elsewhere that M s well grounded on an axomat noton of segregaton. In ths paper, we show that t an be used to estmate the level of segregaton whh does not arse from the statstal assoaton between the demograph groups and any set of ovarates. he usefulness of the approah s llustrated by applyng t to the analyss of multraal segregaton n the U.S. publ shools. More spefally, we study to what extent the measures of wthn-tes and wthndstrts segregaton are due to the statstal assoaton between raal group membershp and three ontnuous varables: ounty nome per apta and wages per job, and teahers per pupl at dstrt and shool level. Results show that around 64% and 20% of, respetvely, wthn-tes and wthn-dstrts segregaton s aounted for by these three ovarates, and that these shares are strongly sfant. he rest of the paper ontans four setons. Seton 2 sets up the general statstal framework, and defnes M and ts deomposton n a multvarate framework. Seton 3 proposes estmators Mˆ, ˆ M, and ˆ M and presents the asymptot results. Seton 4 ontans the empral llustraton, whle Seton 5 offers some onludng omments. II. A GENERAL SAISICAL MODEL OF MULIGROUP SCHOOL SEGREGAION II.. Measures of Segregaton It s useful to refer to a spef segregaton problem. For onssteny wth the empral llustraton n Seton IV, the ase dsussed throughout the paper s the multgroup shool segregaton problem. Assume a ty X onsstng of N shools, ndexed by n =,, N. Eah student belongs to any of G raal groups, ndexed by g =,, G. he data avalable an be organzed nto the followng G x N matrx: X t t N = { t } =, t G tgn 6 ()

7 where t s the number of ndvduals of raal group g attendng shool n, so that total student populaton. t N G = n= g= t s the he nformaton ontaned n the jont absolute frequenes of raal groups and shools, t, s usually summarzed by means of numeral ndes of segregaton. Let X(G, N) be the set of all tes wth G groups and N shools. A segregaton ndex S s a real valued funton defned n X(G, N), where S(X) provdes the extent of shool segregaton for any ty X X(G, N). Let p = t /t, and denote by P GN { p },,, = the jont dstrbuton of raal groups and shools n a ty X X(G, N). In the g= n= followng seton, the dsusson wll be restrted to ndes that apture a relatve vew of segregaton n whh all that matters s the jont dstrbuton,.e. ndes whh admt a representaton as a funton of P. 5 II. 2. A Kullbak-Lebler Measure of Segregaton Consder the probablty spae ( Ω, F,µ ) where Ω s the set of possble samples { },,x Ω where x Λ R k s a vetor of k ovarates. F s the σ-algebra of subsets of Ω, and µ s a measure of the probablty of the events n F. Assume that there are two absolutely ontnuous measures wth respet to µ, µ and µ 2, and two generalzed densty funtons, f ( x,, ) and f (,, ), 2 x suh that ( E) = f (,, x ) d, =, 2, µ µ E for all E F. he elements n x may be unvarate or multvarate, dsrete or ontnuous, qualtatve or quanttatve, and the generalzed densty funtons f are known at most up to a parameter vetor. Consder the partton of Ω nto G x N sets ( x) {,, F:,, x } D = rs r = g s= n Λ and let 5 hs property, satsfed by most segregaton ndes, s referred to as Sze Invarane n James and aeuber (985). 7

8 ( ) µ, = f ( r, s, x ) dµ, =,2, D so that the probablty that a student s of rae g and belongs to shool n under the probablty measure µ s p = µ (, ) = f( r, s, x ) dµ, where p 0 and D G N g= n= p =. he margnal probabltes for rae status and shool membershp are N G g = p and p n n= g= p = p, respetvely. For all g and n suh that ( ) µ, > 0, =, 2, the generalzed ondtonal densty gven rae and shool status s f f (,, x ), =. µ ( x ) (, ) Followng Kullbak (959), a Kullbak-Lebler, KL, measure of dsrepany between f and f 2 s defned as: f (,, x ) I f d (:2) = (,, x )log µ. f2(,, x ) () Let H, =, 2, represent the hypothess that (,, x ) belongs to the statstal populaton wth f(,, x ) probablty measure µ, and defne the logarthm of the lkelhood rato, log, f2(,, x ) as the nformaton n (,, x ) for dsrmnaton n favour of H aganst H 2. 6 hen I(:2) an be nterpreted as the mean dsrepany (or nformaton for dsrmnaton) n favour of H aganst H 2 per observaton from µ (see Kullbak, 959, p. 5). p Defne the ondtonal probablty of shool membershp n gven rae status g as p =, and let Png { png } N = represent the ondtonal dstrbuton of students from group g aross shools. n= ng p g 6 he base of the logarthm s mmateral, provdng essentally a unt of measure. he natural logarthm s used throughout the paper. 8

9 p Smlarly, defne the ondtonal probablty of raal status g gven shool membershp n as p =, and denote by P { p } G = the raal mx wthn shool n. Indes n the segregaton lterature g= assoate the absene of segregaton wth two stuatons. Frstly, raal groups are not segregated f the relatve frequeny wth whh a student attends shool n s onstant, regardless of her raal group,.e. p n g = p n. 7 Seondly, the raal omposton at all shools s fully representatve of the populaton f the relatve frequeny wth whh students belong to raal group g s onstant regardless of the shool whh they attend,.e. p g n = p g. 8 hese two notons of absene of segregaton are equvalent and onde w th the onept of statstal ndependene between rae status and shool membershp: p p p p p p p = g ng = n ng= g n. Under the followng three assumptons the KL noton of dsrepany between dependene and ndependene of rae and shool membershp beomes a measure of segregaton. For all g =,, G, n =,, N, and x Λ R k : A : p > 0. A2 : ( x ) ( x ) f, = f, > 0 as, =,2. N G µ 2 pg p n p p n= g= A3 : ( ), = =. A elmnates from onsderaton ombnatons of raes and shools that are a pror mpossble to observe. A2 ensures that the margnal probabltes p are suffent statsts wth respet to the measure of dsrepany, so that no nformaton s lost by dsregardng x. Fnally, A3 dentfes H 2 wth the noton p n 7 Absene of segregaton n ths sen se s onsstent wth the noton of segregaton as evenness, advoated by James and aeuber (985), aordng to whh segregaton s seen as the tendeny of raal groups to have dfferent dstrbutons aross shools. 8 Absene of segregaton n ths sense follows the dea of representatveness, emphaszed by Frankel and Volj (2009), whh asks to what extent shools have dfferent raal ompostons from the populaton as a whole, and t s losely related to the dea of solaton dstngushed by Massey and Denton (988) n the two-group ase. 9

10 of statstal ndependene between rae and shool membershp. Gven equaton (), the followng remark results. Remark : Under assumptons A to A3, the noton of dsrepany I(:2) Mutual Informaton ndex, M,.e. ondes wth the G N p G N png g ng g= n= pg p n g= n= p n I(:2) = p log = p p log = M. hel (972) shows that M s bounded. he lower bound 0 s aheved whenever png = pg p n for all g and n, whle the upper bound s mn{ log( G), log( N ). } II. 2. Multgroup Shool Condtonal Segregaton Assumptons A to A3 do not requre ndependene between rae status (or shool membershp) and any of the ovarates n x. hus, as s ponted out n the Introduton, t wll be generally of nterest to evaluate the extent to whh M an be attrbuted to the statstal assoaton between the ovarates x and the raal groups (or shools). thout loss of generalty, let us onsder the statstal assoaton between raal groups and ovarates x. It s always possble to fatorze the generalzed densty f (,, x ) as f (,, x) = f (, n g x) f( g, x), where N f ( g, x) = f (,, x), =, 2. herefore, any measure of n= dsrepany I (:2) an always be deomposed nto two terms: f( g, x ) I(:2) = f(,, x )log dµ f2( g, x ) f ( n g, x ) + f(,, x )log dµ. f2( n g, x ) (2) he frst term aptures the dsrepany between f ( g, x ) and f (, x ), 2 g whle the seond term aptures 0 the dsrepany n ondtonal shool assment rules f ( n g, x ) and f (, x ). 2 n g In addton to A, A2, and A3, the followng four assumptons are suffent to obtan a deomposton of M so that one

11 term an be nterpreted as raal dsrepany aross ovarates and the other term an be nterpreted as ondtonal shool segregaton: A4 : ( ) = ( ) f g, x f g x; a f( x) where f ( x ; a ) s known up to parameter vetor R k α a. A5 : ( x) f2 g, = p g f ( x) wth pg = f( g x ; α ) f ( x ) d x not unquely dentfed by a. A6 : ( ) ( ) x Λ β f, x = f, x; b where f(, x; b ) s known up to parameter vetor b R k whh s not a funton of (, ) t ga where g = ( p,..., p G ). A7 : f ( x) = f g x a f ( n x b ) where f ( n x b) f ( x b ) 2, ( ; ) ; ; = G, ;. g= Under assumptons A to A7, the M ndex an be deomposed nto a between term whh aptures the statstal dependene between rae status (or shool membershp) and x, and a wthn term whh aptures multgroup shool segregaton ondtonal on x (see Proposton n Appendx): where = M M ( ga, ) + M ( a, b ) (3) G ( x ; ) f g a M ( ga, ) = f( x) f( g x; a )log dx x Λ g= p g and G N (, x ; ) f b M ( ab, ) = f( x) f(, x; b)log dx. ( x ; ) ( x ; ) x Λ g= n= f g a f n b he term M ( ga, ) dentfes the level of segregaton whh would reman f there were no segregaton after ontrollng for the statstal dependene between the vetor of ovarates x and raal status. Sne M ( ab, ) s the level of segregaton whh s not related to raal dsrepany by ovarates x, t an be referred to as shool segregaton by rae ondtonal on x. Deomposton (3) s appealng for at least two reasons. Frstly, M ( ab, ) and M ( ga, ) are

12 ndependent n the sense that t s possble to ntrodue hanges n the denstes to elmnate M ( ab, ) keepng M ( ga, ) onstant. Seondly, ondtonal denstes f( n x ; b ), and f( n g, x; b ) f( g, n x; b )/ f( g x ; b ) an be nterpreted as qualtatve response models whh stem from eonom agents utlty maxmzng hoes under onstrants. hus, deomposton (3) provdes an ntutve, unfyng, eonometr framework for studes of segregaton usng segregaton ndes and qualtatve response eonometr models. Kullbak (959) ponts out that any KL dsrepany an be reursvely deomposed nto more than two terms. hs s trvally seen wth deomposton (3), as the frst term s tself a KL dsrepany measure and, hene, t an tself be deomposed. A dret applaton of ths property to the problem of multgroup shool segregaton permts the deomposton of M nto three terms apturng betweentes, wthn-tes, and wthn-dstrts shool segregaton. 9 For reasons of brevty, we leave to the reader the detals of deompostons of more than two terms n the model. One fnal pont needs to be larfed. Suppose that all ovarates x are dsrete, and that they partton the set of shools nto dsjont subsets, suh as when shools n a ty are organzed nto a set of shool dstrts. More spefally, assume that eah shool belongs to one of K dfferent shool dstrts and let p d denote the proporton of students of raal group g at shool n wthn dstrt d, p p d = n d. Defne g d p as the jont probablty of rae and dstrt membershp and let p d and p d denote the margnal dstrbuton of dstrts and the jont dstrbuton of rae and shool membershp ondtonal on dstrt d, respetvely. Fnally, let p g d and p nd be the margnal dstrbutons wthn dstrt d of rae and shool membershp. It has prevously been shown that the M ndex s deomposable for any partton of the shools nto K shool dstrts nto a between and a wthn term: 0 9 See also Hernanz et al. (2005) for an applaton of ths prnple n the ontext of oupatonal segregaton by gender and Frankel and Volj (2009) for sequental lusterng of raal ategores n multraal shool segregaton. 0 See Frankel and Volj (2009) and Mora and Ruz-Castllo (2009). For the two-group ase, see Mora and Ruz-Castllo (2003, 2004), and Herranz et al. (2005). 2

13 M = M + M, (4) where M K G p g d = pg dlog and d= g= pg p d K G p d M = p d pd log. k= n d g= pg d p nd How does deomposton (4) relate to deomposton (3)? If f( x ) = p d, f( g x, a ) = p g d, f(, x ; b ) = p d, and f( n x ; b ) = p nd, t an readly be shown that M ( ga, )= M and M ( ab, ) = M. hus, when the vetor of ovarates nludes only dsrete varables, the general deomposton n equaton (3) exatly mathes the deomposton of the M ndex prevously proposed n the lterature for any partton of shools (or groups) as n equaton (4). III. ESIMAION AND ASYMPOIC PROPERIES III.. Estmaton and Asymptots of the M Index Assume that a sample of observatons from students wth nformaton on ther rae status, shool membershp, and ovarates x, { g n } =,, x, s avalable. Let be the number of students of raal group g n shool n, so that N G. Let = n= g= g N = > 0 and n= n G = > 0. Note that g= under assumptons A to A7, rae and shool status are jontly dstrbuted as a nonparametr multnomal model. thout loss of generalty, denote by {(, ):,...,,,...,,(, ) (, )} GN = g= Gn = N GN the set of all rae and shool ombnatons exept ombnaton ( GN, ). hen the margnal probabltes of rae and shool membershp are fully dentfed by the vetor t GN q = ( p, p2,..., p GN, ) Θ R, wth Θ { p}, <, 0, p GN p > GN (, ) GN and p = p > 0. he M ndex s bounded and ontnuous. Moreover, t an always be GN 3

14 estmated by 0log(0) = 0. ˆ ˆ = G N p M ˆ p log, ˆ ˆ g= n= pg p n where ˆ p = /, ˆ p = /, ˆ p = /, and g g n n As s shown n Proposton 2 n the Appendx, under assumptons A to A3, plm Mˆ = M. An mplaton of ths onssteny result s that M ˆ onverges n probablty to 0 f and only f p = p g p n for all g and n. Moreover, whenever two tes are to be ranked aordng to M, the mplt orderng from M ˆ onvergenes n probablty to the orderng ndued by M. he relaton between Kullbak-Lebler dsrepany measures and lkelhood-rato statsts for testng ndependene aross ategoral varables n ontngeny tables s well known (Kullbak, 959, p. 58). Here, to mplement the lkelhood rato statst for the ndependene of raal status and shool membershp, an addtonal parametr assumpton on the ondtonal densty for ovarates x s suffent: A8 : ( x, ) = ( x, ; ) f f j suh that = G N ϕ f( x) f( x, ; j ) p and j R k does not g= n= depend on q. Consder testng for the ndependene of rae and shool membershp,.e. H : 0 p = pg p n for all (, ) GN, versus H :. p p p Let ˆ (, ˆ ) g n lkelhood (hereafter ML) estmator, and let ( ˆ 0, ˆ0) l qj be the log-lkelhood evaluated at the maxmum l q j be the log-lkelhood for the model under H 0 ( ) evaluated at the restrted ML estmator, so that 2log( λ ) = 2 l ˆ ˆ ( ˆ 0, 0) l(, ˆ) q j qj s the loglkelhood rato statst and λ s the lkelhood rato. Remark 2 : Under assumptons A, A2, A3 and A8, Mˆ log( λ) =. 4

15 herefore, M ˆ s a monoton transformaton of the lkelhood-rato statst for testng statstal ndependene between shool membershp and raal status. hs mples that the orderng aross tes provded by omparsons of ty-spef log-lkelhood ratos dvded by ty sze, s unquely defned by the seven ordnal propertes that haraterze the M ndex as shown by Frankel and Volj (2009). Note that log ( λ) s less appealng thanm ˆ as a measure of segregaton beause the orderng ndued by M ˆ s sze nvarant, whle the orderng ndued by -log(λ) s senstve to sample sze for any gven set of relatve frequenes. he value -log(λ) an be seen to be a partular ase of a general KL dvergene test for the null hypothess that r ndependent samples are drawn from an dental dstrbuton, whose funtonal form s known up to a vetor of parameters of dmenson k. Kupperman (957) showed that, under ertan regularty ondtons, ths general KL dvergene test s asymptotally dstrbuted as h-square dstrbuton wth( r ) k degrees of freedom. For the statstal model set up n the prevous seton, t s possble to nvoke earler well-known results on the propertes of the λ statst under the null of ndependene to show that ˆ d 2 2M χ( G )( N ) (see heorem n the Appendx). In many pratal stuatons, the hypothess of ndepedene or absene of segregaton wll be false, so that the relevant statstal propertes for the ndex of segregaton wll be those under the true alternatve. Salrú et al. (994) studed the asymptot dstrbuton of a famly of estmators for whh KL dvded by sample sze s a lmtng ase. Usng the delta method, they fnd square-root onvergene to a normal dstrbuton under the alternatve. In heorem 2 n the Appendx t s shown that, under assumptons A, A2, A3, and A8, f p pg p n for at least one (, ) GN, then ( ) M M onverges n dstrbuton to a normal dstrbuton wth mean zero and postve varane. /2 ˆ Morales et al. (995) onsder ˆ M as a partular ase of a more general famly of dvergene measures between two onsstent estmates of a dsrete dstrbuton. hey fnd that, under the null, the h-square dstrbuton s an asymptot approxmaton for all members of the famly. 5

16 he asymptot power of M ˆ an be estmated for fxed alternatves usng ths last result. Note, however, that the normal approxmaton wll lkely be poor f the sample s not large, and bootstrap nferene may provde better approxmatons to the small sample dstrbuton of M ˆ. III.2. Estmaton and Asymptots of Condtonal Segregaton hen a sample of d observatons of sze s avalable, estmaton of deomposton (3) an be arred out usng the prnple of analogy. he followng four estmators wll be onsdered: G ˆ ( x ; ˆ ) ˆ ˆ f g x ˆ a M( ga, ) = f( g ; a)log, ˆ g p = = g G N (, x ; ˆ ˆ ˆ ) ( ˆ, ) (, x, ˆ f g n b M ab = f g n b)log, ( x ; ˆ ) ( x ; ˆ = g= n= f g f n ) a b ˆ ( ˆ, ˆ) ˆ ˆ M ( ˆ, ˆ ga = M M ga), Mˆ ( ˆ ˆ ˆ ˆ ˆ ˆ ˆ gab,, ) = M ( ga, ) + M ( ab ˆ, ˆ), where ( gˆ, aˆ, bˆ ) are estmates for ( g, a, b ). For the ase n whh all ovarates x are dsrete and partton the set of shools, the sample analogues of deomposton (4) requre no funtonal form assumptons for the denstes of the varables. In ths ase: and ˆ ˆ = + ˆ M M M always. K G pˆ ˆ ( ˆ, ˆ) ˆ gd M ˆ ga = M = pgd log, ˆ ˆ d= g= pg p d K G ˆ ˆ ˆ ˆ ˆ ˆ p d M ( ˆ; ) ( ˆ; ) ˆ ˆ ab = M g b = M = p d pd log, ˆ ˆ k= n d g= pg d p nd he remander of ths seton s devoted to the propertes of these estmators. Results for the ase n whh all ovarates are dsrete and for the ase n whh at least one ovarate s not dsrete are presented n III.2. and III.2.2, respetvely. III.2.. Estmaton and Asymptots of Dstrt versus Shool Segregaton Deomposton (4) ams to answer to what extent rae segregaton at dstrt level an explan a 6

17 sfant amount of shool segregaton by rae. In ths seton, the statstal propertes of the estmators for deomposton (4), ˆ M and ˆ M, are studed. 2 he term ˆ M s tself a mutual nformaton ndex so that t onverges n probablty to the KL measure M and, by Remark 2, an be motvated as the lkelhood-rato test for the ndependene between rae and dstrt membershp. Asymptot dstrbutons for ˆ M both n the presene and the absene of dependene between rae and dstrt membershp an be obtaned usng heorems and 2 n the Appendx after a trval hange n notaton. he wthn-term M ˆ an also be motvated as a lkelhood-rato test. Consder testng for the ndependene of rae and shool membershp wthn any dstrt d,.e. H : p p p, = ( ) 0 d d g d nd d least one ombnaton (,, ). d 0 0 estmator, and let { ˆ } { ˆ d, } d, d GNd, d =,..., D, versus the alternatve H : p d pg dp n dfor at ( ) d Let { ˆ } { ˆ d, d} ( ) l p p be the log-lkelhood evaluated at the ML l p p be the log-lkelhood for the model under H 0 evaluated at the restrted ML estmator, so that { 0 } { 0 } 2log( ) = 2,, s the log- lkelhood rato statst and λ ( l( pˆ ˆ ) ({ ˆ } { ˆ d p })) d l pd p d s the lkelhood rato. λ Remark 3: Suppose that assumptons A to A8 hold, and that the vetor x nludes only dstrt d d ode d. hen ˆ M log( λ ) =. Remark 3 provdes an ntutve statstal nterpretaton for M ˆ. e are not aware of any other wthn-groups term n a deomposton of an ndex of segregaton so losely related to a lassal statstal test. Remark 3 an be appled to any luster of shool dstrts, suh as tes or regons, so that 2 y nterhangng the notaton for groups and organzatonal unts, the results presented here an be appled to deompostons when the set of raal groups s parttoned nto supergroups. 7

18 the wthn terms n the resultng deompostons an be nterpreted as monotone transformatons of lkelhood-rato tests for the ndependene between rae and shool membershp wthn the dstrts of the orrespondng luster. he dsusson of the dsrete ase ends wth two results whh haraterze the asymptot propertes of ˆ M. Frstly, t s shown n heorem 3a n the Appendx that, under general ondtons, f p = p p for all ( d,, ), then ˆ M onverges n dstrbuton to a quadrat form. Intutvely, d g d nd under absene of segregaton, the ndex of segregaton for eah dstrt onverges by heorem n the Appendx to a h-square dstrbuton. Sne the wthn term ˆ M s a weghted average of these terms, t does not generally onverge to a h-square dstrbuton. Seondly, f pd pg d p nd for at least one d then /2 ( M ˆ M ) (,, ), onverges n dstrbuton to a normal dstrbuton wth zero mean and postve varane (see heorem 3b n the Appendx). III.2.2. Condtonal Segregaton wth Non-Dsrete Covarates Although deomposton (3) nludes deomposton (4) as a speal ase, the former annot be onsdered a true generalzaton of the latter. he reason s that, whle n the fnte-ovarates stuaton no restrtve funtonal -form assumptons for the ondtonal denstes are requred to mplement deomposton (3), n the presene of ountable or ontnuous ovarates dentfyng funtonal-form assumptons are mplt n parametr assumptons A4 to A7. hs has two mportant mplatons. Frstly, the sum Mˆ ˆ ( gˆ, aˆ) + M ( ˆ, ˆ ab ) need not be equal to M ˆ for small samples. Seond, there s generally no monotonous relaton between the lkelhood-rato test and Mˆ ( ga ˆ, ˆ) and Mˆ ( ab ˆ, ˆ ). Clearly, the fat that M ˆ ( gˆ, a ˆ) and ˆ M ( ab ˆ, ˆ ) are unrelated to lkelhood-rato tests does not mply that they annot be nterpreted as statstal tests. he asymptot propertes of both estmators are next studed under dfferent hypotheses, therefore provdng ther asymptot motvaton as 8

19 /2 statstal tests. Suffent ondtons for asymptot normalty for ( ˆ M ( ˆ, ˆ ) M ( 0, 0) ) ( ˆ ˆ ˆ (, ) ( 0, 0) ) /2 M M ga g a and a b a b are gven n the Appendx n heorem 4 and heorem 5, respetvely. Asymptot normalty for ˆ ˆ ˆ ˆ ˆ M ( g, a) = M M ( gˆ, a ˆ) and ˆ ( ˆ, ˆ, ˆ) = ˆ ( ˆ, ˆ) ˆ M g a b M ga + M ( ab ˆ, ˆ ) follows dretly. Note that these results apply to any set of onsstent estmators for ( g, a, b ). For ML estmators, the suffent ondtons wll be satsfed under general regularty ondtons. IV. U.S. SCHOOL SEGREGAION Durng the past deades, the U.S. has beome nreasngly raally and ethnally dverse, due to hgher fertlty and/or mmgraton rates among mnortes, whh have led to a faster populaton growth than that of the whte populaton. he demograph advanes of Hspans and Asans are onentrated n ertan parts of the ountry, whle sarely apparent n many others. 3 he man objetve of ths seton s to llustrate the usefulness of the deompostons proposed n ths paper to analyze raal shool segregaton patterns under the desrbed hangng envronment. IV.. Data e use the Common-Core of Data (CCD) ompled by the Natonal Center for Eduatonal Statsts (NCES). hs dataset ontans shool enrolment reords aordng to raal/ethn group from all publ shools n the Unted States. Results are reported for the shool years (the frst year for whh omplete enrolment data s avalable) and Shools are retrospetvely assed Core ased Statstal Area (CSA) odes based on 2005 ZIP odes so that omparsons over tme an be made, wthout hanges n ty boundary defntons affetng the results. 4 he sample s restrted to 3 he terms whte, blak, Asan and Natve Ameran are used throughout ths set on to refer to non-hspan members of these raal groups. Asans nlude Natve Hawaan and Paf Islanders; Natve Amerans nlude Ameran Indans and Alaska Natves (Innut or Aleut). he term Hspan s an ethn rather than a raal ategory sne Hspan persons may belong to any rae. he term raal group s used throughout to refer to eah of these fve raal/ethn ategores. 4 CSAs were publshed by the Offe of Management and udget n 2003 and refer olletvely to urban lusters of at least 0,000 people. hey replae the Metropoltan Statstal Areas (MSA s) whh were used durng the perod. 9

20 open regular shools 5 loated n 960 CSA odes referred to as tes n the 50 states and the Dstrt of Columba. hs overs approxmately 74% of the student populaton attendng U.S. publ shools n here s full nformaton for all targeted shools n For 989, however, a number of shools n a few states faled to report the data. As a onsequene, data for 989 only nludes 839 tes. 6 Unless otherwse spefed, results pertan to those shools for whh raal and ethn nformaton s avalable both n 989 and n Fousng on the shools whh provde nformaton n both years probably gves a farer omparson between the dstrbutons observed n 989 and n 2005, sne t does not nlude those shools whh reported n 2005 but had faled to do so n 989. However, nterpretablty of the results s also potentally ompromsed by the fat that some new shools were reated whlst others dsappeared between 989 and Nevertheless, use of all observatons does not sfantly hange the results (avalable upon request), suggestng that the seleton mehansms at work are not drvng the results of our analyss. IV.2. Raal Shool Segregaton n the U.S.: 989 and 2005 able presents the 989 and 2005 shool enrolment by rae, the overall raal ompostons n the U.S. urban publ shools, and the M ndex of overall raal shool segregaton. Natve Ameran, Asan, blak, and Hspan students already made up 34.8% of the total enrolment n 989. Sne growth rates n mnorty enrolment were larger than among whtes, by 2005 mnortes aounted for almost half, 48.05%, of total enrolment. Although all mnorty raal groups nreased ther share at the expense of whte students, the largest nreases are by far from Hspans, who, n 2005, were already the largest mnorty group n U.S. publ shools. he last row of able presents the M ndex, whh measures the expeted nformaton of the 5 hese are all operatonal shools, exept those foused on voatonal, speal, or other alternatve types of eduaton. 6 Reardon et al. (2000) study the publ shool populaton between 989 and 995 n 27 out of 323 MSAs, as defned by the Census ureau n 993. Frankel and Volj (2009) present results only for the 2005/2006 shool year, restrtng the sample to dstrts n CSAs wth at least two shools whh serve grades K-2. hus, the three papers study a smlar phenomenon, although ours overs a larger populaton durng a longer perod. 20

21 message that transforms the set of U.S. raal shares presented n the prevous panel of able to the set of shools raal shares. In 989, the ndex of segregaton (multpled by 00) was 43.92, and the ndex nreased by.3% to between 989 and able IV.3. Dstrt vs. Shool Segregaton n the U.S. Shools are organzed nto a set of shool dstrts whh are themselves organzed nto a set of tes. 7 hus, the overall ndex of segregaton an be deomposed nto three terms. he frst term results from dfferenes n raal shares between the tes and the natonal raal shares, so that t an be referred to as C (etween Ctes segregaton). he seond term aptures dfferenes n raal shares between the tes and the eduatonal dstrts, and s referred to as C (thn Ctes segregaton). Fnally, the last term n the deomposton aptures dfferenes n raal shares between the dstrts and the shools, and s referred to as D (thn Dstrts segregaton). able 2 presents the deomposton of overall raal shool segregaton nto the three omponents both for 989 and he results are n lne wth those reported n prevous empral studes. Frst ly, the C term, losely lnked to parental hoes of resdene at ty level, ontrbutes the most to overall shool segregaton. Seondly, the D term, losely lnked to the dstrt eduatonal authortes desons, s a relatvely small part around 9% of overall segregaton. able 2 heorems 2 and 3 from the Appendx an be nvoked to justfy the use of resamplng methods. able 2 presents 5% onfdene ntervals based on the normal approxmaton. Upper and lower lmts are obtaned from bootstrap estmates of the varane, usng 250 bootstrap samples of eah ndvdual student raal status wthn shools. Gven the very hgh level of aggregaton and the large sample szes, t s hardly surprsng to onfrm that all terms are sfantly dfferent from zero. Lookng at 7 he data orgnally onsst of 5,834 dstrts n 989 and 7,704 dstrts n For the ommon sample there are 5,429 dstrts. 2

22 dfferenes between the 989 and 2005 results, there s supportve evdene that the nreases observed n all terms are also sfant, and of the same order of matude: lose to 2% for C,.5% for C, and 0% for D. 8 Aggregaton at natonal level may mask large dfferenes n segregaton at ty and dstrt level. y Remark 3, both C and D an be nterpreted as lkelhood ratos for the null of absene of segregaton n any of the more than 800 tes, or the more than 5,000 dstrts. Clearly, ths does not mply that there s segregaton n all tes and all dstrts. A dret way to fnd out how many tes and dstrts have sfant levels of segregaton s to look dretly at eah ty and eah dstrt s loal ndex of raal segregaton. y Remark 2, these ndes are formal tests for the ndependene of raal and organzatonal unt status wthn eah geographal luster. he dstrbuton of these loal ndes under the absene of segregaton an be approxmated usng the h-square dstrbuton wth the approprate degrees of freedom. A naïve proedure to assess how many tes and dstrts present sfant levels of segregaton apples the test to eah ty and dstrt and then ounts those tes and dstrts for whh the null annot be rejeted at a gven onfdene level. Usng ths proedure wth a % onfdene nterval, segregaton s found to be sfant n all tes and the vast majorty (99%) of dstrts n both years. hs approah has the well-known drawbak that, by des, we should expet a postve number of rejetons even f the null s always true. Several orretons have been proposed n the lterature (see, for example, Romano et al., 2008). Usng the Holm orreton, segregaton remans sfant n all tes and n most dstrts, although the perentage of dstrts for whh segregaton s sfantly dfferent from zero slghtly dereases (98%). A related queston s whether segregaton levels are very dfferent among tes and dstrts. Gven that the M ndex represents a unque orderng of lusters of the organzatonal unts satsfyng a set of desrable propertes, t s useful to address ths queston by assessng whether the rankng of tes and the rankng of dstrts s sfant. here are several ways to defne rankng sfane. For brevty, 8 Usng all shools n 989 and 2005 results n slghtly larger nreases for C: around 6%. 22

23 here we only mean whether the poston n the rankng for eah of the tes and eah of the dstrts s presely estmated. One smple way to address ths ssue s by bootstrappng the rankngs and reportng bas bootstrap onfdene lmts for the rankng for eah ty and dstrt. Fgure presents ths nformaton graphally. he y-axs for eah plot shows both the ndex values and 0% bootstrap onfdene lmts for eah ty and dstrt by year. Ctes and dstrts wth the lowest levels of segregaton are ranked frst so that they are represented to the left on the x-axs. hus, all graphs present a postve slope by onstruton. Fgure shows that shool dstrts wth large segregaton values tend to be ranked more presely than shool dstrts wth low segregaton values. he rank of those dstrts wth the lowest levels of segregaton s, n fat, very poorly estmated and ts onfdene ntervals often range n the hundreds of postons. Regardng tes, however, the avalablty of large samples allows us to obtan prese estmates of the rank n most ases. Fnally, a note of auton s due regardng the nterpretaton of Fgure : sne the orderng s spef for eah year, Fgure does not show rankng dynams between 989 and Fgure IV.4. Multgroup Condtonal Shool Segregaton: he Role of Inome, ages, and eahers per Pupl. hs subseton onsders to what extent the measures of C and D presented so far are due to the statstal assoaton between raal group membershp and sooeonom ovarates usng the methodologal framework developed n subseton III.2.2. e fous on two sets of ontrols. Frstly, t has been argued n seton II.2 that, gven that household nome s a potental determnant of resdental and shool hoe, t would be nterestng to dentfy the extent to whh multgroup shool segregaton arses from nome dfferenes aross raes. In addton, resdental hoes may potentally be affeted by the omposton of earnngs nto wage and non-wage nome n the presene of redt 23

24 market restrtons. herefore, we would hope to dentfy the extent to whh multgroup shool segregaton arses from rae dfferentals n the wage to nome rato. Seondly, the mpat of lass sze, or ts nverse the number of teahers per pupl on aadem performane and other outomes has long been subjet to debate n aadem studes and poltal rles, where t he reduton of lass szes s frequently seen as an operatonal way for eduatonal authortes to effetvely nrease resoures n shools wth speal needs. At the same tme, parents aware of the potental postve effets of small lass sze on ther hldren s eduatonal ahevements wll lkely make ther resdental and shool hoes dependent on how shools dffer n ths dmenson. hus, t would be nterestng to dentfy the extent to whh multgroup shool segregaton arses from lass sze dfferenes aross shools. Our empral llustraton tentatvely addresses these two ssues by mergng the CCD data wth aggregated measures of nome and wages at ounty level. e spefally study the ontrbuton to the measurements of C and D n 2005 of the dsrepany n the raal mx by ty and by dstrt for dfferent values of average annual nome per apta, at ounty level average annual wages per job at ounty level, and teahers per pupl at shool level. 9 oth for C and D, we estmate omponents M ( ga, ) and M ( ga, ) = M M ( ga, ) usng estmates of ondtonal denstes f( g x ; a ) based on logst regressons arred out at dstrt and shool level. In partular, for eah raal group at ty (dstrt) level we assume that ' x ( e a ) g f( g x ; a ) = + where x nludes per apta nome, wages per job, and teahers per pupl at ounty (shool) level n addton to dummy varables for ty (dstrt) to ontrol for between ty (dstrt) segregaton. Logst regressons for eah of the fve raal groups are run, usng as the dependent varable n eah of the regressons the logst transformaton of the observed frequeny of 9 Sne nome nludes non-wage nome, nome and wages are not perfetly ollnear. he varable teahers per pupl at shool level an be onstruted usng the nformaton on the number of teahers and pupls reported by most shools sne County odes, also avalable from 2002 onwards, allow us to merge the 2005 dataset wth the 2004 annual per apta personal nome and average wage per job by ounty publshed by the ureau of Eonom Atvty of the U.S. Department of Commere. In our ounty sample, the orrelaton between the two varables s

25 students of a gven rae n a gven dstrt (shool), and as ontrols the averages at dstrt (shool) level for nome, wages, and teahers per pupl, n addton to the ty (dstrt) dummes. able 3 presents a summary of the results. able 3 Estmated margnal effets an be nterpreted as the expeted hange n probablty (n perentage terms) assoated wth a one-perentage nrease n eah of the ontrols. For example, a % nrease n per apta personal nome at dstrt level s assoated wth a 0.83% expeted nrease n the probablty of a student beng whte, and a 0.32% expeted derease n the probablty of a student beng blak. At dstrt level, nreases n per apta nome eters parbus are assoated to nreases n whtes and Asans and dereases n blaks and Hspans, whlst nreases n wages per job are assoated to dereases n whtes and nreases n blaks and Hspans. hese results arguably reflet both the hgher probablty of blak and Hspan students havng parents who have lower overall per apta nome and who are more lkely to be salared workers. Inreases n teahers per pupl are assoated wth dereases n all mnorty groups. th respet to whtes, the pont estmate of the relaton s postve, although statstally not sfant. At shool level, nreases n ounty per apta nome are assoated agan wth sfant nreases n whtes and sfant dereases n blaks. he ss of the estmates for the other groups are smlar to those obtaned for the dstrt level regresson, but the estmates are not sfant. th respet to wages per job, results are agan smlar for whtes, Hspan and blaks, whle the parameter estmates for Natve Ameran and Asans are not sfant. Fnally, the effet of nreases n teahers per pupl s reversed for blaks at shool level: a % nrease n the teahers per pupl n a shool nreases the probablty that a gven student s blak by 0.05%. Obvously, a ausalty nterpretaton should not be attahed to these estmates. Nevertheless, the strong sfane of these effets suggests that a sfant part of C and D stems from the statstal assoaton of these ovarates wth rae. hs entral ssue s addressed n the last panel of able 3. 25

26 One estmates â are obtaned usng the logst regressons arred out at dstrt and shool level, the term M ( ga, ) an be estmated usng M ˆ ( gˆ, a ˆ) wth ˆ ˆ t g ˆ = ( p,..., p G ). he term All ontrols represents Mˆ ( gˆ, aˆ) at ty and dstrt level (segregaton at ty and dstrt level stemmng from the statstal assoaton between rae membershp and per apta personal nome, wages per job, and teahers per pupl). Asymptot standard errors (usng the results from heorem 4 n the Appendx) are shown n parenthess. Results show that most (around 64%) of C whle over 20% of D s aounted for by these three ovarates. hese effets are sfant even n the wthn-dstrts ase. Fnally, to evaluate the potentally attenuatng effet on segregaton of teahers per pupl, the ondtonal segregaton terms are smulated as f ths ontrol had no effet. he average effet remans the same for D, whle t dereases very slghtly for C. V. CONCLUSIONS he startng pont of ths paper s the use of the Kullbak-Lebler noton of dsrepany (Kullbak and Lebler, 95) to propose a deomposton of the Mutual Informaton ndex of segregaton, M, frst ntrodued by hel and Fnzza (97) to solate segregaton ondtonal on any vetor of sooeonom haratersts. Estmators for M and the terms n ts deomposton are proposed, and ther asymptot propertes are obtaned. he usefulness of the approah s llustrated by lookng at patterns of multgroup shool segregaton n the U.S. for the and shool years. Several nterestng results stem from dret applaton of the tools developed n the paper. Overall multgroup shool segregaton, whh s measured as the dsrepany between the set of U.S. raal shares to the set of shools raal shares, s sfantly postve and has sfantly nreased durng the 5-year perod. In the deomposton of overall segregaton nto between-tes, wthn-tes, and wthn-dstrts segregaton, the fndngs are n lne wth prevous studes: betweentes segregaton, losely lnked to parental hoes of resdene at ty level, ontrbutes the most to 26

27 overall shool segregaton. In ontrast, wthn-dstrts segregaton, potentally lnked to poles by the dstrt eduatonal authortes, represents around 9% of overall segregaton. All terms n the deomposton of overall segregaton are sfantly dfferent from zero, and evdene s found that all of them sfantly nreased durng the perod. Aggregaton at natonal level may mask large dfferenes n segregaton at ty and dstrt level. However, when segregaton s studed reursvely by ty and dstrt, t s found to be sfant n all tes and the vast majorty of dstrts n both years. A related queston s whether the rankng of tes and the rankng of dstrts s sfant. Usng bootstrap tehnques, t s found that the rank of those dstrts wth the lowest levels of segregaton s, n fat, very poorly estmated and onfdene ntervals often range n the hundreds of postons. Regardng the rankng of tes, however, the avalablty of large samples allows us to obtan prese rank estmates for most tes. Fnally, we study to what extent the measures of wthn-tes and wthn-dstrts segregaton are due to the statstal assoaton between raal group membershp and three ontnuous varables: annual per apta ounty nome, wages per job at ounty level, and teahers per pupl at shool level. Results show that around 64% and 20% of, respetvely, wthn-tes and wthn-dstrts segregaton s aounted for by these three ovarates, and that the effets are strongly sfant. hese results llustrate why, for both explanatory and poly reasons, t s mportant to dentfy the extent to whh the value of segregaton arses from nome and other sooeonom haratersts. Results suggest that to redue shool segregaton levels t may prove neessary to redue nome and other nequaltes aross raes. APPENDIX Proposton : Under assumptons A to A7, = M M ( g, a) + M ( a, b ) where G ( x ; ) f g a M ( ga, ) = f( x) f( g x ; a )log dx x Λ g= pg 27

28 G N (, x ; ) f b M ( ab, ) = f( x) f(, x; b)log dx. ( x; ) ( x ; ) x Λ g= n= f g a f n b f( g, x) f( g x ; a ) Proof. Frst note that, gven assumptons A4 and A5, =. and thus f2( g, x ) pg G f( g, x) f( g x ; a ) f(,, x)log dµ = f( ) f( g ; )log d f2( g, ) x x a x x g = p x Λ g = M ( ga, ). (, x ; ) From assumptons A5 and A6, (, x ) = fng b f n g and f f( g x ; a ) (,, x) = f (, x; b ) f ( x). Gven assumpton A7 we dretly see that: 28 G N f( n g, x) f(, x; b ) f(,, x)log dµ = f( x) f( g, n x ; )log dx f2( n g, x) b x x x g= n= f( g ; ) f( n ; ) Λ a b = M ( ab, ). Proposton 2: Under assumptons A to A3, plm Mˆ = M. Proof. e frst note that by dret applaton of Lemma n Rao (957), the sample frequenes ˆp = onverge n probablty to the atual probabltes,.e. plm pˆ = p. Defne the mutual nformaton ndex as a funton of the p parameter vetor ( ) = log p log ( ) ( ) + GN m q p p ( ) ( ) GN pg q p n q GN pg q p N q N p f g G n= pg ( q ) = N + f = pgn p jn g G n= ( jn, ) GN and where q Θ, G p f n N, g= p n ( q ) = G + f =. pgn pgj n N g= ( g, j) GN 0 Let q be the vetor ontanng the true probabltes. Sne m(q ) s ontnuous at q = q 0, by the Slutsky theorem t 0 follows that plm Mˆ = plm m( qˆ ) = m( q ) = M. heorem : Suppose that A, A2, A3, and A8 hold. If p = p g p n, for all (, ) GN, then: ˆ d 2 2 M χ( G )( N ).

JSM Survey Research Methods Section. Is it MAR or NMAR? Michail Sverchkov

JSM Survey Research Methods Section. Is it MAR or NMAR? Michail Sverchkov JSM 2013 - Survey Researh Methods Seton Is t MAR or NMAR? Mhal Sverhkov Bureau of Labor Statsts 2 Massahusetts Avenue, NE, Sute 1950, Washngton, DC. 20212, Sverhkov.Mhael@bls.gov Abstrat Most methods that

More information

The corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if

The corresponding link function is the complementary log-log link The logistic model is comparable with the probit model if SK300 and SK400 Lnk funtons for bnomal GLMs Autumn 08 We motvate the dsusson by the beetle eample GLMs for bnomal and multnomal data Covers the followng materal from hapters 5 and 6: Seton 5.6., 5.6.3,

More information

STK4900/ Lecture 4 Program. Counterfactuals and causal effects. Example (cf. practical exercise 10)

STK4900/ Lecture 4 Program. Counterfactuals and causal effects. Example (cf. practical exercise 10) STK4900/9900 - Leture 4 Program 1. Counterfatuals and ausal effets 2. Confoundng 3. Interaton 4. More on ANOVA Setons 4.1, 4.4, 4.6 Supplementary materal on ANOVA Example (f. pratal exerse 10) How does

More information

Machine Learning: and 15781, 2003 Assignment 4

Machine Learning: and 15781, 2003 Assignment 4 ahne Learnng: 070 and 578, 003 Assgnment 4. VC Dmenson 30 onts Consder the spae of nstane X orrespondng to all ponts n the D x, plane. Gve the VC dmenson of the followng hpothess spaes. No explanaton requred.

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Complement of an Extended Fuzzy Set

Complement of an Extended Fuzzy Set Internatonal Journal of Computer pplatons (0975 8887) Complement of an Extended Fuzzy Set Trdv Jyot Neog Researh Sholar epartment of Mathemats CMJ Unversty, Shllong, Meghalaya usmanta Kumar Sut ssstant

More information

Clustering through Mixture Models

Clustering through Mixture Models lusterng through Mxture Models General referenes: Lndsay B.G. 995 Mxture models: theory geometry and applatons FS- BMS Regonal onferene Seres n Probablty and Statsts. MLahlan G.J. Basford K.E. 988 Mxture

More information

FAULT DETECTION AND IDENTIFICATION BASED ON FULLY-DECOUPLED PARITY EQUATION

FAULT DETECTION AND IDENTIFICATION BASED ON FULLY-DECOUPLED PARITY EQUATION Control 4, Unversty of Bath, UK, September 4 FAUL DEECION AND IDENIFICAION BASED ON FULLY-DECOUPLED PARIY EQUAION C. W. Chan, Hua Song, and Hong-Yue Zhang he Unversty of Hong Kong, Hong Kong, Chna, Emal:

More information

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi LOGIT ANALYSIS A.K. VASISHT Indan Agrcultural Statstcs Research Insttute, Lbrary Avenue, New Delh-0 02 amtvassht@asr.res.n. Introducton In dummy regresson varable models, t s assumed mplctly that the dependent

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Analysis of Mixed Correlated Bivariate Negative Binomial and Continuous Responses

Analysis of Mixed Correlated Bivariate Negative Binomial and Continuous Responses Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 8, Issue 2 (Deember 2013), pp. 404 415 Applatons and Appled Mathemats: An Internatonal Journal (AAM) Analyss of Mxed Correlated Bvarate

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights ACTA ET COMMENTATIONES UNIVERSITATIS TARTUENSIS DE MATHEMATICA Volume 7, Number 2, December 203 Avalable onlne at http://acutm.math.ut.ee A note on almost sure behavor of randomly weghted sums of φ-mxng

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

POWER ON DIGRAPHS. 1. Introduction

POWER ON DIGRAPHS. 1. Introduction O P E R A T I O N S R E S E A R H A N D D E I S I O N S No. 2 216 DOI: 1.5277/ord1627 Hans PETERS 1 Judth TIMMER 2 Rene VAN DEN BRINK 3 POWER ON DIGRAPHS It s assumed that relatons between n players are

More information

Some Results on the Counterfeit Coins Problem. Li An-Ping. Beijing , P.R.China Abstract

Some Results on the Counterfeit Coins Problem. Li An-Ping. Beijing , P.R.China Abstract Some Results on the Counterfet Cons Problem L An-Png Bejng 100085, P.R.Chna apl0001@sna.om Abstrat We wll present some results on the ounterfet ons problem n the ase of mult-sets. Keywords: ombnatoral

More information

Phase Transition in Collective Motion

Phase Transition in Collective Motion Phase Transton n Colletve Moton Hefe Hu May 4, 2008 Abstrat There has been a hgh nterest n studyng the olletve behavor of organsms n reent years. When the densty of lvng systems s nreased, a phase transton

More information

Correlation and Regression without Sums of Squares. (Kendall's Tau) Rudy A. Gideon ABSTRACT

Correlation and Regression without Sums of Squares. (Kendall's Tau) Rudy A. Gideon ABSTRACT Correlaton and Regson wthout Sums of Squa (Kendall's Tau) Rud A. Gdeon ABSTRACT Ths short pee provdes an ntroduton to the use of Kendall's τ n orrelaton and smple lnear regson. The error estmate also uses

More information

Brander and Lewis (1986) Link the relationship between financial and product sides of a firm.

Brander and Lewis (1986) Link the relationship between financial and product sides of a firm. Brander and Lews (1986) Lnk the relatonshp between fnanal and produt sdes of a frm. The way a frm fnanes ts nvestment: (1) Debt: Borrowng from banks, n bond market, et. Debt holders have prorty over a

More information

Partner Choice and the Marital College Premium

Partner Choice and the Marital College Premium Partner Choe and the Martal College Premum Perre-André Chappor Bernard Salané Yoram Wess Otober 15, 2010 Abstrat Several theoretal ontrbutons have argued that the returns to shoolng wthn marrage play a

More information

Instance-Based Learning and Clustering

Instance-Based Learning and Clustering Instane-Based Learnng and Clusterng R&N 04, a bt of 03 Dfferent knds of Indutve Learnng Supervsed learnng Bas dea: Learn an approxmaton for a funton y=f(x based on labelled examples { (x,y, (x,y,, (x n,y

More information

Interval Valued Neutrosophic Soft Topological Spaces

Interval Valued Neutrosophic Soft Topological Spaces 8 Interval Valued Neutrosoph Soft Topologal njan Mukherjee Mthun Datta Florentn Smarandah Department of Mathemats Trpura Unversty Suryamannagar gartala-7990 Trpura Indamal: anjan00_m@yahooon Department

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Controller Design for Networked Control Systems in Multiple-packet Transmission with Random Delays

Controller Design for Networked Control Systems in Multiple-packet Transmission with Random Delays Appled Mehans and Materals Onlne: 03-0- ISSN: 66-748, Vols. 78-80, pp 60-604 do:0.408/www.sentf.net/amm.78-80.60 03 rans eh Publatons, Swtzerland H Controller Desgn for Networed Control Systems n Multple-paet

More information

GEL 446: Applied Environmental Geology

GEL 446: Applied Environmental Geology GE 446: ppled Envronmental Geology Watershed Delneaton and Geomorphology Watershed Geomorphology Watersheds are fundamental geospatal unts that provde a physal and oneptual framewor wdely used by sentsts,

More information

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University

Clustering. CS4780/5780 Machine Learning Fall Thorsten Joachims Cornell University Clusterng CS4780/5780 Mahne Learnng Fall 2012 Thorsten Joahms Cornell Unversty Readng: Mannng/Raghavan/Shuetze, Chapters 16 (not 16.3) and 17 (http://nlp.stanford.edu/ir-book/) Outlne Supervsed vs. Unsupervsed

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Partner Choice and the Marital College Premium

Partner Choice and the Marital College Premium Partner Choe and the Martal College Premum Perre-André Chappor Bernard Salané Yoram Wess January 30, 2011 Abstrat Several theoretal ontrbutons have argued that the returns to shoolng wthn marrage play

More information

Temperature. Chapter Heat Engine

Temperature. Chapter Heat Engine Chapter 3 Temperature In prevous chapters of these notes we ntroduced the Prncple of Maxmum ntropy as a technque for estmatng probablty dstrbutons consstent wth constrants. In Chapter 9 we dscussed the

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

First day August 1, Problems and Solutions

First day August 1, Problems and Solutions FOURTH INTERNATIONAL COMPETITION FOR UNIVERSITY STUDENTS IN MATHEMATICS July 30 August 4, 997, Plovdv, BULGARIA Frst day August, 997 Problems and Solutons Problem. Let {ε n } n= be a sequence of postve

More information

Uniform Price Mechanisms for Threshold Public Goods Provision with Private Value Information: Theory and Experiment

Uniform Price Mechanisms for Threshold Public Goods Provision with Private Value Information: Theory and Experiment Unform Pre Mehansms for Threshold Publ Goods Provson wth Prvate Value Informaton: Theory and Experment Zh L *, Chrstopher Anderson, and Stephen Swallow Abstrat Ths paper ompares two novel unform pre mehansms

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Structure and Drive Paul A. Jensen Copyright July 20, 2003

Structure and Drive Paul A. Jensen Copyright July 20, 2003 Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.

More information

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol

Using the estimated penetrances to determine the range of the underlying genetic model in casecontrol Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

technische universiteit eindhoven Analysis of one product /one location inventory control models prof.dr. A.G. de Kok 1

technische universiteit eindhoven Analysis of one product /one location inventory control models prof.dr. A.G. de Kok 1 TU/e tehnshe unverstet endhoven Analyss of one produt /one loaton nventory ontrol models prof.dr. A.G. de Kok Aknowledgements: I would lke to thank Leonard Fortun for translatng ths ourse materal nto Englsh

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

APPENDIX A Some Linear Algebra

APPENDIX A Some Linear Algebra APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,

More information

Homework Math 180: Introduction to GR Temple-Winter (3) Summarize the article:

Homework Math 180: Introduction to GR Temple-Winter (3) Summarize the article: Homework Math 80: Introduton to GR Temple-Wnter 208 (3) Summarze the artle: https://www.udas.edu/news/dongwthout-dark-energy/ (4) Assume only the transformaton laws for etors. Let X P = a = a α y = Y α

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Foundations of Arithmetic

Foundations of Arithmetic Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering

Outline. Clustering: Similarity-Based Clustering. Supervised Learning vs. Unsupervised Learning. Clustering. Applications of Clustering Clusterng: Smlarty-Based Clusterng CS4780/5780 Mahne Learnng Fall 2013 Thorsten Joahms Cornell Unversty Supervsed vs. Unsupervsed Learnng Herarhal Clusterng Herarhal Agglomeratve Clusterng (HAC) Non-Herarhal

More information

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

PHYSICS 212 MIDTERM II 19 February 2003

PHYSICS 212 MIDTERM II 19 February 2003 PHYSICS 1 MIDERM II 19 Feruary 003 Exam s losed ook, losed notes. Use only your formula sheet. Wrte all work and answers n exam ooklets. he aks of pages wll not e graded unless you so request on the front

More information

Dynamics of social networks (the rise and fall of a networked society)

Dynamics of social networks (the rise and fall of a networked society) Dynams of soal networks (the rse and fall of a networked soety Matteo Marsl, ICTP Treste Frantsek Slanna, Prague, Fernando Vega-Redondo, Alante Motvaton & Bakground Soal nteraton and nformaton Smple model

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Partner Choice and the Marital College Premium: Analyzing Marital Patterns Over Several Decades

Partner Choice and the Marital College Premium: Analyzing Marital Patterns Over Several Decades Partner Choe and the Martal College Premum: Analyzng Martal Patterns Over Several Deades Perre-André Chappor Bernard Salané Yoram Wess Deember 25, 2014 Abstrat We onstrut a strutural model of household

More information

Partner Choice and the Marital College Premium

Partner Choice and the Marital College Premium Partner Choe and the Martal College Premum Perre-André Chappor Bernard Salané Yoram Wess September 6, 2012 Abstrat We onstrut a strutural model of household deson-makng and mathng and estmate the returns

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Horizontal mergers for buyer power. Abstract

Horizontal mergers for buyer power. Abstract Horzontal mergers for buyer power Ramon Faul-Oller Unverstat d'alaant Llus Bru Unverstat de les Illes Balears Abstrat Salant et al. (1983) showed n a Cournot settng that horzontal mergers are unproftable

More information

1 Binary Response Models

1 Binary Response Models Bnary and Ordered Multnomal Response Models Dscrete qualtatve response models deal wth dscrete dependent varables. bnary: yes/no, partcpaton/non-partcpaton lnear probablty model LPM, probt or logt models

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Partner Choice and the Marital College Premium

Partner Choice and the Marital College Premium Partner Choe and the Martal College Premum Perre-André Chappor Bernard Salané Yoram Wess September 4, 2012 Abstrat Several theoretal ontrbutons have argued that the returns to shoolng wthn marrage play

More information

AS-Level Maths: Statistics 1 for Edexcel

AS-Level Maths: Statistics 1 for Edexcel 1 of 6 AS-Level Maths: Statstcs 1 for Edecel S1. Calculatng means and standard devatons Ths con ndcates the slde contans actvtes created n Flash. These actvtes are not edtable. For more detaled nstructons,

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Chapter 1 Probability Theory. Definition: A set is a collection of finite or infinite elements where ordering and multiplicity are generally ignored.

Chapter 1 Probability Theory. Definition: A set is a collection of finite or infinite elements where ordering and multiplicity are generally ignored. Chapter 1 for BST 695: Speal Tops n Statstal Theory, Ku Zhang, 2011 Chapter 1 Probablty Theory Chapter 11 Set Theory Defnton: A set s a olleton of fnte or nfnte elements where orderng and multplty are

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Charged Particle in a Magnetic Field

Charged Particle in a Magnetic Field Charged Partle n a Magnet Feld Mhael Fowler 1/16/08 Introduton Classall, the fore on a harged partle n eletr and magnet felds s gven b the Lorentz fore law: v B F = q E+ Ths velot-dependent fore s qute

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity

Week3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle

More information