MODIFICATION OF FRIEDMAN-RUBIN'S CLUSTERING ALGORITHM FOR USE IN STRATIFIED PPS SAMPLING

Size: px
Start display at page:

Download "MODIFICATION OF FRIEDMAN-RUBIN'S CLUSTERING ALGORITHM FOR USE IN STRATIFIED PPS SAMPLING"

Transcription

1 MODFCATON OF FREDMAN-RUBN'S CLUSTERNG ALGORTHM FOR USE N STRATFED PPS SAMPLNG Donna Kostanch, Davd udkns, Rajendra Sngh, and Mnd Schautz U.S. Bureau of e Census. NTRODUCTON Major research actvtes are currently underway at e Census Bureau to mprove e qualty of e Current Populaton Survey (CPS). Obtanng a meod whch reduces e frst-stage component of varance s a hgh prorty project at has been and s stll beng nvestgated. Ths paper deals w some of e research conducted us far on e use of a partcular clusterng algorm to acheve s varance reducton. The followng secton gves a bref dscusson of e survey's objectves and e reason for e concern n reducng s component of varance. Secton of s paper dscusses e general form of s varance component as t relates to bo e desgn of e CPS and a general clusterng stuaton. Also presented n s secton are e modfcatons made to Fredman and Rubn's clusterng algorm (see {2}) for e above stated purpose. A descrpton of e computer program at has been created to satsfy our modfcatons along w oer related optons s gven n Secton V. Emprcal results on e effect of ese modfcatons under varous realstc samplng stuatons are shown n Secton V.. BACKGROUND One of e major purposes of e CPS s to produce labor force statstcs at bo e natonal and state levels w e prmary objectve of obtanng relable estmates on e number of unemployed. The emphass of e CPS had recently, durng e latter part of 1978, shfted towards producng more relable unemployment estmates at a substate level. For a more n-dep dscusson on e desgn and objectves of e CPS durng e last decade see {4}. More recently (.e. early 1981), e am of e CPS has shfted back to obtanng relable state estmates. Ths shft was caused by budgetary problems. Addtonally, ere has been consderable nterest n mnorty estmates. The above mentoned changes n objectves, especally e antcpated shft towards substate estmates, had created concern over e frst-stage contrbuton of varance. For natonal estmates, s component of varance due to e selecton of prmary samplng unts (PSUs) s relatvely small compared to e total varance. Ths proporton ncreases for mnorty estmates at e natonal level. For some states, however, and even more so for some substate areas, s component of varance can be qute large. Therefore, major research s underway n order to reduce e between-psu varance for several varables at e state level.. GENERAL The between-psu varance n e CPS results from e selecton of one sample PSU per stratum w probablty proportonate to sze (PPS). n e current CPS desgn, PSUs are generally defned to be countes or a contguous group of countes w e excepton at mnor cvl dvsons form PSUs n some noreastern states. The results presented n s paper always consder sngle countes as PSUs. Ths component of varance for estmated levels w a gven characterstc s gven by e followng formula: g nh Ph...Ph \\ 2 (~" Var B = h=z = ---- Ph ---- Ph U h - U h ) x / where g = e number of strata n~ = e number of PSUs n e h L stratum Ph ~ = t~ populaton of e PSU n e h" stratum Ph = e populaton of e h stratum / \ /oe. Ph = zh Ph ),, =l Uh = e actual number of persons w att~ertan charac~t~rstcs n e PSU of e h "" stratum and U h = e number of persons w~h characterstc n e h / rl h \ [,.e. Uh = = Uh ) a certan stratum The above formula () can be rewrtten n terms of proportons by lettng Xh Uh / Uh = / Ph and X h = <. Ths gves: g nh (2) Var B = E Z Ph Ph \ fxh - Xh \2 h=l =l \ / Consequently, e purpose of s study s to produce a stratfcaton procedure at smultaneously reduces (2) for several key labor force statstcs for a gven number of strata. Alough g s not actually known, t can be assumed to be fxed snce ts value depends heavly on e desred stratum workload whch s fxed and dependent on e sze of e stratum. Oer factors ncludng relablty requrements and e value of Var B acheved for unemployed also affect e value of g. n attempts to accomplsh e above, a clusterng algorm suggested by Fredman and Rubn n {2} was used as a bass for reducng s frst stage varance component for several varables. n general, er clusterng algorm attempts to optmze a crteron functon for a fxed number of clusters. One of e crteron functons suggested n {2} to be mnmzed s e wn cluster sums of squared errors for several varables, whch s referred to as e trace W crteron. Note at W s defned to be e wn cluster dsperson matrx. Usng notaton defned above, an element of W, wj,j, s defned as: g n h ~, _ w.j,j., = h=z E l \Xh j - ~Xh j, - Xhj,, 285

2 for j, j' =, 2,... p where e addtonal subscrpt j s for desgnaton of e P varables and _ / n X Xhj = Z h Xhj~ /n h ', =l Thu s, P g n / _ \2 (3) Trace W = Z Z Z h ~\Xh j - Xhj ) j=l h=l =l j Note at f e between PSU varance gven n (2) was summed over e key labor force statstcs, s functon would be very smlar to at gven n (3) above. The only dfferences are e Ph and Ph terms and manner n whch e cluster means are computed. Thus, t seems approprate to modfy (3) to obtan a sum of varances n (2) to use as e crteron functon to mnmze n order to reduce e between-psu varance. Ths new crteron functon wll be defned as: P (4) Betvar = j_z_ VarB, where VarB,; s e quantty stated n formula () above for e j varable and p s e number of varables to be used n e clusterng algorm. Ths appears to be a worwhle addton because of e effect of dfferng PSU szes (Ph s) on bo e squared devatons and e clusters means. Ths modfcaton of trace W was en used as e crteron functon along w parts of Fredman and Rubn's clusterng algorm to form a stratfcaton algorm. t was later dscovered at an dentcal crteron functon had been used w a very smlar clusterng algorm for samplng purposes (see {}). The algorm specfed n {2} actually conssts of ree separate passes; a hll clmbng pass, a forcng pass and a reassgnment pass. The hll clmbng pass examnes each object (PSU) one at a tme and moves s object to e group at produces e most mprovement n e crteron functon. f no mprovement occurs e object remans n ts current cluster. A one move local mnmum occurs when an entre pass of e objects produces no moves. The oer two passes are heurstc procedures whch attempt to obtan a near global mnmum. The forcng pass attempts to optmze e crteron by placng groups of objects from one cluster nto e oer clusters. For e reassgnment pass e algorm consders movng all objects nto dfferent clusters at one tme. Due to e lmted avalablty of programmng resources, e effectveness of ese two passes was evaluated usng e wn cluster sums of squared errors gven n formula (3) above. These results, presented n {3}, show at a number of ntal starts wll produce results smlar to ose obtaned usng e forcng and reassgnment passes. Therefore, t was decded at programmng resources could be more effcently utlzed by ncorporatng random ntal starts and oer modfcatons nto e program raer an usng ese two passes. The next secton of s paper descrbes more oroughly e program created for e stratf- caton algorm ncludng oer addtons at have been made to assst n s research as t apples to samplng, V. STRATFCATON PROGRAM Ths stratfcaton program, developed at e Census Bureau, bascally conssts of e hllclmbng pass n {2} and a crteron functon (4) for between-psu varances. Oer optons have also been ncorporated nto s program for e prmary purpose of formng strata for PPS samplng, However, s program can be used for e more conventonal clusterng problem due to a number of optons avalable to e user. The program s run on a Unvac 1100/44 machne. The program sze s 39-43k, w 'k' beng defned as bt words. The sze of e program depends on e optons ncorporated n a partcular run. The program language s ALGOL. The clusterng program can be altered rough e selecton of avalable optons, whch nvolve usng alternate peces of code n e program. The followng are e optons at alter e clusterng program tself: ) OPTON FOR MAXMUM/MNMUM CLUSTER SZE CON- STRANTS. Any reasonable maxmum and mnmum cluster szes can be assgned, whch are referred to as sze constrants. Sze constrants have been ncorporated nto e algorm n order to balance ntervewer workloads. Only one mnmum and two maxmum sze constrantsmay be specfed. One maxmum sze constrant can be used to exclude unusually large elements or PSUs, whch wll en be sampled as self-representng PSUs, and e oer can be used as e maxmum sze constrant n e clusterng algorm. 2) OPTON FOR NTAL ALLOCATON OF PSUs TO CLUSTERS. The followng are e ree avalable procedures for assgnng e ntal allocaton of e elements or PSUs to e clusters: a) Ths allocaton s defned by an external procedure (such as ntuton, prncpal components, graphng, etc.). The number of elements to be n each cluster, nh, s also defned by e external procedure. A cluster code s placed on each record and e fle s en sorted usng s feld, The frst n elements are put nto e frst cluster and e next n 2 elements are placed nto e second cluster, etc. b) Anoer meod s to frst sort e fle by e predomnant varable(s), such as unemployment. The ntal allocaton s determned by formng approxmately equal szed clusters from s sorted fle. c) A random clusterng of e elements may also be acheved by usng a procedure at ncorporates a random number generator. The elements are randomly placed nto clusters so at e clustems meet any reasonable mnmum or maxmum sze constrants. Frst, PSUs or elements are assgned to meet e mnmum sze constrant. f, after e mnmum sze constrant s met, some elements are not yet assgned to a cluster, en e procedure attempts to randomly assgn all remanng elements to a cluster. f at any tme a clusterng scheme cannot be devsed such at all elements are ncluded n one of e groups and all clusters satsfy e sze constrants, e process s stopped and started agan by randomly 286

3 assgnng all elements. The number of random starts to use s left up to e user. 3) OPTON FOR CRTERON FUNCTONS. The avalable functons to be mnmzed are: a) Trace W; suggested by Fredman and Rubn and also gven n (3) b) Betvar; functon of varances gven n (4) c) Determnant W'; W' s specfed n Secton V. The Determnant W' opton s smlar to e Determnant W crtera gven n {2} but ncorporates e PSU szes nto e wn cluster dsperson matrx W. Some results showng e applcablty of s functon as a crteron to mnmze are ncluded n e next secton of s paper 4) OPTON TO STANDARDZE VARABLES. The varables beng used to determne e clusters may be standardzed by one of e followng meods (no standardzaton s also an opton): a) probablty proportonate to sze scalng. For s opton e scalng factor, sf;, appled to all observatons on e j varable, s determned by],. max P1 P (Xl _ Xl )2 sf. = k V =l l,k,k / N \ -E- P Pl (Xl,j -Xl,j)e for j and k =, 2,... p where p s e number of varables, N s e total number of PSUs or elements and all PSUs are assumed to be n one cluster (.e. h = mples at Y n h = N and E Ph = P )" Note at P~ s e populaton of e entre NSR area. b) equal sze scalng. For s opton all observatons or elements are consdered equally mportant n determnng e scalng factors sf ' appled to all observatons on e j These factors are obtaned by: max ~ ~ (- _ ~, )2 k =l Xl,k k sf. = N ~ =~(Xl,j - X,j)2 varable.. where j, k, p and N are defned above n a). Scalng meods a.) and b.) above, respectvely, equalze e between-psu varance or e sums of squares for all varables gven at all elements are n one cluster. 5) OPTON TO CHOOSE NUMBER OF CLUSTERS. The number of clusters to be formed s determned by e user; however, e actual number of clusters formed may be nternally reduced by: a) e removal of any ndvdual element whose sze (Ph) s greater an e maxmum allowable cluster sze; b) e creaton of an ntal empty cluster rough e random starts procedure. Ths was allowed n order to avod complextes, snce an empty cluster would not meet e mnmum sze constrant. The possblty of s occurrng s very slght f one chooses a reasonable number of clusters and reasonable sze constrants. 6) OPTON TO CHOOSE PREFERENCE FACTORS. Any preference factors or varable weghts may be assgned to each varable. f factors are n- dcated here, ey are ncorporated nto e Betvar crteron functon n order to weght e contrbuton from some varables more an oers. Consequently, s forces a greater reducton n between-psu varance for e more heavly weghted varables as determned by e user. f e varables were not standardzed ntally, e contrbuton of each varable to e Betvar crteron would dffer. These contrbutons would be dependent on e general magntude of e varances raer an e mportance of e actual varable. f s opton s not specfed, all preference factors are assumed equal to one. V. EMPRCAL RESULTS Some prelmnary results showng e effectveness at e above mentoned modfcatons have on reducng e between-psu varance are presented n e attached tables. Strata were formed wn ree test states w PSUs beng defned as countes. Not all countes n a state were ncluded n s testng; only ose consdered to be nonself-representng (NSR) were used to form strata. Countes excluded were assumed to be self-representng and were desgnated as such snce er populaton would yeld a sample large enough to support an ntervewer. For example, countes ncluded n e Pttsburgh SMSA were not stratfed and would be n sample w certanty. These ree states were arbtrarly selected for testng as ey seemed to be somewhat representatve of e country. For each state e algorm was tested usng four key labor force varables, whch are mportant n e CPS. Each of ese states ncludes e number of unemployed and e number n e cvlan labor force (CLF) as varables and 2, respectvely, and e resultng between-psu varances are desgnated by VarB,l and Var B,2. Varables 3 and 4 represent smlar mnorty characterstcs of unemployed and CLF respectvely, where Spansh was used n and blacks n and. The number of strata formed n each state s an approxmaton of e number at would be needed to satsfy e antcpated relablty requrements on e state estmate of e number of unemployed for e redesgned CPS. The NSR PSUs n each of ese states were clustered nto strata usng e followng crtera: 1.) Betvar j=l VarB, 4 g n 2.) Trace W = j=l r.. h=l r. =l r'h (Xhj - Xhj ) 2 3.) Determnant W'; where W' can be wrtten as: ~ -VarB, Cov B,1,2 Cov B,1,3 C VB, 1 4-] " C VB,1, 2 VarB, 2 C VB,2, 3 C VB' 214 C VB, 1,3 C VB,2,3 VarB,3 C VB, 3, ~ 1 C VB, 1, 4 C VB,2,4 C VB,3, 4 VarB, 4 by defnng e covarance terms as: C VB,j,j ' = h~l Ph Ph (Xhj-Xhj )(Xh j '-~j,) 287

4 4 4.) Betvar = j_e_ Pfj VarB, j where e Pf s represent preference factors ds- cussed n secton V of s paper, for whch e followng values were assgned: Pfl = 2 and Pf2 = Pf3 = Pf4 = Ths was tested n order to examne e effect of weghtng e most mportant labor force varable, unemployed, more heavly. 5.) Betvar = Var B, 6.) Betvar = Vat B,2 7.) Betvar = Vat B,3 8.) Betvar = Var B,4 Note at ese last four crtera produce a stratfcaton based on only one of e four key labor force varables and e results gve some ndcaton of e amount of reducton n between- PSU varance at can be acheved for each varable. Each of e above eght crtera were used to cluster NSR PSUs nto strata n each of e ree test states. Also for each crteron and state a stratfcaton was produced by mposng ree types of sze constrants on e strata, none, loose and tght. Three random starts were mplemented for each type of sze constrant and e resultng average between-psu varances, Va--~ B,j' were obtaned and are presented n e attached tables. Note also at for a gven sze constrant, dentcal ntal starts were used n clusterng e PSUs w each of e eght crtera. For all stratfcatons e actual varables were frst standardzed w scalng factors determned by e probablty proportonate to sze meod dscussed n secton V of s paper. Tables, 2, 3 and 4 present e average between-psu varances at were obtaned from e frst four crtera gven above, respectvely. The average between-psu varances resultng from stratfyng on each varable separately (.e. crtera 5.)-8.) above) are shown n Table 5. For purposes of comparson, e percent reductons n between-psu varance for each varable are also gven n ese tables. These percent reductons are based on what e between-psu varance would be f e NSR PSUs were not stratfed and g PSUs were selected w probablty proportonate to sze w replacement. These nonstratfed between-psu varances, Var~ were calculated for each test state by: '' N _-E.l Pl P1 (Xlj-Xj)2 VarB,j = g for j = 1,2, 3 and4 where N represents e number of NSR PSUs n e state. Chart A summarzes e nonstratled between-psu varances for each state and varable and also gves oer stratfcaton parameters used n mplementng e algorm. The effect of ncorporatng probabltes of selecton n e clusterng crteron can be seen by comparng e results n Table w ose n Table 2. Greater reductons n between PSU varances occurred w e Betvar crteron for all states and all varables when no sze constrants were mposed on e strata. n fact, n one nstance e Trace W crteron produced an ncrease n e between-psu varance for a partcular varable. Ths happened n to varable 2, CLF, when no sze constrants were mposed; as can be seen from e negatve percent reducton n Table 2. When sze constrants are mposed on e strata, generally e Betvar crteron produces gans over e Trace W crteron alough ese gans are smaller. n and, w tght sze constrants, one of e varables, varable 3 - Spansh unemployed n and varable - unemployed n, acheved greater varance reducton w e Trace W crteron; however, for all varables concerned e Betvar crteron produced a better stratfcaton. For, when loose sze constrants were mposed e Trace W crteron produced an overall better stratfcaton (.e. ree varables acheved a greater varance reducton). However w tght sze constrants half e varances were smaller usng e Trace W crteron e oers were reduced more w e Betvar crteron. Ths prelmnary comparson shows qute strongly at e modfcatons made to e Trace W crteron wll produce sgnfcant reductons n e frst stage varance component when no sze constrants are mposed. t also appears at n general and under certan stuatons smaller gans n varance reducton can be acheved even w sze constrants on strata. The varances for stratfcatons produced by mnmzng e Determnant W' are shown n Table 3. The applcablty of s crteron to stratfcaton s of nterest snce t s a functon of bo between-psu varances and wn cluster covarances. By comparng ese results w ose obtaned by mnmzng only e between-psu varances, (see Table ), t appears at e wn cluster covarances do not n general mprove e stratfcaton. Most cases where greater varance reductons are acheved w e Determnant, e Betvar crteron does not produce substantally larger varances. However, an unusual stuaton dd occur for unemployed n, smaller varances were consstently obtaned from e Determnant for all ree types of sze constrants. Also ncluded n s study s an example of e effect of preference factors, pf~ w s, on e stratfcatons. As prevously dscussed e pf.'s are ncorporated n e Betvar crteron 3 n order to weght partcular varables more heavly an oers. Ths example smply weghts e VarB, (varance for unemployed) twce at of e oers. As antcpated, greater reductons n e between-psu varance for unemployed (.e. see Tables and 4) dd occur, but by dfferng amounts. The decreases were generally substantal, partcularly when no sze constrants were mposed. t appears at e between-psu varances for e oer ree varables can also ncrease consderably. The results presented n Table 5 somewhat ndcate e amount of varance reducton at s possble for a varable under varous condtons, snce ese stratfcatons were based on only one varable. t should be noted at s procedure wll not produce e mnmal 288

5 varance because results are somewhat dependent on e ntal clusterng due to e fact at e algorm only produces a local mnmum. However, ese varances can serve as a bass for determnng e effectveness of a multvarate stratfcaton. Also wory of menton s at greater reductons n between-psu varances can be acheved f strata szes are allowed to vary. Ths can be seen n all e stratfcatons produced, but s especally emphaszed when only one stratfcaton varable s used (.e. Table 5). Alough, e results presented n s paper are not conclusve, research s currently focused on ways to acheve greater varance reducton when sze constrants are mposed on strata. Ths s beng nvestgated under e assumpton at e Betvar crteron along w preference factors wll be e most practcal way to form strata for e CPS. FOOTNOTE Some of e average between-psu varances for were obtaned from only two random starts. BBLOGRAPHY {} DahmstrSm, P. and Hagnell, M. "Multvarate Stratfcaton of Prmary Samplng Unts n a General Purpose Survey." Scand.. Statst. 1978: 5, {2} Fredman, H. P. and Rubn,. (1967). "On Some nvarant Crtera for Groupng Data." ournal of e Amercan Statstcal Assocaton {3} u--~ns, D. R. and Sngh, R. P. "Usng Clusterng Algorms to Stratfy Prm~y Samplng Unts." Presented at e 141-'" Annual Meetng of e Amercan Statstcal Assocaton, August {4} Moore, T.; Bettn, P.; Kostanch, D.; and Shapro, G. "Overvew of Current Populaton Survey Sample Desgn." Presented at 139 Annual Meetng of e Amercan Statstcal Assocaton, August Chart A Summary of Stratfcaton Parameters and Nonstratfed Varances --~-~- ~ State Parameter... ~--_~.._._..._=~ Number of PSUs(N) Number of Strata(g) Vary, (x 105 ) Vary, 2 (x l0 S ) Vary, 3 (x l0 s ) Vary, 4 (x l0 s ) Scalng factors: sf sf 2 sf 3 sf , , t , , , , State and Sze Constrants Table. Reductons n Varance Due to Stratfcaton (Based on Betvar / Crteron) ~'~B, ~/ (% Reducton) -~B,2 ~/ (% Reducton) V-~rB,3~/ (% Reducton) V'-~B,4 ~/ (% Reducton) (62%) (65%) (41%) (32%) (36%) (34%) 3.62 (76%) 6.69 (55%) 6.35 (58%) (78%) (64%) (52%) (83%) (84%) (56%) (81%) 9.32 (93%) (76%) 9.55 (92%) (61%) (75%) (88%) (87%) 1, (72%) (85%) (61%) (53%) 8, (81%) 13, (67%) 16, (61%).71 (92%) 4.35 (52%) 4.35 (52%) (96%) 1, (52%) 1, (52%) 4 Ths crteron s defned as: Betvar = Y~ Var B j=l ' ~/ The average between-psu varances are x

6 18.91 (82%) (75%) (78%) (68%) (59%) (50%) Table 2. Reductons n Varance Due to Stratfcaton Based on Trace W Crteron) "State and Sze Constrants ~'~B,-- /, Reducton 1~'~-arB, 2-/ (% Reducton) ~T~B,3~/ (% Reducton)=~ B 4~/(% - Reduct~: (% t (45%) 1, (-0%~ 4.98 (67%) } (71%) (35%) (28%) 8.58 (43%) (42%) (31%) (33%) 6.02 (60%) (48%) MsssspP (80%) 8, (80%) (75%) 9, (78%) [ (48%) 12, (71%) _/ The average between-psu (90%) (84%) (90%) (85%) (75%) 1, (68%) 1.75 (81%) (89%) 2.25 (75%) (64%) 3.94 (57%) 1, (50%) l State and Sze ~ / (% Reducton) '' VarB,2-- / (% Reducton)... ~ VarB, 31/ -- (% Reducton) _ / (% Reducton Constrants arb,1-- VarB, (56%) (30%) (41%) r80%) %~ (58%~ (87%~ (67%) (64%)... / The average between-psu varances are x 105. Reductons n Varance Due to Stratfcaton (Based on Determnant W" Crteron) ,62 (49%) 7.30 (51%) (37%) 6.29 (58%) (28%) 7.68 (49%) (66%) (86%) (58%) (80%) (47%) (68) 10, (76%~ 1.01 (89%) 16, (60%) 4.39 (52%) 15, (63%) 4.41 (51%) (60%) (65%) (55%) (84%) 1, (78%) 2, (55%) (95%) 1, (50%) 1, (50%) Table 4. Reductons n Varance Due to Stratfcaton (Based on B e t vat-- / Crteron w Preference Factors) sta... d Sze W- 2/ (% Reducton~ ' ~~ 2/ C% Reducton) VarB,3 ~/ (% Reducton) -V~B,4~/ (% Reducton) Constrants arb, - arb, (78%) (50%) (44%) 8.74 (92%) (85%) (75%) 7O7.98 (39%) (23%) (25%) L (70%) (60%) 1, (33%) (91%) 111, (73%) (72%) ~ 17, (59%) (61%) 24, (42%) ~ (75%) 7.19 (52%) 7.37 (51%) (91%) (82%) (70%) (88%) 5.35 (41%) 4.52 (50%) (74%) (58%) (48%) (85%) 1, (69%) 2, (45%) (95%) 1, (48%) 1, (50%) 4 / Ths crteron s defned as: Betvar = ~ pf~ VarB, j w Pfl = 2 and Pf2 = Pf3 = Pf4 =. jl The average between-psu varances are x 105. ~Crteron State and Sze Constrants M s s.ss pp.. VarB.l "-~ / (% Reducton) VarB, (94%) (56%) (51%) 1.20 (99%) (79%) (82%) Reductons n Varance Due to a One-Varable Stratfcaton (Based on a dfferent Betvar Crteron for Each Varable) r....varb, 2 VarB,3 L... VarB~ V / (% Reducton... / ~VarB (% Reducton) arb, 2- VarB,3 - (% Reducton),4 / r (95%) (51%) (39%) (99%) (82%) (81%) ' %) (98%) (75%) 10, %) (51%) 14, (66%) (96%) 6.08 (59%) 6.01 (60%) (99%) (84%) (81%),..07 (99%) 4.55 (50%) 4.05 (55%) %) %) (58%) (99%) (94%) (83%) 9.03 (99.7%) 1, (54%) 1, (54%) / The average between-psu 105. Note at e average between-psu varances for e four varables represent dfferent stratfcatons. 2.q0

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method Maxmzng Overlap of Large Prmary Samplng Unts n Repeated Samplng: A comparson of Ernst s Method wth Ohlsson s Method Red Rottach and Padrac Murphy 1 U.S. Census Bureau 4600 Slver Hll Road, Washngton DC

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,

A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) , A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS Dr. Derald E. Wentzen, Wesley College, (302) 736-2574, wentzde@wesley.edu ABSTRACT A lnear programmng model s developed and used to compare

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part.

Copyright 2010 Cengage Learning, Inc. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. yes to (3) two-sample problem? no to (4) underlyng dstrbuton normal or can centrallmt theorem be assumed to hold? and yes to (5) underlyng dstrbuton bnomal? We now refer to the flowchart at the end of

More information

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING

Sampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING Samplng heory MODULE VII LECURE - 3 VARYIG PROBABILIY SAMPLIG DR. SHALABH DEPARME OF MAHEMAICS AD SAISICS IDIA ISIUE OF ECHOLOGY KAPUR he smple random samplng scheme provdes a random sample where every

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Module 9. Lecture 6. Duality in Assignment Problems

Module 9. Lecture 6. Duality in Assignment Problems Module 9 1 Lecture 6 Dualty n Assgnment Problems In ths lecture we attempt to answer few other mportant questons posed n earler lecture for (AP) and see how some of them can be explaned through the concept

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Statistical Evaluation of WATFLOOD

Statistical Evaluation of WATFLOOD tatstcal Evaluaton of WATFLD By: Angela MacLean, Dept. of Cvl & Envronmental Engneerng, Unversty of Waterloo, n. ctober, 005 The statstcs program assocated wth WATFLD uses spl.csv fle that s produced wth

More information

Feature Selection: Part 1

Feature Selection: Part 1 CSE 546: Machne Learnng Lecture 5 Feature Selecton: Part 1 Instructor: Sham Kakade 1 Regresson n the hgh dmensonal settng How do we learn when the number of features d s greater than the sample sze n?

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals

Simultaneous Optimization of Berth Allocation, Quay Crane Assignment and Quay Crane Scheduling Problems in Container Terminals Smultaneous Optmzaton of Berth Allocaton, Quay Crane Assgnment and Quay Crane Schedulng Problems n Contaner Termnals Necat Aras, Yavuz Türkoğulları, Z. Caner Taşkın, Kuban Altınel Abstract In ths work,

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu

BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH

DETERMINATION OF UNCERTAINTY ASSOCIATED WITH QUANTIZATION ERRORS USING THE BAYESIAN APPROACH Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata Proceedngs, XVII IMEKO World Congress, June 7, 3, Dubrovn, Croata TC XVII IMEKO World Congress Metrology n the 3rd Mllennum June 7, 3,

More information

Gaussian Mixture Models

Gaussian Mixture Models Lab Gaussan Mxture Models Lab Objectve: Understand the formulaton of Gaussan Mxture Models (GMMs) and how to estmate GMM parameters. You ve already seen GMMs as the observaton dstrbuton n certan contnuous

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

COS 521: Advanced Algorithms Game Theory and Linear Programming

COS 521: Advanced Algorithms Game Theory and Linear Programming COS 521: Advanced Algorthms Game Theory and Lnear Programmng Moses Charkar February 27, 2013 In these notes, we ntroduce some basc concepts n game theory and lnear programmng (LP). We show a connecton

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI

NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI NEW ASTERISKS IN VERSION 2.0 OF ACTIVEPI ASTERISK ADDED ON LESSON PAGE 3-1 after the second sentence under Clncal Trals Effcacy versus Effectveness versus Effcency The apprasal of a new or exstng healthcare

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE STATISTICA, anno LXXV, n. 4, 015 USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE Manoj K. Chaudhary 1 Department of Statstcs, Banaras Hndu Unversty, Varanas,

More information

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor

Copyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600

Statistical tables are provided Two Hours UNIVERSITY OF MANCHESTER. Date: Wednesday 4 th June 2008 Time: 1400 to 1600 Statstcal tables are provded Two Hours UNIVERSITY OF MNCHESTER Medcal Statstcs Date: Wednesday 4 th June 008 Tme: 1400 to 1600 MT3807 Electronc calculators may be used provded that they conform to Unversty

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j

Stat 642, Lecture notes for 01/27/ d i = 1 t. n i t nj. n j Stat 642, Lecture notes for 01/27/05 18 Rate Standardzaton Contnued: Note that f T n t where T s the cumulatve follow-up tme and n s the number of subjects at rsk at the mdpont or nterval, and d s the

More information

An Interactive Optimisation Tool for Allocation Problems

An Interactive Optimisation Tool for Allocation Problems An Interactve Optmsaton ool for Allocaton Problems Fredr Bonäs, Joam Westerlund and apo Westerlund Process Desgn Laboratory, Faculty of echnology, Åbo Aadem Unversty, uru 20500, Fnland hs paper presents

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

SIMPLE LINEAR REGRESSION

SIMPLE LINEAR REGRESSION Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two

More information

Multiple Contrasts (Simulation)

Multiple Contrasts (Simulation) Chapter 590 Multple Contrasts (Smulaton) Introducton Ths procedure uses smulaton to analyze the power and sgnfcance level of two multple-comparson procedures that perform two-sded hypothess tests of contrasts

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,

FREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced, FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then

More information

On the Multicriteria Integer Network Flow Problem

On the Multicriteria Integer Network Flow Problem BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5, No 2 Sofa 2005 On the Multcrtera Integer Network Flow Problem Vassl Vasslev, Marana Nkolova, Maryana Vassleva Insttute of

More information

Chapter Newton s Method

Chapter Newton s Method Chapter 9. Newton s Method After readng ths chapter, you should be able to:. Understand how Newton s method s dfferent from the Golden Secton Search method. Understand how Newton s method works 3. Solve

More information

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud

Resource Allocation with a Budget Constraint for Computing Independent Tasks in the Cloud Resource Allocaton wth a Budget Constrant for Computng Independent Tasks n the Cloud Wemng Sh and Bo Hong School of Electrcal and Computer Engneerng Georga Insttute of Technology, USA 2nd IEEE Internatonal

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE STATISTICA, anno LXXIV, n. 3, 2014 REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE Muqaddas Javed 1 Natonal College of Busness Admnstraton and Economcs, Lahore,

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

LECTURE 9 CANONICAL CORRELATION ANALYSIS

LECTURE 9 CANONICAL CORRELATION ANALYSIS LECURE 9 CANONICAL CORRELAION ANALYSIS Introducton he concept of canoncal correlaton arses when we want to quantfy the assocatons between two sets of varables. For example, suppose that the frst set of

More information

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential

Open Systems: Chemical Potential and Partial Molar Quantities Chemical Potential Open Systems: Chemcal Potental and Partal Molar Quanttes Chemcal Potental For closed systems, we have derved the followng relatonshps: du = TdS pdv dh = TdS + Vdp da = SdT pdv dg = VdP SdT For open systems,

More information

A new construction of 3-separable matrices via an improved decoding of Macula s construction

A new construction of 3-separable matrices via an improved decoding of Macula s construction Dscrete Optmzaton 5 008 700 704 Contents lsts avalable at ScenceDrect Dscrete Optmzaton journal homepage: wwwelsevercom/locate/dsopt A new constructon of 3-separable matrces va an mproved decodng of Macula

More information

Phase I Monitoring of Nonlinear Profiles

Phase I Monitoring of Nonlinear Profiles Phase I Montorng of Nonlnear Profles James D. Wllams Wllam H. Woodall Jeffrey B. Brch May, 003 J.D. Wllams, Bll Woodall, Jeff Brch, Vrgna Tech 003 Qualty & Productvty Research Conference, Yorktown Heghts,

More information

1 Convex Optimization

1 Convex Optimization Convex Optmzaton We wll consder convex optmzaton problems. Namely, mnmzaton problems where the objectve s convex (we assume no constrants for now). Such problems often arse n machne learnng. For example,

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

PARTICIPATION FACTOR IN MODAL ANALYSIS OF POWER SYSTEMS STABILITY

PARTICIPATION FACTOR IN MODAL ANALYSIS OF POWER SYSTEMS STABILITY POZNAN UNIVE RSITY OF TE CHNOLOGY ACADE MIC JOURNALS No 86 Electrcal Engneerng 6 Volodymyr KONOVAL* Roman PRYTULA** PARTICIPATION FACTOR IN MODAL ANALYSIS OF POWER SYSTEMS STABILITY Ths paper provdes a

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

STATIC OPTIMIZATION: BASICS

STATIC OPTIMIZATION: BASICS STATIC OPTIMIZATION: BASICS 7A- Lecture Overvew What s optmzaton? What applcatons? How can optmzaton be mplemented? How can optmzaton problems be solved? Why should optmzaton apply n human movement? How

More information

Annexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances

Annexes. EC.1. Cycle-base move illustration. EC.2. Problem Instances ec Annexes Ths Annex frst llustrates a cycle-based move n the dynamc-block generaton tabu search. It then dsplays the characterstcs of the nstance sets, followed by detaled results of the parametercalbraton

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

4DVAR, according to the name, is a four-dimensional variational method.

4DVAR, according to the name, is a four-dimensional variational method. 4D-Varatonal Data Assmlaton (4D-Var) 4DVAR, accordng to the name, s a four-dmensonal varatonal method. 4D-Var s actually a drect generalzaton of 3D-Var to handle observatons that are dstrbuted n tme. The

More information

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013

ISSN: ISO 9001:2008 Certified International Journal of Engineering and Innovative Technology (IJEIT) Volume 3, Issue 1, July 2013 ISSN: 2277-375 Constructon of Trend Free Run Orders for Orthogonal rrays Usng Codes bstract: Sometmes when the expermental runs are carred out n a tme order sequence, the response can depend on the run

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Topic- 11 The Analysis of Variance

Topic- 11 The Analysis of Variance Topc- 11 The Analyss of Varance Expermental Desgn The samplng plan or expermental desgn determnes the way that a sample s selected. In an observatonal study, the expermenter observes data that already

More information

Bayesian Planning of Hit-Miss Inspection Tests

Bayesian Planning of Hit-Miss Inspection Tests Bayesan Plannng of Ht-Mss Inspecton Tests Yew-Meng Koh a and Wllam Q Meeker a a Center for Nondestructve Evaluaton, Department of Statstcs, Iowa State Unversty, Ames, Iowa 5000 Abstract Although some useful

More information

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis

Resource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques

More information

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting

Online Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group

More information

Statistics Chapter 4

Statistics Chapter 4 Statstcs Chapter 4 "There are three knds of les: les, damned les, and statstcs." Benjamn Dsrael, 1895 (Brtsh statesman) Gaussan Dstrbuton, 4-1 If a measurement s repeated many tmes a statstcal treatment

More information