Chapter - Smple Radom Samplg Smple radom samplg (SRS) s a method of selecto of a sample comprsg of umber of samplg uts out of the populato havg umber of samplg uts such that every samplg ut has a equal chace of beg chose. The samples ca be draw two possble ways. The samplg uts are chose wthout replacemet the sese that the uts oce chose are ot placed back the populato. The samplg uts are chose wth replacemet the sese that the chose uts are placed back the populato.. Smple radom samplg wthout replacemet (SRSWOR): SRSWOR s a method of selecto of uts out of the uts oe by oe such that at ay stage of selecto, ayoe of the remag uts have same chace of beg selected,.e. /.. Smple radom samplg wth replacemet (SRSWR): SRSWR s a method of selecto of uts out of the uts oe by oe such that at each stage of selecto each ut has equal chace of beg selected,.e., /.. Procedure of selecto of a radom sample: The procedure of selecto of a radom sample follows the followg steps:. Idetfy the uts the populato wth the umbers to.. Choose ay radom umber arbtrarly the radom umber table ad start readg umbers. 3. Choose the samplg ut whose seral umber correspods to the radom umber draw from the table of radom umbers. 4. I case of SRSWR, all the radom umbers are accepted ever f repeated more tha oce. I case of SRSWOR, f ay radom umber s repeated, the t s gored ad more umbers are draw. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page
Such process ca be mplemeted through programmg ad usg the dscrete uform dstrbuto. Ay umber betwee ad ca be geerated from ths dstrbuto ad correspodg ut ca be selected to the sample by assocatg a dex wth each samplg ut. May statstcal softwares lke R, SAS, etc. have bult fuctos for drawg a sample usg SRSWOR or SRSWR. otatos: The followg otatos wll be used further otes: : : Y : Y : umber of samplg uts the populato (Populato sze). umber of samplg uts the sample (sample sze) The characterstc uder cosderato Value of the characterstc for the th ut of the populato y y : sample mea Y y : populato mea S ( Y Y ) ( Y Y ) σ ( ) ( ) Y Y Y Y s ( y y) ( y y ) Probablty of drawg a sample :.SRSWOR: If uts are selected by SRSWOR, the total umber of possble samples are So the probablty of selectg ay oe of these samples s. ote that a ut ca be selected at ay oe of the draws. Let u be the th. ut selected the sample. Ths ut ca be selected the sample ether at frst draw, secod draw,, or th draw. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page
Let Pj () deotes the probablty of selecto of P () P () + P () +... + P () j + +... + ( tmes) u at the j th draw, j,,...,. The ow f u, u,..., u are the uts selected the sample, the the probablty of ther selecto s Pu (, u,..., u) Pu ( ). Pu ( ),..., Pu ( ) ote that whe the secod ut s to be selected, the there are ( ) uts left to be selected the sample from the populato of ( ) uts. Smlarly, whe the thrd ut s to be selected, the there are ( ) uts left to be selected the sample from the populato of ( ) uts ad so o. If Pu ( ), the Pu ( ),..., Pu ( ). + Thus Pu (, u,.., u)...... + Alteratve approach: The probablty of drawg a sample SRSWOR ca alteratvely be foud as follows: Let u ( k) deotes the th ut draw at the k th draw. ote that the th ut ca be ay ut out of the uts. The so ( u(), u(),..., u( ) ) s a ordered sample whch the order of the uts whch they are draw,.e., u () draw at the frst draw, u () draw at the secod draw ad so o, s also cosdered. The probablty of selecto of such a ordered sample s Ps ( o) Pu ( () ) Pu ( () u() ) Pu ( (3) u() u() )... Pu ( ( ) u() u()... u( ) ). Here ( k) () () ( k ) Pu ( u u... u ) s the probablty of drawg u ( k) at the k th draw gve that u, u,..., u have already bee draw the frst (k ) draws. () () ( k ) Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 3
Such probablty s obtaed as Pu ( ( k) u() u()... u( k ) ). k+ So ( )! Ps ( o). k+! k The umber of ways whch a sample of sze ca be draw! ( )! Probablty of drawg a sample a gve order! So the probablty of drawg a sample whch the order of uts whch they are draw s ( )! rrelevat!.!. SRSWR Whe uts are selected wth SRSWR, the total umber of possble samples are. The Probablty of drawg a sample s. Alteratvely, let u be the th ut selected the sample. Ths ut ca be selected the sample ether at frst draw, secod draw,, or th draw. At ay stage, there are always uts the populato case of SRSWR, so the probablty of selecto of u at ay stage s / for all,,,. The the probablty of selecto of uts u, u,..., u the sample s Pu (, u,.., u) Pu ( ). Pu ( )... Pu ( ).... Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 4
Probablty of drawg a ut. SRSWOR Let A e deotes a evet that a partcular ut u j s ot selected at the th draw. The probablty of selectg, say, th j ut at th k draw s P (selecto of u j at th k draw) k PA ( A... A A) PA ( ) PA ( A) PA ( AA)... PA ( A, A... A ) PA ( A, A... A ) 3 k k k k... k+ k+ k+..... k+ k+ k. SRSWR P[ selecto of u j at kth draw]. Estmato of populato mea ad populato varace Oe of the ma objectves after the selecto of a sample s to kow about the tedecy of the data to cluster aroud the cetral value ad the scatterdess of the data aroud the cetral value. Amog varous dcators of cetral tedecy ad dsperso, the popular choces are arthmetc mea ad varace. So the populato mea ad populato varablty are geerally measured by the arthmetc mea (or weghted arthmetc mea) ad varace, respectvely. There are varous popular estmators for estmatg the populato mea ad populato varace. Amog them, sample arthmetc mea ad sample varace are more popular tha other estmators. Oe of the reaso to use these estmators s that they possess ce statstcal propertes. Moreover, they are also obtaed through well establshed statstcal estmato procedures lke maxmum lkelhood estmato, least squares estmato, method of momets etc. uder several stadard statstcal dstrbutos. Oe may also cosder other dcators lke meda, mode, geometrc mea, harmoc mea for measurg the cetral tedecy ad mea devato, absolute devato, Ptma earess etc. for measurg the dsperso. The propertes of such estmators ca be studed by umercal procedures lke bootstrapg. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 5
. Estmato of populato mea Let us cosder the sample arthmetc mea y y as a estmator of populato mea Y Y ad verfy y s a ubased estmator of Y uder the two cases. SRSWOR Let t y. The E( y) E( y ) E( t ) t y. Whe uts are sampled from uts by wthout replacemet, the each ut of the populato ca occur wth other uts selected out of the remag ( ) uts s the populato ad each ut occurs of the possble samples. So So ow y. y ( )!!( )! E( y) ( )!( )!! y y Y. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 6
Thus y s a ubased estmator of Y. Alteratvely, the followg approach ca also be adopted to show the ubasedess property. E( y) E( yj ) j YP j() j Y. j Y j Y where Pj () deotes the probablty of selecto of th ut at th j stage. SRSWR E( y) E( y ) where E( y ) ( YP +.. + Y P) Y. P Y for all,,..., s the probablty of selecto of a ut. Thus y s a ubased estmator of populato mea uder SRSWR also. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 7
Varace of the estmate Assume that each observato has some varace V( y) E( y Y) E ( y Y) E ( ) ( )( ) y Y + y Y yj Y j E( y Y) + E( y Y)( y Y) j j K + K S σ + σ. The where K E( y Y)( y Y) assumg that each observato has varace j K uder the setups of SRSWR ad SRSWOR. σ. ow we fd SRSWOR. j K E( y Y)( y Y) Cosder E( y Y)( yj Y) ( yk Y)( ye Y) ( ) Sce k k k k k k k ( y Y) ( y Y) + ( y Y)( y Y)) 0 ( ) S + ( yk Y)( y Y) k ( yk Y)( y Y) [ ( ) S ] ( ) S. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 8
Thus K ( ) S ad so substtutg the value of K, the varace of y uder SRSWOR s S V( ywor ) S ( ) S. SRSWR K E( y Y)( y Y) j E( y Y) E( y Y) j je 0 because the th ad jth draws ( j) are depedet. Thus the varace of y uder SRSWR s V y S ( WR ). It s to be oted that f s fte (large eough), the S V( y) s both the cases of SRSWOR ad SRSWR. So the factor s resposble for chagg the varace of y whe the sample s draw from a fte populato comparso to a fte populato. Ths s why s called a fte populato correcto (fpc). It may be oted that, so s close to f the rato of sample sze to populato, s very small or eglgble. The term s called samplg fracto. I practce, fpc ca be gored wheever < 5% ad for may purposes eve f t s as hgh as 0%. Igorg fpc wll result the overestmato of varace of y. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 9
Effcecy of y uder SRSWOR over SRSWR V( ywor ) S V( ywr ) S S + S V ( y ) + a postve quatty Thus WOR V( y ) > V( y ) WR WOR ad so, SRSWOR s more effcet tha SRSWR. Estmato of varace from a sample Sce the expressos of varaces of sample mea volve S whch s based o populato values, so these expressos ca ot be used real lfe applcatos. I order to estmate the varace of y o the bass of a sample, a estmator of estmator of SRSWR, s (or S (or equvaletly σ ) ad we vestgate ts basedess for σ ) s eeded. Cosder S as a S the cases of SRSWOR ad Cosder s y y ( ) ( y Y) ( y Y) ( y Y) y ( Y) E( s ) E( y Y) E( y Y) Var( y ) Var( y) σ Var( y) Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 0
I case of SRSWOR V( ywor ) S ad so Es ( ) σ S S S S I case of SRSWR V( ywr ) S ad so Es ( ) σ S S S S σ Hece Es ( ) S σ s SRSWOR s SRSWR A ubased estmate of Var( y ) s Vˆ( y ) WOR s case of SRSWOR ad ˆ( ). V ywr s s case of SRSWR. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page
Stadard errors The stadard error of y s defed as Var( y ). I order to estmate the stadard error, oe smple opto s to cosder the square root of estmate of varace of sample mea. uder SRSWOR, a possble estmator s ˆ( σ y) s. uder SRSWR, a possble estmator s ˆ( σ y) s. It s to be oted that ths estmator does ot possess the same propertes as of Var ( y ). Reaso beg f ˆ θ s a estmator of θ, the θ s ot ecessarly a estmator of θ. I fact, the ˆ( σ y) s a egatvely based estmator uder SRSWOR. The approxmate expressos for large case are as follows: (Referece: Samplg Theory of Surveys wth Applcatos, P.V. Sukhatme, B.V. Sukhatme, S. Sukhatme, C. Asok, Iowa State Uversty Press ad Ida Socety of Agrcultural Statstcs, 984, Ida) Cosder s as a estmator of S. Let Wrte s S + ε wth E( ε) 0, E( ε ) S. s ( S + ε ) / ε S + S / ε ε S + +... 4 S 8S assumg ε wll be small as compared to S ad as becomes large, the probablty of such a evet approaches oe. eglectg the powers of ε hgher tha two ad takg expectato, we have Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page
where Var s Es () 4 8S ( ) S Var s 4 S 3) for large. ( ) ( ) + ( β ) Thus µ j ( Y Y) µ 4 β : coeffcet of kurtoss. 4 S j ( ) E s β 3 S 4( ) 8 Var s ( ) () 8 4 S Var s S S Var( s ) 4S S + ( β 3 ). ( ) ote that for a ormal dstrbuto, β 3 ad we obta S Var() s. ( ) Both Var( s) ad Var( s ) are flated due to oormalty to the same extet, by the flato factor + ( β 3) ad ths does ot depeds o coeffcet of skewess. Ths s a mportat result to be kept md whle determg the sample sze whch t s assumed that S s kow. If flato factor s gored ad populato s o-ormal, the the relablty o s may be msleadg. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 3
Alteratve approach: The results for the ubasedess property ad the varace of sample mea ca also be proved a alteratve way as follows: () SRSWOR Wth the th ut of the populato, we assocate a radom varable a defed as follows: The, th, f the ut occurs the sample t 0, f the ut does ot occurs the sample (,,..., ) a h th Ea ( ) Probablty that the ut s cluded the sample,,,...,. th Ea ( ) Probablty that the ut s cluded the sample,,,..., th th E( aa j) Probablty that the ad j uts are cluded the sample ( ), j,,...,. ( ) From these results, we ca obta ( ) ( ) Vara ( ) Ea ( ) Ea ( ),,,..., ( ) Cova (, aj) Eaa ( j) Ea ( ) Ea ( j), j,,...,. ( ) We ca rewrte the sample mea as y ay The Ey ( ) Ea ( ) y Y ad Var( y) Var a y ( ) (, ). Var a y + Cov a a j yy j j Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 4
Substtutg the values of Var( a) ad Cov( a, a j) the expresso of Var( y ) ad smplfyg, we get Var y To show that ( ) S. Es ( ) S, cosder s y y ay y. ( ) ( ) Hece, takg, expectato, we get E( s ) E( a) y { Var( y) Y } ( ) + Substtutg the values of E( a ) ad Var( y ) ths expresso ad smplfyg, we get Es ( ) S. () SRSWR Let a radom varable a assocated wth the th ut of the populato deotes the umber of tmes the th ut occurs the sample,,...,. So a assumes values 0,,,,. The jot dstrbuto of a, a,..., a s the multomal dstrbuto gve by! Pa (, a,..., a ). a! where a. For ths multomal dstrbuto, we have Ea ( ), ( ) Var( a ),,,...,. Cov( a, a j), j,,...,. We rewrte the sample mea as y ay. Hece, takg expectato of y ad substtutg the value of Ea ( ) / we obta that E( y) Y. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 5
Further, Var( y) Var( a ) (, ) y Cov a a j yy j + Substtutg, the values of Var( a ) ( ) / ad Cov( a, a ) / ad smplfyg, we get j Var y ( ) S. To prove that Es ( ) S σ SRSWR, cosder ( ) s y y ay y, { } ( ) E( s ) E( a ) y Var( y) + Y ( ) y. S Y ( )( ) S Es ( ) S σ Estmator of populato total: Sometmes, t s also of terest to estmate the populato total, e.g. total household come, total expedtures etc. Let deotes the populato total T Y Y Y whch ca be estmated by Yˆ T Y ˆ y. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 6
Obvously ( ˆ T ) E ( y ) E Y Y ( ˆ T ) ( ) Var Y y ( ) ( ) S S for SRSWOR S S for SRSWOR ad the estmates of varace of Y ˆT are ( ) s for SRSWOR ( ˆ Var YT ) s for SRSWOR Cofdece lmts for the populato mea ow we costruct the 00 ( ) % cofdece terval for the populato mea. Assume that the populato s ormally dstrbuted ( µσ, ) wth mea µ ad varace σ. the y Y Var( y) follows (0,) whe σ s kow. If σ s ukow ad s estmated from the sample the y Y Var( y) follows a t -dstrbuto wth ( ) degrees of freedom. Whe σ s kow, the the 00( ) % cofdece terval s gve by y Y P Z Z Var( y) or P y Z Var( y) y y + Z Var( y) ad the cofdece lmts are y Z Var( y), y + Z Var( y Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 7
whe Z deotes the upper % pots o (0,) dstrbuto. Smlarly, whe σ s ukow, the the 00(- ) % cofdece terval s y Y P t t or Varˆ( y) P y t Varˆ y y y + t Varˆ y ( ) ( ) ad the cofdece lmts are y t Varˆ( y) y t Varˆ + ( y) where t deotes the upper % pots o t -dstrbuto wth ( ) degrees of freedom. Determato of sample sze The sze of the sample s eeded before the survey starts ad goes to operato. Oe pot to be kept s md s that whe the sample sze creases, the varace of estmators decreases but the cost of survey creases ad vce versa. So there has to be a balace betwee the two aspects. The sample sze ca be determed o the bass of prescrbed values of stadard error of sample mea, error of estmato, wdth of the cofdece terval, coeffcet of varato of sample mea, relatve error of sample mea or total cost amog several others. A mportat costrat or eed to determe the sample sze s that the formato regardg the populato stadard dervato S should be kow for these crtero. The reaso ad eed for ths wll be clear whe we derve the sample sze the ext secto. A questo arses about how to have formato about S before had? The possble solutos to ths ssue are to coduct a plot survey ad collect a prelmary sample of small sze, estmate S ad use t as kow value of S t. Alteratvely, such formato ca also be collected from past data, past experece, log assocato of expermeter wth the expermet, pror formato etc. ow we fd the sample sze uder dfferet crtera assumg that the samples have bee draw usg SRSWOR. The case for SRSWR ca be derved smlarly. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 8
. Prespecfed varace The sample sze s to be determed such that the varace of y should ot exceed a gve value, say V. I ths case, fd such that Var( y) V or ( y ) V or S V or V S or e e e + where e S v It may be oted here that. e ca be kow oly whe that S should be kow. The same reaso wll also be see other cases. The smallest sample sze eeded ths case s smallest e. e + It s large, the the requred s e ad smallest e. S s kow. Ths reaso compels to assume. Pre-specfed estmato error It may be possble to have some pror kowledge of populato mea Y ad t may be requred that the sample mea y should ot dffer from t by more tha a specfed amout of absolute estmato error,.e., whch s a small quatty. Such requremet ca be satsfed by assocatg a probablty ( ) wth t ad ca be expressed as P y Y e ( ). Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 9
Sce y follows Y (, S) y Y e P Var( y) Var( y) whch mples that e Z Var( y) assumg the ormal dstrbuto for the populato, we ca wrte or or Z Var y ( ) e Z S e Z S e or Z S + e whch s the requred sample sze. If s large the Z S. e Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 0
3. Pre-specfed wdth of cofdece terval If the requremet s that the wdth of the cofdece terval of y wth cofdece coeffcet ( ) should ot exceed a prespecfed amout W, the the sample sze s determed such that Z Var( y) W assumg Z S W σ s kow ad populato s ormally dstrbuted. Ths ca be expressed as or or 4Z S W W + 4Z S or 4Z S W 4Z S + W The mmum sample sze requred s smallest 4Z S W 4Z S + W If s large the 4Z S W ad the mmum sample sze eeded s smallest 4Z S W.. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page
4. Pre-specfed coeffcet of varato The coeffcet of varato (CV) s defed as the rato of stadard error (or stadard devato) ad mea. The kowledge of coeffcet of varato has played a mportat role the samplg theory as ths formato has helped dervg effcet estmators. If t s desred that the the coeffcet of varato of y should ot exceed a gve or pre-specfed value of coeffcet of varato, say C 0, the the requred sample sze s to be determed such that CV ( y) C Var( y) or C0 Y or or or 0 S Y C C C 0 C C C + C o 0 0 S s the requred sample sze where C s the populato coeffcet of varato. Y The smallest sample sze eeded ths case s smallest C C C + C 0 0 If s large, the C C ad 0 smalest C C 0. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page
5. Pre-specfed relatve error Whe y s used for estmatg the populato mea Y, the the relatve estmato error s defed as y Y. If t s requred that such relatve estmato error should ot exceed a pre-specfed value Y R wth probablty ( ), the such requremet ca be satsfed by expressg t lke such requremet ca be satsfed by expressg t lke y Y RY P. Var( y) Var( y) Assumg the populato to be ormally dstrbuted, y follows Y, S. So t ca be wrtte that RY Var( y) Z. or or Z S RY R CZ or Z C R Z C + R S where C s the populato coeffcet of varato ad should be kow. Y If s large, the z C R. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 3
6. Pre-specfed cost Let a amout of moey C s beg desgated for sample survey to called observatos, C 0 be the overhead cost ad C be the cost of collecto of oe ut the sample. The the total cost C ca be expressed as C C0 + C C C0 Or C s the requred sample sze. Samplg Theory Chapter Smple Radom Samplg Shalabh, IIT Kapur Page 4