Elements of Statistical Methods Lots of Data or Large Samples (Ch 8)
|
|
- Prosper Martin
- 5 years ago
- Views:
Transcription
1 Elemets of Statistical Methods Lots of Data or Large Samples (Ch 8) Fritz Scholz Sprig Quarter 2010 February 26, 2010
2 x ad X We itroduced the sample mea x as the average of the observed sample values x = {x 1,...,x }, usig the plug-i priciple. I parallel, we ca also cosider the average of the radom variables X 1,...,X X = X X ad view it as a radom variable X : S R. = 1 X i i=1 X is just the average of such fuctios X i : S R. Sice the x i are just the observed values of the X i, we ca view x = 1 x i as the observed value of X = 1 i=1 X i i=1 X, viewed as a radom variable, has a distributio. Let us experimet! R makes it very easy to get hads o experiece. 1
3 Behavior of X whe Samplig χ 2 (3) Take a sample of size = 5 from χ 2 (3) via x <- rchisq(5,3) ad the compute its mea mea(x). X χ 2 (3) µ = EX = 3. Repeat this several times, i.e., get several observed values x 5 of X. > x <- rchisq(5,3) > mea(x) [1] > x <- rchisq(5,3) > mea(x) [1] > x <- rchisq(5,3) > mea(x) [1] These values scatter widely aroud µ = 3, samplig variability! 2
4 Samplig Variability of X 5 To get a less haphazard view of this samplig variability, we repeat this process N sim = 1000 times ad look at these 1000 observed sample meas usig a kerel desity plot. This is best implemeted i a fuctio with a loop. chi2averagesim <- fuctio(nsim=1000,=5,k=3){ Xbar <- umeric(nsim) for( i i 1:Nsim ){ x <- rchisq(,k); Xbar[i] <- mea(x) } plot(desity(xbar),xlim=c(0,9),ylim=c(0,1.4),mai="") ablie(h=0) } 3
5 Samplig Variatio of X 5 : X i χ 2 (3) Desity N = 1000 Badwidth =
6 Samplig Variatio of X 20 : X i χ 2 (3) Desity N = 1000 Badwidth =
7 Samplig Variatio of X 80 : X i χ 2 (3) Desity N = 1000 Badwidth =
8 Commets All three kerel desity plots are o the same horizotal ad vertical scale. We see that they are all cetered more or less o µ = 3. The samplig variability of X, as we go from = 5 to = 20 to = 80, decreases visibly, almost by a factor of 2 = 4 each time (for a reaso). The mild skew to the right for = 5 seems to disappear as gets larger. The distributios start to look more ormal for larger. Experimet with chi2averagesim(nsim=1000,=5,k=3), replacig = 5 by = 20 ad = 80. 7
9 Averagig Decreases Variatio i X Distributio What we saw experimetally, whe samplig from χ 2 (3), we will ow geeralize. Let X,...,X be i.i.d. F, some cdf with fiite mea µ = EX i ad fiite variace σ 2 = varx i. ( ) 1 E X = E X i = 1 i=1 EX i = 1 i=1 µ = µ (idepedece ot used) i.e., the mea of the X populatio is the same as that of the sampled populatio. var X = var ( 1 ) X i = 1 i=1 2 varx i = 1 i=1 2 σ2 = σ2 (idepedece is used) σ( X ) = σ/, i.e., quadruplig cuts σ( X ) by a factor 2: 1/ 4 = 1/(2 ). 8
10 The Weak Law of Large Numbers (WLLN) Recall our previous defiitio of covergece: y c as iff for ay ε > 0 we ca fid a atural umber N such that y (c ε,c + ε) for all N We ow replace the umber sequece y by a sequece Y of radom variables. Defiitio: A sequece of radom variables {Y } coverges i probability to a P costat c, writte Y c, iff for ay ε > 0, lim P(Y (c ε,c + ε)) = 1 i.e., Y gets arbitrarily close to c with probability closer ad closer to 1 as. I the cotiuous case, deotig the desity of Y by f Z c+ε c ε f (x)dx = Area (c ε,c+ε) ( f ) 1 as 9
11 Desity f (x) of X f (x) = 5 = 20 = 80 = 320 c ε c c + ε 10
12 The Weak Law of Large Numbers (WLLN) Theorem (WLLN): Let X 1,X 2,... be a sequece of idepedet, idetically distributed radom variables with fiite mea µ ad fiite variace σ 2. The X P µ or equivaletly X µ P 0 as The average X of more ad more observatios X i will get closer ad closer to the mea µ of the sampled populatio with probability tedig to 1. Large sample sizes are good! That is why it is importat to report the sample size used i surveys. 11
13 The Frequetist s Basis for Iterpretatio of Probability Corollary: Let A be a evet ad cosider a sequece of idepedet ad idetical experimets for which we record whether the evet A occurs or ot. Let p = P(A) ad defie i.i.d. Beroulli radom variables X i = { 1 A occurs 0 A c occurs The X is the relative frequecy with which the evet A occurs i trials. Sice µ = EX i = E X = p = P(A) WLLN = X P p as. Thus the axiomatic model of probability eriched by the cocept of idepedece proves the frequetist s iterpretatio of probability. 12
14 Empirical Probabilities ad Plug-I Priciple Recall that we defied the empirical probability of observig a radom variable X i with value x i i evet A R as ˆP (A) = #{x i A} ( = ˆp (A) may be more appropriate otatio.) Whe viewig this i terms of X i istead of x i we have ˆP (A) = #{X i A} P p = P(A) as. The WLLN gives us a justificatio for approximatig P(A) by ˆP (A). This is ofte referred to as the fudametal theorem of statistics, especially whe usig A = (,a] ad the ˆF (a) = #{X i a} P F(a) = P(X i a) as. 13
15 Stadardizatio of a Radom Variable A radom variable X with fiite mea µ = EX ad fiite variace σ 2 is i its stadardized form Z whe Z = (X µ)/σ EZ = E(X µ) σ = µ µ σ = 0 varz = 1 σ 2var(X µ) = 1 σ2 σ2var(x) = σ 2 = 1 The followig are the stadardized versios of X i, X X ad X radom expected stadard stadard variable value deviatio uits X i µ σ X i µ)/σ X X µ σ ( i=1 X i µ)/( σ) X µ σ/ ( X µ)/(σ/ ) Note the equivalece ( X µ)/(σ/ ) = ( i=1 X i µ)/( σ) 14
16 Commets o Stadardizatio The basic shape of the distributio remais uchaged by stadardizatio. X Beroulli(0.5) µ = p = 0.5 ad σ = p(1 p) = 0.5, the Z = (X 0.5)/0.5 = 2X 1 takes o the two values (1 0.5)/0.5 = 1 ad (0 0.5)/0.5 = 1 with equal probability p = 0.5. Stadardizatio the stadardized radom variable is ormally distributed. This miscoceptio may come from the frequet iterchageable laguage usage of stadardizatio ad ormalizatio (ormal i the sese of ormative). Stadardizatio oly turs X N (µ,σ 2 ) ito a Z = (X µ)/σ N (0,1) i.e., you start with ormality ad you ed up with ormality. Approximate distributioal ormality is due to differet effects. 15
17 We have X µ P 0 (Speed?) var [ ( X µ) ] = var( X µ) = var X = σ2 = σ2 X µ, multiplied by the factor a =, has mea zero ad fixed variace σ 2. Thus it appears that a = is just the right factor to couteract the collapse, i.e., we ca view 1/ as the rate of the collapse of X µ to zero. Aside from a stable mea zero ad variace σ 2 for ( X µ), ca we say more about its distributio as? This questio is addressed by the Cetral Limit Theorem (CLT). A mechaical display of the CLT, the Galto Board or quicux, is o display at the Pacific Sciece Ceter. 16
18 Galto Board N sim = 5000 = X X 17
19 The Cetral Limit Theorem Theorem: Let X 1,...,X i.i.d. F, with fiite mea µ ad fiite variace σ 2. Deote the cdf of the stadardized radom variables X ad X X, i.e., The for all z R Z = X µ σ/ = X X µ σ, by F P(Z z) = F (z) Φ(z) as The distributio F of the X i ca be ay distributio with fiite µ ad σ 2. We also write F (z) Φ(z) or Z N (0,1) to express this approximatio result. Note that Z = X µ σ/ = ( X µ) σ N (0,1) or X µ N (0,σ 2 /) the collapse 18
20 CLT for Biomial = Sum of Beroulli R.V.s X X 50 ~Biomial(50, 0.4) Desity blue lie: approximatig ormal desity Normal Q Q Plot Theoretical Quatiles Sample Quatiles 19
21 Speed of Covergece i CLT How fast is the covergece F (z) Φ(z) i relatio to? Uder mild coditios: max z F (z) Φ(z) 0, agai at a rate of about c/. A rule of thumb: the ormal approximatio is usually adequate whe 30. Ofte a much smaller, say = 5, is already quite adequate. It all depeds o what is meat by adequate. Whe z is large we have Φ(z) 0 or 1 ad the same will hold for F. The the relative errors F (z) Φ(z) /F (z) or F (z) Φ(z) /(1 F (z)) may be more relevat. 20
22 Measuremet Example Nuclear magetic resoace (NMR) spectroscopy is used to measure the distace betwee earby hydroge atoms. Kow: The expected value of this measuremet is the actual distace (o bias) the stadard deviatio is σ = 0.5 agstroms. If the measuremet process is repeated 36 times, what is the chace that the average measured value X 36 falls withi 0.1 agstrom of the true value µ? P(µ 0.1 < X 36 < µ + 0.1) = P(µ 0.1 µ < X 36 µ < µ µ) ( 0.1 = P( 0.1 < X 36 µ < 0.1) = P σ/ < X 36 µ σ/ < 0.1 ) σ/ P( 0.1/(0.5/6) < Z < 0.1/(0.5/6)) = P( 1.2 < Z < 1.2) P( 1.2 < Z < 1.2) = Φ(1.2) Φ( 1.2) = porm(1.2) porm( 1.2) =
23 Measuremet Example ( cotiued) CLT = X i N (µ,σ 2 ) ad X N (µ,σ 2 /) i=1 X D(µ,σ 2 ) meas that X has some distributio with mea µ ad variace σ 2. If someoe else idepedetly replicates the previous experimet 64 times, what is the chace that the two averages are withi 0.1 agstroms of each other? X 1,...,X 36 D(µ,σ 2 ) = X 36 N (µ,σ 2 /36) Y 1,...,Y 64 D(µ,σ 2 ) = Ȳ 64 N (µ,σ 2 /64) = X 36 Ȳ 64 = X 36 + ( Ȳ 64 ) N (µ + ( µ),σ 2 /36 + σ 2 /64) = N (0,σ 2 /36 + σ 2 /64) = N (0,0.25 ( )/( )) = N (0,5 2 /48 2 ) ( 0.1 P( 0.1 < X 36 Ȳ 64 < 0.1) = P 5/48 < X 36 Ȳ 64 5/48 < 0.1 ) 5/48 = Φ(0.96) Φ( 0.96) = porm(.96) porm(.96) =
24 More Geeral CLT I our previous CLT we required the summads X i to be idetically distributed. Theorem: Let X i be idepedet radom variables with respective fiite meas µ i ad variaces σ 2 i, i = 1,...,. Uder additioal (techical) assumptios of which the followig is most relevat max(σ 2 1,...,σ2 ) σ σ2 we get that the stadardized sum 0 as (1) Z = X X (µ µ ) σ σ2 has cdf F such that P(Z z) = F (z) Φ(z) as 23
25 Sampled Desities X 1 µ X 2 µ X 3 µ X 4 µ X 5 µ 5
26 CLT i No-IID Case, = 5 X X 5 Frequecy Normal Q Q Plot Theoretical Quatiles X X 5 N sim =
27 Commets The variace coditio (1) makes sure that oe of the variaces domiate. All the variaces cotribute relatively small amouts to the total variability. For example, if X 1 Uiform(0,1000) with a very large variace ad all the other radom variables X i N (0,1), i = 2,...,, the for ot so large the sum X X will ot be well approximated by a ormal distributio, but will iherit maily the uiform distributio character of X 1 (see ext slide). 26
28 X 1 Uiform(0,1000) & X i N (0,1), i = 2,...,10 Frequecy X X 10 27
29 Further Commets o the CLT The iitial versio of the CLT i the iid case is useful i may situatios whe a experimet is repeated idepedetly may times ad we cosider the average X as or mai focus of iterest. The broader o-iid versio of the CLT is very useful it ratioalizig or modelig a ormal distributio for radom variables X i observed i experimets. This ratioalizatio cosists i probig to what extet X i ca be viewed as the sum of may radom effects that act more or less idepedetly. For example, the time to complete a task ca be viewed as the sum of the radom times to complete may subtasks ito which the mai task ca be decomposed. Ay measuremet ca be affected by may differet sources of small errors. 28
30 A Slight Extesio of the CLT Theorem: Let X 1,X 2... be a sequece of iid radom variables with fiite mea µ ad fiite variace σ 2. Suppose that D 1,D 2,... is a sequece of radom variables such that D 2 P σ 2 as ad let The for ay t R we have T = X µ D / = X µ σ/ σ D F (t) = P(T t) Φ(t) as Note that D /σ ad its reciprocal basically behave like the costat 1 as. I our previous measuremet example we assumed a kow σ = 0.5 agstrom. Typically σ is ot kow, but oe ca get a estimate of σ 2, say the plug-i sample estimate σ 2. 29
31 σ 2 P σ 2 Recall σ 2 = 1 (X i X ) 2 = 1 i=1 Xi 2 X 2 i=1 By the WLLN applied to the averages of the X i ad X 2 i X P µ = X 2 P µ 2 ad 1 Xi 2 i=1 P E(X 2 i ) = 1 Xi 2 X 2 i=1 P E(X 2 i ) µ2 = σ 2 While the above sequece of coclusios still require some techical details, P our uderstadig of should make them quite evidet. 30
Binomial Distribution
0.0 0.5 1.0 1.5 2.0 2.5 3.0 0 1 2 3 4 5 6 7 0.0 0.5 1.0 1.5 2.0 2.5 3.0 Overview Example: coi tossed three times Defiitio Formula Recall that a r.v. is discrete if there are either a fiite umber of possible
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More information4. Partial Sums and the Central Limit Theorem
1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLECTURE 8: ASYMPTOTICS I
LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece
More informationAMS570 Lecture Notes #2
AMS570 Lecture Notes # Review of Probability (cotiued) Probability distributios. () Biomial distributio Biomial Experimet: ) It cosists of trials ) Each trial results i of possible outcomes, S or F 3)
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationLecture Chapter 6: Convergence of Random Sequences
ECE5: Aalysis of Radom Sigals Fall 6 Lecture Chapter 6: Covergece of Radom Sequeces Dr Salim El Rouayheb Scribe: Abhay Ashutosh Doel, Qibo Zhag, Peiwe Tia, Pegzhe Wag, Lu Liu Radom sequece Defiitio A ifiite
More informationECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors
ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic
More informationLecture 20: Multivariate convergence and the Central Limit Theorem
Lecture 20: Multivariate covergece ad the Cetral Limit Theorem Covergece i distributio for radom vectors Let Z,Z 1,Z 2,... be radom vectors o R k. If the cdf of Z is cotiuous, the we ca defie covergece
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Itroductio to Probability ad Statistics Lecture 23: Cotiuous radom variables- Iequalities, CLT Puramrita Sarkar Departmet of Statistics ad Data Sciece The Uiversity of Texas at Austi www.cs.cmu.edu/
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationExpectation and Variance of a random variable
Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationModule 1 Fundamentals in statistics
Normal Distributio Repeated observatios that differ because of experimetal error ofte vary about some cetral value i a roughly symmetrical distributio i which small deviatios occur much more frequetly
More informationVariance of Discrete Random Variables Class 5, Jeremy Orloff and Jonathan Bloom
Variace of Discrete Radom Variables Class 5, 18.05 Jeremy Orloff ad Joatha Bloom 1 Learig Goals 1. Be able to compute the variace ad stadard deviatio of a radom variable.. Uderstad that stadard deviatio
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More informationLecture 7: Properties of Random Samples
Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationLecture 8: Convergence of transformations and law of large numbers
Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationThis section is optional.
4 Momet Geeratig Fuctios* This sectio is optioal. The momet geeratig fuctio g : R R of a radom variable X is defied as g(t) = E[e tx ]. Propositio 1. We have g () (0) = E[X ] for = 1, 2,... Proof. Therefore
More informationCEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering
CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationParameter, Statistic and Random Samples
Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,
More informationSTAT 515 fa 2016 Lec Sampling distribution of the mean, part 2 (central limit theorem)
STAT 515 fa 2016 Lec 15-16 Samplig distributio of the mea, part 2 cetral limit theorem Karl B. Gregory Moday, Sep 26th Cotets 1 The cetral limit theorem 1 1.1 The most importat theorem i statistics.............
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationSimulation. Two Rule For Inverting A Distribution Function
Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump
More informationChapter 2 The Monte Carlo Method
Chapter 2 The Mote Carlo Method The Mote Carlo Method stads for a broad class of computatioal algorithms that rely o radom sampligs. It is ofte used i physical ad mathematical problems ad is most useful
More informationTests of Hypotheses Based on a Single Sample (Devore Chapter Eight)
Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationLecture 01: the Central Limit Theorem. 1 Central Limit Theorem for i.i.d. random variables
CSCI-B609: A Theorist s Toolkit, Fall 06 Aug 3 Lecture 0: the Cetral Limit Theorem Lecturer: Yua Zhou Scribe: Yua Xie & Yua Zhou Cetral Limit Theorem for iid radom variables Let us say that we wat to aalyze
More informationProbability 2 - Notes 10. Lemma. If X is a random variable and g(x) 0 for all x in the support of f X, then P(g(X) 1) E[g(X)].
Probability 2 - Notes 0 Some Useful Iequalities. Lemma. If X is a radom variable ad g(x 0 for all x i the support of f X, the P(g(X E[g(X]. Proof. (cotiuous case P(g(X Corollaries x:g(x f X (xdx x:g(x
More informationLecture 2: Poisson Sta*s*cs Probability Density Func*ons Expecta*on and Variance Es*mators
Lecture 2: Poisso Sta*s*cs Probability Desity Fuc*os Expecta*o ad Variace Es*mators Biomial Distribu*o: P (k successes i attempts) =! k!( k)! p k s( p s ) k prob of each success Poisso Distributio Note
More information10/31/2018 CentralLimitTheorem
10/31/2018 CetralLimitTheorem http://127.0.0.1:8888/bcovert/html/cs237/web/homeworks%2c%20labs%2c%20ad%20code/cetrallimittheorem.ipyb?dowload=false 1/10 10/31/2018 CetralLimitTheorem Cetral Limit Theorem
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationLarge Sample Theory. Convergence. Central Limit Theorems Asymptotic Distribution Delta Method. Convergence in Probability Convergence in Distribution
Large Sample Theory Covergece Covergece i Probability Covergece i Distributio Cetral Limit Theorems Asymptotic Distributio Delta Method Covergece i Probability A sequece of radom scalars {z } = (z 1,z,
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationLecture 5. Random variable and distribution of probability
Itroductio to theory of probability ad statistics Lecture 5. Radom variable ad distributio of probability prof. dr hab.iż. Katarzya Zarzewsa Katedra Eletroii, AGH e-mail: za@agh.edu.pl http://home.agh.edu.pl/~za
More informationLecture 2: Concentration Bounds
CSE 52: Desig ad Aalysis of Algorithms I Sprig 206 Lecture 2: Cocetratio Bouds Lecturer: Shaya Oveis Ghara March 30th Scribe: Syuzaa Sargsya Disclaimer: These otes have ot bee subjected to the usual scrutiy
More informationKernel density estimator
Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationMA Advanced Econometrics: Properties of Least Squares Estimators
MA Advaced Ecoometrics: Properties of Least Squares Estimators Karl Whela School of Ecoomics, UCD February 5, 20 Karl Whela UCD Least Squares Estimators February 5, 20 / 5 Part I Least Squares: Some Fiite-Sample
More informationSampling Distributions, Z-Tests, Power
Samplig Distributios, Z-Tests, Power We draw ifereces about populatio parameters from sample statistics Sample proportio approximates populatio proportio Sample mea approximates populatio mea Sample variace
More informationKLMED8004 Medical statistics. Part I, autumn Estimation. We have previously learned: Population and sample. New questions
We have previously leared: KLMED8004 Medical statistics Part I, autum 00 How kow probability distributios (e.g. biomial distributio, ormal distributio) with kow populatio parameters (mea, variace) ca give
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationLecture 4. Random variable and distribution of probability
Itroductio to theory of probability ad statistics Lecture. Radom variable ad distributio of probability dr hab.iż. Katarzya Zarzewsa, prof.agh Katedra Eletroii, AGH e-mail: za@agh.edu.pl http://home.agh.edu.pl/~za
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 6 9/23/2013. Brownian motion. Introduction
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 6 9/23/203 Browia motio. Itroductio Cotet.. A heuristic costructio of a Browia motio from a radom walk. 2. Defiitio ad basic properties
More informationEcon 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara
Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio
More informationIt is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.
MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationChapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE. Part 3: Summary of CI for µ Confidence Interval for a Population Proportion p
Chapter 8: STATISTICAL INTERVALS FOR A SINGLE SAMPLE Part 3: Summary of CI for µ Cofidece Iterval for a Populatio Proportio p Sectio 8-4 Summary for creatig a 100(1-α)% CI for µ: Whe σ 2 is kow ad paret
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 6 9/24/2008 DISCRETE RANDOM VARIABLES AND THEIR EXPECTATIONS Cotets 1. A few useful discrete radom variables 2. Joit, margial, ad
More informationJoint Probability Distributions and Random Samples. Jointly Distributed Random Variables. Chapter { }
UCLA STAT A Applied Probability & Statistics for Egieers Istructor: Ivo Diov, Asst. Prof. I Statistics ad Neurology Teachig Assistat: Neda Farziia, UCLA Statistics Uiversity of Califoria, Los Ageles, Sprig
More informationHypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance
Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?
More informationThe variance of a sum of independent variables is the sum of their variances, since covariances are zero. Therefore. V (xi )= n n 2 σ2 = σ2.
SAMPLE STATISTICS A radom sample x 1,x,,x from a distributio f(x) is a set of idepedetly ad idetically variables with x i f(x) for all i Their joit pdf is f(x 1,x,,x )=f(x 1 )f(x ) f(x )= f(x i ) The sample
More information17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15
17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig
More informationChapter 5. Inequalities. 5.1 The Markov and Chebyshev inequalities
Chapter 5 Iequalities 5.1 The Markov ad Chebyshev iequalities As you have probably see o today s frot page: every perso i the upper teth percetile ears at least 1 times more tha the average salary. I other
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationAn Introduction to Asymptotic Theory
A Itroductio to Asymptotic Theory Pig Yu School of Ecoomics ad Fiace The Uiversity of Hog Kog Pig Yu (HKU) Asymptotic Theory 1 / 20 Five Weapos i Asymptotic Theory Five Weapos i Asymptotic Theory Pig Yu
More informationProbability and Random Processes
Probability ad Radom Processes Lecture 5 Probability ad radom variables The law of large umbers Mikael Skoglud, Probability ad radom processes 1/21 Why Measure Theoretic Probability? Stroger limit theorems
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationSTAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)
STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated
More informationAsymptotic distribution of products of sums of independent random variables
Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege
More informationA sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as
More informationStatisticians use the word population to refer the total number of (potential) observations under consideration
6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space
More information5. INEQUALITIES, LIMIT THEOREMS AND GEOMETRIC PROBABILITY
IA Probability Let Term 5 INEQUALITIES, LIMIT THEOREMS AND GEOMETRIC PROBABILITY 51 Iequalities Suppose that X 0 is a radom variable takig o-egative values ad that c > 0 is a costat The P X c E X, c is
More informationNotes 19 : Martingale CLT
Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationStat 421-SP2012 Interval Estimation Section
Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible
More informationSequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence
Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationProbability and statistics: basic terms
Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample
More informationMath 525: Lecture 5. January 18, 2018
Math 525: Lecture 5 Jauary 18, 2018 1 Series (review) Defiitio 1.1. A sequece (a ) R coverges to a poit L R (writte a L or lim a = L) if for each ǫ > 0, we ca fid N such that a L < ǫ for all N. If the
More informationBig Picture. 5. Data, Estimates, and Models: quantifying the accuracy of estimates.
5. Data, Estimates, ad Models: quatifyig the accuracy of estimates. 5. Estimatig a Normal Mea 5.2 The Distributio of the Normal Sample Mea 5.3 Normal data, cofidece iterval for, kow 5.4 Normal data, cofidece
More informationBasics of Probability Theory (for Theory of Computation courses)
Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.
More informationAsymptotic Results for the Linear Regression Model
Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is
More informationStatistical Theory; Why is the Gaussian Distribution so popular?
Statistical Theory; Why is the Gaussia Distributio so popular? Rob Nicholls MRC LMB Statistics Course 2014 Cotets Cotiuous Radom Variables Expectatio ad Variace Momets The Law of Large Numbers (LLN) The
More information1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1
8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationComparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading
Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual
More informationST5215: Advanced Statistical Theory
ST525: Advaced Statistical Theory Departmet of Statistics & Applied Probability Tuesday, September 7, 2 ST525: Advaced Statistical Theory Lecture : The law of large umbers The Law of Large Numbers The
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationThe Central Limit Theorem
Chapter The Cetral Limit Theorem Deote by Z the stadard ormal radom variable with desity 2π e x2 /2. Lemma.. Ee itz = e t2 /2 Proof. We use the same calculatio as for the momet geeratig fuctio: exp(itx
More information5. Limit Theorems, Part II: Central Limit Theorem. ECE 302 Fall 2009 TR 3 4:15pm Purdue University, School of ECE Prof.
5. Limit Theorems, Part II: Cetral Limit Theorem ECE 302 Fall 2009 TR 3 4:15pm Purdue Uiversity, School of ECE Prof. Ilya Pollak WLLN ad CLT X 1,, X i.i.d. with fiite mea μ ad variace σ 2 WLLN ad CLT X
More information