Lecture 11 October 27
|
|
- Kellie Johns
- 6 years ago
- Views:
Transcription
1 STATS 300A: Theory of Statistics Fall 205 Lecture October 27 Lecturer: Lester Mackey Scribe: Viswajith Veugopal, Vivek Bagaria, Steve Yadlowsky Warig: These otes may cotai factual ad/or typographic errors.. Summary I this lecture, we will discuss the idetificatio of miimax estimators via submodels, the admissibility of miimax estimators, ad simultaeous estimatio ad the James-Stei estimator. This will coclude our discussio of estimatio; i the future we will be focusig o the decisio problem of hypothesis testig..2 Miimax Estimators ad Submodels Recall that a estimator δ M is miimax if its maximum risk is miimal: if δ sup θ Ω Rθ, δ) sup Rθ, δ M ) θ We saw how to derive the miimax estimator usig least favourable priors i Lecture 0. I this lecture we will cosider a differet approach, based o the followig Lemma: Lemma TPE 5..5). Suppose that δ is miimax for a submodel θ Ω 0 Ω ad The, δ is miimax for the full model, θ Ω. sup Rθ, δ) sup Rθ, δ) θ Ω 0 θ Ω This lemma allows us to fid a miimax estimator for a particular tractable submodel, ad the show that the worst-case risk for the full model is equal to that of the submodel that is, the worst-case risk does t rise as you go to the full model). I this case, usig the Lemma, we ca argue that the estimator we foud is also miimax for the full model. This was similar to how we justified miimaxity of the estimator of a Normal mea with bouded variace last lecture. Here s a fairly simple example: Example. Let X,..., X be i.i.d N µ, σ 2 ), where both µ ad σ 2 are ukow. Thus, our parameter vector, θ µ, σ 2 ) ad our parameter space Ω R R +. Our task ow is to estimate µ. Our loss fuctio is the relative squared error loss, give by: Lµ, σ 2 ), d) d µ)2 σ 2 -
2 STATS 300A Lecture October 27 Fall 205 We cosider this loss fuctio to make the questio of miimaxity more iterestig: regular squared error loss is ubouded for the full model, sice it is proportioal to the variace, which is ubouded. We cosider the submodel where σ 2. That is, Ω 0 R {}, ad our loss fuctio simplifies to our usual squared error loss: Lµ, ), d) d µ) 2. We saw i Example of Lecture 0 that uder this loss X is miimax for Ω 0. Moreover, Rµ, σ 2 ), X) µ, σ2 ) Ω. Thus, the risk does ot deped o σ 2. Sice Rµ, ), X)) Rµ, σ 2 ), X)), we have that the maximum risks are equal. That is, sup θ Ω0 Rθ, δ) sup θ Ω Rθ, δ). Therefore, it follows from Lemma that X is miimax o Ω. Note that, thaks to our ew loss fuctio, we do t eed to impose boudedess o our variace like we did i our previous lecture) to establish miimaxity i a meaigful way. This example is parametric, like a lot of the examples we ve made so far. Assumig we kow the form of the distributio for the variables, ad that the variables are i.i.d., are both strog assumptios. Now, we cosider a more ambitious example, which is i a o-parametric settig, ad hece more geeral. Example 2 TPE Example 5..6). Suppose X, X 2,..., X are i.i.d with commo CDF F, with mea µf ) <, ad variace σ 2 F ) <. Our goal is to fid a miimax estimate of µf ) uder squared error loss. Without further restrictio o F, the worst case risk is ubouded for every estimator, so every estimator is miimax. We will impose further costraits, ad restrict our family somehow to have fiite worst-case risk, to esure that meaigful miimax estimators ca be obtaied. Costrait a). Assume σ 2 F ) B. Now, we ve see i the previous lecture that X is miimax for the Gaussia submodel i this case. So a atural guess for us to make is that X is miimax. We verify this by applicatio of Lemma. First, we compute the supremum risk for the full model: RF, X) 2 Sice σ 2 F ) [0, B] by assumptio, we get: i EX i µf )) 2 σ2 F ). sup RF, X) B F Now we saw i Lecture 0 that for the submodel F 0 N µ, σ 2 ) whe σ 2 B, X is miimax. Further, the supremum risk i this case is idetical to that of the full model: sup F F 0 RF, X) B Thus, usig Lemma we coclude that X is miimax for the full model. o-parametric model still costraied to have σ 2 F ) B.) That is, the -2
3 STATS 300A Lecture October 27 Fall 205 Costrait b). Assume F F where F is the set of all CDFs with support cotaied i [0, ]. Is X miimax for this model? We have reaso to believe that it is ot, based o the miimax estimator we derived i Lecture 9 for the Biomial submodel. Ad i fact, it turs out that X is t miimax. To show this, first cosider the submodel, F 0 {Berθ)} θ 0,). Let Y i X i so that Y Bi, θ) ad X Y/. Recall from Lecture 9 that the miimax estimator for µf ) θ, i the Biomial case, is: δx) which has supremum risk 4+ ) 2. So + X + ) 2 + sup Rθ, X) θ 4 > 4 + ) sup Rθ, δ) 2 Thus, X has a higher worst-case risk tha δx) as defied above, ad hece, we have show that X is ot miimax. Now, let s get more ambitious, ad try to see if we ca fid the miimax estimator uder the full model. We kow that this ca t be X, but it s possible that it could be δx). To examie this possibility, we cojecture that δx) is also miimax uder the full model. If we are to establish this uder the Lemma, we eed to show that the supremum risk of δx) uder the full model is o more tha 4+ which is the supremum risk for the biomial ) 2 submodel). Let us compute: [ ) )) 2 ] E F [δx) µf )) 2 ] E F + X µf )) µf ) ) [ 2 ) ] 2 + Var X) + 2 µf ) ) 2 [EX + 2) µf ) ] µf ) + µf )2 ) 2 [EX + 2) + 4 ] µf ) where the third step follows from the fact that VarX ) Var X) E[X] 2 E[X ]) 2 E[X] 2 µf )) 2. By assumptio X [0, ], so X 2 X ad we ca boud the risk: ) 2 E F [δx) µf )) 2 ] [EX + ) + 4 ] µf ) 4 + ). 2 So, δx) is miimax for the Biomial submodel, ad its worst-case risk is the same for the full model ad for the Biomial submodel. Therefore, applyig the Lemma, we coclude that δx) is miimax. Thus, we have foud a miimax estimator. -3 θ
4 STATS 300A Lecture October 27 Fall Admissibility of miimax estimators Let us ow tur to the questio of admissibility of miimax estimators. We begi by otig that the questio of admissibility is particularly importat for miimax estimators. This is because, although we foud domiatig estimators eve whe we were workig with ubiased estimators, the domiatig estimators were biased, so we lost the property ubiasedess) that we were iterested i however, if you fid a estimator that domiates a miimax estimator, it will still be miimax! Also, a aside: admissibility ca give rise to miimaxity. If δ is admissible with costat risk, the δ is also miimax. This is ot hard to show. Let the costat risk of δ be r. The, r is also the worst-case risk of δ, sice the risk is costat. Now, if we assume δ is ot miimax, there exists a differet estimator, say δ, which is miimax. The worst-case risk of δ, say r, would thus be < r. But sice this is the worst-case risk of δ, that would mea that the risk of δ is lower tha r throughout, ad thus δ domiates δ. However, we assumed that δ was admissible, so this is a cotradictio. Thus, our assumptio led to a cotradictio, ad therefore δ is miimax.) Note that miimaxity does ot guaratee admissibility; it oly esures the worst case risk is optimal. We eed to check for admissibility. The followig example illustrates several stadard ways of doig so. iid Example 3. Let X, X 2,..., X N θ, σ 2 ) where σ 2 is kow, ad θ is the estimad. The the miimax estimator is X uder squared error loss, ad we would like to determie whether X is admissible. Istead of aswerig this directly, we aswer a more geeral questio: whe is a X + b, a, b R, basically, ay affie fuctio of X) admissible? Case : 0 < a <. I this case a X + b is a covex combiatio of X ad b. By results we saw i the previous lecture, it is a Bayes estimator with respect to some Gaussia prior o θ. Further, sice we are usig squared error loss, which is strictly covex, this Bayes estimator is uique. So, by Theorem which basically tells us that a uique Bayes estimator will always be admissible), a X + b is admissible. Case 2: a 0. I this case b is also a uique Bayes estimator with respect to a degeerate prior distributio with uit mass at θ b. So by Theorem 5.2.4, b is admissible. Case 3: a, b 0. I this case X + b is ot admissible because it is domiated by X. To see this, ote that X has the same variace as X + b, but strictly smaller bias. The ext few cases use the followig result. I geeral, the risk of a X + b is: E[a X + b θ)] 2 E[ a X θ) + b + θa ) ) 2 ] a2 σ 2 + b + θa ))2 where, i the first step, we added ad subtracted aθ iside. Case 4: a >. If we apply the result for the geeral risk we have: E[a X + b θ) 2 ] a2 σ 2 > σ2 Rθ, X). -4
5 STATS 300A Lecture October 27 Fall 205 The first iequality follows because the secod summad i the expressio for the geeral risk is always oegative. X domiates a X + b whe a >, ad so i this case a X + b is iadmissible. Case 5: a < 0. E[a X + b θ) 2 ] > b + θa )) 2 a ) 2 θ + > θ + b ) 2, a b a ad this is the risk of predictig the costat b/a ). So, b/a ) domiates a X + b, ad therefore, a X + b is agai iadmissible. Now, we have cosidered every case except for the estimator X. It turs out that X. The argumet i this case is more ivolved, ad proceeds by cotradictio. Case 6: a, b 0. Here, we use a limitig Bayes argumet. Suppose X is iadmissible. The, assumig w.l.o.g that σ 2, we have: ) 2 Rθ, X) By our hypothesis, there must exist a estimator δ such that Rθ, δ ) / for all θ ad Rθ, δ ) < / for at least oe θ Ω. Because Rθ, δ) is cotiuous i θ, there must exist ε > 0 ad a iterval θ 0, θ ) cotaiig θ so that: Rθ, δ ) < ε θ θ 0, θ )..) Let r τ be the average risk of δ with respect to the prior distributio N 0, τ 2 ) o θ. Note that this is the exact same prior we used to prove that X was the limit of a Bayes estimator, ad hece miimax. We did this by lettig τ, ad therefore lettig our prior ted to the improper prior πθ) θ.) Let r τ be the average risk of a Bayes estimator δ τ uder the same prior. Note that δ τ δ because Rθ, δ τ ) as θ which is ot cosistet with Rθ, δ ) / for all θ R. So, r τ < r τ, because the Bayes estimator is uique almost surely with respect to the margial distributio of θ. We will look at the followig ratio, which is selected to simplify our algebra later. This ratio, we will show, will become arbitrarily large, which we will use to form a cotradictio with r τ < r τ. Usig the form of the Bayes risk r τ computed i a previous lecture see TPE Example 5..4), we ca write: r τ r τ [ Rθ, 2πτ δ ) ] ) θ exp 2 dθ 2τ 2 + τ 2-5
6 STATS 300A Lecture October 27 Fall 205 Applyig.), we fid: r τ r τ θ 2πτ θ 0 +τ 2 ) + τ 2 ) τ ε 2π εe θ 2 2τ 2 dθ θ θ 0 e θ 2 2τ 2 dθ As τ, the first expressio, + τ 2 )ε/τ 2π) ad sice the itegrad coverges mootoically to, Lebesgue s mootoe covergece theorem esures that the itegral approaches the positive quatity θ θ 0. So, for sufficietly large τ, we must have r τ r τ This meas that r τ < r τ. However, this is a cotradictio, because r τ is the optimal average risk sice it is the Bayes risk). So our assumptio that there was a domiatig estimator was false, ad i this case, a X + b X is admissible. >..4 Simultaeous estimatio Up to this poit, we have cosidered oly situatios where a sigle real-valued parameter is of iterest. However, i practice, we ofte care about several parameters, ad wish to estimate them all at oce. I this sectio we cosider the admissibility of estimators of several parameters that is, of simultaeous estimatio. Example 4. Let X, X 2,..., X p be idepedet with X i N θ i, σ 2 ) for i p. For the sake of simplicity, say σ 2. Now our goal is to estimate θ θ, θ 2,..., θ p ) uder the loss fuctio: p Lθ, d) d i θ i ) 2 i A atural estimator for θ is X X, X 2,..., X p ). It ca be show that X is the UMRUE, the maximum likelihood estimator, a geeralized Bayes estimator, ad a miimax estimator for θ. So, it would be atural to thik that X is admissible. However, couterituitively, it turs out that this is ot the case whe p 3. Whe p 3, X is domiated by the James-Stei estimator ad that too, strictly domiated): Here 2 is the 2-orm so X 2 2 p j X2 j ) δx) δ X), δ 2 X),..., δ p X)) where δ i X) p 2 ) X X 2 i. 2-6
7 STATS 300A Lecture October 27 Fall 205 The J-S estimator makes use of the etire data vector whe estimatig each θ i, so it is surprisig that this is beeficial give the assumptio of idepedece amogst the compoets of X. A example of the James-Stei estimator beig used to estimate battig averages is available at It turs out that the James-Stei estimator is ot itself admissible because it is domiated by the positive part James-Stei estimator TPE Theorem 5.5.4): δ i X) max p 2, 0 X 2 2 To add isult to ijury, eve this estimator ca be show iadmissible, although that proof is o-costructive..4. Motivatio for the J-S estimator To motivate the J-S estimator, we cosider how it ca arise i a empirical Bayes framework. The empirical Bayes approach which builds o priciples of Bayesia estimatio, but is ot strictly Bayesia) is a two-step process:. Itroduce a prior family idexed by a hyperparameter this is the Bayesia aspect). 2. Estimate the hyperparameter from the data this is the empirical aspect). So applyig this procedure to the problem at had: iid. Suppose θ i N 0, A) the the Bayes estimator for θ i is δ A,i X) X i + ) X i A + A 2. I this step we must choose A. Margializig over θ, we see that X has the distributio, X i iid N 0, A + ) Exercise: Verify this.) We will use X ad the kowledge of this margial distributio to fid a estimate of. Oe could, i priciple, use ay estimate of A, ad it is A+ commo to use a maximum likelihood estimate, but here we will used a ubiased estimate. It ca the be show that [ ] E X 2 2 ) X i p 2)A + ) Exercise: Verify this. Hit: A+ X 2 2 follows a χ 2 distributio). So p 2 X 2 2 must be UMVU for. A+ If we plug this estimator ito our Bayes estimator we obtai the J-S estimator: δx i ) p 2 X ) X i.
8 STATS 300A Lecture October 27 Fall James-Stei domiatio Ituitively, the problem with the estimate X is that X 2 2 is typically much larger tha θ 2 2: [ p ] p E[ X 2 2] E Xj 2 p + θi 2 p + θ 2 2 i where p is actually σ 2 p p i this case. So, we may view the J-S estimator as a method for correctig the bias i the size of X. It achieves this by shrikig each coordiate of X toward 0. The uiform superiority of the J-S estimator to X ca be formalised see Keeer.2). Theorem Theorem 5.5. TPE). The James-Stei estimator δ has uiformly smaller risk tha X if p 3. The proof, give o p. 355 of TPE, compares the risk of the J-S estimator directly to that of X. i -8
Lecture 10 October Minimaxity and least favorable prior sequences
STATS 300A: Theory of Statistics Fall 205 Lecture 0 October 22 Lecturer: Lester Mackey Scribe: Brya He, Rahul Makhijai Warig: These otes may cotai factual ad/or typographic errors. 0. Miimaxity ad least
More informationConvergence of random variables. (telegram style notes) P.J.C. Spreij
Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More informationChapter 3. Strong convergence. 3.1 Definition of almost sure convergence
Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS
MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak
More informationResampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.
Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationDiscrete Mathematics for CS Spring 2008 David Wagner Note 22
CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationLecture 12: September 27
36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationSince X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain
Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the
More informationDirection: This test is worth 250 points. You are required to complete this test within 50 minutes.
Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely
More informationRates of Convergence by Moduli of Continuity
Rates of Covergece by Moduli of Cotiuity Joh Duchi: Notes for Statistics 300b March, 017 1 Itroductio I this ote, we give a presetatio showig the importace, ad relatioship betwee, the modulis of cotiuity
More informationFrequentist Inference
Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for
More informationChapter 6 Infinite Series
Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat
More information32 estimating the cumulative distribution function
32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio
More informationProduct measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.
Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationNotes 19 : Martingale CLT
Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More information7.1 Convergence of sequences of random variables
Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week 2 Lecture: Cocept Check Exercises Starred problems are optioal. Excess Risk Decompositio 1. Let X = Y = {1, 2,..., 10}, A = {1,..., 10, 11} ad suppose the data distributio
More informationInfinite Sequences and Series
Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationAsymptotic Results for the Linear Regression Model
Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5
CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio
More informationAdvanced Stochastic Processes.
Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationLecture 33: Bootstrap
Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece
More informationLECTURE 14 NOTES. A sequence of α-level tests {ϕ n (x)} is consistent if
LECTURE 14 NOTES 1. Asymptotic power of tests. Defiitio 1.1. A sequece of -level tests {ϕ x)} is cosistet if β θ) := E θ [ ϕ x) ] 1 as, for ay θ Θ 1. Just like cosistecy of a sequece of estimators, Defiitio
More informationStatistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.
Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized
More informationLecture 3 : Random variables and their distributions
Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}
More information1.010 Uncertainty in Engineering Fall 2008
MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval
More informationMA131 - Analysis 1. Workbook 2 Sequences I
MA3 - Aalysis Workbook 2 Sequeces I Autum 203 Cotets 2 Sequeces I 2. Itroductio.............................. 2.2 Icreasig ad Decreasig Sequeces................ 2 2.3 Bouded Sequeces..........................
More informationLecture 3 The Lebesgue Integral
Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified
More informationUnbiased Estimation. February 7-12, 2008
Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom
More informationEcon 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.
Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio
More informationFall 2013 MTH431/531 Real analysis Section Notes
Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters
More information1 Convergence in Probability and the Weak Law of Large Numbers
36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationSequences. Notation. Convergence of a Sequence
Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationIf a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?
2 Lebesgue Measure I Chapter 1 we defied the cocept of a set of measure zero, ad we have observed that every coutable set is of measure zero. Here are some atural questios: If a subset E of R cotais a
More informationFirst Year Quantitative Comp Exam Spring, Part I - 203A. f X (x) = 0 otherwise
First Year Quatitative Comp Exam Sprig, 2012 Istructio: There are three parts. Aswer every questio i every part. Questio I-1 Part I - 203A A radom variable X is distributed with the margial desity: >
More informationOutput Analysis and Run-Length Control
IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%
More informationLecture 2. The Lovász Local Lemma
Staford Uiversity Sprig 208 Math 233A: No-costructive methods i combiatorics Istructor: Ja Vodrák Lecture date: Jauary 0, 208 Origial scribe: Apoorva Khare Lecture 2. The Lovász Local Lemma 2. Itroductio
More informationLecture 3: August 31
36-705: Itermediate Statistics Fall 018 Lecturer: Siva Balakrisha Lecture 3: August 31 This lecture will be mostly a summary of other useful expoetial tail bouds We will ot prove ay of these i lecture,
More informationLecture 6 Simple alternatives and the Neyman-Pearson lemma
STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull
More informationJanuary 25, 2017 INTRODUCTION TO MATHEMATICAL STATISTICS
Jauary 25, 207 INTRODUCTION TO MATHEMATICAL STATISTICS Abstract. A basic itroductio to statistics assumig kowledge of probability theory.. Probability I a typical udergraduate problem i probability, we
More informationThis exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.
Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the
More informationChapter 6 Principles of Data Reduction
Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a
More informationMA131 - Analysis 1. Workbook 3 Sequences II
MA3 - Aalysis Workbook 3 Sequeces II Autum 2004 Cotets 2.8 Coverget Sequeces........................ 2.9 Algebra of Limits......................... 2 2.0 Further Useful Results........................
More informationDiscrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22
CS 70 Discrete Mathematics for CS Sprig 2007 Luca Trevisa Lecture 22 Aother Importat Distributio The Geometric Distributio Questio: A biased coi with Heads probability p is tossed repeatedly util the first
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationSequences I. Chapter Introduction
Chapter 2 Sequeces I 2. Itroductio A sequece is a list of umbers i a defiite order so that we kow which umber is i the first place, which umber is i the secod place ad, for ay atural umber, we kow which
More informationLecture 8: Convergence of transformations and law of large numbers
Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.
More informationSTAT Homework 1 - Solutions
STAT-36700 Homework 1 - Solutios Fall 018 September 11, 018 This cotais solutios for Homework 1. Please ote that we have icluded several additioal commets ad approaches to the problems to give you better
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013. Large Deviations for i.i.d. Random Variables
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 2 9/9/2013 Large Deviatios for i.i.d. Radom Variables Cotet. Cheroff boud usig expoetial momet geeratig fuctios. Properties of a momet
More information17. Joint distributions of extreme order statistics Lehmann 5.1; Ferguson 15
17. Joit distributios of extreme order statistics Lehma 5.1; Ferguso 15 I Example 10., we derived the asymptotic distributio of the maximum from a radom sample from a uiform distributio. We did this usig
More informationA RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS
J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a
More informationEmpirical Processes: Glivenko Cantelli Theorems
Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationMATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4
MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.
More informationLecture 3. Properties of Summary Statistics: Sampling Distribution
Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary
More informationApplication to Random Graphs
A Applicatio to Radom Graphs Brachig processes have a umber of iterestig ad importat applicatios. We shall cosider oe of the most famous of them, the Erdős-Réyi radom graph theory. 1 Defiitio A.1. Let
More informationDiscrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions
CS 70 Discrete Mathematics for CS Sprig 2005 Clacy/Wager Notes 21 Some Importat Distributios Questio: A biased coi with Heads probability p is tossed repeatedly util the first Head appears. What is the
More informationDiscrete Mathematics and Probability Theory Summer 2014 James Cook Note 15
CS 70 Discrete Mathematics ad Probability Theory Summer 2014 James Cook Note 15 Some Importat Distributios I this ote we will itroduce three importat probability distributios that are widely used to model
More informationLecture 9: September 19
36-700: Probability ad Mathematical Statistics I Fall 206 Lecturer: Siva Balakrisha Lecture 9: September 9 9. Review ad Outlie Last class we discussed: Statistical estimatio broadly Pot estimatio Bias-Variace
More informationECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization
ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where
More informationNotes 27 : Brownian motion: path properties
Notes 27 : Browia motio: path properties Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces:[Dur10, Sectio 8.1], [MP10, Sectio 1.1, 1.2, 1.3]. Recall: DEF 27.1 (Covariace) Let X = (X
More information(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3
MATH 337 Sequeces Dr. Neal, WKU Let X be a metric space with distace fuctio d. We shall defie the geeral cocept of sequece ad limit i a metric space, the apply the results i particular to some special
More informationMAT1026 Calculus II Basic Convergence Tests for Series
MAT026 Calculus II Basic Covergece Tests for Series Egi MERMUT 202.03.08 Dokuz Eylül Uiversity Faculty of Sciece Departmet of Mathematics İzmir/TURKEY Cotets Mootoe Covergece Theorem 2 2 Series of Real
More informationDefinition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.
4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013
MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece
More informationAgnostic Learning and Concentration Inequalities
ECE901 Sprig 2004 Statistical Regularizatio ad Learig Theory Lecture: 7 Agostic Learig ad Cocetratio Iequalities Lecturer: Rob Nowak Scribe: Aravid Kailas 1 Itroductio 1.1 Motivatio I the last lecture
More informationElement sampling: Part 2
Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig
More informationDistribution of Random Samples & Limit theorems
STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to
More informationMachine Learning Brett Bernstein
Machie Learig Brett Berstei Week Lecture: Cocept Check Exercises Starred problems are optioal. Statistical Learig Theory. Suppose A = Y = R ad X is some other set. Furthermore, assume P X Y is a discrete
More informationUniversity of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!
Uiversity of Colorado Dever Dept. Math. & Stat. Scieces Applied Aalysis Prelimiary Exam 13 Jauary 01, 10:00 am :00 pm Name: The proctor will let you read the followig coditios before the exam begis, ad
More informationSTAT331. Example of Martingale CLT with Cox s Model
STAT33 Example of Martigale CLT with Cox s Model I this uit we illustrate the Martigale Cetral Limit Theorem by applyig it to the partial likelihood score fuctio from Cox s model. For simplicity of presetatio
More informationLecture Notes 15 Hypothesis Testing (Chapter 10)
1 Itroductio Lecture Notes 15 Hypothesis Testig Chapter 10) Let X 1,..., X p θ x). Suppose we we wat to kow if θ = θ 0 or ot, where θ 0 is a specific value of θ. For example, if we are flippig a coi, we
More informationSingular Continuous Measures by Michael Pejic 5/14/10
Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable
More informationNotes 5 : More on the a.s. convergence of sums
Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More information4.3 Growth Rates of Solutions to Recurrences
4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.
More information2.1. The Algebraic and Order Properties of R Definition. A binary operation on a set F is a function B : F F! F.
CHAPTER 2 The Real Numbers 2.. The Algebraic ad Order Properties of R Defiitio. A biary operatio o a set F is a fuctio B : F F! F. For the biary operatios of + ad, we replace B(a, b) by a + b ad a b, respectively.
More informationBertrand s Postulate
Bertrad s Postulate Lola Thompso Ross Program July 3, 2009 Lola Thompso (Ross Program Bertrad s Postulate July 3, 2009 1 / 33 Bertrad s Postulate I ve said it oce ad I ll say it agai: There s always a
More informationMaximum Likelihood Estimation and Complexity Regularization
ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio
More informationAda Boost, Risk Bounds, Concentration Inequalities. 1 AdaBoost and Estimates of Conditional Probabilities
CS8B/Stat4B Sprig 008) Statistical Learig Theory Lecture: Ada Boost, Risk Bouds, Cocetratio Iequalities Lecturer: Peter Bartlett Scribe: Subhrasu Maji AdaBoost ad Estimates of Coditioal Probabilities We
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationEXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY
EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber
More informationStat410 Probability and Statistics II (F16)
Some Basic Cocepts of Statistical Iferece (Sec 5.) Suppose we have a rv X that has a pdf/pmf deoted by f(x; θ) or p(x; θ), where θ is called the parameter. I previous lectures, we focus o probability problems
More informationMATH301 Real Analysis (2008 Fall) Tutorial Note #7. k=1 f k (x) converges pointwise to S(x) on E if and
MATH01 Real Aalysis (2008 Fall) Tutorial Note #7 Sequece ad Series of fuctio 1: Poitwise Covergece ad Uiform Covergece Part I: Poitwise Covergece Defiitio of poitwise covergece: A sequece of fuctios f
More informationBayesian Methods: Introduction to Multi-parameter Models
Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested
More informationSection 11.8: Power Series
Sectio 11.8: Power Series 1. Power Series I this sectio, we cosider geeralizig the cocept of a series. Recall that a series is a ifiite sum of umbers a. We ca talk about whether or ot it coverges ad i
More information