Nonparametric estimation of conditional distributions

Size: px
Start display at page:

Download "Nonparametric estimation of conditional distributions"

Transcription

1 Noparametric estimatio of coditioal distributios László Györfi 1 ad Michael Kohler 2 1 Departmet of Computer Sciece ad Iformatio Theory, udapest Uiversity of Techology ad Ecoomics, 1521 Stoczek, U.2, udapest, Hugary, gyorfi@szit.bme.hu 2 Fachrichtug 6.1-Mathematik, Uiversität des Saarlades, Postfach , D Saarbrücke, Germay, kohler@math.ui-sb.de November 9, 2005 bstract Estimatio of coditioal distributios is cosidered. It is assumed that the coditioal distributio is either discrete or that it has a desity with respect to the Lebesgue-orelmeasure. Partitioig estimates of the coditioal distributio are costructed ad results cocerig cosistecy ad rate of covergece of the itegrated total variatio error of the estimates are preseted. Key words ad phrases: coditioal desity, coditioal distributio, cofidece sets, partitioig estimate, Poisso regressio, rate of covergece, uiversal cosistecy. Ruig title: Estimatio of coditioal distributios Please sed correspodece ad proofs to: Michael Kohler, Fachrichtug 6.1-Mathematik, Uiversität des Saarlades, Postfach , D Saarbrücke, Germay, kohler@math.ui-sb.de, phoe , fax , kohler@math.ui-sb.de 1

2 1 Itroductio Oe of the mai tasks i statistics is to estimate a distributio from a give sample. Let µ be a probability distributio o IR d ad let X 1, X 2,... be idepedet ad idetically distributed radom variables with distributio µ. simple but powerful estimate of µ is the empirical distributio µ () = 1 I (X i ), where I deotes the idicator fuctio of the set. y the strog law of large umbers we have µ () µ() a.s. (1) i=1 for each orel set. If we wat to make some statistical iferece about µ it is ot eough to have (1) for each set idividually, istead we eed covergece of µ to µ uiformly over classes of sets. y the Gliveko-Catelli theorem the empirical distributio satisfies sup µ ((, x]) µ((, x]) 0 a.s., (2) x IR d where (, x] = (, x (1) ]... (, x (d) ] for x = (x (1),..., x (d) ) IR d. This is great i case that we wat to make some statistical iferece about itervals, but for more geeral ivestigatios it would be much icer if we are able to cotrol the error i total variatio defied as sup µ () µ(), (3) d where d are the orel-sets i IR d. Clearly, for the empirical distributio the error (3) does ot coverge to zero i geeral, sice if µ has a cotiuous distributio fuctio we have µ({x 1,..., X }) = 0 ad µ ({X 1,..., X }) = 1. If we are able to costruct estimates ˆµ of µ such that sup ˆµ () µ() 0 a.s., (4) d 2

3 the it is easy to costruct cofidece sets ˆ for the values of X 1 such that they have asymptotically level α for give α (0, 1), i.e. such that lim if µ( ˆ ) 1 α a.s. Ideed, ay set ˆ with ˆµ ( ˆ ) 1 α has this property sice µ( ˆ ) = µ ( ˆ ) (µ ( ˆ ) µ( ˆ )) 1 α sup d µ () µ(). Ufortuatley, as was show i Devroye ad Györfi (1990), it is impossible to costruct estimates ˆµ such that (4) holds for all distributios µ. However, it follows from arro et al. (1992) that i case we restrict ourselves to distributios where the oatomic part is absolutely cotiuous with respect to a kow domiatig measure, it is possible to costruct estimates such that (4) holds for all such distributios. Special cases iclude discrete measures (where we assume for otatioal coveiece that µ(in 0 ) = 1) ad measures which have a desity with respect to the Lebesgue-orel-measure. y Scheffe s theorem it suffices i these cases to costruct estimates (ˆµ ({k})) k IN0 of (µ({k})) k IN0 ad estimates ˆf of the desity f of µ, resp., which satisfy ˆµ ({k}) µ({k}) 0 a.s. (5) k=0 ad f (x) f(x) λ(dx) 0 a.s., (6) where λ deotes the Lebesgue-orel-measure. Here oe estimates µ() by ˆµ () = ˆµ ({k}) ad ˆµ () = k IN 0 3 ˆf (x) dx, resp.

4 May estimates which satisfy (6) uiversally for all desities are costructed i Devroye ad Györfi (1985a). I this paper we wat to apply the above ideas i the regressio cotext. Here we have give idepedet ad idetically distributed radom vectors (X, Y ), (X 1, Y 1 ),... with values i IR d IR d. Give the sample D = {(X 1, Y 1 ),..., (X, Y )} of the distributio of (X, Y ) we wat to costruct estimates ˆP { x} of the coditioal distributio P{Y X = x} of Y give X such that sup d ˆP { x} P{Y X = x} µ(dx) 0 a.s., (7) where µ deotes agai the distributio of X. I cotrast to stadard regressio, where d = 1 ad where oly the mea E{Y X = x} of the coditioal distributio is estimated (cf., e.g., Györfi et al. (2002)), we ca use estimates with the property (7) ot oly for predictio of the value of Y for give value of X, but also to costruct cofidece regios for the value of Y give the value of X. Ideed, similarly as above oe gets that (7) implies that ay set C (x) with ˆP {C (x) x} 1 α satisfies lim if P{Y C (X) D } 1 α a.s., sice we have with P { } = P{ D } P{Y C (X) D } = P {Y C (x) X = x}µ(dx) ˆP {C (x) x}µ(dx) ˆP {C (x) x} P {Y C (x) X = x} µ(dx) 1 α sup ˆP ( x) P{Y X = x} µ(dx). 4

5 I order to costruct estimates with the property (7), we cosider two special cases: I the first case the coditioal distributio of Y give X is discrete (ad for otatioal coveiece we assume agai that the support is cotaied i IN 0 ). I the secod case the coditioal distributio of Y give X = x has a desity f( x) with respect to the Lebesgue-orel-measure. I both cases Scheffe s theorem implies that i order to have (7) we have to costruct estimates of P{Y = k X = x} ad f( x) such that k=0 ˆP {k x} P{Y = k X = x} µ(dx) 0 a.s. (8) ad f (y x) f(y x) λ(dy)µ(dx) 0 a.s., (9) resp. I order to costruct i the first case estimates with the property (8) we use two differet approaches: I the first approach we cosider for each y IN 0 P{Y = y X = x} = E{I {Y =y} X = x} as a regressio fuctio ad estimate it by applyig a partitioig estimate to a sample of (X, I {Y =y} ). I the secod approach we cosider Poisso regressio, i.e., we make a parametric assumptio o the way the coditioal distributio of Y give X = x depeds o m(x) ad assume that P{Y = y X = x} = m(x)y y! e m(x) (y IN 0 ) for some m : IR d (0, ), where m is completely ukow. I this case we estimate m(x) = E{Y X = x} by a partitioig estimate m (x) applied to a sample of (X, Y ), ad cosider the plug-i estimate ˆP {Y = y X = x} = m (x) y y! e m(x) (y IN 0 ). 5

6 I both approaches we preset results cocerig uiversal cosistecy, i.e. we show (8) for all correspodig discrete coditioal distributios, ad we aalyze the rate of covergece of the estimates. Estimates of the coditioal desity i the secod case are defied as suitable partitioig estimates. We preset results cocerig uiversal cosistecy, i.e., we show (9) for all coditioal distributios with desity, ad we aalyze the rate of covergece uder regularity assumptios o the smoothess of the coditioal desity. The paper is orgaized as follows: Our mai results cocerig estimatio of discrete coditioal distributios ad coditioal desities are described i Sectio 2 ad 3, resp. The proofs are give i Sectio 4. 2 The estimatio of discrete coditioal distributios I this sectio we study partitioig estimates of discrete coditioal distributios. I our first two theorems each coditioal probability P{Y = y X = x} is estimated separately. We have the followig result cocerig cosistecy of the estimate. Theorem 1 Let P = {,j : j} be a partitio of IR d ad for x IR d deote by (x) that cell,j of P that cotais x. Let ˆP {y x} = i=1 I (x)(x i ) I {Yi =y} j=1 I (x)(x j ) be the partitioig estimate of P{Y = y X = x}. ssume that the uderlyig partitioig P = {,j : j} satisfies for each sphere S cetered at the origi lim max diam(,j) = 0 (10) j:,j S ad {j :,j S } lim = 0, (11) 6

7 where diam() deotes the diameter of the set. The ˆP {y x} P{Y = y X = x} µ(dx) 0 a.s. Next we cosider the rate of covergece of the above partitioig estimate. It is wellkow that i order to derive o-trivial rate of covergece results i oparametric regressio oe eeds smoothess assumptio o the uderlyig regressio fuctio (cf., Devroye (1982)). I our ext result we assume that the coditioal probabilities are locally Lipschitz cotiuous, such that the itegral over the sum of the Lipschitz costat is fiite. Theorem 2 ssume X is bouded a.s., P{Y = y X = x} P{Y = y X = z} C y (x) x z for all x, z from the bouded support of X ad for some local Lipschitz costats C y (x) satisfyig ad assume C y (x)µ(dx) = C <, P{Y = y} <. Let ˆP {y x} be the partitioig estimate of P{Y = y X = x} with respect to a partitio of IR d cosistig of cubes with side-legth h. The so for E ˆP {y x} P{Y = y X = x} µ(dx) c P{Y = y} 1 + d C h, h d h = c 2 1/(d+2) 7

8 we get E ˆP {y x} P{Y = y X = x} µ(dx) c 3 1 d+2. I the ext theorem we cosider Poisso regressio. Here the coditioal distributio of Y give X is give by P{Y = y X = x} = m(x)y y! e m(x) (y IN 0 ) for some m : IR d (0, ). ecause of m(x) = E{Y X = x} we ca estimate it by applyig a partitioig estimate to D ad use a plug-i estimate ˆP {y x} = m (x) y y! e m(x) (y IN 0 ) to estimate the coditioal distributio of Y give X. For this estimate we have the followig result. Theorem 3 ssume that E{Y } < ad P{Y = y X = x} = m(x)y y! e m(x) (y IN 0 ) for some m : IR d (0, ). Let P i=1 I (x)(x i ) Y i P i=1 m (x) = I (x)(x i if ) i=1 I (x)(x i ) > log 0 otherwise. be the (modified) partitioig estimate of m with partitio P = {,j : j} ad set ˆP {y x} = m (x) y y! e m(x) (y IN 0 ). a) ssume that the uderlyig partitio P satisfies (10) ad for each sphere S cetered at the origi {j :,j S } log lim = 0. (12) The ˆP {y x} P{Y = y X = x} µ(dx) 0 a.s. 8

9 b) ssume X is bouded a.s. ad assume that E{Y 2 } < ad m is Lipschitz cotiuous, i.e. m(x) m(z) C x z for some costat C IR +. Choose the uderlyig partitio such that it cosists of cubes of side-legth h. The E ˆP {y x} P{Y = y X = x} µ(dx) c 4 h d + c 5 h, so for h = c 6 1/(d+2) we get E ˆP {y x} P{Y = y X = x} µ(dx) c 7 1 d+2. Remark 1. ssume that the assumptios of Theorem 3 b) hold. The fuctio f(u) = u y e u /(y!) satisfies for u [0, ] f (u) = y u y 1 y! e u uy y! e u ( + 1) y 1 (y 1)!, so by boudedess of the Lipschitz cotiuous regressio fuctio m we get for y > 0 P{Y = y X = x} P{Y = y X = z} ( + 1) y 1 C x z. (y 1)! This implies that the coditioal probabilities are Lipschitz cotiuous ad that the itegral over the sum of the Lipschitz costat is bouded by 1 y 1 + ( + 1) C = ( 1 + ( + 1) e ) C, (y 1)! y=1 hece uder the assumptio of Theorem 3 b) the estimate i Theorem 2 achieves the same rate of covergece although it does ot deped o the particular form of the coditioal distributio. 9

10 Remark 2. Uder more restrictive regularity assumptios o the uderlyig distributio cosistecy of a localized log-likelihood Poisso regressio estimate was show i Kohler ad Krzyżak (2005). 3 The estimatio of coditioal desities I this sectio assume that Y takes values i IR d. Our aim is to estimate the coditioal distributio of Y give X cosistetly i total variatio. We assume that Y has absolutely cotiuous distributio ad the coditioal desity of Y give X is deoted by f(y x). For estimatig f(y x), itroduce a histogram estimate. Let Q = {,j : j} be a partitio of IR d, such that the Lebesgue measure λ of each cell is positive ad fiite. Let (y) be the cell of Q ito which y falls. s before let P = {,j : j} be a partitio of IR d ad deote the cell ito which x falls by (x). Put ν (, ) = 1 the the histogram estimate is as follows: i=1 I {Xi,Y i }, f (y x) = ν ( (x), (y)) µ ( (x)) λ( (y)). We will use the followig coditios: assume that for each sphere S cetered at the origi we have lim max diam(,j) = 0 (13) j:,j S ad {j :,j S } lim = 0. (14) The ext theorem exteds the desity-free strog cosistecy result of bou-jaoude (1976) to coditioal desity estimatio. 10

11 Theorem 4 ssume that the partitios P ad Q satisfy (10), (11), (13) ad (14), resp. The f (y x) f(y x) λ(dy)µ(dx) 0 a.s. Devroye ad Györfi (1985a), ad eirlat ad Györfi (1998) calculated the rate of covergece of the expected L 1 error of the histogram. Next we exted these results to the estimates of coditioal desities. Theorem 5 ssume X ad Y are bouded a.s., ad f(u x) f(y x) C 1 (x) u y ad f(y z) f(y x) C 2 (y) x z for all x, z from the bouded support of X ad for all y, u from the bouded support of Y such that C 1 (z)µ(dz) < ad C 2 (y)λ(dy) <. Let f (y x) be the histogram estimate of f(y x) with respect to a partitios P ad Q cosisitig of cubes with side-legths h ad H, resp. The so for E f (y x) f(y x) λ(dy)µ(dx) c8 c 9 h d + h d + d c 10 h + d H c 11 H, d h = c 12 1/(d+d +2) ad H = c 13 1/(d+d +2) we get E f (y x) f(y x) λ(dy)µ(dx) c 14 1 d+d

12 4 Proofs 4.1 Proof of Theorem 1 Usig (where x + = max{x, 0}) we get = 2 a b = 2(b a) + + (a b) ˆP {y x} P{Y = y X = x} µ(dx) ( P{Y = y X = x} ˆP {y x}) + ˆP {y x}µ(dx) + µ(dx) P{Y = y X = x}µ(dx). Usig the Cauchy-Schwarz iequality ad Theorem 23.1 i Györfi et al. (2002) we get for each fixed y IN 0 ( P{Y = y X = x} ˆP ) {y x} P X(dx) + ˆP {y x} P{Y = y X = x} µ(dx) ( ˆP {y x} P{Y = y X = x}) 2 µ(dx) 0 a.s., which implies together with the domiated covergece theorem, that the first term o the right had side above coverges to zero. Cocerig the secod term we observe = ˆP {y x}µ(dx) i=1 I (x)(x i ) I {Yi =y} j=1 I (x)(x j ) ( ) = I P { j=1 I (x)(x j )>0} 1 µ(dx) = I P µ{ i=1 I,j (X i )=0o,j }. j=0 12 P{Y = y X = x}µ(dx) 1 µ(dx)

13 Together with (11), it implies that ˆP {y x}µ(dx) µ{,j } µ {,j } j=0 0 P{Y = y X = x}µ(dx) a.s. (cf. Lemma 1 i Devroye ad Györfi (1985b) or, with better costat i the expoetial upper boud, cf. the proof of Lemma 23.2 i Györfi et al. (2002)). 4.2 Proof of Theorem 2 I the sequel we use the otatio ν y, () = 1 i=1 I {Yi =y,x i }, ad with this otatio the partitio estimate is give by ˆP {y x} = ν y,( (x)) µ ( (x)). Thus, = = ˆP {y x} P{Y = y X = x} µ(dx) ν y, ( (x)) µ ( (x)) P ν y, () µ () P{Y = y X = x} µ(dx) P{Y = y X = x} µ(dx) ν y, () P µ () ν y,() µ() µ(dx) + ν y, () P{Y = y, X } P µ() µ() µ(dx) + P{Y = y, X } P{Y = y X = x} µ() µ(dx) P 13

14 ν y, () µ () ν y,() µ() µ() P + ν y, () P{Y = y, X } µ() µ() µ() P + P{Y = y, X } P{Y = y X = x} µ() µ(dx) = P P + ν y, () 1 µ () 1 µ() µ() ν y, () P{Y = y, X } P + P{Y = y, X } P{Y = y X = x} µ() µ(dx) P µ () µ() P + ν y, () P{Y = y, X } P + P{Y = y, X } P{Y = y X = x} µ() µ(dx), P where we have used for the last iequality that ν y, () = µ (). Sice µ () is biomially distributed with parameters ad µ() we get by Cauchy- Schwarz iequality P E{ µ () µ() } y Jese iequality we have P P E{(µ () µ()) 2 } µ(). ( a a l l ) 2 a a2 l l, 14

15 which implies a a l l (a a2 l ). Usig this iequality i the sum above for the c 15 /h d may cells P cotaied i the bouded support of X (which are the oly oes with µ() 0) we coclude P E{ µ () µ() } = c15 h d c15 ( ) 2 µ()/ P h d µ() P c15 h d. Similarly we get = = E{ ν y, () P{Y = y, X } } P E{(ν y, () P{Y = y, X }) 2 } P P P{Y = y, X } c 15 P P{Y = y, X } h d c15 c 15 P{Y = y} h d h d P{Y = y}. Fially P{Y = y, X } P{Y = y X = x} P µ() µ(dx) P{Y = y X = z}µ(dz) P{Y = y X = x}µ(dz) P µ() µ() µ(dx) P{Y = y X = z} P{Y = y X = x} µ(dz) µ(dx) µ() P 15

16 C y (x) diam() µ() µ(dx) P µ() d h C y (x)µ(dx) d h C. Summarizig the above results, the assertio follows. 4.3 Proof of Theorem 3 I the proof we will use the followig lemma. Lemma 1 For arbitrary u, v IR + we have u j j! e u vj j! e v 2 u v. j=0 Proof. W.l.o.g. assume u < v. The u j j! e u vj j=0 u j j! e u uj j=0 ( u j = j=0 j! e v j! e v j! e u uj j! e v + u j j! e v vj j=0 ) ( v j + j=0 = e u ( e u e v) + e v e v e u e v = 2 (1 e (v u)) 2 v u, j! e v j! e v uj j! e v ) sice 1 + x e x (x IR). Proof of Theorem 3. Proof of a): y Lemma 1 we get = ˆP {y x} P{Y = y X = x} µ(dx) m (x) y y! e m(x) m(x)y y! 16 e m(x) µ(dx)

17 2 m (x) m(x) µ(dx) (15) 0 a.s. by Györfi (1991) (see also Theorems 23.3 i Györfi et al. (2002)). Proof of Part b): Usig (15), E ˆP {y x} P{Y = y X = x} µ(dx) { 2 E c 4 h d + c 5 h, } m (x) m(x) 2 µ(dx) where the last step ca be doe i a similar way as the proof of Theorem 4.3 i Györfi et al. (2002). 4.4 Proof of Theorem 4 Itroduce the otatio ν(, ) = E{ν (, )} = P{X, Y }, the f (y x) f(y x) λ(dy)µ(dx) = P Q P Q + P Q + P Q P Q ν (, ) µ () λ() f(y x) λ(dy)µ(dx) ν (, ) µ () λ() ν (, ) µ() λ() λ(dy)µ(dx) ν (, ) ν(, ) µ() λ() µ() λ() λ(dy)µ(dx) ν(, ) µ() λ() f(y x) λ(dy)µ(dx) = ν (, ) µ () λ() ν (, ) µ() λ() µ()λ() 17

18 therefore + ν (, ) ν(, ) µ() λ() µ() λ() µ()λ() P Q + ν(, ) µ() λ() f(y x) λ(dy)µ(dx), P Q f (y x) f(y x) λ(dy)µ(dx) ν (, ) 1 µ () 1 µ() µ() P Q + ν (, ) ν(, ) P Q + ν(, ) µ() λ() f(y x) λ(dy)µ(dx) P Q µ () µ() (16) P + ν (, ) ν(, ) (17) P Q + ν(, ) µ() λ() f(y x) λ(dy)µ(dx), (18) P Q where we have used for the last iequality that ν (, ) = µ (). Q ecause of (11), (16) teds to 0 a.s., while (11) ad (14) imply that (17) teds to 0 a.s. (cf. Lemma 1 i Devroye ad Györfi (1985b)). Cocerig the covergece of the bias term (18), itroduce the otatio f (y x) = (x) f(u z)λ(du)µ(dz) (y) µ( (x)) λ( (y)) the P Q = P Q ν(, ) µ() λ() f(y x) λ(dy)µ(dx) f(u z)λ(du)µ(dz) µ() λ() f(y x) λ(dy)µ(dx) 18

19 = f (y x) f(y x) λ(dy)µ(dx) 0, because of the coditios (10) ad (13). This covergece is obvious if f(y x) is cotiuous ad has compact support. I geeral, we use that f(y x) L 1 (µ λ), ad refer to the deseess result such that the set of cotiuous fuctios i L 1 (µ λ) with compact support is dese i L 1 (µ λ) (cf., e.g., Devroye ad Györfi (2002)). alterative techique would be the Lebesgue desity theorem (cf., e.g., Lemma 24.5 i Györfi et al. (2002)), which is a poitwise covergece, ad together with the Scheefe theorem ad the domiated covergece theorem we are ready. 4.5 Proof of Theorem 5 ecause of the proof of Theorem 4, { E } f (y x) f(y x) λ(dy)µ(dx) E { µ () µ() } P + E { ν (, ) ν(, ) } P Q ν( (x), (y)) + µ( (x)) λ( (y)) f(y x) λ(dy)µ(dx). ccordig to the proof of Theorem 2, the coditio that X is bouded implies that P E { µ () µ() } ad, similarly, usig X ad Y are bouded we ca show P c15 h d, c16 E { ν (, ) ν(, ) } h d Q H d Cocerig the rate of covergece of the bias term we observe ν( (x), (y)) µ( (x)) λ( (y)) f(y x) λ(dy)µ(dx) 19.

20 = P Q = P Q P Q P Q + P Q ν(, ) µ() λ() f(y x) λ(dy)µ(dx) f(u z)λ(du)µ(dz) µ() λ() pplyig the coditios the theorem we get that f(y x) λ(dy)µ(dx) f(u z) f(y x) λ(du)µ(dz) λ(dy)µ(dx) µ() λ() f(u z) f(y z) λ(du)µ(dz) λ(dy)µ(dx) µ() λ() f(y z) f(y x) λ(du)µ(dz) λ(dy)µ(dx). µ() λ() ν( (x), (y)) µ( (x)) λ( (y)) f(y x) µ(dx)λ(dy) = P Q + P Q C 1 (z)µ(dz)λ(s Y ) d H + C 1(z) d H λ(du)µ(dz) λ(dy)µ(dx) µ() λ() C 2(y) d h λ(du)µ(dz) λ(dy)µ(dx) µ() λ() C 2 (y)λ(dy) d h, where S Y is the bouded support of Y. Refereces [1] bou-jaoude, S. (1976). Coditios écessaires et suffisates de covergece L 1 e probabilité de l histogramme pour ue desité. ales de l Istitut Heri Poicaré, XII, [2] arro,. R., Györfi, L. ad va der Meule, E. C. (1992). Distributio estimatio cosistet i total variatio ad two types of iformatio divergece. IEEE Tras. Iformatio Theory 38, pp

21 [3] eirlat, J. ad Györfi, L. (1998). O the L 1 error i histogram desity estimatio: the multidimesioal case. Noparametric Statistics 9, pp [4] Devroye, L. (1982). y discrimiatio rule ca have arbitrarily bad probability of error for fiite sample size. IEEE Trasactios o Patter alysis ad Machie Itelligece 4, [5] Devroye, L. ad Györfi, L. (1985a). Noparametric Desity Estimatio: The L 1 View. Joh Wiley, New York. [6] Devroye, L. ad Györfi, L. (1985b). Distributio-free expoetial boud for the L 1 error of partitioig estimates of a regressio fuctio. I Probability ad Statistical Decisio Theory, F. Koecy, J. Mogyoródi, W. Wertz, Eds., D. Reidel, pp [7] Devroye, L. ad Györfi, L. (1990). No empirical measure ca coverge i the total variatio sese for all distributio. als of Statistics 18, pp [8] Devroye, L. ad Györfi, L. (2002). Distributio ad desity estimatio. I Priciples of Noparametric Learig, L. Györfi (Ed.), Spriger-Verlag, Wie, pp [9] Györfi, L. (1991). Uiversal cosistecy of a regressio estimate for ubouded regressio fuctios, Noparametric fuctioal estimatio ad related topics (ed. G. Roussas), , NTO SI Series, Kluwer cademic Publishers, Dordrecht. [10] Györfi, L., Kohler, M., Krzyżak,., ad Walk, H. (2002). Distributio-Free Theory of Noparametric Regressio. Spriger Series i Statistics, Spriger. [11] Kohler, M. ad Krzyżak,. symptotic cofidece itervals for Poisso regressio. Submitted for publicatio,

Estimation of the essential supremum of a regression function

Estimation of the essential supremum of a regression function Estimatio of the essetial supremum of a regressio fuctio Michael ohler, Adam rzyżak 2, ad Harro Walk 3 Fachbereich Mathematik, Techische Uiversität Darmstadt, Schlossgartestr. 7, 64289 Darmstadt, Germay,

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Sieve Estimators: Consistency and Rates of Convergence

Sieve Estimators: Consistency and Rates of Convergence EECS 598: Statistical Learig Theory, Witer 2014 Topic 6 Sieve Estimators: Cosistecy ad Rates of Covergece Lecturer: Clayto Scott Scribe: Julia Katz-Samuels, Brado Oselio, Pi-Yu Che Disclaimer: These otes

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

On the Asymptotic Normality of an Estimate of a Regression Functional

On the Asymptotic Normality of an Estimate of a Regression Functional Joural of Machie Learig Research 6 205) 863-877 Submitted 6/5; Published 9/5 O the Asymptotic Normality of a stimate of a Regressio Fuctioal László Györfi Departmet of Computer Sciece Iformatio Theory

More information

University of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck!

University of Colorado Denver Dept. Math. & Stat. Sciences Applied Analysis Preliminary Exam 13 January 2012, 10:00 am 2:00 pm. Good luck! Uiversity of Colorado Dever Dept. Math. & Stat. Scieces Applied Aalysis Prelimiary Exam 13 Jauary 01, 10:00 am :00 pm Name: The proctor will let you read the followig coditios before the exam begis, ad

More information

Kernel density estimator

Kernel density estimator Jauary, 07 NONPARAMETRIC ERNEL DENSITY ESTIMATION I this lecture, we discuss kerel estimatio of probability desity fuctios PDF Noparametric desity estimatio is oe of the cetral problems i statistics I

More information

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number MATH 532 Itegrable Fuctios Dr. Neal, WKU We ow shall defie what it meas for a measurable fuctio to be itegrable, show that all itegral properties of simple fuctios still hold, ad the give some coditios

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Detailed proofs of Propositions 3.1 and 3.2

Detailed proofs of Propositions 3.1 and 3.2 Detailed proofs of Propositios 3. ad 3. Proof of Propositio 3. NB: itegratio sets are geerally omitted for itegrals defied over a uit hypercube [0, s with ay s d. We first give four lemmas. The proof of

More information

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization

ECE 901 Lecture 14: Maximum Likelihood Estimation and Complexity Regularization ECE 90 Lecture 4: Maximum Likelihood Estimatio ad Complexity Regularizatio R Nowak 5/7/009 Review : Maximum Likelihood Estimatio We have iid observatios draw from a ukow distributio Y i iid p θ, i,, where

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

Empirical Processes: Glivenko Cantelli Theorems

Empirical Processes: Glivenko Cantelli Theorems Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3

More information

Limit distributions for products of sums

Limit distributions for products of sums Statistics & Probability Letters 62 (23) 93 Limit distributios for products of sums Yogcheg Qi Departmet of Mathematics ad Statistics, Uiversity of Miesota-Duluth, Campus Ceter 4, 7 Uiversity Drive, Duluth,

More information

Lecture 8: Convergence of transformations and law of large numbers

Lecture 8: Convergence of transformations and law of large numbers Lecture 8: Covergece of trasformatios ad law of large umbers Trasformatio ad covergece Trasformatio is a importat tool i statistics. If X coverges to X i some sese, we ofte eed to check whether g(x ) coverges

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

1 The Haar functions and the Brownian motion

1 The Haar functions and the Brownian motion 1 The Haar fuctios ad the Browia motio 1.1 The Haar fuctios ad their completeess The Haar fuctios The basic Haar fuctio is 1 if x < 1/2, ψx) = 1 if 1/2 x < 1, otherwise. 1.1) It has mea zero 1 ψx)dx =,

More information

A Proof of Birkhoff s Ergodic Theorem

A Proof of Birkhoff s Ergodic Theorem A Proof of Birkhoff s Ergodic Theorem Joseph Hora September 2, 205 Itroductio I Fall 203, I was learig the basics of ergodic theory, ad I came across this theorem. Oe of my supervisors, Athoy Quas, showed

More information

Math Solutions to homework 6

Math Solutions to homework 6 Math 175 - Solutios to homework 6 Cédric De Groote November 16, 2017 Problem 1 (8.11 i the book): Let K be a compact Hermitia operator o a Hilbert space H ad let the kerel of K be {0}. Show that there

More information

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS

EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS EFFECTIVE WLLN, SLLN, AND CLT IN STATISTICAL MODELS Ryszard Zieliński Ist Math Polish Acad Sc POBox 21, 00-956 Warszawa 10, Polad e-mail: rziel@impagovpl ABSTRACT Weak laws of large umbers (W LLN), strog

More information

1+x 1 + α+x. x = 2(α x2 ) 1+x

1+x 1 + α+x. x = 2(α x2 ) 1+x Math 2030 Homework 6 Solutios # [Problem 5] For coveiece we let α lim sup a ad β lim sup b. Without loss of geerality let us assume that α β. If α the by assumptio β < so i this case α + β. By Theorem

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

Approximation by Superpositions of a Sigmoidal Function

Approximation by Superpositions of a Sigmoidal Function Zeitschrift für Aalysis ud ihre Aweduge Joural for Aalysis ad its Applicatios Volume 22 (2003, No. 2, 463 470 Approximatio by Superpositios of a Sigmoidal Fuctio G. Lewicki ad G. Mario Abstract. We geeralize

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Pattern Classification, Ch4 (Part 1)

Pattern Classification, Ch4 (Part 1) Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

Asymptotic distribution of products of sums of independent random variables

Asymptotic distribution of products of sums of independent random variables Proc. Idia Acad. Sci. Math. Sci. Vol. 3, No., May 03, pp. 83 9. c Idia Academy of Scieces Asymptotic distributio of products of sums of idepedet radom variables YANLING WANG, SUXIA YAO ad HONGXIA DU ollege

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

A survey on penalized empirical risk minimization Sara A. van de Geer

A survey on penalized empirical risk minimization Sara A. van de Geer A survey o pealized empirical risk miimizatio Sara A. va de Geer We address the questio how to choose the pealty i empirical risk miimizatio. Roughly speakig, this pealty should be a good boud for the

More information

Notes 19 : Martingale CLT

Notes 19 : Martingale CLT Notes 9 : Martigale CLT Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: [Bil95, Chapter 35], [Roc, Chapter 3]. Sice we have ot ecoutered weak covergece i some time, we first recall

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

Mathematical Methods for Physics and Engineering

Mathematical Methods for Physics and Engineering Mathematical Methods for Physics ad Egieerig Lecture otes Sergei V. Shabaov Departmet of Mathematics, Uiversity of Florida, Gaiesville, FL 326 USA CHAPTER The theory of covergece. Numerical sequeces..

More information

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5

Ma 4121: Introduction to Lebesgue Integration Solutions to Homework Assignment 5 Ma 42: Itroductio to Lebesgue Itegratio Solutios to Homework Assigmet 5 Prof. Wickerhauser Due Thursday, April th, 23 Please retur your solutios to the istructor by the ed of class o the due date. You

More information

Estimation of a regression function by maxima of minima of linear functions

Estimation of a regression function by maxima of minima of linear functions Estimatio of a regressio fuctio by maxima of miima of liear fuctios Adil M. Bagirov, Coy Clause 2 ad Michael Kohler 3 School of Iformatio Techology ad Mathematical Scieces, Uiversity of Ballarat, PO Box

More information

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1

1 = δ2 (0, ), Y Y n nδ. , T n = Y Y n n. ( U n,k + X ) ( f U n,k + Y ) n 2n f U n,k + θ Y ) 2 E X1 2 X1 8. The cetral limit theorems 8.1. The cetral limit theorem for i.i.d. sequeces. ecall that C ( is N -separatig. Theorem 8.1. Let X 1, X,... be i.i.d. radom variables with EX 1 = ad EX 1 = σ (,. Suppose

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Chapter 7 Isoperimetric problem

Chapter 7 Isoperimetric problem Chapter 7 Isoperimetric problem Recall that the isoperimetric problem (see the itroductio its coectio with ido s proble) is oe of the most classical problem of a shape optimizatio. It ca be formulated

More information

Entropy Rates and Asymptotic Equipartition

Entropy Rates and Asymptotic Equipartition Chapter 29 Etropy Rates ad Asymptotic Equipartitio Sectio 29. itroduces the etropy rate the asymptotic etropy per time-step of a stochastic process ad shows that it is well-defied; ad similarly for iformatio,

More information

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach

Lecture 7: Density Estimation: k-nearest Neighbor and Basis Approach STAT 425: Itroductio to Noparametric Statistics Witer 28 Lecture 7: Desity Estimatio: k-nearest Neighbor ad Basis Approach Istructor: Ye-Chi Che Referece: Sectio 8.4 of All of Noparametric Statistics.

More information

Lecture 3 : Random variables and their distributions

Lecture 3 : Random variables and their distributions Lecture 3 : Radom variables ad their distributios 3.1 Radom variables Let (Ω, F) ad (S, S) be two measurable spaces. A map X : Ω S is measurable or a radom variable (deoted r.v.) if X 1 (A) {ω : X(ω) A}

More information

Maximum Likelihood Estimation and Complexity Regularization

Maximum Likelihood Estimation and Complexity Regularization ECE90 Sprig 004 Statistical Regularizatio ad Learig Theory Lecture: 4 Maximum Likelihood Estimatio ad Complexity Regularizatio Lecturer: Rob Nowak Scribe: Pam Limpiti Review : Maximum Likelihood Estimatio

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

REGRESSION WITH QUADRATIC LOSS

REGRESSION WITH QUADRATIC LOSS REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d

More information

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial. Taylor Polyomials ad Taylor Series It is ofte useful to approximate complicated fuctios usig simpler oes We cosider the task of approximatig a fuctio by a polyomial If f is at least -times differetiable

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Lecture 15: Density estimation

Lecture 15: Density estimation Lecture 15: Desity estimatio Why do we estimate a desity? Suppose that X 1,...,X are i.i.d. radom variables from F ad that F is ukow but has a Lebesgue p.d.f. f. Estimatio of F ca be doe by estimatig f.

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

On Classification Based on Totally Bounded Classes of Functions when There are Incomplete Covariates

On Classification Based on Totally Bounded Classes of Functions when There are Incomplete Covariates Joural of Statistical Theory ad Applicatios Volume, Number 4, 0, pp. 353-369 ISSN 538-7887 O Classificatio Based o Totally Bouded Classes of Fuctios whe There are Icomplete Covariates Majid Mojirsheibai

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 21 11/27/2013 Fuctioal Law of Large Numbers. Costructio of the Wieer Measure Cotet. 1. Additioal techical results o weak covergece

More information

Lecture 3 The Lebesgue Integral

Lecture 3 The Lebesgue Integral Lecture 3: The Lebesgue Itegral 1 of 14 Course: Theory of Probability I Term: Fall 2013 Istructor: Gorda Zitkovic Lecture 3 The Lebesgue Itegral The costructio of the itegral Uless expressly specified

More information

LECTURE 8: ASYMPTOTICS I

LECTURE 8: ASYMPTOTICS I LECTURE 8: ASYMPTOTICS I We are iterested i the properties of estimators as. Cosider a sequece of radom variables {, X 1}. N. M. Kiefer, Corell Uiversity, Ecoomics 60 1 Defiitio: (Weak covergece) A sequece

More information

Probability for mathematicians INDEPENDENCE TAU

Probability for mathematicians INDEPENDENCE TAU Probability for mathematicias INDEPENDENCE TAU 2013 28 Cotets 3 Ifiite idepedet sequeces 28 3a Idepedet evets........................ 28 3b Idepedet radom variables.................. 33 3 Ifiite idepedet

More information

TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES (SUPPLEMENTARY MATERIAL)

TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES (SUPPLEMENTARY MATERIAL) TESTING FOR THE BUFFERED AUTOREGRESSIVE PROCESSES SUPPLEMENTARY MATERIAL) By Ke Zhu, Philip L.H. Yu ad Wai Keug Li Chiese Academy of Scieces ad Uiversity of Hog Kog APPENDIX: PROOFS I this appedix, we

More information

Regression with quadratic loss

Regression with quadratic loss Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,

More information

Rates of Convergence for Quicksort

Rates of Convergence for Quicksort Rates of Covergece for Quicksort Ralph Neiiger School of Computer Sciece McGill Uiversity 480 Uiversity Street Motreal, HA 2K6 Caada Ludger Rüschedorf Istitut für Mathematische Stochastik Uiversität Freiburg

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig

More information

On n-dimensional Hilbert transform of weighted distributions

On n-dimensional Hilbert transform of weighted distributions O -dimesioal Hilbert trasform of weighted distributios MARTHA GUMÁN-PARTIDA Departameto de Matemáticas, Uiversidad de Soora, Hermosillo, Soora 83000, México Abstract We de e a family of cougate Poisso

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Unbiased Estimation. February 7-12, 2008

Unbiased Estimation. February 7-12, 2008 Ubiased Estimatio February 7-2, 2008 We begi with a sample X = (X,..., X ) of radom variables chose accordig to oe of a family of probabilities P θ where θ is elemet from the parameter space Θ. For radom

More information

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1

Solution. 1 Solutions of Homework 1. Sangchul Lee. October 27, Problem 1.1 Solutio Sagchul Lee October 7, 017 1 Solutios of Homework 1 Problem 1.1 Let Ω,F,P) be a probability space. Show that if {A : N} F such that A := lim A exists, the PA) = lim PA ). Proof. Usig the cotiuity

More information

An alternative proof of a theorem of Aldous. concerning convergence in distribution for martingales.

An alternative proof of a theorem of Aldous. concerning convergence in distribution for martingales. A alterative proof of a theorem of Aldous cocerig covergece i distributio for martigales. Maurizio Pratelli Dipartimeto di Matematica, Uiversità di Pisa. Via Buoarroti 2. I-56127 Pisa, Italy e-mail: pratelli@dm.uipi.it

More information

Monte Carlo Integration

Monte Carlo Integration Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce

More information

Local Approximation Properties for certain King type Operators

Local Approximation Properties for certain King type Operators Filomat 27:1 (2013, 173 181 DOI 102298/FIL1301173O Published by Faculty of Scieces ad athematics, Uiversity of Niš, Serbia Available at: http://wwwpmfiacrs/filomat Local Approimatio Properties for certai

More information

INFINITE SEQUENCES AND SERIES

INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES INFINITE SEQUENCES AND SERIES I geeral, it is difficult to fid the exact sum of a series. We were able to accomplish this for geometric series ad the series /[(+)]. This is

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 3 9/11/2013. Large deviations Theory. Cramér s Theorem MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/5.070J Fall 203 Lecture 3 9//203 Large deviatios Theory. Cramér s Theorem Cotet.. Cramér s Theorem. 2. Rate fuctio ad properties. 3. Chage of measure techique.

More information

Notes 5 : More on the a.s. convergence of sums

Notes 5 : More on the a.s. convergence of sums Notes 5 : More o the a.s. covergece of sums Math 733-734: Theory of Probability Lecturer: Sebastie Roch Refereces: Dur0, Sectios.5; Wil9, Sectio 4.7, Shi96, Sectio IV.4, Dur0, Sectio.. Radom series. Three-series

More information

Singular Continuous Measures by Michael Pejic 5/14/10

Singular Continuous Measures by Michael Pejic 5/14/10 Sigular Cotiuous Measures by Michael Peic 5/4/0 Prelimiaries Give a set X, a σ-algebra o X is a collectio of subsets of X that cotais X ad ad is closed uder complemetatio ad coutable uios hece, coutable

More information

5.1 A mutual information bound based on metric entropy

5.1 A mutual information bound based on metric entropy Chapter 5 Global Fao Method I this chapter, we exted the techiques of Chapter 2.4 o Fao s method the local Fao method) to a more global costructio. I particular, we show that, rather tha costructig a local

More information

1 Covariance Estimation

1 Covariance Estimation Eco 75 Lecture 5 Covariace Estimatio ad Optimal Weightig Matrices I this lecture, we cosider estimatio of the asymptotic covariace matrix B B of the extremum estimator b : Covariace Estimatio Lemma 4.

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

A Quantitative Lusin Theorem for Functions in BV

A Quantitative Lusin Theorem for Functions in BV A Quatitative Lusi Theorem for Fuctios i BV Adrás Telcs, Vicezo Vespri November 19, 013 Abstract We exted to the BV case a measure theoretic lemma previously proved by DiBeedetto, Giaazza ad Vespri ([1])

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

BIRKHOFF ERGODIC THEOREM

BIRKHOFF ERGODIC THEOREM BIRKHOFF ERGODIC THEOREM Abstract. We will give a proof of the poitwise ergodic theorem, which was first proved by Birkhoff. May improvemets have bee made sice Birkhoff s orgial proof. The versio we give

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

The Central Limit Theorem

The Central Limit Theorem Chapter The Cetral Limit Theorem Deote by Z the stadard ormal radom variable with desity 2π e x2 /2. Lemma.. Ee itz = e t2 /2 Proof. We use the same calculatio as for the momet geeratig fuctio: exp(itx

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

TAMS24: Notations and Formulas

TAMS24: Notations and Formulas TAMS4: Notatios ad Formulas Basic otatios ad defiitios X: radom variable stokastiska variabel Mea Vätevärde: µ = X = by Xiagfeg Yag kpx k, if X is discrete, xf Xxdx, if X is cotiuous Variace Varias: =

More information

APPENDIX A SMO ALGORITHM

APPENDIX A SMO ALGORITHM AENDIX A SMO ALGORITHM Sequetial Miimal Optimizatio SMO) is a simple algorithm that ca quickly solve the SVM Q problem without ay extra matrix storage ad without usig time-cosumig umerical Q optimizatio

More information

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data

Kolmogorov-Smirnov type Tests for Local Gaussianity in High-Frequency Data Proceedigs 59th ISI World Statistics Cogress, 5-30 August 013, Hog Kog (Sessio STS046) p.09 Kolmogorov-Smirov type Tests for Local Gaussiaity i High-Frequecy Data George Tauche, Duke Uiversity Viktor Todorov,

More information

1 Convergence in Probability and the Weak Law of Large Numbers

1 Convergence in Probability and the Weak Law of Large Numbers 36-752 Advaced Probability Overview Sprig 2018 8. Covergece Cocepts: i Probability, i L p ad Almost Surely Istructor: Alessadro Rialdo Associated readig: Sec 2.4, 2.5, ad 4.11 of Ash ad Doléas-Dade; Sec

More information

Assignment 5: Solutions

Assignment 5: Solutions McGill Uiversity Departmet of Mathematics ad Statistics MATH 54 Aalysis, Fall 05 Assigmet 5: Solutios. Let y be a ubouded sequece of positive umbers satisfyig y + > y for all N. Let x be aother sequece

More information