Size: px
Start display at page:

Download ""

Transcription

1 Which Extreme Values are Really Extremes? Jesús Gozalo Λ Statistics ad Ecoometrics José Olmo Statistics ad Ecoometrics U. Carlos III de Madrid U. Carlos III de Madrid Jauary 2003 Abstract The aim of this paper is to give a formal defiitio ad cosistet estimates of the extremes of a populatio. This defiitio relies o a threshold value that delimits the extremes ad o the uiform covergece of the distributio of these extremes to a Pareto type distributio. The tail parameter of this Pareto type distributio is the tail idex of the data distributio. The estimator of the threshold is achored i the Kolmogorov-Smirov distace betwee cosistet estimates of those two distributios. Our estimator is cosistet ad via the costructio of cofidece itervals for the tail idex (derived from our threshold estimator) we overcome the bias problems of the usual tail idex estimators (Hill or Pickads). The paper also explores the validity of our defiitio for stadard sample sizes. For this purpose, a hypothesis test is desiged i order to reject extremes estimates that are ot really extremes. Applicatios for differet stock returs are preseted. Keywords: Bootstrap, Goodess of fit test, Hill estimator, Kolmogorov-Smirov distace, Balkema ad De-Haa, Pickads Theorem, Tail idex. JEL: C12, C13, C14, C15, G10 Λ Correspodig Address: Fiacial support DGCYT Grat (SEC ) is gratefully ackowledged. We thak participats i the Coferece o Extremal Evets i Fiace celebrated i Motreal.

2 1 Itroductio Noe doubts that Risk Maagemet is oe of the most importat iovatios of the 20th cetury. The questio oe would like to aswer is: "If thigs go wrog, how wrog ca they go?" The variace used as a risk measure is uable to aswer this questio, ad therefore alterative measures regardig possible values out of the rage of available iformatio eed to be defied. Extreme value theory (EVT) provides some tools to costruct these ew risk measures: Value at Risk (VaR), Expected Shortfall or the tail idex of a distributio. All these measures eed to start by idetifyig which values are extreme values. I practice this is doe by graphical methods like QQ-plot, Sample Mea Excess Plot or by other ad-hoc methods that impose a arbitrary threshold (5%; 10%;:::). I this paper we propose a formal way of idetifyig which extreme values are really extremes. The goal is to estimate the lower boud of these extremes for fiite samples, i.e. a threshold value. Our method is achored i three key elemets: Pickads, Balkema- De Haa theorem (BHP), a distace based o a Kolmogorov-Smirov (KS) statistic ad hypothesis testig via bootstrap methods. By Pickads, Balkema-De Haa theorem we kow that the distributio of the exceedaces of a radom variable i the limit teds to a Pareto shape distributio. Therefore, extreme values cosidered as exceedaces above certai threshold will asymptotically have this type of Pareto distributio. I order to estimate this threshold poit we propose a alterative of Pickads estimator based o miimizig a Kolmogorov-Smirov distace takig ito accout the legth of the sample tail. Oe of the cotributios of our threshold estimator is the obtetio of cofidece itervals for the tail idex capturig the tail behavior of the data distributio. Moreover, the tail idex estimators relyig o our threshold estimator are cosistet ad allow to test our defiitio of extreme values (uiform covergece of the sample distributio of extremes to apareto type distributio). The paper cocludes with some applicatios to extreme quatile estimatio for simulated kow distributios as well as for real fiacial series. For these series, extreme quatiles are the corerstoe of risk measures, as Value at Risk or Expected Shortfall. The paper is structured as follows. I sectio 2 we preset a summary of the existig methods to calculate the threshold value. Sectio 3 shows a brief review of the results from the Extreme Value Theory that we will be usig i the core of the paper. Sectio 4 is devoted to defie our cocept of extreme value ad to preset a ew estimatio method for the threshold value that is used to defie extreme observatios. Sectio 5 itroduces a bootstrap goodess of fit test to check the validity of our defiitio. The fiite sample performace of our proposed method as well as some real applicatios are show i sectio 2

3 6. The coclusios are give i sectio 7. All proofs as well as the otatio used i the paper are gathered i the Appedices. 2 Existig ad-hoc Methods for Threshold Estimatio I the existig literature, there is ot a clear defiitio of the threshold value ν that determies the extremes. There exist differet popular estimatio methods to select a threshold (^ν ) relyig o the asymptotic Pareto distributio of the exceedaces. ffl QQ-plot ffl Sample Mea Excess Plot ffl Simulatio Procedures This estimatio of the threshold has differet challeges depedig o how close ^ν is to the right ed poit. A small ^ν yields bias problems i the estimatio of the parameters of the Pareto distributio. O the other had, large ^ν implies problems of great variace due to the abscece of poits i the tail to estimate the Pareto distributio. 2.1 QQ-plot The method is based o the followig simple fact: if U (1)» U (2)» :::U () are the order statistics from i.i.d. observatios uiformly distributed o [0,1], the by symmetry E(U (i+1) U (i) ) = 1 +1 ad hece E(U (i)) = i +1. Sice U (i) should be close to its i mea, the plot of f( i ;U (+1) (+1) (i)); 1» i» g has to be liear. Suppose ow, X (1)» X (2)» :::X () are the order statistics from a i.i.d. sample of size which is suspected to come from a particular cotiuous distributio G. The plot of f( ;G(X (+1) (i)); 1» i» g should be approximately liear ad hece also the plot of fg ψ ( );X (+1) (i)); 1» i» g should be liear. i i 3

4 Figure 2.1. QQ-plots of the egative tail of Nikkei Idex returs over the period 05= =2001 with ^ν = x d0:90e ad ^ν = x d0:95e. It is ot clear from Figure 2.1 which portio of the observatios fits better to the Geeralized Pareto distributio GP D ^, with parameters estimated from the sample observatios. 2.2 Sample Mea Excess Plot Aother stadard tool for choosig suitable thresholds is the sample mea excess plot (ν; e (ν)) where e (ν) is the sample mea excess fuctio defied by e (ν) = ± (X i ν) +, ± 1 fxi>νg with x + = max(x; 0). The sample mea excess fuctio e (ν) is the empirical couterpart of the mea excess fuctio which is defied as e(ν) =E[X ν j X>ν]. If the empirical plot follows a reasoably straight lie with positive gradiet above a certai value of ν, the this is a idicatio that the exceedaces over this threshold follow a Geeralized Pareto distributio with positive tail idex (ο) parameter. This is derived from the fact that e(ν) = ff + ον 1 ο ; where ff + ον > 0adff is the stadard deviatio of the GPD (see McNeil ad Saladi, 2001). 4

5 Figure 2.2. Sample mea Excess plot for egative Nikkei Idex returs over the period 05= =2001, (left graph) ad a sample from a ormal distributio of size =1000 (right graph). Focusig o Figure 2.2 differet cadidates ca be selected for the estimated threshold value. Other methods i order to choose the threshold value ν take advatage of simulatio procedures for kow distributios. The idea is to determie a threshold ν from a sample of size ad cosider the umber of observatios over this threshold N ν. The goal is to obtai the ecessary sample size to geerate N ν exceedaces over the determied threshold ν. This sample size is employed to estimate a extreme quatile closer to the right ed poit tha the threshold ν. I this waywe ca compare this extreme estimate with the actual extreme quatile of the kow distributio ad see the reliability of the ad-hoc threshold estimate (see McNeil, 1997). 3 Extreme Value Theory Results The mathematical foudatio of EVT is the class of extreme value limit laws, first derived heuristically by Fisher ad Tippet (1928) ad later from a rigorous stadpoit by Gedeko (1943). Suppose X 1 ;:::;X are idepedet radom variables with commo distributio fuctio F (x) = P fx» xg ad let M = max(x 1 ;:::;X ). Uder some cotiuity coditios o F at its right ed poit, the maximum M properly cetered ad ormalized has a limit law H ο with ο the parameter of the limit distributio, P f M d c» xg = F (c x + d ) d! H ο (x): (1) The cotiuity o F is a sufficiet coditio but it is ot ecessary. It is oly required some smoothess ear the right ed poit. 5

6 Theorem 3.1. Let F be a distributio fuctio with right ed poit x F» 1 ad let fi 2 (0; 1). There exists a sequece (u ) satisfyig F (u )! fi if ad oly if lim x!x F F (x) F (x =1 (2) ) (see Embrechts, Klüppelberg ad Mikosch, 1997, p.117). The coditio F (u )! fi is equivalet tosay that the sample maximum has a odegeerate distributio of expoetial type P (M» u )! e fi. The asymptotic distributio of the maximum is called extreme value law. The key result of Fisher-Tippet ad Gedeko is that there are oly three fudametal types of extreme value limit laws. These are Type I: (Gumbel) Λ(x) =exp( e x ); 1 <x<1; Type II: (Fr echet) Φ ff (x) = Type III: (Weibull) Ψ ff (x) = 8 < : 8 < : 0 x» 0; exp( x ff ) x>0 1 x 0; exp( ( x) ff ) x<0 I Types II ad III ff is a positive parameter. The three types may alsobecombied ito a sigle geeralised extreme value distributio, first proposed by Vo Mises (1936), of the form H ο (x) = 8 < : e (1+οx) 1 ο ο 6= 0 e e x ο =0 with 1+οx > 0. The case ο > 0 correspods to Type II with ff = 1 ο, the case ο < 0to Type III with ff = 1=ο, ad the limit case ο! 0toType I.. (3) Figure 3.1. The desity fuctio of the extreme value limit laws. The dot lie is the Gumbel distributio. Fr echet ad Weibull distributios are plotted with ff =1. 6

7 Corollary 3.1. From expressios ( 1) ad ( 3), the followig relatioships ca be extracted depedig o the value of the parameter ο, F (c x + d ) d! (1 + οx) 1 ο if ο 6= 0ad F (c x + d ) d! e x if ο =0. These expressios ca be cosidered as the survivor fuctios of a Geeralized Pareto distributio. Moreover, the asymptotic distributio of the stadardized tail of F depeds o a parameter ο (tail idex), hece a distributio F verifyig ( 2) ca be classified accordig to this parameter. Defiitio 3.1. F belogs to the Maximum Domai of Attractio of a Extreme Value Distributio H ο, F 2 MDA(H ο ), if ad oly if there exist costats c > 0 ad d,such that c 1 (M d ) d! H ο. Notice that the commoly employed cotiuous distributio fuctios belog to the maximum domai of attractio of a extreme value limit law, F 2 MDA(H ο ). From the results of Fisher-Tippet ad Gedeko it is derived that there are oly three types of maximum domais of attractio i cotrast with the umber of domais of attractio of ff-stable processes. This maximum domai of attractio of F depeds o the sig of ο. Defiitio 3.2. A df F such that the right tail satisfies 1 F (tx) lim x!1 1 F (x) = t ff ; t>0; ff = 1 ο > 0 (4) is called regularly varyig with idex ff (F 2 RV ff ). The tail of a distributio F satisfyig ( 4) decays polyomially (F is heavy tailed). This coditio ca be rewritte as 1 F (x) =x 1 ο L(x); x!1; ο>0 (5) where L(x) is a slowly varyig fuctio L(tx) lim =1; t>0: (6) x!1 L(x) A distributio fuctio F with positive tail idex ad verifyig coditio ( 2) idicates that the sample maximum should have a o degeerate distributio of Type II. Propositio 3.1. F 2 RV ff, F 2 MDA(Φ ff ) where Φ ff is the Fréchet EVD. The ormalizig costats for this case are d = 0 ad c = F ψ (1 1 ) (see Embrechts, Klüppelberg ad Mikosch, 1997, p.132). 7

8 It is sufficiettokow the tail idex of a distributio F to kow the asymptotic distributio of the stadardized maximum. Moreover, this parameter ο provides iformatio about the behavior of the tail ad therefore the tail idex ca help to give a formal defiitio for the tail avoidig the ad-hoc selectio of arbitrary quatiles. Defiitio 3.3. The tail of a distributio is the set of extreme values; these extremes are the exceedaces over a determied threshold (ν) with ν sufficietly large. The distributio of these large observatios F ν (x) is called the coditioal excess distributio fuctio (cedf) over ν ad is defied as F ν (x) =P fx» xjx >νg; ν» x» x F ; (7) where Xisaradom variable, ν is a give threshold ad x F»1is the right edpoit of F. This distributio ca be writte i terms of F, F ν (x) = F (x) F (ν) 1 F (ν) x>ν: (8) From this expressio it is deduced that F ν (x) = F (x). The extremes of a populatio F (ν) are determied by the threshold ν ad by the tail idex ο of the distributio F. These parameters defie the asymptotic distributio of stadardized extremes. Theorem 3.2. (Balkema ad de Haa (1974), Pickads (1975)) (BHP). Let F be a distributio fuctio such that F 2 MDA(H ο ), the coditioal excess distributio fuctio F ν (x) for ν large, is where GP D ο;ff (x ν) = lim F ν (x) =GP D οff (x ν) ν!x F 8 < : 1 (1 + ο(x ν) ff ) 1 1 exp (x ν) ff ο if ο 6= 0 if ο =0 (9) is the so-called Geeralized Pareto distributio (GPD). The Geeralized Pareto is the asymptotic distributio of the extremes uder some cotiuity coditios over F. If the distributio F has a regularly varyig tail, the the distributio of the extremes as goes to ifiity ca be reduced to a Pareto distributio. Corollary 3.2. For a distributio fuctio F such that F 2 MDA(Φ ff ), the coditioal excess distributio fuctio F ν (x), for ν large, is lim F ν (x) =PD ο ( x ν!x F ν ); 8

9 where PD ο ( x ν )=1 ( x ν ) 1 ο ; x>ν is the Pareto distributio. We brig together the two approaches of the asymptotic Pareto type distributio of the tail i the otatio G, G = 8 < : GP D ο;ff (x ν); F 2 MDA(H ο ) PD ο ( x ν ); F 2 MDA(Φ ff) with = fο;ffg for the GPD case ad = fοg for the Pareto distributio. It is importat to otice that we are ot cotradictig BHP theorem. The Pareto distributio is icluded i the Geeralized Pareto family: PD ο ( x ν )=GP D ο;ff( x ν νο ). 4 Defiitio ad Estimatio of Extremes Uder defiitio ( 3.3) the tail of a distributio is the set of extreme values. We ca exted BHP theorem to give a formal defiitio of the extremes based o uiform covergece betwee the two ivolved distributio fuctios. Defiitio 4.1. Let X be aradom variable with a distributio F 2 MDA(H ο ). Let ν 2 support of F =[x 0 ;x F ] with x F»1, F ν (x) be the coditioal excess distributio fuctio ad G (x; ν) be the Pareto type distributio. The extreme values of the distributio F are defied byaparameter ν<x F such that F ν (x) coverges uiformly to G (x; ν) as ν! x F, lim ν!x F sup jf ν(x) G (x; ν)j =0 (10) with ο 2 the tail idex of the distributio F. I order to obtai the extremes of a distributio F from a sample data we eed to estimate the threshold parameter. Coditio ( 10) is ot possible to check for a give, therefore we give a characterizatio of the defiitio for sample data. Propositio 4.1. Let F (x) be a distributio fuctio verifyig coditio ( 2) ad cosider ^ν a estimator of ν. The extreme values estimates defied by^ν are extreme values if ad oly if ffl sup jf^ν p (x) G (x; ^ν )j! 0. ffl F^ν (x) =G (x; ^ν ) for almost every x 2 R ad give. 9

10 The first coditio provides the cosistecy of the estimator ^ν. This is a ecessary coditio but ot sufficiet to defie the set of extremes. We ca obtai cosistet estimators of ν that determie extremes estimates that do ot follow Pareto type distributios for agive sample size. This is the reaso to impose the secod coditio. This coditio is ot usually possible to check because the parameters of the Pareto type distributio ( ) are ukow. I cosequece, we propose a goodess of fit test to circumvet this drawback. 4.1 Coditioal Estimatio of the Parameter Set The distributio of the extremes of a distributio depeds o its tail behavior, i.e. o. This set of parameters must be estimated from the available data sample. Maximum Likelihood (ml) is the most covetioal method ad has very desirable properties; cosistecy, asymptotic efficiecy ad ormality. The estimatio of the tail parameters is coditioed o the kowledge wehave about the sample tail defied by the threshold ν. For the GPD approach, ^ (ν) =f^ο ml (ν); ^ff ml (ν)g ad for the Pareto approach, ^ (ν) =^ο ml (ν). Propositio 4.2. If F 2 MDA(Φ ff ), ^ο ml (ν) for PD ο is the Hill estimator (see Hill, 1975), 1 ^ο Hill (ν) = ( k) X i=k+1 with ν = x (k) ad x (k+1)» :::» x () the icreasig order statistics. log x (i) ν ; (11) Hill estimator is gaiig popularity i the EVT Literature because is easy to calculate ad has good asymptotic properties, but i the fiacial literature is employed eve for ot heavy tailed distributios. Therefore, cosistecy ad asymptotic ormality may ot hold ay more.there exists some cofusio about the coditios to use it. Propositio 4.3. Let ^ο ml ad ^ο Hill be the maximum likelihood estimators of the parameter ο of a Geeralized Pareto ad of a Pareto distributio respectively. These estimators are p -cosistet estimates of the tail idex of a distributio fuctio F verifyig coditio ( 2) if ο > 1 2 for ^ο ml (see Smith, 1984) ad if ο > 0 for ^ο Hill (Goldie ad Smith, 1987). The drawback of these estimators is their biases (see Guillou & Hall, 2000). This bias has two differet sources: the distributio of data is ot of a Pareto type ad the choice of the umber of order statistics used to costruct the estimator. Let us cocetrate o the Hill estimator ad assume F 2 MDA(Φ ff ). By BHP theorem the large observatios x (k 0 +1)» :::» x () greater tha ν = x (k 0 ) follow apd ο with ν sufficietly large, therefore 10

11 we elimiate the first source of bias. It is show i Hill (1975) that the radom variable V i = i [log x ( i+1) log x ( i) ] follows a expoetial distributio with mea ο. Cosider ^ν = x (k) a estimate of ν such that x (k+1)» :::» x (k 0 )» :::» x () are the exceedaces over the estimate 1. Hill estimator based o x (k) is ^ο Hill (^ν ) = 1 estimator ca be decomposed as P k i=k+1 log x (i) x (k). This ^ο Hill (^ν )= 1 k X i=k 0 +1 log x (i) ν + 1 k = k0 k ^ο Hill (ν)+ 1 k Xk 0 i=k+1 Xk 0 i=k+1 log x (i) ν log x (i) ν + log ν x (k) : + log ν x (k) = O the other had, Hill estimator based o the parameter ν ca be expressed i terms of V i, ^ο k P 0 Hill (ν) = 1 k V 0 i. This estimator is ubiased ( E[^ο Hill (ν)] = ο ), however the expected value of the Hill estimator based o the estimate ^ν is biased. This deviatio from the parameter depeds o the bias of the threshold estimator: E[^ο Hill (^ν )] = k0 k ο + 1 k Xk 0 i=k+1 Notice that the bias disappears if k = k 0. Elogx (i) Elogx (k) + k0 k logν: Therefore, the bias of the Hill estimator of a distributio F 2 MDA(Φ ff ) as goes to ifiity depeds oly o the bias of the threshold estimate. The problem is that the parameter ν = x (k 0 ) is ukow. I order to miimize bias problems, cofidece itervals are proposed as estimators of the tail idex. It is well kow that the radom variable S k = p k(^ο Hill ο) has a asymptotic N(0;ο 2 ) distributio. We costruct oparametric bootstrap cofidece itervals to approximate the exact cofidece itervals for the tail idex, ο 2 [^ο Hill (^ν ) p 1 J 1 k (F ; 1 ff k 2 ); ^ο Hill (^ν ) p 1 J 1 k (F ; ff )]; (12) k 2 with F the empirical distributio fuctio of the data, ff the sigificace level ad J k (x; F ) the approximate bootstrap distributio of S k. Note that the same procedure ca be applied to calculate cofidece itervals for the tail idex based o the maximum likelihood estimator ^ο ml (^ν ) of a Geeralized Pareto distributio. 1 Notice that if ^ν >νthere is ot a problem of bias, it is oly a matter of efficiecy of the estimator ^ν. 11

12 4.2 Estimatio Method for the Threshold Value ν Pickads (1975) proposed a method to estimate the threshold value ν based o uiform covergece (d 1 )betwee the Empirical Distributio associated to F ν ad a Geeralized Pareto distributio estimated from data GP D Pic ν : ν Pic = arg mi ν d ν 1(F ν; ;GPD Pic ν ); (13) with Pic ν the estimated parameters of the GPD. This estimator of ν is cosistet i the sese that P fsup jf ν Pic GP D Pic j >"g!0. The estimators for the parameters of the ν Pic GPD proposed by Pickads deped o the differet values of ν. Cosider ν = X ( 4i+1), i =1;:::;=4. for the tail idex ad ^ο(ν) = 1 log(2) log(x ( i+1) X ( 2i+1) ), X ( 2i+1) ν ^ff(ν) = X ( 2i+1) ν R log2 e ^οu du, 0 for the variace. This estimator for the tail idex is cosistet, but it is very sesitive to the choice of the order statistics ad it is ot efficiet (Drees, 1995). For stadard sample sizes the estimatios of the tail idex are biased ad the cofidece itervals for ο do ot give reliable iformatio. Alterative statistics have bee proposed for the tail idex to overcome these drawbacks, (see Dekker, Eimahl ad de Haa, 1989). Goldie ad Smith (1987) or Dekker ad de Haa (1993) establish the optimal umber of order statistics for differet estimators of the tail idex. O the other had, Pickads estimator for the threshold (ν Pic ) does ot take ito accout the legth of the sample tails defied by ν to compute the distaces i ( 13). As ν! x F the available samples of the tails are smaller yieldig worse estimatios of the tail parameters of the GPD. This implies worse goodess of fit of the coditioal distributios F ν to the theoretical asymptotic distributio. This is caused ot oly by the lack of fit of data to the theoretic GPD distributio but also by the estimatio mechaism of the tail parameters. Hece, ν Pic is ot ear the tail by its ow costructio. Cosequetly, the extremes estimates defied by Pickads estimator ca be very misleadig for stadard sample sizes (see Table 6.1). A atural distace to derive a good estimator to overcome Pickads drawbacks for fiite samples is a distace based o Kolmogorov-Smirov statistic. 12

13 Defiitio 4.2. (Kolmogorov-Smirov distace) ± Let F ν; (x) = 1 fν»x i»xg be the empirical distributio fuctio associated tof ν ad G be ± 1 fx i >νg apareto type distributio. The distace betwee F ν ad G is calculated by the followig KS distace d ν ks(f ν; ;G^ ν )= vu u t X 1 fxi>νg sup j P 1 fν»xi»xg P 1 fxi>νg G ^ ν (x; ν)j: (14) This statistic regards the umber of observatios of the available sample tails givig less weight to distaces of samples with less data i order to compesate the estimatio failure of the parameters of the theoretical distributio from small samples. Defiitio 4.3. Let d ν ks be the KS distace of ( 14) ad x = fx 1 :::;x g be a sample of size from a distributio F. The estimated threshold ^ν is the order statistic x (k) that makes the distace d ν ks miimum. with x (k) such that k!1, k! 0. ^ν = arg mi ν d ν ks (F ν;;g^ ν ); The latter coditios are cosequece of BHP theorem. As becomes large, -k should go to ifiity to beefit of a icreasig sample (more iformatio as icreases ad therefore smaller variace). At the same time, uless a portio of the upper tail follows exactly a Pareto type distributio we expect that k the approximatio to the theoretical distributio whe ν! x F (smaller the bias). teds to zero i order to improve as BHP theorem states Theorem 4.1. Let ^ν be the threshold estimator derived from the KS distace (d ν ks ) ad let ^ο(^ν ) be acosistet estimator of the tail idex based ox with ο 2. The, ^ν is a cosistet estimator of the threshold parameter ν i the sese that P fsup jf^ν (x) G (x; ^ν )j >"g!0, 8 ">0. The cocept of cosistecy ca be puzzlig i this cotext because the parameter ν accordig to BHP theorem must go to the right ed poit. The uiqueess of the threshold makes o sese, because as ν goes to x F the approximatio of the coditioal distributio is better. I cosequece, we prove the cosistecy of our estimator i the sese that mimics the properties of the parameter ν. However, other estimators ca mimic as well the behavior of the parameter; ν! x F, ^ο(ν)! p ο ad F ν = G (x; ν). I order to check the 13

14 performace of these other estimators we propose a hypothesis test i the ext sectio. I practice our estimator of the threshold is obtaied i the followig way, Algorithm 4.1. : 1. Fix a threshold, ν = x (k),(k = k 0 = =2) 2 2. Estimatio 3 of ^ ν = 8 < : ^ο ml (ν); ^ff ml (ν) ^ο Hill (ν) GPD approach Pareto approach 3. Compute F ν; (x) = 4. Compute G ^ ν = 8 < : ± 1 fν»x i»xg ± 1 fx i >νg GP D ^ο;^ff (x i ν). PD^ο Hill ( xi ν ) 5. Calculate the distace defied by d ν ks (F ν;;g^ ) = 6. k ++ s P 1 fxi>νg sup j GPD approach Pareto approach ± 1 fν»x i»xg ± 1 fx i >νg G ^ ν (x; ν)j Repeat the process util k = At the ed of the day, we estimate ^ν = x (^k) such that ^ν = arg mi ν d ν ks (F ν;;g^ ν ). Alterative distace measures ca be proposed for this threshold selectio. For istace the oes based o Cramér-vo Mises or Aderso-Darlig Statistics, ffl W 2 = R 1 1 (F ν;(x) G ^ ν (x)) 2 dg ^ ν (x) R ffl A 2 1 (F ν;(x) G = ^ ν (x))2 1 G ^ ν (x)(1 G (x))dg (x). ^ ν ^ ν These statistics rely o the euclidea distace. The drawback of these measures with respect to KS type statistics for threshold selectio is that these first oes are less sesitive to large deviatios from the Pareto type distributio due to isolate observatios (outlier observatios). 5 Hypothesis Testig The threshold estimate ^ν provides the lower limit of the estimatio of the extreme values i fiite samples. Our threshold estimator ^ν is such that as the sample size icreases, 2 Cosider k =1;:::; 1 is computatioally very costly. The method is implemeted takig fractios of the sample. x (k) s.t. k = Λ i ; i =50; 60; 70; 80; 90; 91;:::; The algorithm to estimate the threshold depeds o the maximum domai of attractio of the distributio F. 14

15 coditio ( 10) asymptotically holds (F^ν = G ). The key questio to aswer is whether this coditio ca be rejected or ot for the extremes estimates produced from the threshold value estimatio. I other words, are these estimates really extreme values accordig to our defiitio of extremes? The aswer boils dow to test with G = 8 < : GP D ο;ff (x ν) PD ο ( xi ν ) H 0 : F ν = G (15) GPD approach Pareto approach. The statistic proposed to test H 0 is the followig goodess of fit test T (x ; ) = p sup jf^ν ;(x) G (x; ^ν )j: (16) Although there are alteratives that are more sesitive to the deviatios from the ull distributio that occur i both tails (Modified KS tests, see Maso ad Schueemeyer, 1983) we cocetrate o the stadard KS test because our cocer is the distributio of the largest observatios exceedig the threshold value ν. The samplig distributio of this test statistic J (x; F ; ) = P ft (x ; )» xg is ot kow ad the asymptotic ull distributio J(x; F ) is parameter free (see Kolmogorov, 1933) but it is ot possible to obtai avalue of the estimator based o a sample x because the set of parameters is ukow. Therefore, the test statistic eeded to test the ull hypothesis is T (x ; ^ ), where ^ is a estimate of the true. This statistic follows asymptotically a fuctioal of a cetered gaussia process that depeds o, see Durbi (1973). The asymptotic critical values vary with H 0 ad the estimatio of this set of parameters. Bootstrap methodology ca be applied to calculate the samplig quatiles of the Bootstrap distributio J (x; ^F ; ^ Λ ) with ^ Λ the estimated set of parameters from the bootstrap sample x Λ = fx Λ 1 ;:::;xλ g ad with ^F a estimate of F. These quatiles will be close to the exact quatiles of the distributio of the statistic J (x; F ; ) if the Bootstrap is cosistet (J (x; F ; ^ ) ' J (x; ^F ; ^ Λ )) ad if ^ is a p -cosistet estimator of (J (x; F ; ) ' J (x; F ; ^ )), see Babu ad Rao (2002) for details. Propositio 5.1. Let x be a sample of size from F. Assume that ^F is a estimate of F based ox ad let J (x; F ; ^ ) be the true samplig distributio of the statistic T (x ; ^ ). If the followig two coditios hold ffl sup j ^F (x) F (x)j p! 0. ffl J (x; F ; ^ )! J(x; F ; ) with J(x; F ; ) beig a strictly icreasig cotiuous fuctio i x. 15

16 The, the Bootstrap approximatio J (x; ^F ; ^ Λ ) is cosistet (J (x; ^F ; ^ Λ ) ' J (x; F ; ^ )). 5.1 Methodology The statistic T (x ; ^ ) follows asymptotically a fuctioal of a cetered gaussia process. Therefore, i order to obtai a cosistet bootstrap approximatio (J (x; ^F ; ^ Λ )) of the true samplig distributio of T (x ; ^ ) we eed to costruct ^F verifyig uiform covergece i probability tof. Defiitio 5.1. Let ^F (x) be a mixture of F (x) for values smaller tha the estimated threshold ^ν ad of a Pareto type distributio for values above it: ^F (x) = 8 >< >: G ^ ^ν (x)+ 1 P 1 P 1 fxi»xg x» ^ν 1 fxi»^ν gg ^ ^ν (x) x>^ν : (17) It is obvious to check that ^F (x) i expressio ( 17) is a distributio fuctio. Propositio 5.2. Let x be a sample of size with distributio fuctio F (x). The, the distributio fuctio ^F is such that sup j ^F (x) F (x)j p! 0. The first task is to geerate a bootstrap sample x Λ of size from the distributio ^F. Algorithm 5.1. (Geeratig Process of Data): H Let ^ν = x (k) be the estimated threshold ad ^ be the estimated parameter space. 2. Geerate 0» j» 1 ad calculate dje 8 < : 3. x Λ x (dje) if dje»k i = z if dje >k P j 1 fxi»^ν g z = G ψ^ ^ν ( ) 4. i ++ Go to step 2 P 1 fxi>^ν g Oce a bootstrap sample is geerated it is immediate to calculate J (x; ^F ; ^ Λ ) uder Algorithm 5.2. (Bootstrap Distributio of T ): 1. l =1. 2. Geerate x Λ abootstrap sample comig from ^F. 16

17 3. Compute ^ Λ from the exceedaces of x Λ over the fixed threshold ^ν. 4. Compute T Λ l (xλ ; ^ Λ )= p 5. l =1;:::;B 6. J (x; ^F ; ^ Λ )= 1 B BP 1 ft Λ i»xg: sup j P 1 f^ν»x Λ i»xg P 1 fx Λ i >^ν g G ^ Λ^ν (x; ^ν )j: Notice that the set of parameters is cosistetly estimated by f^s 2 ; ^ο ml g for the Geeralized Pareto distributio, ad by ^ο Hill for the Pareto distributio. Both estimators of the tail idex are p -cosistet for some values of the tail idex (see propositio ( 4.3)). Therefore, the kowledge of J (x; ^F ; ^ Λ ) allows us to estimate the p-value of the test ( 16): p = P fj (x; F ; )>T (x ; )g 'P fj (x; ^F ; ^ Λ ) >T (x ; ^ )g = 1 B BP 1 ft Λ l >T g =^p: Large values of the test statistic imply rejectio of the ull hypothesis. I other words, it is rejected if ^p <fffor a give sigificace level ff. 5.2 Size of the Test Theorem 5.1. Let ^Q be a estimator of F based o a sample x of size that satisfies sup j p ^Q F (x)j! 0 wheever F 2 F H0. The, P ft (x; ^ ) >j (1 ff; ^Q ; ^ Λ )g!ff, with j (1 ff; ^Q ; ^ Λ ) the 1 ff quatile of the Bootstrap distributio J (x; ^Q ; ^ Λ ) of T (x; ^ ). The distributio fuctio ^F (x) of expressio ( 17) verifies the coditio of theorem 5.1, therefore j (1 ff; ^F ; ^ Λ ) ' j (1 ff; F ; ^ ). I cosequece, ^F is a good cadidate to estimate the size of the proposed test. Algorithm 5.3. : 1. j =1. 2. Estimate ^ν = x (k) ad G ^ ^ν by KS method from a sample x j; that follows F. (a) i =1 (b) Geerate a sample x Λ i; ο ^F from x j;. (c) Calculate T Λ i (x Λ i; ; ^ Λ ) (d) i ++. Go to step (b) while i» B: (e) Costruct J (x; ^F )= 1 B BP 3. Geerate a sample x 0 uder H Calculate T 0 (x 0 ; ^ 0 ). 1 ft Λ i»xg: 17

18 5. ^p = 1 B BX 1 ft Λ i >T 0 g : 6. Reject H 0 if ^p <ffwith ff the sigificace level. 7. ffi j = 8 < : 1 if H 0 is rejected 0 if H 0 is accepted. 8. j ++. Go to step 2 while j» m: 9. ^ff = 1 m mx ffi i, where ^ff is the estimatio of the type I error. ^ff should be close to the sigificace level ff. 5.3 Power of the Test The choice of ^Q ca brig some problems uder the alterative hypothesis (F 2 F H1 ). ^Q should satisfy three coditios uder the alterative hypothesis i order to avoid that the critical values of J (x; ^Q ; ^ Λ ) go to ifiity as icreases. ffl T (x ; ^ )!1uder F 2 F H1. ffl ^Q with F 2 F H1 such that ^Q fl F, but some F 0 uder (F H0 ). ffl The critical value should satisfy j (1 ff; ^Q ; ^ Λ ) ' j (1 ff; F 0 ; )!! j(1 ff; F 0 ) < 1. If these coditios hold, the by Slutsky's theorem, P ft (x; ^ ) >j (1 ff; ^Q ; ^ Λ )g'pft (x; ) >j (1 ff; F 0 )g!1as!1. Propositio 5.3. Let x be a sample of size from a distributio F uder the alterative hypothesis F H1 ad let T (x; ^ ) be the test statistic of ( 16) with ^ν ad G ^ ^ν estimated uder the ull hypothesis. The, T (x; ^ )!1. The problem is how to costruct ^Q such that does ot approach the distributio F, but F H0 whe the sample x comes from F H1. ^F is ot valid i this case because F 2 F H1 (x ο F H1 ). At least, a sample x o; of size uder F H0 is required to costruct ^Q, a cosistet estimate of F H0. ^Q (x) = 8 >< >: G ^ ^ν (x)+ 1 P 1 P 1 fx0;i»xg x» ^ν 1 fx0;i»^ν gg ^ ^ν (x) x>^ν (18) with ^ ^ν a cosistet estimate of uder the ull hypothesis. The algorithm to estimate the power is equivalet to the algorithm proposed for the size, but i step 3 the sample is geerated from F 2 F H1. Therefore, ^ff is a estimate of 18

19 the power of the test. The objective of this hypothesis test is to reject extremes estimates defied by ^ν which are ot really extremes. This situatio ca occur for small sample sizes where ^ν ca be ot ear the right ed poit x F defiig more extremes estimates tha there really exist. We ca also test if the extremes estimates defied by other ~ν are really extremes. 6 Simulatios ad Some Fiacial Applicatios I this sectio we preset how our estimatio ad testig methodology perform i fiite samples, with simulated data from differet distributios as well as with real data. Uder our methodology the extremes of the distributio are well estimated by the observatios exceedig a determied threshold value oce the ull hypothesis ( 15) is ot rejected. The extreme quatile estimates ad their bootstrap cofidece itervals rely upo the costructio of ^F (x). We distiguish two cases: if F has heavy tails, G is a PD ο ad a cosistet estimator is give by ^F (x) = 8 >< >: 1 P 1 1 fxi»xg x» ^ν ± 1 fx i >^νg ( x^ν ) 1 ^ο otherwise, G is a GP D ο;ff ad a cosistet estimate of F is ^F (x) = 8 >< >: 1 ± 1 ± 1 fx i >^νg By the coditioal probability theorem, x>^ν 1 fxi»xg x» ^ν (1 + (x ^ν) ^ο ^ff ) 1 ^ο x>^ν P fx» xg = P fx» νgp fx» x j X» νg + P fx >νgp fx» x j X>νg (19) with P fx» x j X» νg = 1 for x>ν. The coditioal probability P fx» x j X >νg = F ν (x) ca be well approximated by apareto type distributio G for ν large (BHP theorem). Cosider x p such that P fx» x p g =1 p, 0 <p<1 ad ^ν = x (k) estimated by our KS distace estimator. The, covertig expressio ( 19) ito its empirical couterpart ad approximatig F ν (x) by G ^ ^ν we obtai 1 p = 1 For F 2 MDA(Φ ff ), G ^ = PD^ο, X 1 fxi»^ν g + 1 X 1 fxi>^ν gg ^ ^ν : ^x p =^ν ( ± 1 fxi>^ν g p ) ^ο: (20) 19

20 For F 2 MDA(H ο ), G ^ = GP D ^ο;^ff, ^x p =^ν + ^ff^ο (( ± 1 fxi>^ν g p) ^ο 1): (21) Quatile estimatio is very importat as a risk measure i may fields. I Fiace is used as a risk idicator (Value at Risk) ad i Hydrology or Meteorology to determie security levels of raifalls or floods. Aother applicatio of ^F is to measure the ucertaity of the tail parameter estimates. There are two challeges to make iferece about these parameters. First, F ad the true samplig distributio of the statistic h (x ; ) of the extreme parameter are ot kow, ad secod, the asymptotic distributio of h depeds o uisace parameters. ^F defied from ^ν allows to geerate bootstrap samples x Λ i order to calculate the Bootstrap samplig distributio L (x; ^F ) = P (h (x Λ ; ^ )» x) of the statistic. Propositio 6.1. Let h (x ; (F )) be a statistic such that depeds o the sample x ad o the parameter (F ). Let L (x; F ) the true samplig distributio of the statistic ad L (x; ^F ) be the bootstrap approximatio. Cosider ^ ( ^F ) a estimator of (F ). The, if the Bootstrap approximatio is cosistet (L (x; F ) ' L (x; ^F )), P fl 1 ( ff 2 ; ^F )» h (x ; (F ))» L 1 (1 ff 2 ; ^F )g'1 ff: Suppose h (x ; (F )) = fl (^ ( ^F ) (F )), fl>0. The, a cofidece iterval for (F ) at sigificace levelff is I:C(ff) =[^ ( ^F ) fl L 1 (1 ff 2 ; ^F ); ^ ( ^F ) fl L 1 ( ff 2 ; ^F )]: (22) Cofidece itervals for the tail idex parameter proposed i ( 12) are calculated with this methodology but with o iformatio about the tail behavior, i.e. ^F is the empirical distributio. Oce the ull hypothesis of ( 15) is ot rejected, cofidece itervals from expressio ( 12) ca be improved approximatig F by our semi-parametric distributio ^F because we are coutig with crucial iformatio about the tail of F. 6.1 Fiite Sample Performace The scope of this sectio is to give simulated evidece about the fiite sample properties of the differet estimators of the threshold, as well as the impact of these estimators i the tail idex estimators. 20

21 p Extremes are characterized by a threshold parameter ν such that satisfies: ^ο(ν)! ο, ν! x F ad F ν = G (x; ν). Let us start with the tail idex estimator. We cosider three alterative estimators: ^οml (^ν) based o a GPD with the threshold estimated by KS distace, ^ο Hill (^ν )with^ν also estimated by KS distace ad ^ο Pic (ν Pic ) Pickads estimator with the threshold estimated by Pickads method (see Sectio 4:2). These statistics deped o the threshold, therefore the method to select ^ν is crucial to miimize possible bias effects ad to get cosistecy. We have costructed bootstrap cofidece itervals for the tail idex yielded from these three differet approaches. KS(GPD) ad KS(PD) are the methods achored i a Geeralized Pareto ad a Pareto distributio respectively. Pickads method is costructed with the estimates of the Pickads estimator obtaied from the values over the estimated threshold proposed by Pickads (1975). F ο KS (GPD) KS (PD) Pickads N(0; 1) ο =0 [ 0:41; 0:18] [0:08; 0:19] [ 0:80; 0:35] Exp(1) ο =0 [ 0:23; 1:22] [ 0:29; 0:25] [ 0:34; 0:05] t 60 ο ο 0 [ 0:39; 0:27] [0; 0:24] [ 0:6; 0:31] t 10 ο ο 0:1 [ 0:28; 0:48] [0:16; 0:30] [ 0:67; 0:09] PD 1=4;1 ο =0:25 [0:02; 0:59] [0:16; 0:37] [0:13; 0:43] PD 1=2;1 ο =0:5 [ 0:13; 1:41] [0:23; 0:81] [0:46; 0:79] Table 6.1. Cofidece itervals at ff = 0:05 for the tail idex ο yielded from the three proposed estimators, ^ο ml (^ν ), ^ο Hill (^ν ) ad ^ο Pic (ν Pic ) with ν estimated by the KS distace method ad Pickads estimator, respectively. B = 1000 bootstrap samples of size = 1000 have bee geerated from a sample of the distributio F. It ca be observed that KS(GPD) cofidece itervals always cotai the parameter, although they are loger tha the other oes. KS(PD) method outperforms the GPD method whe F has heavy tails, i other cases, this estimator ca produce biased cofidece itervals. Pickads method oly performs well for distributios with heavy tails. It is importat to otice that these bootstrap itervals rely o the empirical distributio fuctio, F. For large sample sizes it is ot relevat the bootstrap approximatio of F, however, for as the sample size decreases it is better to use ^F of expressio ( 17), because it provides us with iformatio about the tail whe there is o sufficiet available data of F. 21

22 F KS (GPD) KS (PD) F ^F F ^F N(0; 1) [ 0:48; 1:45] [ 0:67; 0:11] [ 1:38; 0:08] [0:04; 0:38] Exp(1) [ 0:35; 1:39] [ 0:48; 1:56] [0:02; 0:42] [ 2:32; 0:13] t 60 [ 1:49; 1:50] [ 0:62; 0:01] [ 0:89; 0:32] [ 0:03; 0:30] t 10 [ 0:39; 0:29] [ 0:43; 0:31] [0:20; 0:59] [ 0:25; 0:29] PD 1=4;1 [ 0:78; 0:66] [ 0:14; 0:70] [0:10; 0:42] [0:19; 0:30] PD 1=2;1 [0:06; 0:95] [0:11; 1:11] [0:18; 1:70] [0:37; 0:67] Table 6.2. Cofidece itervals at ff =0:05 for the tail idex ο yielded from ^ο ml (^ν ) ad ^ο Hill (^ν ) with ν estimated by the KS distace method. B =1000bootstrap samples of size = 250 have bee geerated from a sample of the distributio F. I the rest of the sectio we will be usig F to costruct the cofidece itervals for the tail idex, because i order to employ ^F wehave first to accept the ull hypothesis F ν = G. To check imore detail the performace of these estimators for heavy tails, i Table 6.3 we aalyze t-studet distributios with differet degrees of freedom. t 1 (ο ο 1) t 3 (ο ο 0:33) t 5 (ο ο 0:2) t 10 (ο ο 0:1) t 30 (ο ο 0) KS (GPD) [0:37; 1:11] [0:10; 1:53] [ 0:17; 0:33] [ 0:48; 0:14] [ 1:31; 0:50] KS (PD) [0:67; 1:24] [0:09; 0:42] [0:15; 0:39] [0:16; 0:30] [ 0:03; 0:24] Pickads [0:61; 1:36] [ 0:44; 0:14] [0:01; 0:90] [ 0:67; 0:09] [ 0:83; 0:36] Table 6.3. Cofidece itervals at ff = 0:05 for the tail idex ο from the three proposed estimators, ^ο ml (^ν ), ^ο Hill (^ν ) ad ^ο Pic (ν Pic ) with ν estimated by the KS distace method ad Pickads estimator, respectively. B =1000bootstrap samples of size =1000have bee geerated from a sample of the differet t-studet distributios. I practice, the problem arises whe the geeratig process of data is ukow ad there is o iformatio about the ratio of decay of the tail. The tail idex ca be estimated by both methods (KS(GPD) ad KS(PD)) ad depedig o the results we should apply a adequate estimator for the threshold parameters, ^ν GP D ;ks or ^ν ;ks PD,toachieve more accurate ad reliable estimatios of the extremes. Some fiacial idexes are cosidered i Table

23 KS (GPD) KS (PD) C.I. Pickads Dax [ 0:18; 0:89] [0:23; 0:37] [ 0:49; 0:15] Ftse [ 0:25; 0:07] [ 0:31; 0:15] [ 0:46; 0:06] Ibex [ 0:11; 0:87] [0:25; 0:47] [ 0:46; 0:04] Nikkei [ 0:11; 0:56] [0:27; 0:41] [ 0:36; 0:03] Dow-Joes [ 0:15; 1:55] [0:039; 0:53] [ 0:43; 0:03] Table 6.4. Cofidece itervals at ff =0:05 for the tail idex ο for real data over roughly the period 05= =2001. B = 1000 bootstrap samples of size = 1000 have bee geerated for the bootstrap itervals. Almost all fiacial idexes aalyzed i this Table ca be cosidered to be fat tailed ad the extremes of these distributios are well defied by ^ν yielded from the KS estimator ad the Pareto distributio (PD ο ) with ο cotaied i a precise cofidece iterval. Some doubts ca exist with respect Ftse idex. I this case we coclude that the extremes follow a GP D ο;ff. Cosider ow thesecod property ofthe threshold parameter: ν! x F. By cosistecy, the threshold estimators should go to the right ed poit as the sample size icreases. Distributio = 500 = 1000 = 1500 = 2000 =5000 N(0; 1) ^ν GP D ;ks 1:19 1:37 1:45 1:51 1:67 (0:57) (0:49) (0:47) (0:46) (0:42) ^ν Pic 0:44 0:52 0:59 0:64 0:88 (0:26) (0:29) (0:32) (0:33) (0:36) t 10 ^ν PD ;ks 2:18 2:28 2:33 2:39 2:49 (0:47) (0:43) (0:41) (0:38) (0:32) ^ν Pic 0:47 0:56 0:63 0:69 0:96 PD1 4 ;1 (0:27) (0:31) (0:34) (0:36) (0:39) ^ν PD ;ks 2:14 2:13 2:11 2:07 2:07 (0:64) (0:62) (0:62) (0:61) (0:61) ^ν Pic 1:29 1:29 1:29 1:29 1:29 (0:07) (0:07) (0:08) (0:08) (0:08) Table 6.5. Threshold estimatio with KS distace ad Pickads estimators as icreases samples of size of differet distributios are geerated. The ubiased estimated stadard deviatio from simulatios of ^ν is displayed i brackets. 23

24 As icreases, the two estimators go to the right ed poit of the distributio. Pickads estimator provides estimates far from the right ed poit ad the variace slowly icreases. This result poits out that extremes estimates produced by Pickads method may be ot very reliable. O the other had, the estimators achored i KS distace have decreasig variace ad approachtox F as!1. Notice that for PD1 4 ;1 distributio, ^ν PD ;ks estimator has a greater variace as before ad k 9 0. This is because this distributio is exactly of Pareto type but the term of the KS statistic accoutig for the sample legth of the tails produces this ucertaity i the threshold estimates from the bootstrap samples. Pickads estimator detects the shape of the distributio from the begiig. Oe of the goals of this paper is to propose a test to check if the extreme estimates yielded from a proposed threshold estimator verify the third property: F ν = G (x; ν). The rejectio of the ull hypothesis meas the extremes estimates defied by ^ν are ot really extremes. Tables 6.6 ad 6.7 show size ad power of the goodess of fit test proposed i ( 16). The proposed alteratives to measure the power of this test are costructed as deviatios from the theoretical distributio of the extremes. Table 6.6 shows the empirical rejectio rates of our test for F 2 MDA(H ο ). = 1000 Size Power (5%) 0:01 0:05 Exp(1) GP D 1=4;1 GP D 1=4;1 N(0; 1) 0:014 0:07 0:98 0:96 0:96 Exp(1) 0:014 0:04 0:5 0:72 0:75 t 60 0:02 0:05 0:97 0:95 0:96 FTSE 0:006 0: Table 6.6. B=1000 Bootstrap samples of legth = 1000 of the differet distributios with tail expoetially decayig. m=500 simulatios are geerated for the bootstrap test. Notice that the results from the expoetial distributio reflect certai lack ofpower of the test. This is because the tail of a expoetial with mea 1 is a GPD with ο =0. Thus, our proposed alteratives are very close to the ull hypothesis. Next table displays the the empirical rejectio rates of our test for F 2 MDA(Φ ff ). 24

25 = 1000 Size Power (5%) 0:01 0:05 Exp(1) PD 0:1;1 PD 0:65;1 t 10 0:012 0:038 0:79 0:74 0:97 PD 1=4;1 0:012 0:056 0:75 0:92 0:95 PD 1=2;1 0:01 0:046 0:98 0:99 0:67 Nikkei 0:014 0: Table 6.7. B=1000 Bootstrap samples of legth = 1000 of the differet heavy tailed distributios. m=500 simulatios are geerated for the bootstrap test. Aother possibility for the alterative hypothesis is to cosider more extremes tha with our defiitio of extremes, i.e. ~ν < ^ν. Let us cocetrate o distributios with heavy tails. We should test F ν = PD ο fixig the threshold ~ν i order to check ifthere are more data i the populatio that follow a Pareto distributio with tail idex ο. I additio, the opposite case ca be tested as well. Cosider a smaller set of extremes tha the oes produced with our defiitio of extremes. I this case the ull hypothesis should be accepted because F^ν = PD ο implies F ~ν = PD ο with ^ν < ~ν. Data ^ν ~ν = x (950) ~ν = x (900) ~ν = x (800) ~ν = x (700) t 10 fl 0:97 =2:27 0:19 0:01 0:00 0:00 ^s =(0:42) (0:29) (0:07) (0:00) (0:00) t 3 fl 0:97 =2:97 0:29 0:13 0:0001 0:00 ^s =(0:97) (0:33) (0:26) (0:002) (0:00) DaX x (910) =0:025 0:69 0:20 0:00 0:00 Nikkei x (920) =0:021 0:97 0:05 0:00 0:00 Table 6.8. p-values of the bootstrap hypothesis tests H 0 : F ~ν = PD ο for samples of = 1000 observatios. For the t-studet distributios m = 500 iteratios are geerated. fl p is the extreme quatile ^ν of the distributio. The ubiased estimated stadard deviatio of the p-values is displayed ibrackets. 7 Coclusio Risk ad ucertaity are ot the same thig (see Grager, 2002) ad therefore they eed to be characterized by differet measures. It is accepted that variace is well desiged to capture the latter but ot the former. To measure risk, i other words, to respod the questio if thigs go wrog how wrog they ca go? it is first ecessary to fid a aswer 25

26 to the questio which extreme values are really extremes? This is the mai goal of this paper, where followig Pickads (1975) methodology we do ot oly defie formally ad aalytically the set of extreme observatios of a give populatio, but we proposeasimple estimator of them ad costruct a test to aswer the previous questio. Idetificatio of the extreme observatios allows to estimate very accurately risk measures as Value at Risk or Expected Shortfall, as well as to make iferece o differet tail parameters of iterest. Boths issues are extesios of this paper ad costitute udergoig researchby the authors. A Appedix: Proofs Corollary 3.1: Takig logs i expressio ( 1), wehave log(1 F (c x+d )) d! logh ο (x). Therefore, log (1 F (cx+d) ) d! logh ο (x). This is equivalet to F (c x + d ) d! logh ο (x), with H ο = e (1+οx) 1 ο if ο 6= 0adHο = e e x if ο =0. We obtai F (c x + d ) d! (1 + οx) 1 ο if ο 6= 0 ad F (c x + d ) d! e x. Λ Corollary 3.2: Let F 2 MDA(Φ ff ) ad M = max(x 1 ;:::;x ). By defiitio, there exist costats, c = F ψ (1 1 ) ad d = 0 such that c 1 (M d ) d! Φ ff with Φ ff = e x ff ;x > 0, ad ff>0. By propositio ( 3.1), F 2 RV ff. Cosider ν; x 2 support(f) with x F = 1 ad x = νt with t>1. Notice that for 0 <t» 1, F ν (x) =0. Operatig i expressio ( 4), 1 F (x) 1 lim ν!1 Propositio 1 F (ν) = lim ν!1 F (x) F (ν) 1 F (ν) = lim ν!1 F ν(x) =1 ( x ν ) ff = PD ο ( x ν ): Λ 4.1: First the if part. Cosider ^ν a threshold estimator such that the values above it are extreme values. Therefore expressio ( 10) ca be writte, replacig the parameter by the estimator, as This implies lim ^ν!x F sup jf^ν (x) G (x; ^ν )j =0: P fsup jf^ν (x) G (x; ^ν )j >"g!0. I additio, if ^ν defies the set of extreme values there may exist a subset A R such that jf^ν (A ) G (A ; ^ν )j >", although from ( 10) sup x2a The, it is derived that F^ν (x) =G (x; ^ν ) 8x 2 RA. jf^ν (x) G (x; ^ν )j!0. With respect to the oly if part, this result follows from coditio ( 2). The cotiuity ear the right ed poit x F ad the cosistecy of the estimator ^ν imply that lim sup ^ν!x F jf^ν (x) G (x; ^ν )j =0. Λ 26

27 Propositio 4.2: Let x 1 ;:::;x k ο PD ο with PD ο ( x ν )=1 ( x ν ) ff ;x > ν. The desity fuctio is pd(x) =ff( x ν ) (ff+1) 1 ν. The, the likelihood fuctio is l(x 1 ;:::;x k ; ν; ff) =( ff k ν )k Π ( xi ν ) (ff+1). Let ο = 1 ff, the from the first order coditios, it is easy to obtai ^ο = 1 k Theorem kp log xi ν : Λ 4.1: Let ^ν be the threshold estimator derived from the KS distace ad let ^ο(^ν ) be a cosistet estimator of the tail idex based o x with ο 2. sup jf^ν (x) G (x; ^ν )j»sup jf^ν (x) G ^ (x; ^ν )j + sup jg (x; ^ν ) G ^ (x; ^ν )j. ^ is a cosistet estimator of, therefore, sup jg (x; ^ν ) G ^ (x; p ^ν )j! 0. s P Let X (ν) = 1 fxi>νg sup jf ν(x) G ^ ν (x; ν)j such that for values of ν sufficietly large X (ν) is a radom variable that follows a fuctioal of a cetered gaussia process depedig o the parameter (see Durbi, 1973). Cosider ow, X (^ν ) = mifx ;1 (ν 1 );:::;X ;k (ν k )g with fν 1 ;:::;ν k g greater tha a ν 0 verifyig BHP theorem ad X ;i (ν i ) radom variables. ^ν is the argumet of the miimum of this fiite set; ^ν = arg mi X (ν). The, ν P (X (^ν ) >")=P (mifx ;1 (ν 1 );:::;X ;k (ν k )g >")=P (X ;i (ν i ) >") k : As goes to ifiity k icreases as well. P (X (^ν ) >")! 0as; k!1. This expressio is equivalet to The, P f s P 1 fxi>^ν g sup I additio, P (X (ν) > ") < 1, therefore, jf^ν (x) G ^ ^ν (x; ^ν )j >"g!0. P fsup jf^ν (x) G ^ ^ν (x; ^ν )j >" Λ g!0 with 0 <" Λ <". Λ Propositio 5.1: Let x be a sample of size from F. Assume that ^F is a estimate of F based o x verifyig sup j p ^F (x) F (x)j! 0 ad let J (x; F ; ^ ) be the true samplig distributio of the statistic T (x ; ^ ). This distributio is such that J (x; F ; ^ )! J(x; F ; ) with J(x; F ; ) beig a strictly icreasig cotiuous fuctio i x. The, P ft (x; ^ )» J ψ (1 ff; F ; ^ )g!p ft (x; ^ )» J ψ (1 ff; F ; )g =1 ff. I additio, J(x; F ; ) is cotiuous ad strictly icreasig, therefore J ψ (1 ff; F ; ^ )! J ψ (1 ff; F ; ). The, as!1, J ψ (1 ff; ^F ; ^ Λ )! J ψ (1 ff; F ; ) because sup j ^F (x) F (x)j P ft (x; ^ )» J ψ (1 ff; F ; )g = 1 ff. sup p! 0. Cosequetly, P ft (x; ^ )» J ψ (1 ff; ^F ; ^ Λ )g! The, sup jj (x; F ; ^ ) J (x; ^F ; ^ Λ )j» jj (x; F ; ^ ) J(x; F ; )j + sup jj (x; ^F ; ^ Λ ) J(x; F ; )j!0. Λ Propositio 5.2: For x» ^ν, ^F (x) is the Empirical distributio fuctio. By Gliveko- Catelli theorem, sup x»^ν j ^F (x) F (x)j = sup x»^ν jf (x) F (x)j a:s:! 0. For x>^ν, uder 27

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT

Introduction to Extreme Value Theory Laurens de Haan, ISM Japan, Erasmus University Rotterdam, NL University of Lisbon, PT Itroductio to Extreme Value Theory Laures de Haa, ISM Japa, 202 Itroductio to Extreme Value Theory Laures de Haa Erasmus Uiversity Rotterdam, NL Uiversity of Lisbo, PT Itroductio to Extreme Value Theory

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

32 estimating the cumulative distribution function

32 estimating the cumulative distribution function 32 estimatig the cumulative distributio fuctio 4.6 types of cofidece itervals/bads Let F be a class of distributio fuctios F ad let θ be some quatity of iterest, such as the mea of F or the whole fuctio

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY

EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY EXAMINATIONS OF THE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA, 016 MODULE : Statistical Iferece Time allowed: Three hours Cadidates should aswer FIVE questios. All questios carry equal marks. The umber

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)*

GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)* GUIDE FOR THE USE OF THE DECISION SUPPORT SYSTEM (DSS)* *Note: I Frech SAD (Système d Aide à la Décisio) 1. Itroductio to the DSS Eightee statistical distributios are available i HYFRAN-PLUS software to

More information

Convergence of random variables. (telegram style notes) P.J.C. Spreij

Convergence of random variables. (telegram style notes) P.J.C. Spreij Covergece of radom variables (telegram style otes).j.c. Spreij this versio: September 6, 2005 Itroductio As we kow, radom variables are by defiitio measurable fuctios o some uderlyig measurable space

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

Chapter 6 Infinite Series

Chapter 6 Infinite Series Chapter 6 Ifiite Series I the previous chapter we cosidered itegrals which were improper i the sese that the iterval of itegratio was ubouded. I this chapter we are goig to discuss a topic which is somewhat

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Chapter 6 Principles of Data Reduction

Chapter 6 Principles of Data Reduction Chapter 6 for BST 695: Special Topics i Statistical Theory. Kui Zhag, 0 Chapter 6 Priciples of Data Reductio Sectio 6. Itroductio Goal: To summarize or reduce the data X, X,, X to get iformatio about a

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece 1, 1, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract Goodess-Of-Fit For The Geeralized Expoetial Distributio By Amal S. Hassa stitute of Statistical Studies & Research Cairo Uiversity Abstract Recetly a ew distributio called geeralized expoetial or expoetiated

More information

Sequences and Series of Functions

Sequences and Series of Functions Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 5 CS434a/54a: Patter Recogitio Prof. Olga Veksler Lecture 5 Today Itroductio to parameter estimatio Two methods for parameter estimatio Maimum Likelihood Estimatio Bayesia Estimatio Itroducto Bayesia Decisio

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Lecture 6 Simple alternatives and the Neyman-Pearson lemma

Lecture 6 Simple alternatives and the Neyman-Pearson lemma STATS 00: Itroductio to Statistical Iferece Autum 06 Lecture 6 Simple alteratives ad the Neyma-Pearso lemma Last lecture, we discussed a umber of ways to costruct test statistics for testig a simple ull

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable

More information

Lecture 33: Bootstrap

Lecture 33: Bootstrap Lecture 33: ootstrap Motivatio To evaluate ad compare differet estimators, we eed cosistet estimators of variaces or asymptotic variaces of estimators. This is also importat for hypothesis testig ad cofidece

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula

Journal of Multivariate Analysis. Superefficient estimation of the marginals by exploiting knowledge on the copula Joural of Multivariate Aalysis 102 (2011) 1315 1319 Cotets lists available at ScieceDirect Joural of Multivariate Aalysis joural homepage: www.elsevier.com/locate/jmva Superefficiet estimatio of the margials

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator

Slide Set 13 Linear Model with Endogenous Regressors and the GMM estimator Slide Set 13 Liear Model with Edogeous Regressors ad the GMM estimator Pietro Coretto pcoretto@uisa.it Ecoometrics Master i Ecoomics ad Fiace (MEF) Uiversità degli Studi di Napoli Federico II Versio: Friday

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

5. Likelihood Ratio Tests

5. Likelihood Ratio Tests 1 of 5 7/29/2009 3:16 PM Virtual Laboratories > 9. Hy pothesis Testig > 1 2 3 4 5 6 7 5. Likelihood Ratio Tests Prelimiaries As usual, our startig poit is a radom experimet with a uderlyig sample space,

More information

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker

SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker SOME THEORY AND PRACTICE OF STATISTICS by Howard G. Tucker CHAPTER 9. POINT ESTIMATION 9. Covergece i Probability. The bases of poit estimatio have already bee laid out i previous chapters. I chapter 5

More information

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes.

Direction: This test is worth 250 points. You are required to complete this test within 50 minutes. Term Test October 3, 003 Name Math 56 Studet Number Directio: This test is worth 50 poits. You are required to complete this test withi 50 miutes. I order to receive full credit, aswer each problem completely

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 19 11/17/2008 LAWS OF LARGE NUMBERS II THE STRONG LAW OF LARGE NUMBERS MASSACHUSTTS INSTITUT OF TCHNOLOGY 6.436J/5.085J Fall 2008 Lecture 9 /7/2008 LAWS OF LARG NUMBRS II Cotets. The strog law of large umbers 2. The Cheroff boud TH STRONG LAW OF LARG NUMBRS While the weak

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2

( θ. sup θ Θ f X (x θ) = L. sup Pr (Λ (X) < c) = α. x : Λ (x) = sup θ H 0. sup θ Θ f X (x θ) = ) < c. NH : θ 1 = θ 2 against AH : θ 1 θ 2 82 CHAPTER 4. MAXIMUM IKEIHOOD ESTIMATION Defiitio: et X be a radom sample with joit p.m/d.f. f X x θ. The geeralised likelihood ratio test g.l.r.t. of the NH : θ H 0 agaist the alterative AH : θ H 1,

More information

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS

A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a

More information

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals

7-1. Chapter 4. Part I. Sampling Distributions and Confidence Intervals 7-1 Chapter 4 Part I. Samplig Distributios ad Cofidece Itervals 1 7- Sectio 1. Samplig Distributio 7-3 Usig Statistics Statistical Iferece: Predict ad forecast values of populatio parameters... Test hypotheses

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Stat 421-SP2012 Interval Estimation Section

Stat 421-SP2012 Interval Estimation Section Stat 41-SP01 Iterval Estimatio Sectio 11.1-11. We ow uderstad (Chapter 10) how to fid poit estimators of a ukow parameter. o However, a poit estimate does ot provide ay iformatio about the ucertaity (possible

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

1 Introduction to reducing variance in Monte Carlo simulations

1 Introduction to reducing variance in Monte Carlo simulations Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values

Confidence interval for the two-parameter exponentiated Gumbel distribution based on record values Iteratioal Joural of Applied Operatioal Research Vol. 4 No. 1 pp. 61-68 Witer 2014 Joural homepage: www.ijorlu.ir Cofidece iterval for the two-parameter expoetiated Gumbel distributio based o record values

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit theorems Throughout this sectio we will assume a probability space (Ω, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS

MBACATÓLICA. Quantitative Methods. Faculdade de Ciências Económicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACATÓLICA Quatitative Methods Miguel Gouveia Mauel Leite Moteiro Faculdade de Ciêcias Ecoómicas e Empresariais UNIVERSIDADE CATÓLICA PORTUGUESA 9. SAMPLING DISTRIBUTIONS MBACatólica 006/07 Métodos Quatitativos

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Asymptotic Results for the Linear Regression Model

Asymptotic Results for the Linear Regression Model Asymptotic Results for the Liear Regressio Model C. Fli November 29, 2000 1. Asymptotic Results uder Classical Assumptios The followig results apply to the liear regressio model y = Xβ + ε, where X is

More information

Output Analysis and Run-Length Control

Output Analysis and Run-Length Control IEOR E4703: Mote Carlo Simulatio Columbia Uiversity c 2017 by Marti Haugh Output Aalysis ad Ru-Legth Cotrol I these otes we describe how the Cetral Limit Theorem ca be used to costruct approximate (1 α%

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Topic 18: Composite Hypotheses

Topic 18: Composite Hypotheses Toc 18: November, 211 Simple hypotheses limit us to a decisio betwee oe of two possible states of ature. This limitatio does ot allow us, uder the procedures of hypothesis testig to address the basic questio:

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

4. Partial Sums and the Central Limit Theorem

4. Partial Sums and the Central Limit Theorem 1 of 10 7/16/2009 6:05 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 4. Partial Sums ad the Cetral Limit Theorem The cetral limit theorem ad the law of large umbers are the two fudametal theorems

More information

Last Lecture. Wald Test

Last Lecture. Wald Test Last Lecture Biostatistics 602 - Statistical Iferece Lecture 22 Hyu Mi Kag April 9th, 2013 Is the exact distributio of LRT statistic typically easy to obtai? How about its asymptotic distributio? For testig

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam.

This exam contains 19 pages (including this cover page) and 10 questions. A Formulae sheet is provided with the exam. Probability ad Statistics FS 07 Secod Sessio Exam 09.0.08 Time Limit: 80 Miutes Name: Studet ID: This exam cotais 9 pages (icludig this cover page) ad 0 questios. A Formulae sheet is provided with the

More information

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015

ECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015 ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],

More information

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f.

Let us give one more example of MLE. Example 3. The uniform distribution U[0, θ] on the interval [0, θ] has p.d.f. Lecture 5 Let us give oe more example of MLE. Example 3. The uiform distributio U[0, ] o the iterval [0, ] has p.d.f. { 1 f(x =, 0 x, 0, otherwise The likelihood fuctio ϕ( = f(x i = 1 I(X 1,..., X [0,

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

Empirical Processes: Glivenko Cantelli Theorems

Empirical Processes: Glivenko Cantelli Theorems Empirical Processes: Gliveko Catelli Theorems Mouliath Baerjee Jue 6, 200 Gliveko Catelli classes of fuctios The reader is referred to Chapter.6 of Weller s Torgo otes, Chapter??? of VDVW ad Chapter 8.3

More information

Fall 2013 MTH431/531 Real analysis Section Notes

Fall 2013 MTH431/531 Real analysis Section Notes Fall 013 MTH431/531 Real aalysis Sectio 8.1-8. Notes Yi Su 013.11.1 1. Defiitio of uiform covergece. We look at a sequece of fuctios f (x) ad study the coverget property. Notice we have two parameters

More information

Stochastic Simulation

Stochastic Simulation Stochastic Simulatio 1 Itroductio Readig Assigmet: Read Chapter 1 of text. We shall itroduce may of the key issues to be discussed i this course via a couple of model problems. Model Problem 1 (Jackso

More information

Summary. Recap ... Last Lecture. Summary. Theorem

Summary. Recap ... Last Lecture. Summary. Theorem Last Lecture Biostatistics 602 - Statistical Iferece Lecture 23 Hyu Mi Kag April 11th, 2013 What is p-value? What is the advatage of p-value compared to hypothesis testig procedure with size α? How ca

More information

Efficient GMM LECTURE 12 GMM II

Efficient GMM LECTURE 12 GMM II DECEMBER 1 010 LECTURE 1 II Efficiet The estimator depeds o the choice of the weight matrix A. The efficiet estimator is the oe that has the smallest asymptotic variace amog all estimators defied by differet

More information

6. Sufficient, Complete, and Ancillary Statistics

6. Sufficient, Complete, and Ancillary Statistics Sufficiet, Complete ad Acillary Statistics http://www.math.uah.edu/stat/poit/sufficiet.xhtml 1 of 7 7/16/2009 6:13 AM Virtual Laboratories > 7. Poit Estimatio > 1 2 3 4 5 6 6. Sufficiet, Complete, ad Acillary

More information

Mathematical Statistics - MS

Mathematical Statistics - MS Paper Specific Istructios. The examiatio is of hours duratio. There are a total of 60 questios carryig 00 marks. The etire paper is divided ito three sectios, A, B ad C. All sectios are compulsory. Questios

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9 BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information