On Out-of-Sample Statistics for Financial Time-Series

Size: px
Start display at page:

Download "On Out-of-Sample Statistics for Financial Time-Series"

Transcription

1 On Out-f-Sample Statistics fr Financial Time-Series Françis Gingras Yshua Bengi Claude Nadeau CRM-2585 January 1999 Département de physique, Université de Mntréal Labratire d infrmatique des systèmes adaptatifs, Département d infrmatique et recherche pératinnelle, Université de Mntréal Centre interuniversitaire de recherche en analyse des rganisatins

2

3 Abstract This paper studies an ut-f-sample statistic fr time-series predictin that is analgus t the widely used R 2 in-sample statistic. We prpse and study methds t estimate the variance f this ut-f-sample statistic. We suggest that the ut-f-sample statistic is mre rbust t distributinal and asympttic assumptins behind many tests fr insample statistics. Furthermre we argue that it may be mre imprtant in sme cases t chse a mdel that generalizes as well as pssible rather than chse the parameters that are clsest t the true parameters. Cmparative experiments are perfrmed n artificial data as well as n a financial time-series (daily and mnthly returns f the TSE300 index). The experiments are perfrmed fr varying predictin hrizns and we study the relatin between predictibility (ut-f-sample R 2 ), variability f the utf-sample R 2 statistic, and the predictin hrizn. In particular, we find that very different cnclusins wuld be btained when testing against the null hypthesis f n dependency rather than testing against the null hypthesis that the prpsed mdel des nt generalize better than a naive frecast.

4

5 1 Intrductin The purpse f the analysis f time-series such as financial time-series is ften t take decisins based n this analysis. The analyst is given a certain quantity f histrical data D T = {z 1,..., z T } frm which he will eventually cme up with a decisin. In this paper, we will fcus n decisins which take the frm f a predictin ŷ T h f the future value f sme variable 4, say Y T h, with Z t = (X t h, Y t ). The quality f the predictin will be judged a psteriri accrding t sme lss functin, such as the squared difference (ŷ T h Y T h ) 2 between the predictin ŷ T h and the realizatin Y T h f the predicted variable. A cmmn apprach is t use the histrical data D T t infer a functin f that takes as input the value f sme summarizing infrmatin X t available at time t and prduces as utput ŷ th = f(x t ), which in the case f the abve quadratic lss functin wuld be an estimate f the cnditinal expectatin E[Y th X t ]. The hpe is that if this functin wrked well n bserved past pairs (x t, y th ), it shuld wrk well n (X T, Y T h ). Hw shuld we chse the functin f? A classical apprach is t assume a parametrized class f functins (e.g., the affine functins), estimate the value f these parameters (e.g., by maximum likelihd r least squares), and then in rder t validate the mdel, perfrm statistical tests t verify if these parameters differ significantly frm the value that wuld be cnsistent with a null hypthesis (e.g., the parameters f the regressin are significantly different frm zer, s that there is really a linear dependency between the X s and the Y s). In particular, these tests are imprtant t knw whether ne shuld use the prpsed mdel at all, r t decide amng several mdels. In this paper we will cnsider alternative appraches t address the last questin, i.e., hw a mdel shuld be validated and hw several mdels shuld be cmpared. It is very satisfying t btain a result n the true value f the parameters (e.g., t use an efficient estimatr, which cnverges as fast as pssible t the true value f the parameters). But in many applicatins f timeseries analysis, the end-user f the analysis may be mre interested in knwing whether the mdel is ging t wrk well, i.e., t generalize well t the future cases. In fact, we will argue that smetimes (especially when data is scarce), the tw bjectives (estimating the true parameters r chsing the mdel that generalizes better) may yield t quite different results. Anther fundamental justificatin fr the apprach that we are putting frward is that we may nt be sure that the true distributin f the data has the frm (e.g. linear, Gaussian, etc...) that has been assumed. Therefre it may nt be meaningful t talk abut the true value f the parameters, in this case. What may be mre apprpriate is the questin f generalizatin perfrmance: will the mdel yield gd predictins in the future? where the ntin f gd can be used t cmpare tw mdels. T btain answers t such questins, we will cnsider statistics that measure ut-f-sample perfrmance, i.e., measured n data that was nt used t frm the predictin functin. In the machine learning cmmunity, it is cmmn t use such measures f perfrmance. Fr example, ne can estimate ut-f-sample errr by first estimating the parameters n a large subset f the data (als called the training data) and test the functin n the rest f the data (als called the held-ut r test data). With this methd, when there is little histrical data, the estimate f perfrmance wuld be pr because it wuld be based n a small subset f the data. An alternative is the K-fld crss-validatin methd, in which K train/test partitins are created, a separate parameter estimatin perfrmed n each f the training subsets, testing perfrmed fr each f these K functins n the crrespnding K test sets, and the average f these test errrs prvides a (slightly pessimistic) estimate f generalizatin errr (Efrn and Tibshirani, 1993). When the data is sequential (and may be nn-statinary), the abve methds may nt be applicable (and in the nn-statinary case may yield ptimistic estimates). A mre hnest estimate can be btained 4 In this paper we will nrmally use upper case fr randm variables and lwer case fr their value. 1

6 with a sequential crss-validatin prcedure, described in this paper. This estimate essentially attempts t measure the predictability f the time-series when a particular class f mdels is used. In this cntext, what the analyst will try t chse is nt just a functin frm x t t E[Y th x t ], but a functinal F that maps histrical data D t int such a functin (and will be applied t many cnsecutive time steps, as mre histrical data is gathered). What d we mean by predictability? Fr practitiners f quantitative finance, predictability equates beating the market. This gal is, usually and fr a large part, frm the dmain f financial engineering. Depending n the market, and things like the investment hrizn and the nature f the investr, it requires the incrpratin f transactins fees, liquidity f the market, risk management, tax and legal cnsideratins, and s n. But, amng the details n which depends an applicable mdel, ne which is crucial is that in rder t btain the decisin r the predictin at time t, nly infrmatin that is available at time t can be used. This includes nt nly the inputs f the functin but als the data that is used t estimate the parameters f this functin. In this paper, we define predictability f sequential data, such as market return, as the pssibility t identify, n the basis f knwn infrmatin at time t (that will be used as input ), a functin amng a set f pssible functins (frm the input t a predictin) that utperfrms (in terms f generalizatin perfrmance) a naive mdel that des nt use any input, but can be trained n the same (past) data. One bjective f this paper is t study if the use f generalizatin errr allws t recver the statistical results btained by traditinal in-sample inference, but withut ding any assumptin n the prperties f the data and the residuals f the mdel. We apply the methd t a simple linear mdel where we can, in the case f the in-sample statistic, cnstruct an autcrrelatincnsistent standard errr. Here the results btained with the ut-f-sample statistic are cnsistent with thse btained with the in-sample apprach, but the methd has the advantage f being directly extensible t any mdel, linear r nt (e.g., neural netwrks), fr which the distributins f the in-sample statistics are unknwn r rely n delicate hyptheses. Several empirical tests based n btstrapping techniques are prpsed t cmpare the mdel t a null hypthesis in which the inputs d nt cntain infrmatin abut future utputs. Anther majr bjective f this paper is t establish a distinctin between tw apparently clse null hyptheses: (1) n relatinship between the inputs and the utputs and (2) n better predictive pwer f a given mdel with respect t a naive mdel. We prpse a methd t test against the 2nd null hypthesis, and we shw that these tw types f tests yield t very different results n ur financial returns data. In sectin 2, we present the classical ntin f generalizatin errr, empirical risk minimizatin, and crss-validatin, and we extend these ntins t nn-i.i.d. data. We als present the ntin f a naive mdel used t establish a cmparisn benchmark (and null hyptheses) and establish the cnnectin between it and the in-sample variance f data. In sectin 3, we recall sme aspects f rdinary least square regressin applied t sequential data. We recall the definitins f explained variance, the usual in-sample R 2, used in predictability tests (Campbell et al., 1997; Kaul, 1996). We define an ut-f-sample analgus f R 2 that we dente R 2, and a related but unbiased statistic that we dente D. The R 2 intrduced in this paper is related t a definitin f frecastibility prpsed by Granger and Newbld (1976). Sectin 4 describes the financial time-series data and presents sme preliminary results. In sectin 5, we test the hypthesis f nn-relatin between the inputs and the utputs. We generate btstrap samples similar t the riginal series but fr which we knw that the null hypthesis is true, and we cmpare the test statistics bserved n the riginal series t the distributin f the statistics btained with this btstrap prcess. Althugh this hypthesis is f n direct relevance 2

7 t this paper, it allws us t nicely intrduce sme difficult issues with the data at hand (such as dependency induced by verlapping) and the type f methdlgies used later n, including the btstrap as mentined abve. It als allws us t make the distinctin between the absence f relatinship between inputs and utputs and the inability f inputs t frecast utputs. In rder t study sme prperties f the statistics used t test the hypthesis f nn-relatin between the inputs and the utputs, we als generated statinary artificial data fr which we knw the nature f the relatin between the inputs and the utputs. Using this data, we cmpare the pwer f in-sample and ut-f-sample statistics when testing against a hypthesis f n linear dependency. Sectin 6 aims at assessing whether inputs may be used t prduce frecast that wuld utperfrm a naive frecast. Fllwing sectin 3, we test if R 2 = 0 against the alternative that it is psitive. T d s, we use the statistic ˆR 2 0 and varius btstrap schemes. The experiments are perfrmed fr varying predictin hrizns and we study the relatin between predictability (ut-fsample R 2, which we will write R 2 ), variability f the ut-f-sample R 2 statistic, and the predictin hrizn. The results are cmpared t thse btained when trying t reject the null hypthesis f n dependency. 2 Expected Risk and Sequential Validatin This sectin reviews ntins frm the generalizatin thery f Vapnik (1995), and it presents an extensin t nn-i.i.d. data f the cncepts f generalizatin errr and crss-validatin. We als define a naive mdel that will be used as a reference fr the R 2 statistic. First let us cnsider the usual i.i.d. case (Vapnik, 1995). Let Z = (X, Y ) be a randm variable with an unknwn density P (Z), and let the training set D l be a set f l examples z 1,..., z l drawn independently frm this distributin. In ur case, we will suppse that X R n and Y R. Let F be a set f functins frm R n t R. A measure f lss is defined which specifies hw well a particular functin f F perfrms the generalizatin task fr a particular Z: Q(f, Z) is a functinal frm F R n1 t R. Fr example, in this paper we will use the quadratic errr Q(f, Z) = (Y f(x)) 2. The bjective is t find a functin f F that minimizes the expectatin f the lss Q(f, Z), that is the generalizatin errr f f: G(f) = E[Q(f, Z)] = Q(f, z)p (z)dz (1) Since the density P (z) is unknwn, we can t measure r even less minimize G(f), but we can minimize the crrespnding empirical errr: G emp (f, D l ) = 1 l z i D l Q(f, z i ) = 1 l l Q(f, z i ) (2) i=1 where the z i are sampled i.i.d. frm the unknwn distributin P (Z). When f is chsen independently f D l, this is an unbiased estimatr f G(f), since E[G emp (f, D l )] = G(f). Empirical risk minimizatin (Vapnik, 1982, 1995) simply chses f = F (D l ) = arg min G emp (f, D l ) f F where we have nted F (D l ) the functinal that maps a data set int a decisin functin. Vapnik has shwn varius bunds n the maximum difference between G(f) and G emp (f, D l ) fr f F, which depend n the s-called VC-dimensin r capacity (Vapnik, 1982, 1995) f the set f 3

8 functins F. Nte that the capacity f an rdinary linear regressin mdel is simply the number f free parameters. When the sample size l is small relative t capacity, there will be a significant discrepancy between the generalizatin errr (1) and the in-sample training errr G emp (F (D l ), D l ), but it can be cntrlled with the capacity f the set f functins F (Vapnik, 1995), e.g., in the case f linear mdels, by using less inputs r by regularizing (fr example by cnstraining the parameters t be small). An empirical estimate f G(F (D)), the generalizatin errr f a functinal F (frm a data set D t a functin f F), can be btained by partitining the data in tw subsets: a training subset D 1 t pick f = F (D 1 ) F which minimizes the empirical errr in D 1, and a held-ut r test subset D 2 which gives an unbiased estimate f G(F (D 1 )), the generalizatin errr f f, and a slightly pessimistic estimate f G(F (D)), the generalizatin errr assciated t a functinal F when it will be applied t D = D 1 D 2. When there is nt much data, it is preferable but cmputatinally mre expensive t use the K-fld crss-validatin prcedure described in Bishp (1995); Efrn and Tibshirani (1993). Hwever, in the case where the data are nt i.i.d., the results f learning thery are nt directly applicable, nr are the prcedures fr estimating generalizatin errr. Cnsider a sequence f pints z 1, z 2,..., with z t R n1, generated by an unknwn prcess such that the z t s may be dependent and have different distributins. Nevertheless, at each time step t, in rder t make a predictin, we are allwed t chse a functin f t frm a set f functins F using the past bservatins z t 1 = (z 1, z 2,..., z t ), i.e., we chse f t = F (z t 1). In ur applicatins z t is a pair (x t h, y t ) 5 and the functins f F take an x t in input t take a decisin that will be evaluated against y th thrugh the lss functin Q(f t, Z th ). Here we call h the hrizn because it crrespnds t the predictin hrizn in the case f predictin prblems. Mre generally it is the number f time steps frm a decisin t the time when the quality f this decisin can be evaluated. In this paper, we cnsider the quadratic lss Q(f, Z th ) = Q(f, (X t, Y th )) = (Y th f(x t )) 2. We then define the expected generalizatin errr G t fr the decisin at time t as G t (f) = E[Q(f, Z th ) Z1] t = Q(f, z th )P th (z th Z1)dz t th. (3) The bjective f learning is t find, n the basis f empirical data z1, t the functin f F which has the lwest expected generalizatin errr G t (f). The prcess Z t may be nn-statinary, but as lng as the generalizatin errrs made by a gd mdel are rather stable in time, we can hpe t use the data z1 t t pick a functin which wuld have wrked well in the past and will wrk well in the future. Therefre, we will extend the abve empirical and generalizatin errr (equatins 2 and 1). Hwever we cnsider nt the errr f a single functin f but the errr assciated with a functinal F (which maps a data set D t = z1 t int a functin f F). Nw let us first cnsider the empirical errr which is the analgue fr nn i.i.d. data f the K-fld crss-validatin prcedure. We call it the sequential crss-validatin prcedure and it measures the ut-f-sample errr f the functinal F as fllws: C T (F, z T 1 ) = 1 T M 1 5 The first bservable x is called x 1 h rather than x 1. T t=m Q(F (z t h 1 ), z t ) (4) 4

9 where f t = F (z1) t is the chice f the training algrithm using data z1 t (see equatin 6 belw), and (M h) > 0 is the minimum number f training examples required fr F (z1 M h ) t prvide meaningful results. We define the generalizatin errr assciated t a functinal F fr decisins r predictins with a hrizn h as fllws: E Gen (F ) = E[C T (F, z1 T )] = = 1 T M 1 T t=m 1 T M 1 T t=m Q(F (z t h 1 ), z t )P (z T 1 )dz T 1 E[G t h (F (Z t h 1 ))] (5) where P (z1 T ) is the prbability f the sequence z1 T under the generating prcess. In that case, we readily see that (4) is the empirical versin f (5), that is (4) estimates (5) by definitin. In the case f the quadratic lss, we have E Gen (F ) = 1 T M 1 T t=m E[V ar[f (Z t h 1 )(X t h ) Y t X T h 1 h ] E 2 [F (Z1 t h )(X t h ) Y t X T h ]] (6) T cmplete the picture, let us simply mentin that the functinal F may be chsen as F (z1) t = arg min R(f) f F 1 h t Q(f, z s ) (7) where R(f) might be used as a regularizer, t define a preference amng the functins f F, e.g., thse that are smther. Fr example, cnsider a sequence f bservatins z t = (x t h, y t ). A simple class f functins F is the class f cnstant functins, which d nt depend n the argument x, i.e., f(x) = µ. Applying the principle f empirical risk minimizatin t this class f functin with the quadratic lss Q(f, (x t, y th )) = (y th f(x t )) 2 yields s=1 f naive t = F naive (z t 1) = arg min µ t (y s µ) 2 = ȳ t = 1 t s=1 t y s, (8) s=1 the histrical average f the y s up t the current time t. We call this uncnditinal predictr the naive mdel, and its average ut-f-sample errr C T (F naive, z1 T 1 ) = T T M1 t=m (ȳ t h y t ) 2 is called the ut-f-sample naive cst. 3 Out-f-Sample R 2 T assess the generalizatin ability f a functinal F fr a mre interesting class f functins, which depend n their argument x, let us intrduce the relative measure f perfrmance R 2 (F ) = 1 E Gen(F ) E Gen (F naive ) = 1 E[C T (F, z T 1 )] E[C T (F naive, z T 1 )] (9) 5

10 where E Gen (.), C T (.,.) and the F naive functinal were discussed in the previus sectin. R 2 will be negative, null r psitive accrding t whether the functinal F generalizes wrse, as well r better than F naive. Bradly speaking, when R 2 is psitive it means that there is a dependency between the inputs and the utputs. In ther wrds, when there is n dependency and we use a mdel with mre capacity (e.g., degrees f freedm) than the naive mdel, then R 2 will be negative. The cnverse is nt true, i.e. R 2 < 0 des nt imply n dependency but indicates that the dependency (if present) is nt captured by the class f functins in the image f F. S in cases where the signal-t-nise-rati is small, it may be preferable nt t try t capture the signal fr making predictins. The empirical versin r estimatr f R, 2 called ut-f-sample R 2, is defined as the statistic where ˆR 2 (F ) = 1 C T (F, z T 1 ) C T (F naive, z T 1 ) = 1 e F t = y t F (z t h 1 )(x t h ) T t=m (ef t ) 2 T t=m (enaive t ), (10) 2 dentes the previsin errr made n y t by the functinal F. T ease ntatin, we let e naive t stand fr e F naive t. This empirical ˆR2 is a nisy estimate (due t the finite sample), and thus might be psitive even when R 2 is negative (r vice-versa). Furthermre, this estimate ˆR 2 may be biased because its expectatin is 1 minus the expectatin f a rati f tw randm variables (C T (F, z1 T ) and C T (F naive, z1 T )), which is different frm R 2 that is 1 minus the rati f the expectatins f these same variables. Hwever, unless there is sme strange dependency between these tw variables, we can expect that ˆR 2 underestimates R 2 (which is preferable than ver-estimating it, meaning that a mre cnservative estimate is made). It is therefre imprtant t analyze hw nisy this estimate is in rder t cnclude n the dependency between the inputs and the utputs. This matter will be addressed in a later sectin, using an related statistic that is unbiased (fr which the expectatin f the empirical estimate is equal t the true value), that we dente D : with empirical estimate D (F ) = E Gen (F naive ) E Gen (F ) = E[C T (F naive, z T 1 )] E[C T (F, z T 1 )] (11) ˆD (F ) = C T (F naive, z T 1 ) C T (F, z T 1 ) = T t=m (e naive t ) 2 T (e F t ) 2 T understand the name ut-f-sample R 2, ntice that ˆR 2 lks like the usual R 2 which is ˆR 2 = 1 t=m T t=1 (ẽf t ) 2 T T t=1 (ẽnaive t ) = 1 t=1 (y t F (z1 T )(x t h )) 2 2 T t=1 (y, (12) t ȳ T ) 2 where ẽ F t = y t F (z1 T )(x t h ) is the usual in-sample residual and ẽ naive t = ẽ F naive t. Hwever, nte that ˆR, 2 like R, 2 may be negative, which cntrasts with R 2 which is nn-negative whenever F F naive. The terms in and ut f sample underline the difference between ẽ F t, depending n the whle sample thrugh F (z1 T ), and e F t which depends slely n y t and the sample z1 t h up t time t h. In ther wrds, e F t is a genuine frecast errr and ẽ F t is nt, as F (z1 T ) is available nly at time T s that F (z1 T )(x t h ) cannt be used at time t h. 6

11 An example may clarify all f the abve. Take n = 1 and let F lin be the set f affine functins, i.e. linear mdels f(x) = α βx. Sticking with the quadratic lss with n regularizatin, we have that ft lin (x) = F lin (z1)(x) t = ˆα t ˆβ t x, where (ˆα t, ˆβ t ), minimizing t (y s α βx s h ) 2, s=1 are the least square estimates f the linear regressin f y s n x s h, s = 1,..., t, and rely nline n data knwn up t time t, i.e. z t 1. e naive t e lin t ẽ naive t e lin t = y t F naive (z t h 1 )(x t h ) = y t ȳ t h = y t F lin (z1 t h )(x t h ) = y t ˆα t h ˆβ t h x t h = y t F naive (z1 T )(x t h ) = y t ȳ T = y t F lin (z T 1 )(x t h ) = y t ˆα T ˆβ T x t h If we assume that the Z t s are independent with expectatin E[Y t x t h ] = αβx t h and variance V ar[y t x t h ] = σ 2, then (??) yields and (T M 1)E Gen (F naive ) = σ 2 T t=m [ 1 1 ] T β 2 E[(X t t h X t h ) 2 ] t=m T [ (T M 1)E Gen (F lin ) = σ ] [ T σ 2 (X t h E X ] t 2h ) 2 t h t h t=m t=m s=1 (X s h X, t 2h ) 2 where X t h = t 1 t h s=1 h X s is the mean f the X s up t X t h. We then see that R 2 is negative, null r psitive accrding t whether β2 is smaller, equal r greater than σ 2 [ T t=m E ] (X t h X t 2h ) 2 t h s=1 (X s h X t 2h ) 2 T t=m E[(X t X t h ) 2 ]. (13) This illustrates the cmment made earlier regarding the fact that R 2 < 0 means that the signal-tnise-rati ( β2 here) is t small fr F lin t utperfrm F naive. This result shws in this particular σ 2 case that even if the true generating mdel has β 0, a mdel trained frm the class f mdels with β = 0 (the naive mdel) shuld be chsen fr its better frecast generalizatin, rather than a mdel frm the class β 0. Let us nw cnsider a mre cmplex case where the distributin is clser t the kind f data studied in this paper. If we assume that E[Y t x T h h1 ] = αβx t h and V ar[y t x T h h1 ] = σ2 with Cv[Y t, Y tk x T h h1 ] = 0 whenever k h, then (??) yields and (T M 1)E Gen (F naive ) = (T M 1)E Gen (F lin ) = T t=m (σ 2 E[V ar[ȳt h x T h h1 ]]) β2 T t=m T E[(X t X t h ) 2 ] t=m (σ 2 E[V ar[ȳt h ˆβ t h (x t h x t 2h ) x T h h1 ]]) 7

12 where X t h = t 1 t h s=1 h X s is the mean f the X s up t X t h. We then see that R 2 is negative, null r psitive accrding t whether β2 is smaller, equal r greater than σ 2 σ [ 2 T t=m E V ar[ȳt h X T h h1 ] V ar[ȳt h ˆβ t h (X t h X ] t 2h ) X T h h1 ] T t=m E[(X t X. (14) t h ) 2 ] Nte that it can be shwn that the abve numeratr is free f σ as it invlves nly expectatins f expressins in X t s (like the denminatr). This again illustrates the cmment made earlier regarding the fact that R 2 < 0 means that the signal-t-nise-rati ( β2 here) is t small fr F lin σ 2 t utperfrm F naive. It als illustrates the pint made in the intrductin that when the amunt f data is finite, chsing a mdel accrding t its expected generalizatin errr may yield a different answer than chsing a mdel that is clsest t the true generating mdel, and that fr the purpse f making frecasts r taking decisins regarding yet unseen (i.e., ut-f-sample) data, it reflects better ur true bjective. See als (Vapnik, 1982, sectin 8.6) fr an example f the difference in ut-f-sample generalizatin perfrmance between the mdel btained when lking fr the true generating mdel versus chsing the mdel which has a better chance t generalize (in this case using bunds n generalizatin errr, fr plynmial regressin). 4 The financial data and preliminary results Experiments n the in-sample and ut-f-sample statistics where perfrmed n a financial timeseries. The data is based n the daily ttal return, including capital gain as well as dividends, fr the Trnt Stck Exchange TSE300 index, starting in January 1982 up t July The ttal return series T R t, t = 0, 1,..., 4178, can be described as the result at time t f an initial investment f 1 dllar and the reinvestment f all dividends received. We cnstruct, fr different values f h, the lg-return series n a hrizn h ( ) T Rt r t (h) = lg = lg(t R t ) lg(t R t h ) (15) T R t h where T R means ttal return, and t represents days. Thus r t (h) represents the lgarithm f the ttal return at day t f the past h day(s). There are 4179 trading days in the sample. We cnsider that there are twenty-ne trading days per mnth r 252 trading days per year. The real number f trading days, where the trading activities can ccur, can vary slightly frm mnth t mnth, depending n hlidays r exceptinal events, but 21 is a gd apprximatin if we want t wrk with a fixed number f trading days per mnth. A hrizn f H = N mnths will mean h = N 21 days. Using and predicting returns n a hrizn greater than the sampling perid creates an verlapping effect. Indeed, upn defining the daily lg-returns we can write r t (h) = lg(t R t ) lg(t R t h ) = r t = r t (1), t = 1,..., 4178, t s=t h1 8 (lg(t R s ) lg(t R s 1 )) = t s=t h1 r s (16)

13 lg return f TSE daily lg return f TSE Figure 1: Left: daily lgarithm f TSE300 index frm January 1982 t end f July Right: daily lg returns f TSE300 fr the same perid Table 1: Sample skewness and sample kustsis f TSE300 return ver hrizn f 1 day and 1 mnth and 3 mnths. The statistics and their standard deviatins (shwn in parathensis) have been cmputed accrding t frmulas described in Campbell et al. (1997). Hrizn skewness kurtsis 1 day (0.04) (0.08) 1 mnth (0.17) (0.35) 3 mnths (0.30) 3.93 (0.60) as a mving sum f the r t s. We will wrk n mnthly returns as it has been suggested frm empirical evidence (Campbell et al., 1997; Fama and French, 1988) that they can be useful fr frcasting, where a such results are nt dcumented fr daily return, except fr nn-prfitable trading effects. S ur hrizn will be a multiple f 21 days. Data are slightly better behaved when we take mnthly returns instead f daily nes. Fr instance, the daily return series is far frm being nrmally distributed. It is knwn that stck indices return distributins present mre mass in their tails than the nrmal distributin (Campbell et al., 1997). But returns ver lnger hrizns get clser t nrmality, thanks t equatin 13 and the central limits. Fr example, table 1 shws the sample skewness and kurtsis fr the daily, mnthly and quarterly returns. We readily ntice that these higher mments are mre in line with thse f the nrmal distributin (skewness=0, kurtsis=3) when we cnsider lnger term returns instead f daily returns. Table 1 is the first illustratin f the tuchy prblem f the verlapping effect. Fr instance, yu will ntice that the standard deviatin are nt the same fr daily and mnthly returns. This is because the daily returns statistics are based n r 1,..., r 4178, whereas their mnthly cunterparts are based n r 21 (21), r 42 (21),... r (21), that is apprximatively 21 times fewer pints than in the daily case. The reasn fr this is that we want independent mnthly returns. If we assumed that the daily returns were independent, then mnthly returns wuld have t be at least ne mnth apart t be als independent. Fr instance, r 21 (21) and r 40 (21) wuld nt be independent as they 9

14 In-sample R H Figure 2: Evlutin f the in-sample R 2 with the hrizn f predictin. The R 2 seem indicate than the strnger input/utput relatin is fr the hrizn f arund a year. share r 20 and r 21. Figures 2 and 3 depict the values f R 2 and R 2 btained n the TSE data when H = 1, 2,..., 24. The first plt suggests that there appears t be little relatinship between past and future returns except, perhaps, when we agregate the returns n a perid f abut ne year (H = 12). Figure 3 tells a similar stry: at best, predictability f future returns seems pssible nly fr yearly returns r s. But hw can we decide (frmally) if there is a relatinship between past and future returns, and if such a relatinship might be useful fr frecasting. This will be the gal f the next sectin. 5 Testing the hypthesis f n relatin between Y and X Cnsider testing the hypthesis that there is n relatinship between successive return f hrizn h, i.e. H 0 : E[r t (h) r t h (h)] = µ. Nte that r t (h) and r t h (h) d nt verlap but are cntiguus h days returns. T put it in sectin 4 s ntatin, we have y t = x t = r t2h 1 (h), s that, fr instance, x 1 h = r h (h) is the first bservable x. We wish t test E[Y t x t h ] = µ. As mentined in the intrductin, this hypthesis is nt what we are actually interested in what we d in this sectin prves t be useful in sectin 6 as it allws us t intrduce the btstrap, amng ther things. T perfrm a test f hypthesis, ne needs a statistics with a behavir that depends n whether H 0 is true r false. We will mainly cnsider tw statistics here. First we have R 2 that will take 10

15 Out-f-sample estimated R2 H Figure 3: Evlutin f the ut-f-sample R 2 with the change f hrizn. 11

16 smaller values under H 0 than therwise. The ther apprach t testing H 0 is t ntice that if E[r t (h) r t h (h)] des nt depend n r t h (h) then the crrelatin between r t h (h) and r t (h) is null, ρ(r t (h), r t h (h)) = 0. Thus we will use ˆρ(r t (h), r t h (h)), an estimatr f ρ(r t (h), r t h (h)), t test H 0 as it will tend t be clser t 0 under H 0 than therwise. The secnd thing needed in a test f hypthesis is the distributin f the chsen statistic under H 0. This may be btained frm theretical results r apprximated frm a btstrap as explained later. In the case f ˆρ(r t (h), r t h (h)), we d have such a theretical result (Bartlett, 1946; Andersn, 1984; Bx and Jenkins, 1970). First let us frmally define ˆρ(r t (h), r t h (h)) = T 2h (r t(h) r(h))(r t h (h) r(h)) T h (r t(h) r(h)) 2, (17) with r(h) being the sample mean f r h (h),..., r T (h). Assuming that the r t s are independent and identically distributed with finite variance then T h 1(ˆρ(rt (h), r t h (h)) ρ(r t (h), r t h (h))) N(0, W ) with W = (ρ vh ρ v h 2ρ h ρ v ) 2, (18) v=1 where ρ k stands fr ρ(r tk (h), r t (h)). If the r t s are independent and the r t (h) are running sums f r t s as shwn in equatin 13, then where u = max(u, 0). Therefre we have W = 2h 1 v=1 ρ 2 v h = h v v=1 h ρ k = (h k ) h h 1 ρ 2 v = 1 2h 2 (h v) 2 = 1 v=1 (h 1)(2h 1) 3h where the identity N 2 = N(N1)(2N1) was used in the last equality. Large 6 T h1 values f ˆρ(r W t(h), r t h (h)) are unfavrable t H 0 and their significance are btained frm a N(0, 1) table. In the case f the ˆR -statistics, 2 its distributin is unknwn. Hwever we may find an apprximatin f it by simulatin (btstrap). S we have t generate data frm the hypthesis H 0 : E[Y t x t h ] = µ (i.e. d nt depend n x t h ). This can be dne in at least fur ways. 1. Generate a set f independent r t s and cmpute the Y t = r t (h) s and the x t h = r t h (h) s in the usual way. 2. Keep the Y t btained frm the actual data, but cmpute the x t h as suggested in Keep the x t h btained frm the actual data, but cmpute the Y t as suggested in Generate a set f independent r t s and cmpute the Y t = r t (h) s. Then generate anther set f r t s independently f the first set and cmpute the x t h = r t h n thse. 12 (19)

17 The generatin f the r t s may cme frm the empirical distributin f the actual r t s (i.e. resampling with replacement) r anther distributin deemed apprpriate. We have cnsidered bth the empirical distributin and the N(0, 1) distributin 6. We believe that the generatin scheme 1 t be the mst apprpriate here since it lks mre like the way the riginal data was treated: Y t and x t h btained frm a single set f r t s. Once we have chsen a simulatin scheme, we may btain as many (B, say) samples as we want and thus get B independent realizatins f the statistic ˆR. 2 We then check if the ut-f-sample statistic will take values that are large even in this case, cmpared t the value bserved n the riginal data series. Frmally, cmpute p-value= A where A is the number f simulated ˆR 2 B greater r equal t the ˆR 2 cmputed n the actual data. This measures the plausibility f H 0 ; small values f p-value indicate that H 0 is nt plausible in the light f the actual data bserved. Anther way t use the btstrap values f ˆR 2 is t assume that the distributin f ˆR 2 under H 0 is N(Ê[ ˆR ], 2 ˆV [ ˆR ]) 2 where Ê[ ˆR ] 2 and ˆV [ ˆR ] 2 are the sample mean and the sample variance f the B btstrap values f ˆR. 2 Cmparing the actual ˆR 2 t this distributin yields the nrmalized btstrap p-value. Fr the type 1 methd we simply cmpute the p-value f the bserved ˆR 2 under the null hypthesis f n relatinship between the inputs and the utputs, using the empirical histgram f this statistic ver the btstrap replicatins. When the p-value is small, a mre meaningful quantity might be the mean and the standard deviatin f the statistic ver the btstrap replicatins t prvide a z-statistic. Of curse, this btstrap apprach may be used even in the case where the (asympttic) distributin f a statistic is knwn. Therefre, we will cmpute btstrap p-values fr the statistic ˆρ(r t (h), r t h (h)) as well as its theretical p-value fr cmparisn purpses. Finally, ne may wnder why Fisher s test-statistic R 2 F = (T h 2) 1 R, 2 where R 2 is the in-sample standard R 2, was nt used t test H 0. That is because the famus result F F 1,T k 2 (under H 0 ) hlds in a rather strict framewrk where, amng ther things, the Y s have t be independent (which is nt the case here). The usual theretical p-value wuld be terribly misleading. The actual distributin f F nt being knwn, an interesting exercise is t cmpute the btstrap p-values n F t cmpare with the wrng theretical p-values. 5.1 Results n artificial data They are sme tricky pints cncerning the financial data and the apprpriateness f the asympttic and autcrrelatin-cnsistent standard errr ˆρ(r t (h), r t h (h)). The result f Bartlett (1946) hld fr Gaussian and statinary data. A slight vilatin f these assumptin can cmplicate the cmparisn f the in-sample and ut-f-sample statistics. Hence we generated artificial data and tested the null hypthesis f n relatinship n them. We chse an autregressive prcess f rder 1, y t1 = βy t ɛ t fr which we vary the cefficient β f autregressin frm a range f values between zer and 1 and where ɛ t is drawn frm a nrmal distributin N(0, σ ɛ ), with here σ ɛ = 0.2. We cnduct the tests n the null hypthesis fr series f lengths in the set N = 200, 1000, 2000, 4000, 8000, Since the ut-f-sample R 2, just like the in sample R 2, is lcatin-scale invariant, we dn t have t bther abut matching the mean and variance f the actual series. 13

18 Table 2: Empirical critical pints f fur statistics estimated n series f different length. We can bserve the presence f a negative skew in the empirical distributin f the ρ fr series f length 200 and 500. Fr the same series, the critical pints f the ut-f-sample statistics cntain the the value zer. N ρ R 2 R 2 D 200 [ 0.121, 0.099] [ 0.026, 0.006] [ 0.16, 0.038] 500 [ 0.077, 0.066] [ 0.012, 0.001] [ 0.21, 0.026] 1000 [ 0.050, 0.050] [ 0.006, ] [ 0.25, 0.007] 2000 [ 0.037, 0.035] [ , ] [ 0.27, 0.035] 4000 [ 0.026, 0.028] [ 0.002, ] [ 0.30, 0.045] 8000 [ 0.020, 0.020] [ 0.001, 0.002] [ 0.33, 0.077] [ 0.014, 0.014] [ , ] [ 0.35, 0.099] We first generated, fr each value f n in N, ne thusand series fr which β = 0. Fr each f these series we cnstruct the empirical distributin f 4 statistics, namely the autcrrelatin ρ (equatin 14), the in-sample R 2, the ut-f-sample R 2 and the difference between the ut-f-sample naive cst and the ut-f-sample linear mdels cst, named D (F ) and that we have frmally defined earlier (eq. 10) as with the empirical estimate D (F ) = E Gen (F naive ) E Gen (F ) = E[C T (F naive, z T 1 )] E[C T (F, z T 1 )] ˆD (F ) = C T (F naive, z T 1 ) C T (F, z T 1 ) = T t=m (e naive t ) 2 T (e F t ) 2 nt suffering f the same pssible bias f ˆR 2 (F ) as discussed in sectin 3, because the expectatin f a difference f tw randm variables is equal t the difference f the expectatin f each randm variable E[C T (F naive, z T 1 ) C T (F, z T 1 )] = E[C T (F naive, z T 1 )] E[C T (F, z T 1 )] Frm these empirical distributins, we estimated the critical pints at 10%, [L 5%, H 5% ], excepted fr the in-sample R 2 where we estimated the 10% critical pints at the right f the empirical distributins. Fr the ut-f-sample statistics, we chse M = 50 fr minimum number f training examples befre t prvide generalizatin errr (see equatin 4). The values f these critical pints are presented in table 2. After having established the critical pints at 10%, we want t study the pwer f these tests, i.e. hw each statistic is useful t reject the null hypthesis when the null hypthesis is false. Fr this gal, we generated ne thusand series fr different value f β, sme smaller than σ ɛ and sme larger. We estimated n these series the value f the fur statistics cnsidered in table 2, and β cmpute fr the different values f σ ɛ the number f times each f these statistics are utside the interval delimited by the critical values, r greater than the critical value in the case f R 2. The results are presented in the table 3 up t the table 8. We can bserve frm these tables that the pwer f the test f H :n relatinship between the inputs and the utputs based n the ut-f-sample statistics R 2 and D are less than the pwer f the test based n the in-sample statistic. This seem particularly true fr value f 0.01 < β < 0.05, 14 t=m

19 Table 3: Statistics n artificial data f length 200. ŝ \ β σ ρ 10.0% 10.5% 11.4% 13.5% 15.4% 14.6% 25.3% 46.9% R % 10.5% 10.9% 11.0% 13.4% 12.7% 21.1% 40.7% R % 11.8% 11.1% 12.8% 15.8% 15.3% 18.9% 33.6% D 10.9% 11.4% 11.3% 12.1% 15.8% 14.8% 18.1% 32.3% Table 4: Statistics n artificial data f length 1000 ŝ \ β σ ρ 9.6% 9.5% 11.8% 13.2% 15.3% 16.4% 47.9% 93.7% R 2 9.3% 9.6% 11.5% 11.7% 12.9% 13.2% 46.7% 93.3% R 2 9.7% 8.9% 12.0% 12.2% 11.1% 12.5% 38.3% 86.1% D 9.6% 8.7% 11.7% 12.0% 13.3% 12.2% 37.7% 85.8% Table 5: Statistics n artificial data f length 2000 ŝ \ β σ ρ 10.6% 12.2% 14.3% 17.0% 25.4% 27.6% 75.1% 99.8% R % 12.6% 14.0% 16.9% 24.8% 26.9% 74.0% 99.8% R % 14.1% 13.6% 15.2% 21.8% 20.8% 62.8% 99.1% D 9.9% 13.9% 13.5% 14.5% 21.5% 20.4% 62.4% 99.0% Table 6: Statistics n artificial data f length 4000 ŝ \ β σ ρ 10.5% 10.7% 13.1% 21.1% 27.2% 35.1% 91.2% 100.0% R 2 9.4% 9.9% 14.5% 23.1% 29.7% 37.5% 93.3% 100.0% R % 10.8% 13.0% 20.6% 23.1% 30.1% 85.6% 100.0% D 10.1% 10.5% 13.2% 20.4% 23.4% 30.0% 85.3% 100.0% Table 7: Statistics n artificial data f length 8000 ŝ \ β σ ρ R R D Table 8: Statistics n artificial data f length ŝ \ β σ ρ R R D

20 Empirical Distributins f the autcrrelatin Figure 4: The dts draw the empirical distributin f the ˆρ btained with φ = 0 in the autregressive mdel used t generate the artificial data. The length f the series is 4000 and there are 1000 such series. The represent the empirical distributin f the ˆρ btained with β/σɛ = Lking at the table 6, there is 25.5% f the value utside the critical range [ 0.026, 0.028], presented in table 2. 16

21 Empirical distributins f ut-f-sample R Figure 5: The dts draw the empirical distributin f the ˆR 2 btained with β = 0 in the autregressive mdel used t generate the artificial data. The length f the series is 4000 and there are 1000 σɛ such series. The represent the empirical distributin f the ˆR 2 btained with β/σɛ = Lking at the table 6, there is 25.5% f the value utside the critical range [ 0.30, 0.045], presented in table 2. 17

22 crrespnding t a signal-t-nise rati in the range f 0.05 < β/σ ɛ < Fr small value f β σ ɛ, such as 0 and 0.005, there is nt a clear difference between the pwer f the tests based n insample statistics and n ut-f-sample statistics (althugh the ut-f-sample statistics seem slightly better). Estimatins f these prbabilities t bserve the statistics utside the critical range n ther samples f 1000 series can lead t slight different prbabilities, indicating that we must use a bigger sample in rder t be able t bserve a significant discrepancy between the tests fr small values f signal-t-nise rati. It wuld appear frm these results that when we want t test against the null hypthesis f n dependency, the classical in-sample tests prvide mre pwer. Hwever, there are tw reasns why we may still be interested in lking at the ut-f-sample statistics: first, we may care mre abut the ut-f-sample perfrmance (whether ur mdel will generalize better than the naive mdel) than abut the true value f β (see the fllwing sectin fr a striking result cncerning this pint); secnd, the dependency may be nn-linear r nn-gaussian. 5.2 Discussin f the results n financial data In all cases B = 1000 btstrap replicatins were generated and the ut-f-sample statistic was cmputed n each f them with M = 50, yielding distributins f ˆR 2 fr the null hypthesis which is that the true R 2 is negative. Fr ˆρ(r t (h), r t h (h)), the theretical p-values disagree with the tw thers, see table 9, indicating that the asympttic nrmality f ˆρ(r t (h), r t h (h)) des nt hld in ur sample. Fr Fisher s F, we see that the theretical p-values are awfully wrng. Even the tw btstrap p-values dn t agree well, indicating that the null distributin f F is nt nrmal. We bserved an asymmetry in the empirical distributins f the Fischer s F : mst f the value are near the 0 with a decay in the frequency f the F fr larger F. Typically, the skewness f the empirical distributin fr the F are psitives and arund 2. S here, nly the (pure) btstrap p-value can be trusted as it is valid generally. Regarding ˆR 2 0, we see (table 9) that a similar pattern is bserved fr the psitive ˆR 2 0. The pure btstrap p-values seems indicating a pssible dependence f the near ne year return n the past year return. Als, in this case, the empirical distributins f the ˆR 2 0 are nt nrmal, the bserved skewness n these distributin are systematically negative with values arund 4. The theretical p-values fr this ut-f-sample statistics are nt knw. The table 10 is presented t prvide the crrespndancy between the F statistic shwn in the table 9 and the value f the in-sample ˆR 2. The table 11 presents the results f the test cnducted n the null hypthesis n relatinship between inputs and utputs using the statistic D. This test allws t reject even mre strngly the null hypthesis f n linear dependency than the test based n R 2. 6 Test f H 0 : R 2 = 0 Here we attack the prblem we are actually interested in: assessing whether generalizatins based n past returns are better than the naive generalizatins. Here we cnsider linear frecasts, s that we want t knw if F lin generalizes better than F naive. The statistic we will use t this end is ˆR 2, assuming that its bias, nt ccurring in D, is nt a majr cncern here. Its distributin nt being knwn, we will have t turn t the btstrap methd and simulate values f ˆR 2 cmputed n samples generated under H 0 : R 2 = 0. We assume that E[r t (h) r t h (h)] = α βr t h (h). (20) 18

23 Table 9: Test f the hypthesis f n relatinship between inputs and utputs. Three statistics are used, and fr each the theretical (tpv), pure btstrap (pbpv) and nrmalized (nbpv) p-values are cmputed. H ˆR2 tpv pbpv nbpv ˆρ tpv pbpv nbpv ˆF tpv pbpv nbpv NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA

24 Table 10: The in-sample ˆR 2 presented in the figure 2 and related t Fisher s ˆF shwn in the table 9. H R 2 H R 2 H R 2 H R % 7 1.0% % % % 8 2.8% % % % 9 4.5% % % % % % % % % % % % % % % Table 11: The test based n the D statistic als give a strng evidence against H : n relatin between inputs and utputs. The empirical versin used t estimate D des nt suffer f a bias like the empirical versin f R 2. H D p-value H D p value This is the regressin used by Fama and French (1988) t test the mean reversin hypthesis f a suppsed statinary cmpnent f stck prices. We saw earlier that this amunts t β2 being equal σ 2 t the rati shwn in (??). If we let the Y t s (given x T h 1 h ) have the crrelatin structure shwn in (??), we have E[V ar[ȳt h X T h h1 ]] = σ 2 (t h) 2 h = = σ 2 (t h) 2 h σ 2 (t h) 2 h h 1 s=1 h (h s )(t h s ) [ ] h 1 h(t h) 2 (h s)(t h s) s=1 [ ] h 1 h(t h) 2 s(t 2h s) s=1 [ ] σ 2 = h 2 h(h 1)(2h 1) (t h) (t h) 2 h 3 [ ] σ 2 (h 1)(2h 1) = h(t h) (t h) 2 3 (21) 20

25 and E[V ar[ȳt h ˆβ t h (X t h X t 2h ) X T h h1 ]] = σ2 E[c V c], where V is a (t h) (t h) matrix with V ij = (h i j ), and c is a (t h) 1 vectr with h c i = 1 t h (X t h X t 2h )(X i h X t 2h ) t h j=1 (X j h X, i = 1,..., t h. t 2h ) 2 If we let L be a (T 1) (T h) with L ij = I[0 i j < h]/ h, then we may write c V c as W W where W = Lc. This representatin is useful if we need t cmpute V ar[f lin (Z1 t h )(X t h ) X T h h1 ] = c V c fr varius values f t as recursive relatins may be wrked ut in W. Due the lcatin-scale invariance f the ˆR 2 mentined earlier, σ 2 and α may be chsen as ne pleases (1 and 0, say). The expectatins then depend bviusly n the prcess generating the X t s. The simplest thing t d is t assume that X T h 1 h δ x, that is X T h T h 1 h can nly take the value 1 h bserved. This makes the expectatin easy t wrk ut. Otherwise, these expectatins can be wrked ut via simulatins. Once X T h 1 h s prcess, α, β, σ2 have been chsen, we generate Z1 T = (X T h 1 h, Y 1 T ) as fllws. 1. Generate X T h 1 h. 2. Generate ɛ 1,..., ɛ T s that the ɛ t s are independent f X T h 1 h with V ar[ɛ t] = σ 2 and the cvariance structure shwn in (??). This may be dne by generating independent variates with variance equal t σ2 and take their mving sums (with windw size f h). h 3. Put Y t = α βx t h ɛ t. The btstrap test f H 0 : R 2 = 0 culd be perfrmed by generating B samples in the way explained abve, yielding B btstrap values f ˆR. 2 These wuld be used t cmpute either a pure btstrap p-value r a nrmalized btstrap p-value. Needless t say that generating data under H 01 : R 2 = 0 is mre tedius than generating data under H 02 : n relatinship between inputs and utputs. Furthermre the abve apprach relies heavily n the distributinal assumptins f linearity and the given frm f cvariance, and we wuld like t devise a prcedure that can be extended t nn-linear relatinships, fr example. T get the distributin f ˆR 2 under H 01, we prpse t cnsider an apprximatin saying that the distributin f ˆR2 R 2 is the same under H 01 and H 02. We will call this hypthesis the shifted distributin hypthesis (nte that this hypthesis can nly be apprximately true because fr extreme values f ˆR2 near 1, it cannt be true). This means that we are assuming that the distributin f ˆR 2 under R 2 = 0 has the same shape as its distributin under β = 0 but is shifted t the right (since it crrespnds t a psitive value f β). If that was the case, generating ˆR 2 0 under H 01 wuld be the same as simulating ˆR 2 R 2 under H 02, which we have dne previusly withut subtracting ff R. 2 This R 2 can be btained either analytically r estimated frm the btstrap as B b=1 1 C T (F lin, Z1 T (b)) B b=1 C T (F naive, Z1 T (b)). Nte, t make the ntatin clear, that the btstrap ˆR s 2 are simply 1 C T (F lin,z1 T (b)) C T (F naive,z1 T (b)), b = 1,..., B. Frm these ˆR 2 R s, 2 we btain the btstrap p-values and the nrmalized btstrap p-values as usual. Nte that the btstrap p-values fr H 01 and H 02 are the prprtin f the ˆR s 2 (generated under H 02 ) that are greater than ˆR (bserved) 2 R 2 and ˆR (bserved) 2 respectively. Since R 2 < 0 under H 02, we see that p-value(h 02 ) p-value(h 01 ). 21

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) >

Bootstrap Method > # Purpose: understand how bootstrap method works > obs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(obs) > Btstrap Methd > # Purpse: understand hw btstrap methd wrks > bs=c(11.96, 5.03, 67.40, 16.07, 31.50, 7.73, 11.10, 22.38) > n=length(bs) > mean(bs) [1] 21.64625 > # estimate f lambda > lambda = 1/mean(bs);

More information

Resampling Methods. Chapter 5. Chapter 5 1 / 52

Resampling Methods. Chapter 5. Chapter 5 1 / 52 Resampling Methds Chapter 5 Chapter 5 1 / 52 1 51 Validatin set apprach 2 52 Crss validatin 3 53 Btstrap Chapter 5 2 / 52 Abut Resampling An imprtant statistical tl Pretending the data as ppulatin and

More information

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came.

CHAPTER 24: INFERENCE IN REGRESSION. Chapter 24: Make inferences about the population from which the sample data came. MATH 1342 Ch. 24 April 25 and 27, 2013 Page 1 f 5 CHAPTER 24: INFERENCE IN REGRESSION Chapters 4 and 5: Relatinships between tw quantitative variables. Be able t Make a graph (scatterplt) Summarize the

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff

Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeoff Lecture 2: Supervised vs. unsupervised learning, bias-variance tradeff Reading: Chapter 2 STATS 202: Data mining and analysis September 27, 2017 1 / 20 Supervised vs. unsupervised learning In unsupervised

More information

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9.

Internal vs. external validity. External validity. This section is based on Stock and Watson s Chapter 9. Sectin 7 Mdel Assessment This sectin is based n Stck and Watsn s Chapter 9. Internal vs. external validity Internal validity refers t whether the analysis is valid fr the ppulatin and sample being studied.

More information

Pattern Recognition 2014 Support Vector Machines

Pattern Recognition 2014 Support Vector Machines Pattern Recgnitin 2014 Supprt Vectr Machines Ad Feelders Universiteit Utrecht Ad Feelders ( Universiteit Utrecht ) Pattern Recgnitin 1 / 55 Overview 1 Separable Case 2 Kernel Functins 3 Allwing Errrs (Sft

More information

, which yields. where z1. and z2

, which yields. where z1. and z2 The Gaussian r Nrmal PDF, Page 1 The Gaussian r Nrmal Prbability Density Functin Authr: Jhn M Cimbala, Penn State University Latest revisin: 11 September 13 The Gaussian r Nrmal Prbability Density Functin

More information

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017

Resampling Methods. Cross-validation, Bootstrapping. Marek Petrik 2/21/2017 Resampling Methds Crss-validatin, Btstrapping Marek Petrik 2/21/2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins in R (Springer, 2013) with

More information

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007

CS 477/677 Analysis of Algorithms Fall 2007 Dr. George Bebis Course Project Due Date: 11/29/2007 CS 477/677 Analysis f Algrithms Fall 2007 Dr. Gerge Bebis Curse Prject Due Date: 11/29/2007 Part1: Cmparisn f Srting Algrithms (70% f the prject grade) The bjective f the first part f the assignment is

More information

Distributions, spatial statistics and a Bayesian perspective

Distributions, spatial statistics and a Bayesian perspective Distributins, spatial statistics and a Bayesian perspective Dug Nychka Natinal Center fr Atmspheric Research Distributins and densities Cnditinal distributins and Bayes Thm Bivariate nrmal Spatial statistics

More information

Simple Linear Regression (single variable)

Simple Linear Regression (single variable) Simple Linear Regressin (single variable) Intrductin t Machine Learning Marek Petrik January 31, 2017 Sme f the figures in this presentatin are taken frm An Intrductin t Statistical Learning, with applicatins

More information

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation

A New Evaluation Measure. J. Joiner and L. Werner. The problems of evaluation and the needed criteria of evaluation III-l III. A New Evaluatin Measure J. Jiner and L. Werner Abstract The prblems f evaluatin and the needed criteria f evaluatin measures in the SMART system f infrmatin retrieval are reviewed and discussed.

More information

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart

Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Key Wrds: Autregressive, Mving Average, Runs Tests, Shewhart Cntrl Chart Perfrmance f Sensitizing Rules n Shewhart Cntrl Charts with Autcrrelated Data Sandy D. Balkin Dennis K. J. Lin y Pennsylvania State University, University Park, PA 16802 Sandy Balkin is a graduate student

More information

ENSC Discrete Time Systems. Project Outline. Semester

ENSC Discrete Time Systems. Project Outline. Semester ENSC 49 - iscrete Time Systems Prject Outline Semester 006-1. Objectives The gal f the prject is t design a channel fading simulatr. Upn successful cmpletin f the prject, yu will reinfrce yur understanding

More information

Hypothesis Tests for One Population Mean

Hypothesis Tests for One Population Mean Hypthesis Tests fr One Ppulatin Mean Chapter 9 Ala Abdelbaki Objective Objective: T estimate the value f ne ppulatin mean Inferential statistics using statistics in rder t estimate parameters We will be

More information

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression

4th Indian Institute of Astrophysics - PennState Astrostatistics School July, 2013 Vainu Bappu Observatory, Kavalur. Correlation and Regression 4th Indian Institute f Astrphysics - PennState Astrstatistics Schl July, 2013 Vainu Bappu Observatry, Kavalur Crrelatin and Regressin Rahul Ry Indian Statistical Institute, Delhi. Crrelatin Cnsider a tw

More information

AP Statistics Notes Unit Two: The Normal Distributions

AP Statistics Notes Unit Two: The Normal Distributions AP Statistics Ntes Unit Tw: The Nrmal Distributins Syllabus Objectives: 1.5 The student will summarize distributins f data measuring the psitin using quartiles, percentiles, and standardized scres (z-scres).

More information

What is Statistical Learning?

What is Statistical Learning? What is Statistical Learning? Sales 5 10 15 20 25 Sales 5 10 15 20 25 Sales 5 10 15 20 25 0 50 100 200 300 TV 0 10 20 30 40 50 Radi 0 20 40 60 80 100 Newspaper Shwn are Sales vs TV, Radi and Newspaper,

More information

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis

SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical model for microarray data analysis SUPPLEMENTARY MATERIAL GaGa: a simple and flexible hierarchical mdel fr micrarray data analysis David Rssell Department f Bistatistics M.D. Andersn Cancer Center, Hustn, TX 77030, USA rsselldavid@gmail.cm

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 4: Mdel checing fr ODE mdels In Petre Department f IT, Åb Aademi http://www.users.ab.fi/ipetre/cmpmd/ Cntent Stichimetric matrix Calculating the mass cnservatin relatins

More information

Differentiation Applications 1: Related Rates

Differentiation Applications 1: Related Rates Differentiatin Applicatins 1: Related Rates 151 Differentiatin Applicatins 1: Related Rates Mdel 1: Sliding Ladder 10 ladder y 10 ladder 10 ladder A 10 ft ladder is leaning against a wall when the bttm

More information

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification

COMP 551 Applied Machine Learning Lecture 5: Generative models for linear classification COMP 551 Applied Machine Learning Lecture 5: Generative mdels fr linear classificatin Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Jelle Pineau Class web page: www.cs.mcgill.ca/~hvanh2/cmp551

More information

The blessing of dimensionality for kernel methods

The blessing of dimensionality for kernel methods fr kernel methds Building classifiers in high dimensinal space Pierre Dupnt Pierre.Dupnt@ucluvain.be Classifiers define decisin surfaces in sme feature space where the data is either initially represented

More information

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving.

This section is primarily focused on tools to aid us in finding roots/zeros/ -intercepts of polynomials. Essentially, our focus turns to solving. Sectin 3.2: Many f yu WILL need t watch the crrespnding vides fr this sectin n MyOpenMath! This sectin is primarily fcused n tls t aid us in finding rts/zers/ -intercepts f plynmials. Essentially, ur fcus

More information

Comparing Several Means: ANOVA. Group Means and Grand Mean

Comparing Several Means: ANOVA. Group Means and Grand Mean STAT 511 ANOVA and Regressin 1 Cmparing Several Means: ANOVA Slide 1 Blue Lake snap beans were grwn in 12 pen-tp chambers which are subject t 4 treatments 3 each with O 3 and SO 2 present/absent. The ttal

More information

Computational modeling techniques

Computational modeling techniques Cmputatinal mdeling techniques Lecture 2: Mdeling change. In Petre Department f IT, Åb Akademi http://users.ab.fi/ipetre/cmpmd/ Cntent f the lecture Basic paradigm f mdeling change Examples Linear dynamical

More information

SAMPLING DYNAMICAL SYSTEMS

SAMPLING DYNAMICAL SYSTEMS SAMPLING DYNAMICAL SYSTEMS Melvin J. Hinich Applied Research Labratries The University f Texas at Austin Austin, TX 78713-8029, USA (512) 835-3278 (Vice) 835-3259 (Fax) hinich@mail.la.utexas.edu ABSTRACT

More information

Chapter 3: Cluster Analysis

Chapter 3: Cluster Analysis Chapter 3: Cluster Analysis } 3.1 Basic Cncepts f Clustering 3.1.1 Cluster Analysis 3.1. Clustering Categries } 3. Partitining Methds 3..1 The principle 3.. K-Means Methd 3..3 K-Medids Methd 3..4 CLARA

More information

A Matrix Representation of Panel Data

A Matrix Representation of Panel Data web Extensin 6 Appendix 6.A A Matrix Representatin f Panel Data Panel data mdels cme in tw brad varieties, distinct intercept DGPs and errr cmpnent DGPs. his appendix presents matrix algebra representatins

More information

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa

PSU GISPOPSCI June 2011 Ordinary Least Squares & Spatial Linear Regression in GeoDa There are tw parts t this lab. The first is intended t demnstrate hw t request and interpret the spatial diagnstics f a standard OLS regressin mdel using GeDa. The diagnstics prvide infrmatin abut the

More information

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y )

[COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t o m a k e s u r e y o u a r e r e a d y ) (Abut the final) [COLLEGE ALGEBRA EXAM I REVIEW TOPICS] ( u s e t h i s t m a k e s u r e y u a r e r e a d y ) The department writes the final exam s I dn't really knw what's n it and I can't very well

More information

IN a recent article, Geary [1972] discussed the merit of taking first differences

IN a recent article, Geary [1972] discussed the merit of taking first differences The Efficiency f Taking First Differences in Regressin Analysis: A Nte J. A. TILLMAN IN a recent article, Geary [1972] discussed the merit f taking first differences t deal with the prblems that trends

More information

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank

CAUSAL INFERENCE. Technical Track Session I. Phillippe Leite. The World Bank CAUSAL INFERENCE Technical Track Sessin I Phillippe Leite The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Phillippe Leite fr the purpse f this wrkshp Plicy questins are causal

More information

Lead/Lag Compensator Frequency Domain Properties and Design Methods

Lead/Lag Compensator Frequency Domain Properties and Design Methods Lectures 6 and 7 Lead/Lag Cmpensatr Frequency Dmain Prperties and Design Methds Definitin Cnsider the cmpensatr (ie cntrller Fr, it is called a lag cmpensatr s K Fr s, it is called a lead cmpensatr Ntatin

More information

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d)

COMP 551 Applied Machine Learning Lecture 9: Support Vector Machines (cont d) COMP 551 Applied Machine Learning Lecture 9: Supprt Vectr Machines (cnt d) Instructr: Herke van Hf (herke.vanhf@mail.mcgill.ca) Slides mstly by: Class web page: www.cs.mcgill.ca/~hvanh2/cmp551 Unless therwise

More information

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression

3.4 Shrinkage Methods Prostate Cancer Data Example (Continued) Ridge Regression 3.3.4 Prstate Cancer Data Example (Cntinued) 3.4 Shrinkage Methds 61 Table 3.3 shws the cefficients frm a number f different selectin and shrinkage methds. They are best-subset selectin using an all-subsets

More information

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs

Admissibility Conditions and Asymptotic Behavior of Strongly Regular Graphs Admissibility Cnditins and Asympttic Behavir f Strngly Regular Graphs VASCO MOÇO MANO Department f Mathematics University f Prt Oprt PORTUGAL vascmcman@gmailcm LUÍS ANTÓNIO DE ALMEIDA VIEIRA Department

More information

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION

On Huntsberger Type Shrinkage Estimator for the Mean of Normal Distribution ABSTRACT INTRODUCTION Malaysian Jurnal f Mathematical Sciences 4(): 7-4 () On Huntsberger Type Shrinkage Estimatr fr the Mean f Nrmal Distributin Department f Mathematical and Physical Sciences, University f Nizwa, Sultanate

More information

Inference in the Multiple-Regression

Inference in the Multiple-Regression Sectin 5 Mdel Inference in the Multiple-Regressin Kinds f hypthesis tests in a multiple regressin There are several distinct kinds f hypthesis tests we can run in a multiple regressin. Suppse that amng

More information

Kinetic Model Completeness

Kinetic Model Completeness 5.68J/10.652J Spring 2003 Lecture Ntes Tuesday April 15, 2003 Kinetic Mdel Cmpleteness We say a chemical kinetic mdel is cmplete fr a particular reactin cnditin when it cntains all the species and reactins

More information

Part 3 Introduction to statistical classification techniques

Part 3 Introduction to statistical classification techniques Part 3 Intrductin t statistical classificatin techniques Machine Learning, Part 3, March 07 Fabi Rli Preamble ØIn Part we have seen that if we knw: Psterir prbabilities P(ω i / ) Or the equivalent terms

More information

Eric Klein and Ning Sa

Eric Klein and Ning Sa Week 12. Statistical Appraches t Netwrks: p1 and p* Wasserman and Faust Chapter 15: Statistical Analysis f Single Relatinal Netwrks There are fur tasks in psitinal analysis: 1) Define Equivalence 2) Measure

More information

AP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date

AP Statistics Practice Test Unit Three Exploring Relationships Between Variables. Name Period Date AP Statistics Practice Test Unit Three Explring Relatinships Between Variables Name Perid Date True r False: 1. Crrelatin and regressin require explanatry and respnse variables. 1. 2. Every least squares

More information

COMP 551 Applied Machine Learning Lecture 4: Linear classification

COMP 551 Applied Machine Learning Lecture 4: Linear classification COMP 551 Applied Machine Learning Lecture 4: Linear classificatin Instructr: Jelle Pineau (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted

More information

We can see from the graph above that the intersection is, i.e., [ ).

We can see from the graph above that the intersection is, i.e., [ ). MTH 111 Cllege Algebra Lecture Ntes July 2, 2014 Functin Arithmetic: With nt t much difficulty, we ntice that inputs f functins are numbers, and utputs f functins are numbers. S whatever we can d with

More information

Tree Structured Classifier

Tree Structured Classifier Tree Structured Classifier Reference: Classificatin and Regressin Trees by L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stne, Chapman & Hall, 98. A Medical Eample (CART): Predict high risk patients

More information

Lab 1 The Scientific Method

Lab 1 The Scientific Method INTRODUCTION The fllwing labratry exercise is designed t give yu, the student, an pprtunity t explre unknwn systems, r universes, and hypthesize pssible rules which may gvern the behavir within them. Scientific

More information

SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A.

SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST. Mark C. Otto Statistics Research Division, Bureau of the Census Washington, D.C , U.S.A. SIZE BIAS IN LINE TRANSECT SAMPLING: A FIELD TEST Mark C. Ott Statistics Research Divisin, Bureau f the Census Washingtn, D.C. 20233, U.S.A. and Kenneth H. Pllck Department f Statistics, Nrth Carlina State

More information

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint

Biplots in Practice MICHAEL GREENACRE. Professor of Statistics at the Pompeu Fabra University. Chapter 13 Offprint Biplts in Practice MICHAEL GREENACRE Prfessr f Statistics at the Pmpeu Fabra University Chapter 13 Offprint CASE STUDY BIOMEDICINE Cmparing Cancer Types Accrding t Gene Epressin Arrays First published:

More information

Determining the Accuracy of Modal Parameter Estimation Methods

Determining the Accuracy of Modal Parameter Estimation Methods Determining the Accuracy f Mdal Parameter Estimatin Methds by Michael Lee Ph.D., P.E. & Mar Richardsn Ph.D. Structural Measurement Systems Milpitas, CA Abstract The mst cmmn type f mdal testing system

More information

Least Squares Optimal Filtering with Multirate Observations

Least Squares Optimal Filtering with Multirate Observations Prc. 36th Asilmar Cnf. n Signals, Systems, and Cmputers, Pacific Grve, CA, Nvember 2002 Least Squares Optimal Filtering with Multirate Observatins Charles W. herrien and Anthny H. Hawes Department f Electrical

More information

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS

CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS CHAPTER 4 DIAGNOSTICS FOR INFLUENTIAL OBSERVATIONS 1 Influential bservatins are bservatins whse presence in the data can have a distrting effect n the parameter estimates and pssibly the entire analysis,

More information

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA

Modelling of Clock Behaviour. Don Percival. Applied Physics Laboratory University of Washington Seattle, Washington, USA Mdelling f Clck Behaviur Dn Percival Applied Physics Labratry University f Washingtn Seattle, Washingtn, USA verheads and paper fr talk available at http://faculty.washingtn.edu/dbp/talks.html 1 Overview

More information

Sections 15.1 to 15.12, 16.1 and 16.2 of the textbook (Robbins-Miller) cover the materials required for this topic.

Sections 15.1 to 15.12, 16.1 and 16.2 of the textbook (Robbins-Miller) cover the materials required for this topic. Tpic : AC Fundamentals, Sinusidal Wavefrm, and Phasrs Sectins 5. t 5., 6. and 6. f the textbk (Rbbins-Miller) cver the materials required fr this tpic.. Wavefrms in electrical systems are current r vltage

More information

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India

CHAPTER 3 INEQUALITIES. Copyright -The Institute of Chartered Accountants of India CHAPTER 3 INEQUALITIES Cpyright -The Institute f Chartered Accuntants f India INEQUALITIES LEARNING OBJECTIVES One f the widely used decisin making prblems, nwadays, is t decide n the ptimal mix f scarce

More information

MATHEMATICS SYLLABUS SECONDARY 5th YEAR

MATHEMATICS SYLLABUS SECONDARY 5th YEAR Eurpean Schls Office f the Secretary-General Pedaggical Develpment Unit Ref. : 011-01-D-8-en- Orig. : EN MATHEMATICS SYLLABUS SECONDARY 5th YEAR 6 perid/week curse APPROVED BY THE JOINT TEACHING COMMITTEE

More information

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels

k-nearest Neighbor How to choose k Average of k points more reliable when: Large k: noise in attributes +o o noise in class labels Mtivating Example Memry-Based Learning Instance-Based Learning K-earest eighbr Inductive Assumptin Similar inputs map t similar utputs If nt true => learning is impssible If true => learning reduces t

More information

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms

Chapter Summary. Mathematical Induction Strong Induction Recursive Definitions Structural Induction Recursive Algorithms Chapter 5 1 Chapter Summary Mathematical Inductin Strng Inductin Recursive Definitins Structural Inductin Recursive Algrithms Sectin 5.1 3 Sectin Summary Mathematical Inductin Examples f Prf by Mathematical

More information

5 th grade Common Core Standards

5 th grade Common Core Standards 5 th grade Cmmn Cre Standards In Grade 5, instructinal time shuld fcus n three critical areas: (1) develping fluency with additin and subtractin f fractins, and develping understanding f the multiplicatin

More information

Checking the resolved resonance region in EXFOR database

Checking the resolved resonance region in EXFOR database Checking the reslved resnance regin in EXFOR database Gttfried Bertn Sciété de Calcul Mathématique (SCM) Oscar Cabells OECD/NEA Data Bank JEFF Meetings - Sessin JEFF Experiments Nvember 0-4, 017 Bulgne-Billancurt,

More information

Homology groups of disks with holes

Homology groups of disks with holes Hmlgy grups f disks with hles THEOREM. Let p 1,, p k } be a sequence f distinct pints in the interir unit disk D n where n 2, and suppse that fr all j the sets E j Int D n are clsed, pairwise disjint subdisks.

More information

Math Foundations 20 Work Plan

Math Foundations 20 Work Plan Math Fundatins 20 Wrk Plan Units / Tpics 20.8 Demnstrate understanding f systems f linear inequalities in tw variables. Time Frame December 1-3 weeks 6-10 Majr Learning Indicatrs Identify situatins relevant

More information

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data

x 1 Outline IAML: Logistic Regression Decision Boundaries Example Data Outline IAML: Lgistic Regressin Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester Lgistic functin Lgistic regressin Learning lgistic regressin Optimizatin The pwer f nn-linear basis functins Least-squares

More information

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax

Revision: August 19, E Main Suite D Pullman, WA (509) Voice and Fax .7.4: Direct frequency dmain circuit analysis Revisin: August 9, 00 5 E Main Suite D Pullman, WA 9963 (509) 334 6306 ice and Fax Overview n chapter.7., we determined the steadystate respnse f electrical

More information

Preparation work for A2 Mathematics [2017]

Preparation work for A2 Mathematics [2017] Preparatin wrk fr A2 Mathematics [2017] The wrk studied in Y12 after the return frm study leave is frm the Cre 3 mdule f the A2 Mathematics curse. This wrk will nly be reviewed during Year 13, it will

More information

B. Definition of an exponential

B. Definition of an exponential Expnents and Lgarithms Chapter IV - Expnents and Lgarithms A. Intrductin Starting with additin and defining the ntatins fr subtractin, multiplicatin and divisin, we discvered negative numbers and fractins.

More information

Five Whys How To Do It Better

Five Whys How To Do It Better Five Whys Definitin. As explained in the previus article, we define rt cause as simply the uncvering f hw the current prblem came int being. Fr a simple causal chain, it is the entire chain. Fr a cmplex

More information

Thermodynamics Partial Outline of Topics

Thermodynamics Partial Outline of Topics Thermdynamics Partial Outline f Tpics I. The secnd law f thermdynamics addresses the issue f spntaneity and invlves a functin called entrpy (S): If a prcess is spntaneus, then Suniverse > 0 (2 nd Law!)

More information

7 TH GRADE MATH STANDARDS

7 TH GRADE MATH STANDARDS ALGEBRA STANDARDS Gal 1: Students will use the language f algebra t explre, describe, represent, and analyze number expressins and relatins 7 TH GRADE MATH STANDARDS 7.M.1.1: (Cmprehensin) Select, use,

More information

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards:

MODULE FOUR. This module addresses functions. SC Academic Elementary Algebra Standards: MODULE FOUR This mdule addresses functins SC Academic Standards: EA-3.1 Classify a relatinship as being either a functin r nt a functin when given data as a table, set f rdered pairs, r graph. EA-3.2 Use

More information

Dead-beat controller design

Dead-beat controller design J. Hetthéssy, A. Barta, R. Bars: Dead beat cntrller design Nvember, 4 Dead-beat cntrller design In sampled data cntrl systems the cntrller is realised by an intelligent device, typically by a PLC (Prgrammable

More information

Weathering. Title: Chemical and Mechanical Weathering. Grade Level: Subject/Content: Earth and Space Science

Weathering. Title: Chemical and Mechanical Weathering. Grade Level: Subject/Content: Earth and Space Science Weathering Title: Chemical and Mechanical Weathering Grade Level: 9-12 Subject/Cntent: Earth and Space Science Summary f Lessn: Students will test hw chemical and mechanical weathering can affect a rck

More information

Module 4: General Formulation of Electric Circuit Theory

Module 4: General Formulation of Electric Circuit Theory Mdule 4: General Frmulatin f Electric Circuit Thery 4. General Frmulatin f Electric Circuit Thery All electrmagnetic phenmena are described at a fundamental level by Maxwell's equatins and the assciated

More information

IAML: Support Vector Machines

IAML: Support Vector Machines 1 / 22 IAML: Supprt Vectr Machines Charles Suttn and Victr Lavrenk Schl f Infrmatics Semester 1 2 / 22 Outline Separating hyperplane with maimum margin Nn-separable training data Epanding the input int

More information

Module 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics

Module 3: Gaussian Process Parameter Estimation, Prediction Uncertainty, and Diagnostics Mdule 3: Gaussian Prcess Parameter Estimatin, Predictin Uncertainty, and Diagnstics Jerme Sacks and William J Welch Natinal Institute f Statistical Sciences and University f British Clumbia Adapted frm

More information

BASD HIGH SCHOOL FORMAL LAB REPORT

BASD HIGH SCHOOL FORMAL LAB REPORT BASD HIGH SCHOOL FORMAL LAB REPORT *WARNING: After an explanatin f what t include in each sectin, there is an example f hw the sectin might lk using a sample experiment Keep in mind, the sample lab used

More information

NUMBERS, MATHEMATICS AND EQUATIONS

NUMBERS, MATHEMATICS AND EQUATIONS AUSTRALIAN CURRICULUM PHYSICS GETTING STARTED WITH PHYSICS NUMBERS, MATHEMATICS AND EQUATIONS An integral part t the understanding f ur physical wrld is the use f mathematical mdels which can be used t

More information

Statistical Learning. 2.1 What Is Statistical Learning?

Statistical Learning. 2.1 What Is Statistical Learning? 2 Statistical Learning 2.1 What Is Statistical Learning? In rder t mtivate ur study f statistical learning, we begin with a simple example. Suppse that we are statistical cnsultants hired by a client t

More information

Performance Bounds for Detect and Avoid Signal Sensing

Performance Bounds for Detect and Avoid Signal Sensing Perfrmance unds fr Detect and Avid Signal Sensing Sam Reisenfeld Real-ime Infrmatin etwrks, University f echnlgy, Sydney, radway, SW 007, Australia samr@uts.edu.au Abstract Detect and Avid (DAA) is a Cgnitive

More information

A Few Basic Facts About Isothermal Mass Transfer in a Binary Mixture

A Few Basic Facts About Isothermal Mass Transfer in a Binary Mixture Few asic Facts but Isthermal Mass Transfer in a inary Miture David Keffer Department f Chemical Engineering University f Tennessee first begun: pril 22, 2004 last updated: January 13, 2006 dkeffer@utk.edu

More information

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank

MATCHING TECHNIQUES. Technical Track Session VI. Emanuela Galasso. The World Bank MATCHING TECHNIQUES Technical Track Sessin VI Emanuela Galass The Wrld Bank These slides were develped by Christel Vermeersch and mdified by Emanuela Galass fr the purpse f this wrkshp When can we use

More information

CS 109 Lecture 23 May 18th, 2016

CS 109 Lecture 23 May 18th, 2016 CS 109 Lecture 23 May 18th, 2016 New Datasets Heart Ancestry Netflix Our Path Parameter Estimatin Machine Learning: Frmally Many different frms f Machine Learning We fcus n the prblem f predictin Want

More information

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers

LHS Mathematics Department Honors Pre-Calculus Final Exam 2002 Answers LHS Mathematics Department Hnrs Pre-alculus Final Eam nswers Part Shrt Prblems The table at the right gives the ppulatin f Massachusetts ver the past several decades Using an epnential mdel, predict the

More information

Phys. 344 Ch 7 Lecture 8 Fri., April. 10 th,

Phys. 344 Ch 7 Lecture 8 Fri., April. 10 th, Phys. 344 Ch 7 Lecture 8 Fri., April. 0 th, 009 Fri. 4/0 8. Ising Mdel f Ferrmagnets HW30 66, 74 Mn. 4/3 Review Sat. 4/8 3pm Exam 3 HW Mnday: Review fr est 3. See n-line practice test lecture-prep is t

More information

Medium Scale Integrated (MSI) devices [Sections 2.9 and 2.10]

Medium Scale Integrated (MSI) devices [Sections 2.9 and 2.10] EECS 270, Winter 2017, Lecture 3 Page 1 f 6 Medium Scale Integrated (MSI) devices [Sectins 2.9 and 2.10] As we ve seen, it s smetimes nt reasnable t d all the design wrk at the gate-level smetimes we just

More information

NOTE ON A CASE-STUDY IN BOX-JENKINS SEASONAL FORECASTING OF TIME SERIES BY STEFFEN L. LAURITZEN TECHNICAL REPORT NO. 16 APRIL 1974

NOTE ON A CASE-STUDY IN BOX-JENKINS SEASONAL FORECASTING OF TIME SERIES BY STEFFEN L. LAURITZEN TECHNICAL REPORT NO. 16 APRIL 1974 NTE N A CASE-STUDY IN B-JENKINS SEASNAL FRECASTING F TIME SERIES BY STEFFEN L. LAURITZEN TECHNICAL REPRT N. 16 APRIL 1974 PREPARED UNDER CNTRACT N00014-67-A-0112-0030 (NR-042-034) FR THE FFICE F NAVAL

More information

UNIV1"'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION

UNIV1'RSITY OF NORTH CAROLINA Department of Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION UNIV1"'RSITY OF NORTH CAROLINA Department f Statistics Chapel Hill, N. C. CUMULATIVE SUM CONTROL CHARTS FOR THE FOLDED NORMAL DISTRIBUTION by N. L. Jlmsn December 1962 Grant N. AFOSR -62..148 Methds f

More information

Support-Vector Machines

Support-Vector Machines Supprt-Vectr Machines Intrductin Supprt vectr machine is a linear machine with sme very nice prperties. Haykin chapter 6. See Alpaydin chapter 13 fr similar cntent. Nte: Part f this lecture drew material

More information

Sequential Allocation with Minimal Switching

Sequential Allocation with Minimal Switching In Cmputing Science and Statistics 28 (1996), pp. 567 572 Sequential Allcatin with Minimal Switching Quentin F. Stut 1 Janis Hardwick 1 EECS Dept., University f Michigan Statistics Dept., Purdue University

More information

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines

COMP 551 Applied Machine Learning Lecture 11: Support Vector Machines COMP 551 Applied Machine Learning Lecture 11: Supprt Vectr Machines Instructr: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/cmp551 Unless therwise nted, all material psted fr this curse

More information

Technical Bulletin. Generation Interconnection Procedures. Revisions to Cluster 4, Phase 1 Study Methodology

Technical Bulletin. Generation Interconnection Procedures. Revisions to Cluster 4, Phase 1 Study Methodology Technical Bulletin Generatin Intercnnectin Prcedures Revisins t Cluster 4, Phase 1 Study Methdlgy Release Date: Octber 20, 2011 (Finalizatin f the Draft Technical Bulletin released n September 19, 2011)

More information

WRITING THE REPORT. Organizing the report. Title Page. Table of Contents

WRITING THE REPORT. Organizing the report. Title Page. Table of Contents WRITING THE REPORT Organizing the reprt Mst reprts shuld be rganized in the fllwing manner. Smetime there is a valid reasn t include extra chapters in within the bdy f the reprt. 1. Title page 2. Executive

More information

ECE 5318/6352 Antenna Engineering. Spring 2006 Dr. Stuart Long. Chapter 6. Part 7 Schelkunoff s Polynomial

ECE 5318/6352 Antenna Engineering. Spring 2006 Dr. Stuart Long. Chapter 6. Part 7 Schelkunoff s Polynomial ECE 538/635 Antenna Engineering Spring 006 Dr. Stuart Lng Chapter 6 Part 7 Schelkunff s Plynmial 7 Schelkunff s Plynmial Representatin (fr discrete arrays) AF( ψ ) N n 0 A n e jnψ N number f elements in

More information

Interference is when two (or more) sets of waves meet and combine to produce a new pattern.

Interference is when two (or more) sets of waves meet and combine to produce a new pattern. Interference Interference is when tw (r mre) sets f waves meet and cmbine t prduce a new pattern. This pattern can vary depending n the riginal wave directin, wavelength, amplitude, etc. The tw mst extreme

More information

READING STATECHART DIAGRAMS

READING STATECHART DIAGRAMS READING STATECHART DIAGRAMS Figure 4.48 A Statechart diagram with events The diagram in Figure 4.48 shws all states that the bject plane can be in during the curse f its life. Furthermre, it shws the pssible

More information

Aerodynamic Separability in Tip Speed Ratio and Separability in Wind Speed- a Comparison

Aerodynamic Separability in Tip Speed Ratio and Separability in Wind Speed- a Comparison Jurnal f Physics: Cnference Series OPEN ACCESS Aerdynamic Separability in Tip Speed Rati and Separability in Wind Speed- a Cmparisn T cite this article: M L Gala Sants et al 14 J. Phys.: Cnf. Ser. 555

More information

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur

initially lcated away frm the data set never win the cmpetitin, resulting in a nnptimal nal cdebk, [2] [3] [4] and [5]. Khnen's Self Organizing Featur Cdewrd Distributin fr Frequency Sensitive Cmpetitive Learning with One Dimensinal Input Data Aristides S. Galanpuls and Stanley C. Ahalt Department f Electrical Engineering The Ohi State University Abstract

More information

We say that y is a linear function of x if. Chapter 13: The Correlation Coefficient and the Regression Line

We say that y is a linear function of x if. Chapter 13: The Correlation Coefficient and the Regression Line Chapter 13: The Crrelatin Cefficient and the Regressin Line We begin with a sme useful facts abut straight lines. Recall the x, y crdinate system, as pictured belw. 3 2 1 y = 2.5 y = 0.5x 3 2 1 1 2 3 1

More information

ECEN 4872/5827 Lecture Notes

ECEN 4872/5827 Lecture Notes ECEN 4872/5827 Lecture Ntes Lecture #5 Objectives fr lecture #5: 1. Analysis f precisin current reference 2. Appraches fr evaluating tlerances 3. Temperature Cefficients evaluatin technique 4. Fundamentals

More information