Robust Estimation of the Self-similarity Parameter in Network Traffic

Size: px
Start display at page:

Download "Robust Estimation of the Self-similarity Parameter in Network Traffic"

Transcription

1 Robust Estimation of the Self-similarity Parameter in Network Traffic Haipeng Shen Department of Statistics and Operations Research University of North Carolina at Chapel Hill * joint work with Thomas Lee (UC Davis), Zhengyuan Zhu (Iowa State) June 24, 2010 Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 1 / 48

2 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 2 / 48

3 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 3 / 48

4 Example: Abilene Trace (a) Abilene: original trace packet count X(t) sampling time t (100ms interval) 2-hour trace, 100ms sampling unit eastbound traffic on the Abilene Backbone Network between KC and Indianapolis Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 3 / 48

5 Long-range Dependence X(t): a second-order stationary stochastic process X(t) is long-range dependent (LRD) with parameter α, if its autocorrelation function satisfies γ X (k) c γ k (1 α) as k, for α (0, 1) k γ X (k) = ; γ X (k) goes to zero very slowly Alternative characterization: its spectrum function f X (ν) c f ν α as ν 0, for α (0, 1) (1) Example: Var(X n ) = 2cγnα (1+α)α Doukhan, Oppenheim and Taqqu (2003) 1 n Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 4 / 48

6 Self-similarity Y (t) is self-similar (SS) if and only if c H Y (ct) d = Y (t) for all c > 0 d: equality in finite-dimensional distributions H: self-similarity parameter, or Hurst parameter Close connection between LRD and SS If Y (t) has finite variance, and 1/2 < H < 1, then its increments are LRD H = (α + 1)/2, or α = 2H 1 Example: fractional Brownian motion (fbm): self-similar fractional Gaussian noise (fgn): LRD Estimation of H or α (and c γ ) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 5 / 48

7 Example: Abilene Trace (Ĥ = 119) (a) Abilene: original trace packet count X(t) sampling time t (100ms interval) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 6 / 48

8 Example: UNC Trace I (Ĥ = 128) (a) UNC02 APR 09: original trace packet count X(t) sampling time t (100ms interval) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 7 / 48

9 Example: UNC Trace II (Ĥ = 151) packet count X(t) (a) UNC02 APR 13: original trace sampling time t (100ms interval) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 8 / 48

10 Example: UNC Trace III (Ĥ = 092) (a) UNC02 APR 11: missing values packet count X(t) sampling time t (100ms interval) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 9 / 48

11 Example: UNC Trace III, zoomed-in view (a) UNC02 APR 11: missing values packet count X(t) sampling time t (100ms interval) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 10 / 48

12 Practical Challenges for Parameter Estimation Non-stationarity gradual diurnal trends, (maybe polynomial) abrupt mean level shifts (of various magnitudes) sudden drop of traffic level, missing values confounding between non-stationarity and LRD Extreme values, both large and small Our goals: robust estimation some characterization of the non-stationarity Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 11 / 48

13 Haipeng Shen (UNC-CH) Figure 1: One Problematic Statistics Real of Networks Trace with Level Shifts Issac Newton Institute 12 / 48 Example: Abilene Trace, combined (a) Abilene: original trace (b) Abilene: Logscale Diagram packet count X(t) sampling time t (100ms interval) y_j LD RLD Octave j (c) Abilene: level shifts (d) Abilene: level shifts removed level shift alpha(t) sampling time t (100ms interval) detrended trace beta(t) sampling time t (100ms interval)

14 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 13 / 48

15 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 13 / 48

16 Existing Estimators for H or α Earlier estimators: the aggregated variance estimator the periodogram based estimator the Whittle estimator see Taqqu, Teverovsky and Willinger (1995) Our focus: wavelet-based estimators Abry and Veitch (1998), Veitch and Abry (1999) Soltani, Simard and Boichu (2004) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 13 / 48

17 The Abry-Veitch (AV) Wavelet Estimator Consider a family of wavelet basis functions: {ψ j,k (t) = 2 j/2 ψ 0 (2 j t k), j = 1,, J, k Z} X(t): second-order stationary, with LRD parameter α Discrete wavelet transform coefficients: d X (j, k) = X(t), ψ j,k Veitch and Abry (1999): d X (j, ) s are iid Gaussian for a fixed j dx (j, ) and d X (j, ) are independent when j j for large j, Ed 2 X (j, ) = 2 jα c f C, (2) where C = C(α, ψ 0 ) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 14 / 48

18 The AV Estimator Equation (2) suggests that log 2 ( Ed 2 X (j, ) ) = jα + log 2 (c f C) To estimate EdX 2 (j, ), consider µ j = 1 n j dx 2 (j, k), n j k=1 where n j : number of coefficients at scale j Hence, log 2 µ j d = jα + log2 (c f C) log 2 (n j ) + ln X nj / ln 2, where X nj is Chi-squared with n j degrees of freedom E ( log 2 µ j ) = jα + log2 (c f C) + g j, var ( log 2 µ j ) = ζ(2, nj /2)/ ln 2 2 Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 15 / 48

19 The AV Estimator To correct for bias, denote y j log 2 µ j g j Consider linear regression y j = jα + log 2 (c f C) + ɛ j (3) ɛ j has mean 0 and variance ζ(2, n j /2)/ ln 2 2 The AV estimator of α: the weighted-least-squares (WLS) estimate from Model (3) Then, Ĥ = (ˆα + 1) /2 Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 16 / 48

20 50 0 The AV Estimator: Logscale Diagram sampling time t Logscale Diagram (LD): a plot of y j against j (c) H= sampling time t (d) H=09 X(t) Synthetic fractional Gaussian trace: sampling time t y_j True LD LD RLD Octave j Figure 3: Sample fgn with level shifts For LRD processes, the upper part of the LD forms a straight line of slope α; a diagnostic tool for the existence of LRD For the trace shown in Figure 3(c), Figure 3(d) plots the Logscale Diagrams of the original fgn (True LD) and the level-shifts-added fgn (LD) along with the robust LD Selection of the onset of scaling: Veitch, Abry and Taqqu (2003), Park and Park (2009) (RLD) Approximate Gaussian confidence intervals are also provided along the True LD using the variance of ɛ j (Section 32) As one can see, the upper part of the RLD looks Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 17 / 48

21 The SSB Estimator Soltani et al (2004) consider ( ) D j,k = dx 2 (j, k) + d X 2 (j, k + n j/2) /2 Then, D j,k log 2 D j,k d = jα + log 2 (c f C) 1 + ln X 2 / ln 2, (4) where X 2 : Chi-squared with two degrees of freedom It follows that D j,k has a negative Gumbel distribution mean: jα + log2 (c f C) δ/ ln 2 variance: π 2 /(6 ln 2 2) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 18 / 48

22 The SSB Estimator Define j 1 n j /2 n j D j,k k=1 Then, j d = jα + log2 (c f C) δ/ ln 2 + ɛ j (5) Cental Limit Theorem suggests that ( ( )) ɛ j N 0, π 2 / 3n j ln 2 2 Finally, to estimate α, perform WLS regression of j on j Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 19 / 48

23 Comparison of the Two Estimators AV mean log SSB log mean SSB performs slightly better than AV more immune to heavy-tailed fluctuations Stoev, Pipiras and Taqqu (2002) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 20 / 48

24 Advantages of Wavelet Estimators naturally incorporate the scaling unbiased, asymptotically efficient fast computation due to DWT (O(n)) robust to polynomial trends, depending on the number of vanishing moments of the wavelet basis robust to (moderate) level shifts in mean and variance (Roughan and Veitch, 1999) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 21 / 48

25 LRD under Non-stationarity Roughan and Veitch (1999) considered a class of non-stationary LRD models X(t; m, σ, H, c γ ) = m(t) + σ(t)w (t; H, c γ ) where W (t; H, c γ ): mean zero, unit variance LRD They specifically looked at models of mean level shifts X(t) = T (t; J, S, n/2) + W (t; H, c γ ) where T (t; J, S, L) = 1 + J 2 + J ( ) t L π arctan S L: location of the shift J: size of the shift S: smoothness of the shift Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 22 / 48

26 specifically, we begin with a stationary LRD model, and define a class of non-stationary variations by transforming it to induce a Achange Class in the meanof and/or Mean variance, whilst Level the parameters Shifts mea- Models suring the LRD, including H, remain well defined and constant 1 Smoothness=300 In this way some time-varying properties are allowed, and are well defined, but important features of the original stationary 05 Smoothness=0 model remain, and remain well defined also A class of non-stationary LRD models for the traffic rate X(t) is again given by transformation of the mean zero, unit variance LRD3 W (t H c r ), resulting in time from Roughan and Veitch (1999) X(t m H c r )=m(t) +(t) W (t H c r ) (6) Fig 1 The transition functions, with jump size J =1:0 25 where m(t) and (t) are positive functions of time Comparing with (5), we see that the location and scale parameters have f Non stationary FGN, model 1: H = 080, c = 028, sd = become 2 time varying, but the shape function ; W, and its associated parameters (H c jump size = 40 Smoothness=40 r ), do not change In fact m X (t) =m(t), 9 8 smoothness = 1200 and 15 X 2 (t) =2 (t) and 7 Smoothness= R X (t s) = (t)(s); W (t ; s H c r ) ; X (t s) = ; W (t ; s H c r ) Smoothness=300 ; W (k H c r ): (7) Thus, although the autocovariance function is no longer a function 05of the lag Smoothness=0 only, the autocorrelation function retains this t property despite the non-stationarities in location and scale Since we have used a definition of LRD based on such an autocorrelation Fig 2 Non-stationary FGN (parameters shown on each subplot) The white 0 function, it remains well defined, and gives a precise lines show the mean, while the dashed lines show one standard deviation meaning to the notion of non-stationary time LRD models, where the about the mean The left (right) figure shows NS FGN s constructed according to Model I (resp II) LRD parameters (H c r ) retain their physical meanings, and remain constant Fig 1 TheThus, transition in this functions, framework withthe jump estimation size J =1:0 of (H c r ) members of the family are illustrated with smoothness values has the meaning of measuring the stationary part of the non- S = f g, each with J =1and L = 8192 The Non stationary traffic FGN, model model In this 1: H paper = 080, wecconcentrate f = 028, sd = on10 the same smoothness values are used in simulations although, due robust estimation of H Although the estimation of c 12 r in the to space limitations, typically only results for S = f0 300g will normal 11 stationary context is well understood ([21], [20]), the estimation 9 10 jump size = 40 be shown The case S =0corresponds to the limit of the above 8 smoothness of c r in the non-stationary = 1200 context is more difficult and function as S! 0 from above, namely a step function The will7be studied elsewhere 6 smoothness parameter has the dimensions of time and gives a For 5 the remainder of the paper we will consider a particular 4 measure of the duration of the transition region A dimensionless Networksmeasure of the rapidity Issac of change Newtonacross Institutethe region, 23 / form3 of m(t) and (t), namely that of a level shift a monotone 2 Haipeng Shen (UNC-CH) Statistics of 48 process mean X t process me X t 15 Smoothness=40 Smoothness=1200

27 Robustness against Mean (and Variance) Level Shifts Stationary FGN Estimates NS FGN: Mean Shift Variance Shift TABLE I from Roughan and Veitch (1999) the exact size/location of shifts: helpful for deciding LRD region multiple shifts Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 24 / 48

28 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 25 / 48

29 Our Model Suppose observing X(t) at a set of n discrete time points The model: X(t) = α(t) + β(t), α(t) = i µ i1 [ti,t i+1 )(t), t 0 = 1 < t 1 < < t m 1 < t m = n α(t): mean level shifts T = {t1,, t m 1 }: the collection of the shift locations β(t): stationary LRD with a Hurst parameter H Connection with the non-stationary LRD model (Roughan and Veitch, 1999), and the alpha-beta model (Sarvotham, Riedi and Baraniuk, 2001) The issue of model selection: select model for α(t): m and T estimation of H Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 25 / 48

30 Defining A Best Fitting Model Need to define the best combination of m and T Bayesian Information Criterion (BIC) (Schwartz, 1978): BIC(m, T H 0 ) = 2{l(m, T H 0 )} + (2m 1) ln n l(m, T H0 ): the conditional log likelihood for the fitted model 2m 1: number of parameters in the fitted model trade-off between maximum likelihood and model size Suppose β(t) is fgn with a Hurst parameter H 0 Then, l(m, T H 0 ) = C(n) n 2 ln t,s {X(t) ˆα(t)}{X(s) ˆα(s)}W ts, where W ts is the (t, s)-th element of the inverse correlation matrix The best model has (m, T ) that minimizes BIC Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 26 / 48

31 Defining A Best Fitting Model Once (m, T ) is decided, ˆµ i = 1 t i+1 t i t i+1 1 t=t i X(t), ˆα(t) = i ˆµ i 1 [ˆt i,ˆt i+1 ) (t) Residual Sum of Squares: RSS m = t {X(t) ˆα(t)}2 Note that the BIC depends on an initial value H 0 To solve this (cyclic) problem consider a set of initial candidate values for H0 derive the best BIC under each candidate select H0 as the one that gives the smallest BIC value determine the final level shifts under the corresponding (m, T ) Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 27 / 48

32 Locating the Best Model Non-trivial to minimize BIC n is typically huge, m varies amongst the models Consider a greedy merging algorithm begin with an over-fitting model for example, mean shifts at every other time point at each step, merge two adjacent segments to form a single bigger segment, which increases RSS m the least continue the merging until there is only one segment left, corresponding to no mean shift Choose the smallest BIC from the above nested sequence of models Efficient updating of RSS m exists Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 28 / 48

33 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 29 / 48

34 Robust Regression Consider linear model y j = x t j β + ɛ j Least-squares (LS) regression estimates β by ˆβ LS = argmin j (y j x t j β)2 Both AV and SSB estimators use LS However, when the errors have a heavier tail than Gaussian, LS procedures can be very inefficient and unstable Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 29 / 48

35 Robust Regression: the SSB estimator The SSB estimator is based on D j,k Quantiles of Standard Normal Empirical Quantiles Figure 2: Gaussian Quantile Plot of the D j,k s at Octave the sum of the first h ordered squared residuals respectively LTS can 50%, but it is numerically more difficult to obtain As described in Section 2, the AV estimator explores the log linear d 2 X (j, k) (ie µ j) and j, and fits a WLS of log 2 (µ j ) (ie y j ) against j, wh j, the average of D j,k, against j Our simulation results (Section 4) sho The response j 1 n j /2 n j k=1 D j,k only j at large octaves are useful for estimating H however, not many wavelet coefficients at those scales the effect of CLT may not be significant We propose to use robust regression techniques Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 30 / 48

36 Robust Regression L 1 regression (Edgeworth, 1887): M-estimation (Huber, 1981): ˆβ L1 = argmin y j x t j β ˆβ M = argmin ρ(y j x t j β) ρ( ) down-weights extreme observations Iteratively Re-weighted Least Squares (IRLS) algorithm (Heiberger and Becker, 1992) Least trimmed squares estimation (Rousseeuw, 1984): h ˆβ LTS = argmin r(j) 2, where r (j) is the jth order statistic of the residual r j = y j x t j β Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 31 / 48 j=1

37 Robust Wavelet Estimation of H 1 For each candidate of H 0, say H k, 1 Use the level shift removing method to determine the number and locations of mean level shifts (m k, T k ) in X(t) 2 Choose the combination (m k, T k, H k ) that gives the smallest BIC 2 Remove the estimated level shifts ˆα(t) from X(t), and obtain the estimated β(t) as ˆβ(t) = X(t) ˆα(t) 3 Apply robust regression to the wavelet coefficients of ˆβ(t) to obtain a final robust estimate Ĥ Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 32 / 48

38 Robust Logscale Diagram A robust version of the Logscale Diagram Instead of y j or j, find a robust center" of D j,k, the logged wavelet coefficients ling time t H=09 X(t) depend on the robust regression technique sampling time t Fit the robust regression to the upper part of the diagram (d) H=09 y_j True LD LD RLD ling time t Octave j Figure 3: Sample fgn with level shifts Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 33 / 48

39 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 34 / 48

40 Simulation Setup Model: X t = µ t + ɛ t ɛ t : fractional Gaussian noise (fgn) with variance σ 2 and autocorrelation function γ(k) = ( k + 1 2H + k 1 2H 2 k 2H )/2, k 0 (6) H = 1/2: ɛt white noise 1/2 < H < 1: LRD Simulation I: fgn Simulation II: fgn with missing values Simulation III: fgn with mean level shifts Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 34 / 48

41 Simulation I: fgn Simulate 50 fgn using the circulant embedding method of Dietrich and Newsam (1997) n = 2 14, µ = σ = 20 H = {05, 075, 09} Compare bias, standard error (SE), root mean squared error (RMSE) Consider SSB, L 1, IRLS, LTS Level-shifts removal (correctly) found no or very few level shifts Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 35 / 48

42 Fig 3 Sample fgn Simulation I: fgn H=05 X(t) X(t) t H= t X(t) se RMSE H = 09 bias se RMSE H=09 in Table I From Table I we can see that all yield very accurate estimates of H when are exact fgn IRLS gives marginally be than LTS and L 1 regression It is also b regression when H is large In sequel we IRLS in the simulation studies for robust IRLS is computational quicker than LTS an all three robust procedures are expected to performances 0 We also 5000 applied the 10000level shifts removi t the simulated traces For H = 05 and 075 correctly found no level shifts for all 50 sim The bias, Forstandard H = 09, error the(se) method and root foundmean no leve squ errors (RMSE) traces, of one the 50 level estimated shift for H s16 under traces, eachtwo co bination offor Hfive andtraces, the estimation and threemethod level shifts are presen for o all cases, the estimated H for the level sh traces are almost identical to those of the o Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 36 / 48 X(t) t

43 e 1: Comparison of the SSB estimator and three robust regression estima Comparison of SSB, IRLS, LTS, L 1 estimators for fgn SSB IRLS LTS L 1 H = 05 bias se RMSE H = 075 bias se RMSE H = 09 bias se RMSE true Haipeng valueshen of(unc-ch) H, Ĥ i is the estimated Statistics of Networks H for the ith trace, Issac Newton and Institute H is37 the / 48

44 0 Simulation II: fgn with missing values 200 time points set to 0, (1% missing) ure 4 plots two typical fgn traces with H = 05 at the beginning missing values added Vertical dashed lines are erimposed to in highlight the middle the location of the missing es X(t) Fig 4 Sample fgn with missing values H= t X(t) X(t) TABLE II COMPARISON OF THE LS, IRLS AND RW EST WITH MISSING VALUES Missing at the beginning LS IRLS t H = 05 bias H=05 se RMSE H = 075 bias se RMSE H = 09 bias se RMSE Missing in the middle LS IRLS H=05 Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 38 / t

45 Comparison of SSB, IRLS and RW estimators for fgn with missing values : Comparison of the SSB, IRLS and RW estimators for fgn with missing Missing at the beginning SSB IRLS RW H = 05 bias SE RMSE H = 075 bias SE RMSE H = 09 bias SE RMSE Missing in the middle SSB IRLS RW H = 05 bias SE RMSE H = 075 bias SE RMSE H = 09 bias SE RMSE mator employs the mean level shifts removing algorithm first before appl Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 39 / 48

46 Simulation III: Sample fgn with level shifts Take the fgn traces in Simulation I Add level shifts: for H = 05, 075, 1 unit of the grand mean for H = 09, 4 unit of the grand mean Selection of number of shifts: correct %: 98% (H = 05), 84% (H = 075), 68% (H = 09) For larger H, harder to distinguish level shifts from natural variation caused by strong auto-correlation Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 40 / 48

47 according to two-sample t-tests Simulation III: Sample fgn with level shifts (a) H=05 (b) H=075 X(t) X(t) sampling time t (c) H= sampling time t (d) H=09 X(t) y_j True LD LD RLD sampling time t Octave j Figure 3: Sample fgn with level shifts Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 41 / 48

48 1%Þ A timators ller bias, gnificant e occurs dramatic Comparison of SSB, IRLS and RW estimators for fgn with leveltable shifts 2 used by undaries wavelet omment y of the der the or level ss and a ao lower t all the e CRLB, iple, one 16 level shifts of magnitude one unit of the grand Comparison of the SSB, IRLS, and RW estimators for fgn with level shifts SSB IRLS RW H ¼ 0:5 bias SE RMSE H ¼ 0:75 bias SE RMSE H ¼ 0:9 bias SE RMSE Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 42 / 48

49 Outline 1 Motivation 2 Parameter Estimation Wavelet Estimators Level Shifts Removal Robust Regression 3 Simulation Studies 4 Analysis of Real Traces Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 43 / 48

50 Comparison of the AV and RW estimators for the four real traces AV RW Trace Octave Ĥ m Ĥ Abilene UNC I UNC II UNC III Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 43 / 48

51 Example: UNC Trace I (Ĥ = 128) t (a) UNC02 APR 09: original trace packet count X(t) sampling time t (100ms interval) y_j Octave j Table 4 reports the estimated Hur Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 44 / 48

52 reports the estimated Hurst parameters of the four t thods Haipeng (A/V Shen (UNC-CH) and RW) Statistics Theof Networks automatic octave Issac Newton Institute selection 45 / 48 Example: UNC Trace II (Ĥ = 151) t (a) UNC02 APR 13: original trace packet count X(t) t y_j y_j sampling time t (100ms interval) Octave j Octave j

53 Example: UNC Trace III (Ĥ = 092) t (a) UNC02 APR 11: missing values packet count X(t) t y_j y_j sampling 5 6time 7t (100ms 8 interval) Octave j Octave j ose to use a robust procedure to estimate the Hurst all Haipeng robust Shen (UNC-CH) wavelet (RW) Statistics ofestimator Networks We Issac Newton consider Institute 46 / 48t

54 Subtrace consistency Trace Estimator S1 S2 S3 S4 RMSE Abilene AV RW UNC I AV RW UNC II AV RW UNC III AV RW Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 47 / 48

55 Summary Take home messages proposed a robust wavelet estimator for H or α level-shift removal + robust regression illustrated its performance via simulated and real traces Future work model magnitude/duration of level-shifts, connection with bursty periods different model selection criterion incorporate changing variance Haipeng Shen (UNC-CH) Statistics of Networks Issac Newton Institute 48 / 48

Robust estimation of the Hurst parameter and selection of an onset scaling

Robust estimation of the Hurst parameter and selection of an onset scaling Robust estimation of the Hurst parameter and selection of an onset scaling Juhyun Park and Cheolwoo Park Lancaster University and University of Georgia Abstract: We consider the problem of estimating the

More information

ROBUST ESTIMATION OF THE HURST PARAMETER AND SELECTION OF AN ONSET SCALING

ROBUST ESTIMATION OF THE HURST PARAMETER AND SELECTION OF AN ONSET SCALING Statistica Sinica 19 (2009), 1531-1555 ROBUST ESTIMATION OF THE HURST PARAMETER AND SELECTION OF AN ONSET SCALING Juhyun Park and Cheolwoo Park Lancaster University and University of Georgia Abstract:

More information

Visualization and inference based on wavelet coefficients, SiZer and SiNos

Visualization and inference based on wavelet coefficients, SiZer and SiNos Visualization and inference based on wavelet coefficients, SiZer and SiNos Cheolwoo Park cpark@stat.uga.edu Department of Statistics University of Georgia Athens, GA 30602-1952 Fred Godtliebsen Department

More information

Visualization and Inference Based on Wavelet Coefficients, SiZer and SiNos

Visualization and Inference Based on Wavelet Coefficients, SiZer and SiNos Visualization and Inference Based on Wavelet Coefficients, SiZer and SiNos Cheolwoo Park, Fred Godtliebsen, Murad Taqqu, Stilian Stoev and J.S. Marron Technical Report #- March 6, This material was based

More information

Estimation of the long Memory parameter using an Infinite Source Poisson model applied to transmission rate measurements

Estimation of the long Memory parameter using an Infinite Source Poisson model applied to transmission rate measurements of the long Memory parameter using an Infinite Source Poisson model applied to transmission rate measurements François Roueff Ecole Nat. Sup. des Télécommunications 46 rue Barrault, 75634 Paris cedex 13,

More information

Median Cross-Validation

Median Cross-Validation Median Cross-Validation Chi-Wai Yu 1, and Bertrand Clarke 2 1 Department of Mathematics Hong Kong University of Science and Technology 2 Department of Medicine University of Miami IISA 2011 Outline Motivational

More information

A NOVEL APPROACH TO THE ESTIMATION OF THE HURST PARAMETER IN SELF-SIMILAR TRAFFIC

A NOVEL APPROACH TO THE ESTIMATION OF THE HURST PARAMETER IN SELF-SIMILAR TRAFFIC Proceedings of IEEE Conference on Local Computer Networks, Tampa, Florida, November 2002 A NOVEL APPROACH TO THE ESTIMATION OF THE HURST PARAMETER IN SELF-SIMILAR TRAFFIC Houssain Kettani and John A. Gubner

More information

Strengths and Limitations of the Wavelet Spectrum Method in the Analysis of Internet Traffic

Strengths and Limitations of the Wavelet Spectrum Method in the Analysis of Internet Traffic Strengths and Limitations of the Wavelet Spectrum Method in the Analysis of Internet Traffic Stilian Stoev, Murad Taqqu, Cheolwoo Park and J.S. Marron Technical Report #24-8 March 26, 24 This material

More information

Wavelet and SiZer analyses of Internet Traffic Data

Wavelet and SiZer analyses of Internet Traffic Data Wavelet and SiZer analyses of Internet Traffic Data Cheolwoo Park Statistical and Applied Mathematical Sciences Institute Fred Godtliebsen Department of Mathematics and Statistics, University of Tromsø

More information

Long-Range Dependence and Self-Similarity. c Vladas Pipiras and Murad S. Taqqu

Long-Range Dependence and Self-Similarity. c Vladas Pipiras and Murad S. Taqqu Long-Range Dependence and Self-Similarity c Vladas Pipiras and Murad S. Taqqu January 24, 2016 Contents Contents 2 Preface 8 List of abbreviations 10 Notation 11 1 A brief overview of times series and

More information

On the wavelet spectrum diagnostic for Hurst parameter estimation in the analysis of Internet traffic

On the wavelet spectrum diagnostic for Hurst parameter estimation in the analysis of Internet traffic On the wavelet spectrum diagnostic for Hurst parameter estimation in the analysis of Internet traffic Stilian Stoev Boston University Murad S. Taqqu Boston University J. S. Marron University of North Carolina

More information

Long-range dependence

Long-range dependence Long-range dependence Kechagias Stefanos University of North Carolina at Chapel Hill May 23, 2013 Kechagias Stefanos (UNC) Long-range dependence May 23, 2013 1 / 45 Outline 1 Introduction to time series

More information

Network Traffic Characteristic

Network Traffic Characteristic Network Traffic Characteristic Hojun Lee hlee02@purros.poly.edu 5/24/2002 EL938-Project 1 Outline Motivation What is self-similarity? Behavior of Ethernet traffic Behavior of WAN traffic Behavior of WWW

More information

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis

More information

A discrete wavelet transform traffic model with application to queuing critical time scales

A discrete wavelet transform traffic model with application to queuing critical time scales University of Roma ÒLa SapienzaÓ Dept. INFOCOM A discrete wavelet transform traffic model with application to queuing critical time scales Andrea Baiocchi, Andrea De Vendictis, Michele Iarossi University

More information

Capturing Network Traffic Dynamics Small Scales. Rolf Riedi

Capturing Network Traffic Dynamics Small Scales. Rolf Riedi Capturing Network Traffic Dynamics Small Scales Rolf Riedi Dept of Statistics Stochastic Systems and Modelling in Networking and Finance Part II Dependable Adaptive Systems and Mathematical Modeling Kaiserslautern,

More information

Stochastic volatility models: tails and memory

Stochastic volatility models: tails and memory : tails and memory Rafa l Kulik and Philippe Soulier Conference in honour of Prof. Murad Taqqu 19 April 2012 Rafa l Kulik and Philippe Soulier Plan Model assumptions; Limit theorems for partial sums and

More information

Fractal Analysis of Intraflow Unidirectional Delay over W-LAN and W-WAN WAN Environments

Fractal Analysis of Intraflow Unidirectional Delay over W-LAN and W-WAN WAN Environments Fractal Analysis of Intraflow Unidirectional Delay over W-LAN and W-WAN WAN Environments Dimitrios Pezaros with Manolis Sifalakis and Laurent Mathy Computing Department Lancaster University [dp@comp.lancs.ac.uk]

More information

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity. LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity. Important points of Lecture 1: A time series {X t } is a series of observations taken sequentially over time: x t is an observation

More information

Extremogram and Ex-Periodogram for heavy-tailed time series

Extremogram and Ex-Periodogram for heavy-tailed time series Extremogram and Ex-Periodogram for heavy-tailed time series 1 Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia) and Yuwei Zhao (Ulm) 1 Jussieu, April 9, 2014 1 2 Extremal

More information

Wavelet domain test for long range dependence in the presence of a trend

Wavelet domain test for long range dependence in the presence of a trend Wavelet domain test for long range dependence in the presence of a trend Agnieszka Jach and Piotr Kokoszka Utah State University July 24, 27 Abstract We propose a test to distinguish a weakly dependent

More information

9. Robust regression

9. Robust regression 9. Robust regression Least squares regression........................................................ 2 Problems with LS regression..................................................... 3 Robust regression............................................................

More information

Wavelet-based confidence intervals for the self-similarity parameter

Wavelet-based confidence intervals for the self-similarity parameter Journal of Statistical Computation and Simulation Vol. 78, No. 12, December 2008, 1179 1198 Wavelet-based confidence intervals for the self-similarity parameter AGNIESZKA JACH and PIOTR KOKOSZKA* Department

More information

Extremogram and ex-periodogram for heavy-tailed time series

Extremogram and ex-periodogram for heavy-tailed time series Extremogram and ex-periodogram for heavy-tailed time series 1 Thomas Mikosch University of Copenhagen Joint work with Richard A. Davis (Columbia) and Yuwei Zhao (Ulm) 1 Zagreb, June 6, 2014 1 2 Extremal

More information

Network Traffic Modeling using a Multifractal Wavelet Model

Network Traffic Modeling using a Multifractal Wavelet Model 5-th International Symposium on Digital Signal Processing for Communication Systems, DSPCS 99, Perth, 1999 Network Traffic Modeling using a Multifractal Wavelet Model Matthew S. Crouse, Rudolf H. Riedi,

More information

Statistical analysis of peer-to-peer live streaming traffic

Statistical analysis of peer-to-peer live streaming traffic Statistical analysis of peer-to-peer live streaming traffic Levente Bodrog 1 Ákos Horváth 1 Miklós Telek 1 1 Technical University of Budapest Probability and Statistics with Applications, 2009 Outline

More information

Chapter 3: Regression Methods for Trends

Chapter 3: Regression Methods for Trends Chapter 3: Regression Methods for Trends Time series exhibiting trends over time have a mean function that is some simple function (not necessarily constant) of time. The example random walk graph from

More information

Regression, Ridge Regression, Lasso

Regression, Ridge Regression, Lasso Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.

More information

Multiplicative Multifractal Modeling of. Long-Range-Dependent (LRD) Trac in. Computer Communications Networks. Jianbo Gao and Izhak Rubin

Multiplicative Multifractal Modeling of. Long-Range-Dependent (LRD) Trac in. Computer Communications Networks. Jianbo Gao and Izhak Rubin Multiplicative Multifractal Modeling of Long-Range-Dependent (LRD) Trac in Computer Communications Networks Jianbo Gao and Izhak Rubin Electrical Engineering Department, University of California, Los Angeles

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

Adaptive wavelet decompositions of stochastic processes and some applications

Adaptive wavelet decompositions of stochastic processes and some applications Adaptive wavelet decompositions of stochastic processes and some applications Vladas Pipiras University of North Carolina at Chapel Hill SCAM meeting, June 1, 2012 (joint work with G. Didier, P. Abry)

More information

Bootstrapping Long Memory Tests: Some Monte Carlo Results

Bootstrapping Long Memory Tests: Some Monte Carlo Results Bootstrapping Long Memory Tests: Some Monte Carlo Results Anthony Murphy and Marwan Izzeldin University College Dublin and Cass Business School. July 2004 - Preliminary Abstract We investigate the bootstrapped

More information

THERE is now ample evidence that long-term correlations

THERE is now ample evidence that long-term correlations 2 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 1, JANUARY 1998 Wavelet Analysis of Long-Range-Dependent Traffic Patrice Abry and Darryl Veitch Abstract A wavelet-based tool for the analysis of

More information

An algorithm for robust fitting of autoregressive models Dimitris N. Politis

An algorithm for robust fitting of autoregressive models Dimitris N. Politis An algorithm for robust fitting of autoregressive models Dimitris N. Politis Abstract: An algorithm for robust fitting of AR models is given, based on a linear regression idea. The new method appears to

More information

MODELS FOR COMPUTER NETWORK TRAFFIC

MODELS FOR COMPUTER NETWORK TRAFFIC MODELS FOR COMPUTER NETWORK TRAFFIC Murad S. Taqqu Boston University Joint work with Walter Willinger, Joshua Levy and Vladas Pipiras,... Web Site http://math.bu.edu/people/murad OUTLINE Background: 1)

More information

Nonparametric regression with martingale increment errors

Nonparametric regression with martingale increment errors S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration

More information

Hydrological statistics for engineering design in a varying climate

Hydrological statistics for engineering design in a varying climate EGS - AGU - EUG Joint Assembly Nice, France, 6- April 23 Session HS9/ Climate change impacts on the hydrological cycle, extremes, forecasting and implications on engineering design Hydrological statistics

More information

LONG RANGE DEPENDENCE, UNBALANCED HAAR WAVELET TRANSFORMATION AND CHANGE IN LOCAL MEAN LEVEL.

LONG RANGE DEPENDENCE, UNBALANCED HAAR WAVELET TRANSFORMATION AND CHANGE IN LOCAL MEAN LEVEL. International Journal of Wavelets, Multiresolution and Information Processing c World Scientific Publishing Company LONG RANGE DEPENDENCE, UNBALANCED HAAR WAVELET TRANSFORMATION AND CHANGE IN LOCAL MEAN

More information

A Practical Guide to Measuring the Hurst Parameter

A Practical Guide to Measuring the Hurst Parameter A Practical Guide to Measuring the Hurst Parameter Richard G. Clegg June 28, 2005 Abstract This paper describes, in detail, techniques for measuring the Hurst parameter. Measurements are given on artificial

More information

Joint Parameter Estimation of the Ornstein-Uhlenbeck SDE driven by Fractional Brownian Motion

Joint Parameter Estimation of the Ornstein-Uhlenbeck SDE driven by Fractional Brownian Motion Joint Parameter Estimation of the Ornstein-Uhlenbeck SDE driven by Fractional Brownian Motion Luis Barboza October 23, 2012 Department of Statistics, Purdue University () Probability Seminar 1 / 59 Introduction

More information

Functional Mixed Effects Spectral Analysis

Functional Mixed Effects Spectral Analysis Joint with Robert Krafty and Martica Hall June 4, 2014 Outline Introduction Motivating example Brief review Functional mixed effects spectral analysis Estimation Procedure Application Remarks Introduction

More information

Inference for High Dimensional Robust Regression

Inference for High Dimensional Robust Regression Department of Statistics UC Berkeley Stanford-Berkeley Joint Colloquium, 2015 Table of Contents 1 Background 2 Main Results 3 OLS: A Motivating Example Table of Contents 1 Background 2 Main Results 3 OLS:

More information

ON THE CONVERGENCE OF FARIMA SEQUENCE TO FRACTIONAL GAUSSIAN NOISE. Joo-Mok Kim* 1. Introduction

ON THE CONVERGENCE OF FARIMA SEQUENCE TO FRACTIONAL GAUSSIAN NOISE. Joo-Mok Kim* 1. Introduction JOURNAL OF THE CHUNGCHEONG MATHEMATICAL SOCIETY Volume 26, No. 2, May 2013 ON THE CONVERGENCE OF FARIMA SEQUENCE TO FRACTIONAL GAUSSIAN NOISE Joo-Mok Kim* Abstract. We consider fractional Gussian noise

More information

Statistics 203: Introduction to Regression and Analysis of Variance Course review

Statistics 203: Introduction to Regression and Analysis of Variance Course review Statistics 203: Introduction to Regression and Analysis of Variance Course review Jonathan Taylor - p. 1/?? Today Review / overview of what we learned. - p. 2/?? General themes in regression models Specifying

More information

Sensitivity of ABR Congestion Control Algorithms to Hurst Parameter Estimates

Sensitivity of ABR Congestion Control Algorithms to Hurst Parameter Estimates Sensitivity of ABR Congestion Control Algorithms to Hurst Parameter Estimates Sven A. M. Östring 1, Harsha Sirisena 1, and Irene Hudson 2 1 Department of Electrical & Electronic Engineering 2 Department

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

1.4 Properties of the autocovariance for stationary time-series

1.4 Properties of the autocovariance for stationary time-series 1.4 Properties of the autocovariance for stationary time-series In general, for a stationary time-series, (i) The variance is given by (0) = E((X t µ) 2 ) 0. (ii) (h) apple (0) for all h 2 Z. ThisfollowsbyCauchy-Schwarzas

More information

Univariate ARIMA Models

Univariate ARIMA Models Univariate ARIMA Models ARIMA Model Building Steps: Identification: Using graphs, statistics, ACFs and PACFs, transformations, etc. to achieve stationary and tentatively identify patterns and model components.

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Wavelets come strumento di analisi di traffico a pacchetto

Wavelets come strumento di analisi di traffico a pacchetto University of Roma ÒLa SapienzaÓ Dept. INFOCOM Wavelets come strumento di analisi di traffico a pacchetto Andrea Baiocchi University of Roma ÒLa SapienzaÓ - INFOCOM Dept. - Roma (Italy) e-mail: baiocchi@infocom.uniroma1.it

More information

The Modified Allan Variance as Time-Domain Analysis Tool for Estimating the Hurst Parameter of Long-Range Dependent Traffic

The Modified Allan Variance as Time-Domain Analysis Tool for Estimating the Hurst Parameter of Long-Range Dependent Traffic The Modified Allan Variance as Time-Domain Analysis Tool for Estimating the urst Parameter of Long-Range Dependent Traffic Stefano Bregni, Senior Member, IEEE, Luca Primerano Politecnico di Milano, Dept.

More information

Classic Time Series Analysis

Classic Time Series Analysis Classic Time Series Analysis Concepts and Definitions Let Y be a random number with PDF f Y t ~f,t Define t =E[Y t ] m(t) is known as the trend Define the autocovariance t, s =COV [Y t,y s ] =E[ Y t t

More information

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia

GARCH Models Estimation and Inference. Eduardo Rossi University of Pavia GARCH Models Estimation and Inference Eduardo Rossi University of Pavia Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016

Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Ph.D. Qualifying Exam Monday Tuesday, January 4 5, 2016 Put your solution to each problem on a separate sheet of paper. Problem 1. (5106) Find the maximum likelihood estimate of θ where θ is a parameter

More information

Local Polynomial Modelling and Its Applications

Local Polynomial Modelling and Its Applications Local Polynomial Modelling and Its Applications J. Fan Department of Statistics University of North Carolina Chapel Hill, USA and I. Gijbels Institute of Statistics Catholic University oflouvain Louvain-la-Neuve,

More information

A Tutorial on Stochastic Models and Statistical Analysis for Frequency Stability Measurements

A Tutorial on Stochastic Models and Statistical Analysis for Frequency Stability Measurements A Tutorial on Stochastic Models and Statistical Analysis for Frequency Stability Measurements Don Percival Applied Physics Lab, University of Washington, Seattle overheads for talk available at http://staff.washington.edu/dbp/talks.html

More information

Multiscale and multilevel technique for consistent segmentation of nonstationary time series

Multiscale and multilevel technique for consistent segmentation of nonstationary time series Multiscale and multilevel technique for consistent segmentation of nonstationary time series Haeran Cho Piotr Fryzlewicz University of Bristol London School of Economics INSPIRE 2009 Imperial College London

More information

Ch3. TRENDS. Time Series Analysis

Ch3. TRENDS. Time Series Analysis 3.1 Deterministic Versus Stochastic Trends The simulated random walk in Exhibit 2.1 shows a upward trend. However, it is caused by a strong correlation between the series at nearby time points. The true

More information

Nonlinear Time Series Modeling

Nonlinear Time Series Modeling Nonlinear Time Series Modeling Part II: Time Series Models in Finance Richard A. Davis Colorado State University (http://www.stat.colostate.edu/~rdavis/lectures) MaPhySto Workshop Copenhagen September

More information

Inference For High Dimensional M-estimates: Fixed Design Results

Inference For High Dimensional M-estimates: Fixed Design Results Inference For High Dimensional M-estimates: Fixed Design Results Lihua Lei, Peter Bickel and Noureddine El Karoui Department of Statistics, UC Berkeley Berkeley-Stanford Econometrics Jamboree, 2017 1/49

More information

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E.

Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, / 91. Bruce E. Forecasting Lecture 3 Structural Breaks Central Bank of Chile October 29-31, 2013 Bruce Hansen (University of Wisconsin) Structural Breaks October 29-31, 2013 1 / 91 Bruce E. Hansen Organization Detection

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding

Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding Synthesis of Gaussian and non-gaussian stationary time series using circulant matrix embedding Vladas Pipiras University of North Carolina at Chapel Hill UNC Graduate Seminar, November 10, 2010 (joint

More information

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1 Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maximum likelihood Consistency Confidence intervals Properties of the mean estimator Properties of the

More information

WAVELET BASED ESTIMATORS OF LONG-RANGE DEPENDENCIES IN TRAFFIC TRACES. Želimir Lucic

WAVELET BASED ESTIMATORS OF LONG-RANGE DEPENDENCIES IN TRAFFIC TRACES. Želimir Lucic WAVELET BASED ESTIMATORS OF LONG-RANGE DEPENDENCIES IN TRAFFIC TRACES by Želimir Lucic B.Sc., University of Sarajevo, 1991 PROJECT SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference Università di Pavia GARCH Models Estimation and Inference Eduardo Rossi Likelihood function The procedure most often used in estimating θ 0 in ARCH models involves the maximization of a likelihood function

More information

Inference For High Dimensional M-estimates. Fixed Design Results

Inference For High Dimensional M-estimates. Fixed Design Results : Fixed Design Results Lihua Lei Advisors: Peter J. Bickel, Michael I. Jordan joint work with Peter J. Bickel and Noureddine El Karoui Dec. 8, 2016 1/57 Table of Contents 1 Background 2 Main Results and

More information

Data Mining Stat 588

Data Mining Stat 588 Data Mining Stat 588 Lecture 02: Linear Methods for Regression Department of Statistics & Biostatistics Rutgers University September 13 2011 Regression Problem Quantitative generic output variable Y. Generic

More information

A nonparametric test for seasonal unit roots

A nonparametric test for seasonal unit roots Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna To be presented in Innsbruck November 7, 2007 Abstract We consider a nonparametric test for the

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

Time Series. Anthony Davison. c

Time Series. Anthony Davison. c Series Anthony Davison c 2008 http://stat.epfl.ch Periodogram 76 Motivation............................................................ 77 Lutenizing hormone data..................................................

More information

On the estimation of the heavy tail exponent in time series using the max spectrum. Stilian A. Stoev

On the estimation of the heavy tail exponent in time series using the max spectrum. Stilian A. Stoev On the estimation of the heavy tail exponent in time series using the max spectrum Stilian A. Stoev (sstoev@umich.edu) University of Michigan, Ann Arbor, U.S.A. JSM, Salt Lake City, 007 joint work with:

More information

Likelihood-Based Methods

Likelihood-Based Methods Likelihood-Based Methods Handbook of Spatial Statistics, Chapter 4 Susheela Singh September 22, 2016 OVERVIEW INTRODUCTION MAXIMUM LIKELIHOOD ESTIMATION (ML) RESTRICTED MAXIMUM LIKELIHOOD ESTIMATION (REML)

More information

Modeling Multiscale Differential Pixel Statistics

Modeling Multiscale Differential Pixel Statistics Modeling Multiscale Differential Pixel Statistics David Odom a and Peyman Milanfar a a Electrical Engineering Department, University of California, Santa Cruz CA. 95064 USA ABSTRACT The statistics of natural

More information

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 4, 215 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 31 questions. Circle

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Modeling and testing long memory in random fields

Modeling and testing long memory in random fields Modeling and testing long memory in random fields Frédéric Lavancier lavancier@math.univ-lille1.fr Université Lille 1 LS-CREST Paris 24 janvier 6 1 Introduction Long memory random fields Motivations Previous

More information

On the usefulness of wavelet-based simulation of fractional Brownian motion

On the usefulness of wavelet-based simulation of fractional Brownian motion On the usefulness of wavelet-based simulation of fractional Brownian motion Vladas Pipiras University of North Carolina at Chapel Hill September 16, 2004 Abstract We clarify some ways in which wavelet-based

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

1 Random walks and data

1 Random walks and data Inference, Models and Simulation for Complex Systems CSCI 7-1 Lecture 7 15 September 11 Prof. Aaron Clauset 1 Random walks and data Supposeyou have some time-series data x 1,x,x 3,...,x T and you want

More information

Financial Econometrics and Quantitative Risk Managenent Return Properties

Financial Econometrics and Quantitative Risk Managenent Return Properties Financial Econometrics and Quantitative Risk Managenent Return Properties Eric Zivot Updated: April 1, 2013 Lecture Outline Course introduction Return definitions Empirical properties of returns Reading

More information

Thomas J. Fisher. Research Statement. Preliminary Results

Thomas J. Fisher. Research Statement. Preliminary Results Thomas J. Fisher Research Statement Preliminary Results Many applications of modern statistics involve a large number of measurements and can be considered in a linear algebra framework. In many of these

More information

Efficient and Robust Scale Estimation

Efficient and Robust Scale Estimation Efficient and Robust Scale Estimation Garth Tarr, Samuel Müller and Neville Weber School of Mathematics and Statistics THE UNIVERSITY OF SYDNEY Outline Introduction and motivation The robust scale estimator

More information

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13)

Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) Stat 5100 Handout #26: Variations on OLS Linear Regression (Ch. 11, 13) 1. Weighted Least Squares (textbook 11.1) Recall regression model Y = β 0 + β 1 X 1 +... + β p 1 X p 1 + ε in matrix form: (Ch. 5,

More information

MA Advanced Econometrics: Applying Least Squares to Time Series

MA Advanced Econometrics: Applying Least Squares to Time Series MA Advanced Econometrics: Applying Least Squares to Time Series Karl Whelan School of Economics, UCD February 15, 2011 Karl Whelan (UCD) Time Series February 15, 2011 1 / 24 Part I Time Series: Standard

More information

Efficient estimation of a semiparametric dynamic copula model

Efficient estimation of a semiparametric dynamic copula model Efficient estimation of a semiparametric dynamic copula model Christian Hafner Olga Reznikova Institute of Statistics Université catholique de Louvain Louvain-la-Neuve, Blgium 30 January 2009 Young Researchers

More information

Estimators as Random Variables

Estimators as Random Variables Estimation Theory Overview Properties Bias, Variance, and Mean Square Error Cramér-Rao lower bound Maimum likelihood Consistency Confidence intervals Properties of the mean estimator Introduction Up until

More information

Effect of the Traffic Bursts in the Network Queue

Effect of the Traffic Bursts in the Network Queue RICE UNIVERSITY Effect of the Traffic Bursts in the Network Queue by Alireza KeshavarzHaddad A Thesis Submitted in Partial Fulfillment of the Requirements for the Degree Master of Science Approved, Thesis

More information

Handling data with R

Handling data with R Handling data with R fitting distributions, time-series analysis, and analysis of variance Prof. Steve Uhlig Professor of Networks steve@eecs.qmul.ac.uk Steve Uhlig 1 R What is R? Open-source statistical

More information

Time Series: Theory and Methods

Time Series: Theory and Methods Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary

More information

A source model for ISDN packet data traffic *

A source model for ISDN packet data traffic * 1 A source model for ISDN packet data traffic * Kavitha Chandra and Charles Thompson Center for Advanced Computation University of Massachusetts Lowell, Lowell MA 01854 * Proceedings of the 28th Annual

More information

Differencing Revisited: I ARIMA(p,d,q) processes predicated on notion of dth order differencing of a time series {X t }: for d = 1 and 2, have X t

Differencing Revisited: I ARIMA(p,d,q) processes predicated on notion of dth order differencing of a time series {X t }: for d = 1 and 2, have X t Differencing Revisited: I ARIMA(p,d,q) processes predicated on notion of dth order differencing of a time series {X t }: for d = 1 and 2, have X t 2 X t def in general = (1 B)X t = X t X t 1 def = ( X

More information

arxiv:math/ v1 [math.st] 25 Oct 2006

arxiv:math/ v1 [math.st] 25 Oct 2006 A PRACTICAL GUIDE TO MEASURING THE HURST PARAMETER arxiv:math/0610756v1 [math.st] 25 Oct 2006 RICHARD G. CLEGG Dept. Of Mathematics, University of York, YO10 5DD richard@richardclegg.org Abstract: This

More information

GARCH Models Estimation and Inference

GARCH Models Estimation and Inference GARCH Models Estimation and Inference Eduardo Rossi University of Pavia December 013 Rossi GARCH Financial Econometrics - 013 1 / 1 Likelihood function The procedure most often used in estimating θ 0 in

More information

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information.

An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. An example of Bayesian reasoning Consider the one-dimensional deconvolution problem with various degrees of prior information. Model: where g(t) = a(t s)f(s)ds + e(t), a(t) t = (rapidly). The problem,

More information

Multiresolution Models of Time Series

Multiresolution Models of Time Series Multiresolution Models of Time Series Andrea Tamoni (Bocconi University ) 2011 Tamoni Multiresolution Models of Time Series 1/ 16 General Framework Time-scale decomposition General Framework Begin with

More information

COMSTA3369 MODC+ ARTICLE IN PRESS. Estimation of Hurst exponent revisited

COMSTA3369 MODC+ ARTICLE IN PRESS. Estimation of Hurst exponent revisited PROD. TYPE: COM PP: -6 col.fig.: nil) COMSTA6 MODC+ ED: Vijay PAGN: Vidya -- SCAN: Global Computational Statistics & Data Analysis ) www.elsevier.com/locate/csda Estimation of Hurst exponent revisited

More information

1. Stochastic Processes and Stationarity

1. Stochastic Processes and Stationarity Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture Note 1 - Introduction This course provides the basic tools needed to analyze data that is observed

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

NEW ESTIMATORS FOR PARALLEL STEADY-STATE SIMULATIONS

NEW ESTIMATORS FOR PARALLEL STEADY-STATE SIMULATIONS roceedings of the 2009 Winter Simulation Conference M. D. Rossetti, R. R. Hill, B. Johansson, A. Dunkin, and R. G. Ingalls, eds. NEW ESTIMATORS FOR ARALLEL STEADY-STATE SIMULATIONS Ming-hua Hsieh Department

More information