MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong

Size: px
Start display at page:

Download "MGR-815. Notes for the MGR-815 course. 12 June School of Superior Technology. Professor Zbigniew Dziong"

Transcription

1 Modeling, Estimation and Control, for Telecommunication Networks Notes for the MGR-815 course 12 June 2010 School of Superior Technology Professor Zbigniew Dziong 1

2 Table of Contents Preface 5 1. Example of the analytic model of a VoIP system The basics and the packet loss rate Binomial distribution Control of connections admission Sizing the link 9 2. Introduction to time series and forecasting Decomposition of a time series Types of forecasting Forecasting error Forecasting techniques Choosing a forecasting technique Basic Concepts Probability Probability distribution Normal distribution and Student's t Law Central limit theorem Confidence interval Example of calculation of the confidence interval Application of confidence interval for simulation results Regression Model of simple linear regression Model of general (multiple) linear regression Model of quadratic regression Time series as stochastic processes Definition of stochastic process Stationarity of the process 21 2

3 4.3. Correlogram Interpretation of correlograms Trend Filtering the series Removal of local fluctuations (low pass filters) Removal of long-term fluctuations (high pass filters) Useful stochastic processes for the analysis of time series White noise Random path Moving average Autoregressive process First order autoregressive process (Markov process) ARMA model ARIMA model Development of forecasting models Identification of the type of the sample's model Correlogram of the sample Interpretation of correlogram Estimation of the parameters of the autoregression model Least square estimate method - LSE AR(1) parameters AR(2) parameters Estimation of the order of the autoregression model Maximum likelihood estimation method - MLE Estimation of the parameters of the moving average model Estimation of the order of the moving average model Estimation of parameters of the average mobile model Model Verification 43 3

4 7. Forecasting Forecasting by extrapolation Exponential Smoothing Choice of Generalized exponential smoothing Kalman filter State space model Scalar example of Kalman filter Non-linear models for estimation and forecasting Extensions of ARMA models Nonlinear autoregressive model NLAR Threshold autoregressive model TAR Bilinear models Neural networks Neural network Optimization Neural network for forecasting Chaos Characteristics of chaotic system Fractal behavior and self-similarity Poisson process and exponential distribution Derivation of the Poisson distribution Transition probabilities for t Standardization method Markov decisional process Average reward The relative values Transformation of the iterative algorithm on the values 65 References 67 4

5 Preface These MGR-815 course notes explain many concepts and important derivations for understanding the course. Nevertheless, the purpose of these notes is not to cover all the material of the course, but to help understand the material included in the other references and those that are available on the course's website. Chapter 1 provides an example of how to calculate the performance of the Voice on IP system while using a simplified but effective model. Chapters 2-7 present the problems of estimation and forecasting and cover the majority of materials necessary for Project 1 and Test 1. Chapters 4-8 follow, in many aspects, the [Chatfield, 2003] book and partly the [Dziong, 1997] book, where we can find more details. The purpose of Chapter 9 is to help track the overheads associated with the [Bose, 2001] book and the selected chapters of [Dziong, 1997], which deal with the Markov process and the queue analysis. Finally, Chapter 10 offers some basic concepts and derivations which facilitate the understanding of the material associated with the Markov decision process that is shown in the selected chapters of [Dziong, 1997]. The more advanced study of this subject is available in the [Tijms, 1994] book. Through notes, the parts marked by the star * are not obligatory and can be omitted without causing problems in understanding the other parts of the notes. 5

6 1. Example of the analytic model of a VoIP system In this section, we will analyze the performance of a link in a network serving a certain number, M, with VoIP (Voice over IP) connections. The purpose of this analysis is to find a relationship between the maximum number of VoIP connections M max, which we can admit on the link and its capacity, L. At the same time, we require certain guarantees for the quality of the service for these connections. This guarantee is expressed as a constraint, B p c, on the probability of packet loss, B p, which should not be exceeded. The architecture of this system is illustrated in Figure M connections admitted on the link.. M. Link Capacity = L Lost packets with B p Probability Figure 1.1. The system's architecture Here are the assumptions used in the analysis: 1. VoIP packets have a fixed length (e.g. 200 bytes) 2. An active connection generates a packet every 20ms 3. During the inactivity period, a connection does not generate packets 4. A connection is active with the probability p a = The link works synchronously with a 20ms time interval 6. In each interval, the link can serve a maximum of L packets 7. The bond has no buffer for the packets. Therefore, during the given interval, if the number of packets reached, A, exceeds the capacity of the link, A > L, the link will reject A L packets. Attention! Unlike assumption 7, the real links have a buffer for the packets which cannot be served during the given intervals and these packets can be served during the following intervals. In this case, the constraint of the quality of the service can be 6

7 defined as the probability of packet delay which must not exceed a certain value (e.g. 150ms). However, the non-buffer system can be used for the approximate analysis of the buffered system, because we can set certain approximate matches between the two qualities of service constraints The basics and the packet loss rate In general, in the interval j, of 20ms, there is a certain number of packets, A j, which happen to be served by the link. Let us consider this number as a result of the j-me trial (or experience). For n trials, we can define the following concepts: N k (n) k (n) = N k(n) n p(k) = lll n k (n) ; Number of trials with A j = k packets ; Relative frequency of having a trial with k packets ; Probability of having a trial with k packets A n = 1 n A n j=1 j = A n n M 1 M n k=0 kk k(n) ; Sample average k=0 kk(k) E[A] ; Mathematical expectation, average value Using these concepts, we can define the probability of packet loss as follows: 1 n M K=L+1 1 n (K L)N k(n) M k=0 kn k(n) n M K=L+1 M k=0 (K L)p(k) = B kk(k) p (1.1) 1.2. Binomial distribution Considering a sequence of three trials of head (T) or tail (F) where each trial is independent with the probability of a head p and a tail 1-p. Here are the probabilities for all possible combinations: P [{FFF}] =p 3 P [{FFF}] =p 2 (1 p) P [{FTT}] =p 2 (1 p) P [{TFF] =p 2 (1 p) P [{TTT] =p (1 p) 2 7

8 By grouping the sequences with the same number of heads and tails, it is possible to define the probabilities of having k tails in the three trials By analyzing the structure of these formulas, we can find the generalization for any number of trials, n, in the sequence Or Express the number of combinations for the selection of k positions among n positions. The formula (1.2) defines the binomial distribution which can also be calculated using the recurrence 1.3. Control of connections admission Using (1.2) with p = p a, in (1.1) we reach Where the notation B p (M,L) for the probability of packet loss stresses that it depends on the number of connections and the capacity of the link. The formula (4) can be used to find the maximum number of VoIP connections, M max, which we may admit on the link with the capacity L: 8

9 Figure (1.2) shows an example of B p (M,L) as a function of M for L =20 and P a = 1 3. By using this figure, we can find M max for the required value of the constraint B p c. For example, for B p c = 5% we can accept a maxium M max = 56 connections. Figure 1.2. Probability of packet loss B p (M,L) as a function of M for L = Sizing the link Formula (4) may also be used for finding the minimum capacity L min for the number of given M connections: Figure 1.3. provides an example of B p (M,L) as a function of L for M =48 and P a = 1. By 3 using this figure, we can find L min for the required value of the constraint B c p. For c example, for B p = 5% we need a minimum L min = 18 VoIP packets per interval. 9

10 Figure 1.3. Probability of packet loss B p (M,L) as a function of L for M =48 10

11 2. Introduction to time series and to forecasting A time series may be defined by a variable X(t i ) being a function of discrete time t i : i = 1, 2, 3,. Assuming that we are at the moment of time t n and that we know all the values of the series for i n. These values form the history of the series and they are also called the samples of the series. One of the basic problems of time series analysis is the estimation of one (or several) future value(s) of X(t i ) for i > n, based on the full or partial history of the samples. This estimation is also called a forecast or a prediction. A simplified notation is often used for the index of discrete time X t : t = 1, 2, 3, Decomposition of a time series In general, a time series can be decomposed into four components: X t = T t + C t + S t + R t where T t - Trend (can be ascending or descending) C t - Cycle (periodic repetition) S t - Seasonal (a special case of a cycle) R t - Random A series may have only a subset of these components. Often we want to analyze only one component and in this case we must remove (filter) the other components Types of forecasting In practice, the forecast is always associated with a forecast error. Actually, there are two types of forecasting that differ in the treatment of error: 1. Point Forecast: In this case, the result of forecasting is a particular value that should be equal to the future value of the series we seek. This type of forecasting gives no information on the size of the possible error of the forecast. 2. Forecast of confidence interval: In this case, the result of forecasting is an interval that covers the real future value of the series we search with a 11

12 specified probability (e.g. 0.9 or 0.95 or 0.99). This interval gives confidence in the forecast because it indicates the probability of having an error of the specified size Forecasting error In general, the forecasting of the time series sample is not ideal and we may define the forecasting error: e t = x t - x t where x t is the true value of the sample and x t corresponds with its forecast. We are often interested only in the absolute deviation of the error: e t = x t - x t To analyze the quality of forecasting in a good way, we need to use the errors on several samples. In this case we can apply one of two metrics: [1] MAD (Mean Absolute Deviation) - the average deviation of the error: 2. MSE (Mean Squared Error) average squared error The mean squared error is more popular because it gives more weight to the biggest mistakes that are more "dangerous." 2.4. Forecasting techniques Here are some techniques that can be used for forecasting in the time series: 1. Estimation of the last sample used as the forecast 2. Exponential reading 3. Time series regression 4. Autoregression of the time series (ARIMA) 5. Box-Jenkins methodology 6. Models based on the states' space: Kalman Filter 12

13 The techniques outlined in bold are covered in this course Choosing a forecasting technique The choice of a forecasting technique depends on the sought characteristics which can be divided into the following categories: 1. Type of forecast (point, confidence interval) 2. Horizon of the forecast (longer => more difficult) 3. Component of the series (T, C, S, R) 4. Cost of forecasting a. Development of the forecasting model b. Requested memory c. Complexity 5. Accuracy of forecasting 6. Availability of data 7. Ease of operation and understanding. 13

14 3. Basic Concepts Supposing that {y 1, y 2,, y n } is a sample with a variable Y randomly selected. For this sample we can define: 1. Sample's average 2. Sample's variance 3. Distance-type of the sample Question: Why do we divide by n 1, and not by n, in the formula for the variance? 3.1. Probability Probability is a measure of a likelihood, or a chance, that the variable is equal to a value. It can be defined as Where A is a particular value of the variable, n A is a number of values A in the sample and n is a total number of values in the sample Probability distribution Here we must distinguish between the continuous and the discrete variables. For the discrete variables we can define the probability function: and the cumulative distribution function: 14

15 The function of cumulative distribution is also valid for the continuous variables. On the other hand, the probability function is not defined for the continuous variables, because in this case it is always equal to zero due to the infinite number of possible values. Nevertheless, we can define an equivalent function which is called probability density function which is determined by: Therefore, the function of cumulative distribution can be presented as: For the discrete variables and as for the continuous variables Normal distribution and Student's t law The normal (Gaussian) distribution is very important for many applications and especially for statistics. The function of normal distribution density is defined by where µ is an average and σ is a distance-type of the distribution. The normal distribution is sometimes called N (µ, σ). If we normalize the variable x : we reach the standard normal distribution with the average equals zero and with the variance equal to a unit. The normal distribution is symmetrical as shown in Figure 3.1. For the analysis of the confidence interval, shown later, it is useful to present the probability values in which the variable is found during the following intervals: 15

16 Student's t distribution is a distribution related to the normal distribution and is important for the calculation of the confidence interval discussed later. It is a symmetrical distribution with µ = 0 and σ > 1, which is characterized by a number of degrees of freedom df. For df wide, the law of t approaches the normal distribution: The density of the law of t with df = 4 is compared with that of the normal standard law in Figure 3.1. Figure 3.1. Density of the normal standard law (higher) and with the law of t with m = 4 (wider) 3.4. Central limit theorem Suppose that is the sum of n random iid variables (independent with identical distribution) with the average µ and the finished variance σ 2. Let us consider a standardized random variable with the average zero and variance of a unit. We can prove that Which means that the distribution of the sum of a large number of iid variables approaches the normal distribution. The importance of the central limit theorem 16

17 comes from the fact that these variables can have any distribution. The only requirement is that the average and variance are finished Confidence interval While estimating or forecasting a value of a series parameter W, we can use the notion of the confidence interval to indicate a precision in the estimation or forecasting. This interval covers the true value of the parameter with a probability 1 - where is a parameter of the estimation or forecasting. Often this interval is shown as a deviation of the estimator (of the forecast) from the point W : Example of calculation of the confidence interval Let us consider a sample of n independent values {y 1,, y n } with a random variable Y having a distribution close to the normal distribution. The goal is the estimation of the confidence interval for the average µ of Y. Here are the steps: 1. Let us calculate the average and the distance-type of the sample {y 1,, y n } Let us note that y is actually our estimator in terms of µ. 2. Now, let us consider y as a random variable. We can easily find the relations between the averages and the variances of y and Y : which give the average and the distance-type of the sample y : 3. Because we can prove that y follows Student's t law with the degree of freedom df = n 1, the confidence interval for the average µ is given by 17

18 where t dd /2 is a value which gives where T is a random variable that follows Student's t law. We can find the values dd in the mathematical tables. In practice, we can also use the normal t /2 distribution when n - 1 > 30, since, as mentioned before, the law of t approaches the normal law for wide df Application of confidence interval for simulation results Let us consider the estimation of an average µ of a random variable X in while using a simulation. For example, X may be the number of connections (or packets) in the system. To obtain the statistics, the measured values x j of X are recorded during the simulation from time to time. Let us note that in order to avoid the influence of the initial conditions it is good to start the measurement after an initial period when the process becomes stationary. To obtain the confidence interval for the estimation of the average µ, we must have a m sample of average values {x 1,, x n } where x i = /m are independent. We can obtain it in two ways: j=1 x j 1. Realize n independent simulations while using different initial values having m random figures in each simulation. Calculate x i = /m for each simulation. j=1 x j 2. Realize a simulation that we divide on n periods (of course after the initial m period). In each period, calculate x i = /m. The periods should be long enough to have independent x i. j=1 x j Note that regardless of the distribution of x i, the variables x i have a distribution close to a normal distribution for reasonably wide m, based on the central limit theorem. So once we have a sample of average values {x 1,, x n } confidence interval is given by Where The confidence parameter is chosen according to the need.. Typically, we use = 0.1, 0.05,

19 3.6. Regression The regression deals with the description of the nature of the relation between two or more variables. In particular, it is concerned with the problem of estimating the value of a variable depending on the basis of one or more independent variables. Below is a description of some models of linear regression Model of Simple linear regression A model of simple linear regression model is defined as Where y - dependent variable x - independent variable µ y/x - average dependent value β i - coefficient of regression ε - Term of error, or disturbance, or noise. Let us note that ε captures all the other factors influencing the dependent variable y other than the regressor x. In statistical problems, we focus on the estimation of {β i } while using the least squared method where we estimate {β i } while finding the minimum of the sum of the squared error where the error is defined as the difference between the value of an observation y t, and the value given by the regression model R ( {β i }, {x i }j ) which in general is a function of the values of the regression coefficients {β i } and values of the independent variables {x i } j. In the case of the simple linear regression, the estimators of regression coefficients, which minimize the average square error are: 19

20 where {y 1,, y n }, {x 1,, x n } are the samples of the dependent and independent variables, and y, x are their averages, respectively. Based on this result, it is possible to make a point estimation for simple linear regression: Model of general (multiple) linear regression A model of general linear regression (with the multiple independent variables) is defined by It is worth mentioning that in the analysis of the different types of regression, we almost always use the following assumptions for ε: 1. This is a random variable with the normal distribution 2. The average is µ = 0 3. The variance σ 2 is constant (does not depend on values of x i ) 4. The values are statistically independent Model of quadratic regression Let us note that this regression is always linear, because y is a linear function of regression coefficients β i. 20

21 4. Time series as stochastic processes The time series can be modeled and analyzed as the stochastic processes. In this chapter we will describe the concepts and types of stochastic processes that are used in analyzing the time series Definition of stochastic process A stochastic process is a set of random variables that are time-ordered. Two types of stochastic processes are defined: 1. With continuous time: X(t) 2. With discrete time: X t : t = 0, 1, 2, 3, Obviously, to model and analyze the time series, we use the stochastic processes with discrete time. The basic problem in the analysis of stochastic processes is the estimation of the process's characteristics, based on a sample of realizations of the process. It must be emphasized that a major difficulty in the analysis of time series is the fact that in most cases we only have a realization (observation) of the available process. The stochastic process can be defined by the multi-dimensional probability distribution for any set of time. Unfortunately, this approach is not practical because of the large size of the problem. The alternative approach is based on the moments. The exact version of this approach is always difficult, because we need all the moments. However, in practice we can often limit ourselves to the first two moments: Average, variance, and autocovariance. Let us note that the normal processes are exactly defined by the first two moments Stationarity of the process A process can be strictly stationary or it may be characterized by the weak stationarity (of second order). Here are the definitions: 1. Strict stationarity (two definitions) a. The distributions of multidimensional probability are identical to any time sequence of the same length. 21

22 b. All the moments are the same for any time sequence of the same length. 2. Weak stationarity (of second order) a. The average does not depend on time For all t b. Auto-covariance depends only on the shift 4.3. Correlogram The correlogram is an autocorrelation of a process presented as a shift function. It is very useful for determining the type of a time series. Let us start with the definition of the sample's autocovariance with n elements and its biased version which is more popular. Based on this function we can define the function (also known as the coefficient) of the sample's autocorrelation as the value of autocovariance normalized by the variance Interpretation of correlograms In this section we will highlight the important features of correlograms for the different types of time series. a. Random series In this case the correlogram is characterized by values close to zero The values r k are not equal to zero because normally the number of values x t in the sample n is limited, which causes a statistical error. We can demonstrate that this error has an approximately normal distribution N (0, 1 ). Based on this n characteristic and on (3.3) we can say that averagely 19 out of 20 r k values will be within the confidence interval 22

23 An example of a random series of correlogram, with n = 100 observations, is shown in Figure 4.1 with 95% confidence interval Figure 4.1. Random series correlogram with the confidence interval (taken from Chris Chatfield. The analysis of time series: an introduction. Chapman & Hall, 2003). b. Short-term correlation series In the case of the short-term correlation series, its correlogram has an important value for k = 1 and one or two following coefficients with the significant values, but smaller and progressively reduced. An example of a short-term correlation series and its correlogram is shown in Figure 4.2. Figure 4.2. A short-term correlation series and its correlogram (taken from Chris Chatfield. The analysis of time series: an introduction. Chapman & Hall, 2003). 23

24 c. Alternating series For the alternating series, the correlogram also has a tendency to alternate between the positive and negative values, as shown in Figure 4.3. Figure 4.3. Alternating series and its correlogram (taken from Chris Chatfield. The analysis of time series: an introduction. Chapman & Hall, 2003). d. Non-stationary series If we have a non-stationary series, its correlogram decreases for a long time and reaches zero for a long time and reaches zero for the large shift. In this case the component of the trend is dominant and to analyze the other characteristics it must be removed. An example of a non-stationary series and its correlogram is shown in Figure 4.4. Figure 4.4. Non-stationary series and its correlogram (taken from Chris Chatfield. The analysis of time series: an introduction. Chapman & Hall, 2003). 24

25 e. Series with seasonal fluctuations In Figure 4.5 (a) there is the monthly temperatures observations correlogram. This correlogram is dominated by the annual cycle. On the other hand, after the seasonal adjustment obtained by subtracting the monthly average (long term) of each observation, we obtain the correlogram shown in Figure 4.5 (b) where 95% confidence interval is also given for zero values. This correlogram indicates that the adjusted series has a short-term correlation. Based on this characteristic we can say that if a month is hotter, one or two of the following months will be hotter too. Figure 4.5. Non-stationary series and its correlogram (taken from Chris Chatfield. The analysis of time series: an introduction. Chapman & Hall, 2003) Trend As indicated in Section 2.1, a series may have a trend component. Here are some types of trend with a mathematical formulation: a. No trend T t = β o b. Linear trend T t = β o + β 1 t + ε t c. Quadratic trend T t = β o + β 1 t + β 2 t 2 + ε t d. p-me order polynomial trend: T t = β o + β 1 t + + β p t p + ε t 25

26 where β i are constant coefficients and ε t is a random error with the average equals zero. The estimation of the least squares of coefficients β i can be obtained by regression. Let us note that if the trend has a local character, the coefficients depend on time β i (t) Filtering the series In general, a time series may have several components associated with the different time scales. We already mentioned the trend T t, the cycle C t, and the seasonal variation S t, which, normally, are associated with a long time scale. On the other hand, the random component R t is rather associated with a short time scale or a medium time scale. In addition, we may have the component ε which is associated with a short time scale. This component can be interpreted as a measurement error, or a disturbance, or a noise. Filtering is used to remove certain components of the series, which are associated with a particular time scale, to obtain a series with only the components of interest. Below, we will discuss the different types of filtering Removal of local fluctuations (low pass filters) To remove local fluctuations, associated with a short time scale, we can use several filters. Here are the filters often used: a. General linear filter The general linear filter transforms a linear series to another using the linear operation Which sums up observations x t, in the window q, s, weighed by the weights a r. We often use r a r = 1. b. Symmetric linear filter This is a special version of the general linear filter where the weights are identical for all the observations in the interval. 26

27 c. Exponential Smoothing In this case, we sum up all the previous observations with the weight which decreases exponentially (geometrically) where = (0,1) defines the weight given to the previous observations. The removal of local fluctuations, associated with a short scale time is also treated as a low pass filter, as the resulting series has only the components associated with a long time scale Removal of long-term fluctuations (high pass filters) In case we want to have only one component associated with a short time scale, we must remove the long-term fluctuations. Here are some methods to do so. a. General linear filter After estimating the trend Sm(x t ), using the general linear filter, we can subtract it from the series where r b r = 0, b 0 = 1 a 0, and b r = -a r, assuming that r a r = 1. The residue of this Res(x t ) contains only the local fluctuations. b. Differentiation to remove the trend The first order differentiation is a simple and very useful filter to remove the trend If the differentiated series is not always stationary, we can make a second order differentiation to reach the stationary series. c. Differentiation to remove a seasonal variation To remove seasonal variations, we can also use the differentiation, but between the observations distanced by the variations cycle 27

28 For example for the annual variations, we use m = 12. It is worth mentioning that in general we can use the filters in series. For example, to obtain a series with the components associated with an average time scale, we can use two filters in series: a low pass filter (to remove noise or error or disturbance) and a high-pass filter (to remove the trend or seasonal variations). 28

29 5. Useful stochastic processes for the analysis of time series In this chapter we present several types of stochastic processes that are useful in time series analysis White noise The purely random process, often called the white noise, is a sequence of random variables where (Z t ) are iid which means a. independent (mutually) b. identically distributed, In this case, the autocovariance and autocorrelation are equal to zero In addition, we assume the average equal to zero µ z = 0 and the limited variance σ z 2 < Random path The random path is connected to a sequence of random variables {Zt } iid which is characterized by µ z > 0 and the limited variance σ z 2 <. This relation is defined by Normally it is assumed that X 0 = 0 which gives Consequently, the average and variance of the random path are given by indicating that the random path is a non-stationary process. Let us note that, if we apply the differentiation on the random path 29

30 We obtain the stationary series {Z t }. A good example of the random series is the price of a stock at the exchange market Moving Average Supposing that {Z t } iid with µ z = 0 and σ z 2 <. The moving average MA(q) of the order q is defined by Where {β i } are constant coefficients with β 0 normally proportional to 1. The moments of MA(q) are given by Let us note that for MA(1) with β 0 = 1 we receive The formula for the moments indicates that MA(q) is stationary, independently from the value of the coefficients {β i } Autoregressive process Supposing that {Z t } iid with µ z = 0 and σ z 2 <. The autoregressive process AR(p) of the order p is defined by 30

31 where {a i } are constant coefficients which define the autocorrelation of the process First order autoregressive process (Markov process) The First order autoregressive process is important because it belongs to the class of Markov process which has many applications in the field of telecommunications. In this case, we have p = 1 and the first order autoregressive process is described by Using the successive substitutions, we reach the other form of process which corresponds with the moving average of the infinite order. Equation (5.8) may also be presented as Where B is backward operator : Z t-1 = BZ t. Based on (5.8) or (5.9) we can easily find that for AR(1) we have It is important to note that we must have α < 1 for the variance to be finished. In this case, the second moments are given by Because the moments do not depend on t, this shows the second order stationarity for α <1. In Figure 5.1, we present the correlograms for three values α = 0.8, 0.3,

32 Figure 5.1. Correlograms of the first order autoregressive process for α = 0.8, 0.3, (taken from Chris Chatfield. The analysis of time series: an introduction. Chapman & Hall, 2003) ARMA model The mixed autoregressive and moving average model ARMA of the order (p,q) is a composition of the autoregressive and moving average: The importance of ARMA is that a stationary series can be described by less coefficients using ARMA than when using MA or AR ARIMA model The mixed integrated autoregressive and of moving averages model ARIMA of the order (p,d,q) is a composition of the differentiation, the autoregressive process, and the moving average. It is used for non-stationary series. First, we remove the nonstationarity by differentiation of the order d 32

33 and subsequently we apply ARMA model For example, the random path can be modeled by ARIMA of the order (0, 1, 0). 33

34 6. Development of forecasting models In this chapter we will present a methodology for the development of forecasting models. This methodology will be illustrated by several examples. It can be broken down into four steps: 1) Identification of the type of model a) Application of the correlogram 2) Estimation of the model's parameters a) Least Square Estimate - LSE, or b) Maximum Likelihood Estimate MLE 3) Model verification a) Analysis of the residue 4) Verification of the alternatives (if the model verification is not positive) a) Return to step 1 In the following sections, we will illustrate this methodology while considering only the types of models belonging to the ARIMA family. On the other hand, the methodology is also valid for other types of models Identification of the type of the sample's model The correlogram is very useful for identifying the type of model. Therefore, first, it must be estimated, based on the series' sample Correlogram of the sample Suppose that n is the number of observations in the time series' sample. As already indicated in section 4.3, we can consider two formulas for the estimation of the sample's autocovariance: the unbiased estimation and the biased version where the bias has the order of 1/n. Nevertheless, the estimator (6.2) is asymptotically unbiased, because lim n E(C k ) = y (k). In addition, its finite transformation Fourier is not negative, which gives also a certain advantage in estimating the spectrum of 34

35 the series. On the other hand, even if the estimator (6.1) is unbiased, we can show that its variance Var(C k ) is higher than that of (6.2) and that corresponds to the average square error of the highest estimation in the cases (6.1). For these reasons, in practice we rather use the estimator (6.2). Once the autocovariance is estimated, the correlogram is defined by the values of the autocorrelation of the sample given by For most series, the autocovariance p(k) reaches zero from a certain k'. The question is how can we identify k' of the sample if the estimator r k has an error. To answer this question, we first can show that if {x t } are observations of the variable iid with an arbitrary average, r k has an asymptotically normal distribution with From these characteristics we can define 95% confidence interval for r k which corresponds with p(k) = 0 which is near because normally 1 n 2 n. An illustration of the use of the confidence interval to identify p(k) = 0 of the sample is given in section and in Figure 4.1. Note that the bias of the estimator (6.2) can be reduced using the technique jackknife where the final estimator is given by Where c k, c k 1 and c k 2 are the estimations (6.2) based on all the observations, the first half of the observations, and the second half of the observations, respectively. We can show that (6.5) reduces the bias of 1/n to 1/n 2. Similarly, we can reduce the bias of the estimator (6.3) using 35

36 Where r k, r k 1 and r k 2 are the estimations (6.3) based on all the observations, the first half of the observations, and the second half of the observations, respectively Interpretation of correlogram The correlogram gives the possibility of a preliminary determination of the type of series. In this section we will give certain rules for the series that can be modeled through a model of the ARIMA family. We assume that we are interested in the characteristics of the short-term fluctuations. a) If the correlogram decreases for a long time and reaches zero for the very wide shift, as shown in Figure 4.4, we assume that the series is not stationary. In this case, we must differentiate the series with the degree which gives the stationarity b) If the correlogram drops to zero immediately after a certain shift k, we assume that the series corresponds to the moving average of the order MA(q = k) c) If the correlogram is a mixture of exponentials and depreciated sinusoids, and it decreases (or alternates) slowly, we assume that the series corresponds with the autoregressive process. Unfortunately, it is difficult to find the order p of the model AR from the correlogram d) If the correlogram alternates and does not fall to zero, we can also assume that the series belongs to the ARMA category Estimation of the parameters of the autoregressive model Assuming that the analysis of a series' correlogram suggests the autoregressive model with µ The objective is to estimate the values of the parameters µ, p and { I }. To facilitate the estimation, it is suitable to decompose the problem into two parts: a) Estimation of µ, and { I } while assuming that the order p is known b) Estimation of the order p. First, we will deal with the estimation of µ, and { I } while the estimation of the order p will be treated in section

37 Generally, to estimate µ, and { I } we can use one of two methods: a) Least Square Estimate method LSE b) Maximum Likelihood Estimate method MLE The two methods provide very similar or identical results. First, we focus on LSE, while MLE is discussed in section Least Square Estimate method LSE Let us consider the generic model of the time series Where θ represents a set of model parameters and ε t is a random factor. In the least square method, we estimate θ by finding the minimum of the sum of the squared error where the error is defined as the difference between the value of observation x t, and the value given by the model M(t, θ) which may also be a function of previous observations. Note that this technique is the same as the one used for the regression problem described in section 3.6. Now by applying the least square method (6.9) for estimating parameters µ, and { I } and while assuming that the order p is known in our autoregressive model (6.7) we reach As already mentioned, this minimization is analog to the least square method used in the regression. An important difference is that, in the regression, the variables x i are independent, which is not the case for x t in the autoregressive model. Nevertheless, we can demonstrate that the basic results concerning the optimality of the solution for the regression are also valid for the autoregressive model. In particular, the least square method is characterized by the variance which is minimal between the unbiased linear estimators. 37

38 AR(1) Parameters According to (6.7) the AR(1) model can be defined as For this model, the least square method (6.10) gives In this case we can analytically find a solution in the form where x (1), x (2) are the means of the first and last n 1 observations in the sample, respectively. Because we can assume that we reach the solution in the form Note that this solution is identical with the solution of AR(1) using the ordinary regression. In this case, the element (X t-1 - µ) in the equation AR(1) (6.11) is treated as the independent variable. Note also that the coefficient I is approximately equal to the autocorrelation coefficient AR(2) Parameters According to (6.7) the AR(2) model can be defined as For this model, the least square method (6.10) gives In this case we can analytically find a solution in the form 38

39 The coefficient 2 is also treated as the estimator of the partial autocorrelation for the order p = 2, π 2. In general the partial autocorrelation for the order p, π p = p, defines the excess autocorrelation when compared with the order p 1. This notion applies to any order p and will be used in the next section to find the order p of the AR model Estimation of the order of the autoregressive model The determination of the order of the AR series, based on the sample, is not easy. The correlogram provides no specific information as in the case of the moving average (See Section 6.3). We can only say that - For AR (p= 1) the correlogram decreases exponentially - for AR (p> 1) the correlogram is a mixture of depreciated exponential functions or depreciated sinusoid functions Nevertheless, we can find the order of an AR series by analyzing the behavior of certain characteristics of the series, presented as the function of the order. Here are two approaches based on this idea. 1) Analysis of the partial autocorrelation π j : a. Estimating the partial autocorrelation π j = j of the sample for a large enough, increasing sequence of orders j. b. Identifying the order k from which we can assume that π j = 0. Here, we must use the concept of confidence interval, presented in sections to identify the area where the correlogram is equal to zero. For example for 95% confidence interval we have π j < 2 n. c. The order of the AR series is estimated as the order of the last partial autocorrelation which has a significant value, π j > 2. In other words p = k 1. 2) Analysis of the minimum of the sum of the squared error S mmm (j) : a. To calculate the minimum of the sum of the squared error S mmm (j) for the sample (See section 6.2.1, equation (6.10)) for a sequence of large enough, increasing sequence of orders j. n 39

40 b. To identify the order k from which we can assume that the sum of the square error S mmm (j) of the sample is constant (and of course minimal) c. The order of the AR series is estimated as the minimal order which gives the sum of the squared error S mmm (j) for the minimal sample. In other words p = k. Note that both methods are based on the same principle and involve the same complexity. Nevertheless, the method based on the analysis of the partial autocorrelation has the advantage of using the confidence interval to determine the order. On the other hand, in the method based on the analysis of the minimum of the sum of squared error, the choice of the order is less accurate with regard to stochastic variations Maximum Likelihood Estimate Method - MLE* Supposing that X t : t = 1,, n is the sample of independent observations with a probability density function f(x t ). In this case the multi-dimensional function of the probability density is defined as Let us assume that the probability density function is a function of certain parameters θ, f(x t, θ). In this case we can define the likelihood function The estimation of the maximum likelihood (Maximum Likelihood Estimation - MLE) of the parameters θ is obtained by maximizing L(θ) over θ. Note that in the case of time series, the observations are not independent, so the equations (6.22, 6.23) should be adjusted while using the probability density functions f(x t ) and f(x t, θ) conditioned on the previous values X t-1 = {X 1, X 2,, X t-1 } In fact, Kalman filter, discussed in the following chapters, produces such a conditional distribution for a general class of time series models. 40

41 We should emphasize that in the analysis of time series, the method of maximum likelihood is always used while assuming that the series is Gaussian (i.e. Z t has a normal distribution) Estimation of AR(2) parameters In this section, we present an example of application of the maximum likelihood method for the estimation of an AR(2) series parameters. In this case, the conditioned probability density function is given by and the likelihood function is given by Note that the product should be started with t = 3 but if the number of observation is large, the effect of this approximation is minimal. To resolve (6.27) we apply the logarithmic operation which gives This equation shows that maximizing (6.27) is equivalent to the minimization So we obtained the identical solution with the least square solution (LSE) given by (6.18). Although this is not always the case, LSE gives a good approximation of MLE. If σ 2 is not known, it can be estimated using LSE Estimation of the parameters of the moving average model Assuming that the analysis of a series' correlogram suggests the moving average model, MA, with µ The objective is to estimate the values of the parameters µ, q, and {β i }. To facilitate the estimation it is suitable to decompose the problem into two parts: a. Estimation of order q b. Estimation of parameters µ and {β i } 41

42 Contrary to the autoregressive model, order estimation for the moving average model is easy. On the other hand, the estimation of parameters µ and {β i } is more difficult Estimation of the order of the moving average model The order of MA series, based on the sample, can be easily estimated by a correlogram analysis because the theoretical autocorrelation is cut to zero for the shifts larger than q. In practice it is necessary to apply the following steps: a. To calculate the sample's correlogram b. To identify the order k from which we can assume that r j = 0. Here we must use the concept of confidence interval, presented in Section 4.3.1, to identify the area where the correlogram is equal to zero. For example for 95% confidence interval we have r j < 2 n. c. The order of the MA series is estimated as the order of the last autocorrelation coefficient which has a significant value r j > 2. In other words q = k 1. n Estimation of the parameters of the average mobile model In general, the estimation of the parameters of the moving average model is difficult because there are no explicit estimators. To solve the problem we can use a numerical repetition. In the next section we will describe an approach to MA(1) Estimation of MA(1) parameters Let us consider the moving average model of the order q = 1 with β 0 normalized to 1: In fact, we can be tempted to use the theoretical equation which defines the autocorrelation (5.5) to calculate B 1 under the condition B 1 < 1: Unfortunately, we can show that such an estimator is inefficient. So we are forced to use the following numerical procedure: a. Starting with µ = x and β 1 of (6.31) b. Calculate the sum of the square residue, n 2 1 Z t using the recurrence Starting with z 0 = 0. c. Minimizing Z t 2 n 1 through the inspection of the values close to µ and B 1. 42

43 6.4. Model Verification Model verification may be made through an analysis of the residue. The residue is defined as a difference between the observation value x t, and the estimation value of this observation x t which is calculated using the model and the previous observations. Therefore the residue is defined as For example, for AR(1) the residue is given by If the model is exact the residues should be random and independent. To verify these characteristics we can analyze the residues' correlogram. Also, we can assume that the distribution of the autocorrelation coefficients is normal, which gives the means to calculate the confidence interval to verify whether the correlogram indicates a dependency. For the 95% confidence we can say that the values which are outside the interval 2 / n are significantly different from zero. If the analysis of the residue gives the dependency indication, the model should be improved to reach a random and independent residue. 43

44 7. Forecasting This chapter presents three approaches for forecasting: forecasting by extrapolation, exponential smoothing, and the Kalman filter. Forecasting by extrapolation uses a time series model that can be developed using the methodology presented in the previous chapter, but it is limited to the stationary series. The exponential smoothing is an iterative method that is much simpler, because it does not require the creation of a model, and can adapt to the slow change in the series' parameters, but it is an intuitive method that gives no confidence interval on the forecast. The Kalman Filter is also an iterative and adaptive method but it is based on a theory which also gives the confidence interval Forecasting by extrapolation Consider a time series X t, at the moment T. Let us assume that at this time we have every observation x t of the series for t T. Suppose x T (k) is a forecast of the series for the time T + k where k is the horizon of the forecasting. This forecast may be presented as a conditional expectation: If we have an ARMA model of the series, the forecast x T (k) may be obtained by substitutions: a) Z future values by zeros b) X future values by their conditional expectations c) X and Z old values by the values of their observations For example, assuming that the model of our series is given by AR(1) and we need the forecast x T (3). In this case the prediction is iteratively calculated: The main advantage of forecasting by extrapolation is that this approach is based on a good model. On the other hand, there are also some disadvantages: 44

45 a) Complex calculation of the model's parameters b) Need to memorize the series c) Assumption of the series' stationarity We can avoid these disadvantages by using the exponential smoothing described in the next section Exponential Smoothing In exponential smoothing we build a new series using the weight 0 1 : This operation can also be presented as a sum: which clearly shows the name of the method which comes from the fact that the weight of the previous observations decreases exponentially (or more precisely: geometrically) with the elapsed time j. Note that exponential smoothing has already been presented in section 4.5 as a lowpass filter to remove local fluctuations. However, exponential smoothing can be used for forecasting. Especially Note that for the horizon k > 1 we always have the same value of the prevision: This forecast can also be presented as a function of the forecasting error e T = x T - x T-1 (1). Especially Note that for the horizon k > 1 we always have the same value of the forecasting: x T (k) = x T (1) There are several advantages for forecasting using exponential smoothing: a) It adapts well to the slow change in the series' parameters, thus the assumption of the series' stationarity is not required. b) No need to memorize the series c) The calculation is very simple. On the other hand, exponential smoothing has also some disadvantages: 45

46 a) It is not based on a theory or a statistical model b) We can say that the method is intuitive c) No confidence interval. It should be mentioned that we can avoid these disadvantages using the Kalman Filter, which is described at the end of this chapter Choice of The choice of for exponential smoothing is not easy and there is no unique approach. First we will consider a theoretical method that can be applied on the assumptions that the series is infinite and that its autocorrelation has the following form: In this case we can show that if is selected according to the following rule: the sum of squared errors is minimized. On the other hand, if the condition (7.6) is not met, to find the optimal value of we have to minimize the sum of squared errors Q( ) using a numerical method. In many cases the optimal value of is found in the interval 0.01 < < 0.3. However, if the optimal value is close to 0 or 1 and the function Q( ) is flat around the optimal value, it is better to choose a less extreme value Generalized Exponential Smoothing Consider a time series with a linear trend defined by x t = a + bt In this case the exponential smoothing method gives the forecasting which can be expressed as 46

47 For wide T we can show that Using these formulations we reach This result shows that the forecast has a delay (sometimes termed as a shift) de b/. To remove or minimize this delay we can apply the double exponential smoothing with two parameters, and β : Where M t corresponds with the estimator of the series and D t corresponds with the estimator of the shift (delay) in the time interval t - (t 1). Thus, the forecast is given by To find the optimal values of and β we must minimize the sum of square errors Q( and β) using a numerical method Kalman filter In this section we will introduce Kalman filter, and the related notions, in a simplified and abbreviated way, using the particular application. For more details, see the [Dzio97] book and the references included in this book. Kalman filter provides the optimal estimator according to certain criteria. It is an iterative filter that is based on the model of states' space, S-S (state-space model). It is important that this filter can give the confidence interval of the estimation State space model In general, and observation Z t in the time t can be presented as a sum of a certain signal and of measurement noise. In the state space model, the observation is the vector that is defined by the measurement equation 47

48 where W t - Vector of the system's state which we cannot observe directly H t - Observation matrix which gives the state that we can observe µ t - Error vector (or the noise) of the observation (or the measurement). This vector is a white sequence with the average zero and the covariance matrix Y t In the state space model, we also define the system's equation where t - Transition matrix of the system's state e t - Vector of the system's deviation. This vector is a white sequence with the average zero and the covariance matrix Q t Figure 7.1 shows dynamics of a system which corresponds with the system's equation and the measurement equation Figure 7.1 Dynamics of a system with a measurement NB: Modèle de système = System's model Modèle de mesure = Measurement model Scalar example of Kalman filter Consider a link that serves the VoIP connections (voice over IP). For the aim of admission of calls it is important to know the average rate of packets served, which, generally, is a function of number of served connections and of types of connections. Figure 7.2 shows an example of the trajectory of the average rate of packets which changes at each admission or departure of a call. We assume that in each period t with the same number of calls we make the measure M t of the average rate. Of course the real average rate M t, during the period t is different from the 48

49 measurement M, due to the measurement error µ t which is caused by the fact that the arrival process of the parquets is random. Thus, we can formulate the analog measurement equation at (7.10) where in our case H t = 1. Note that based on the measurement method and the length of the epoch t, in general we can also estimate the variance of the measurement error V t e,m which corresponds with the covariance matrix Y t. Figure 7.2 Trajectory of system's state NB: époque = epoch temps = time départ = departure arrivée = arrival We can also assume that the type of each incoming or outgoing call is known. This gives a possibility of estimating the average rate m t of the incoming (or outgoing) call at the beginning (or at the end) of the epoch t, based on its type and its historical statistics. Therefore, we can define the equation of the analog system by (7.11) Where t = M t 1 m t M t 1 is a transition matrix of the system state and e t is an error of the estimator m t that can be calculated based on the type of call and its historic statistics. Note than based on the same historical statistics we can also estimate the variance of the estimator's error v t e,m which corresponds with the covariance matrix Q t. Finally, (7.13) can also be transformed to 49

50 The dynamics of the system and measurement models are illustrated in figure 7.3 above. On the other hand, beneath this figure there is an illustration of the dynamics of Kalman Filter. Figure 7.3. Dynamics of Kalman filter NB: gain de Kalman = Kalman gain modèle de système = system model modèle de mesure = measurement model filtre de Kalman = Kalman filter The objective of the Kalman filter is to calculate the optimal estimator M t of the average rate M t. As illustrated in Figure 7.3, using (7.14) and the previous iteration estimator M t-1 we can find an extrapolation of the average rate So we have two sources of information for the estimation of the average rate: the extrapolation of the average rate M t e and the measurement of the average rate M t. Looking at Figure 7.3, we can easily find that the optimal estimator of Kalman filter uses both values with the weight determined by Kalman gain K t : Because in our case H t = 1, we obtain Where 50

51 The optimality of Kalman filter comes from the fact that it is calculated using the information about the extrapolation and measurement errors that is contained in v t e,m and V t e,m, respectively. Especially where P t e is an extrapolation of the covariance matrix of the estimation error P t defined as The equations (7.19, 7.20) show that if the measurement error V e,m t is dominant, Kalman gain is decreased (closer to zero) and the estimator gives a greater weight to e the extrapolation M t. On the other hand, if the extrapolation error is dominant, Kalman gain is increased (closer to 1) and the estimator gives a greater weight to the measurement. By using Kalman gain, we can also estimate the covariance matrix of the estimation error Apart from being useful for the Kalman gain calculation in the next iteration, the covariance matrix of the estimation error can be used to find the confidence interval of the estimator M t while assuming the normal distribution of the error. As for the estimation, we can use the extrapolation while assuming that we have the estimation of the transition matrix of the system state. In our example, we get for k = 1 and for k > 1. 51

52 8. Non-linear models for estimation and forecasting 8.1. Extensions of ARMA models In this section, we will present some examples of non linear models having ARMA models' forms of extension Non linear autoregressive model NLAR Here is a general form of a non linear autoregressive model of the order p For example, for p = 1 we reach the model Where φ(x t ) < 1 for a stability. In this class we can define the model with the instationary parameter Where ε t is a variable i.i.d. independent of Z t. For β = 0 we reach the model with the random coefficient Threshold autoregressive model TAR Here is an example of a threshold autoregressive model (TAR) In this case, the estimation of coefficients 1 and 2 demands an iterative procedure. The forecast for the horizon k = 1 is simple, but for wider horizons, is complicated Bilinear models The bilinear models can be seen as a non linear extension of ARMA models. Here is an example Where βz t-1 X t-1 is a "cross product term". 52

53 8.2. Neural networks Figure 8.1 shows an example of a neural network structure with the direct action (feed forward). In this example there is only a hidden layer of neurons, but in general we can have more of them Figure 8.1. Example of a neural network structure NB: couche d'entrée = input layer couche cachée = hidden layer couche de sortie = output layer In general the structure of a neural network is determined by the analyst who is experienced in the field. Specifically, he should determine three elements a) Number of layers b) Number of neurons in each layer c) Function of the input and output of each neuron Normally the function of the input is linear Where ω ij is the weight for the input i of neurons j. The output function, which is also called the activation function, defines a neuron value, z j = f v j. It can be linear Z j = v j 53

54 but more often it is non linear. Here are some examples of the nonlinear activation function: In the case of a hidden layer, the output of the neural network can be presented as where f o, f j - Activation functions in the output layer and the hidden layer, respectively ω j, ω ij - Weight in the output layer and the hidden layer, respectively ω 0 - Weight for the direct link between the constant input and the output layer. The activation function in the output layer is often chosen as the identity function, which means that the output layer is linear. Note that the form of equation (8.2) indicates that neural networks can be connected to the linear regression models Neural Network Optimization To optimize a neural network, we can apply the method of least square, which has already been presented in section In particular, based on the series' sample, we minimize the sum of squared error on the prediction with the horizon k = 1 In general the optimization is done using the research by trial and error and backpropagation. Here are some recommendations and practical observations: - It is a good practice to divide the sample into two parts: one for an optimization and the other for verification of the model. - We often need thousands of iterations for a convergence 54

55 - Local minima are possible, due to the nonlinearity and very large number of parameters. - To avoid the large numbers of parameters, a penalty term can be added - The large numbers of parameters may very well minimize the square error on the sample, but that does not necessarily give a better forecasting. There is also software that optimizes the neural network automatically, but sometimes the results are ridiculous Neural network for forecasting The neural networks may be useful for the long series with the evident nonlinear characteristics. But at the same time, to have confidence we must have hundreds and even preferably thousands of observations Chaos A time series that appears to be random may be a) random, b) chaotic, non-linear, deterministic, c) combination of a) and b) The very well known example of the chaotic series (or chaos) is called the logistic map or the quadratic map. It is a deterministic series defined as We can show that for 0 < k 4 we have an x t (0,1) and that the series has the following characteristics for the different values of k a) K = 4 ; The series is found on the quadratic curve shown in figure 8.2 (a). In addition, it has the characteristics of white noise without correlation (UWN - uncorrelated white noise) b) 0 < k < 1 ; The series converges to zero x t 0. c) 1 k 3 ; The series converges to the attractor x t (1 1/k) as shown in figure 8.2 (b). d) 3 < k < 3.57 ; The series has cyclical behavior. 55

56 Figure 8.2. Illustration of a logistics map (the quadratic map) NB: attracteur = attractor Characteristics of chaotic system There are two interesting characteristics associated with the chaotic series: Butterfly Effect The name of this feature comes from the anecdote that the flapping of butterflies' wings can cause a tropical storm. In the case of the chaotic storm, a small change in the initial value of the series can cause a big change in the following values. We can verify it while calculating x 3 for x0 =.1 and x0 =.11 in the series defined by (8.3) with k = Non integral dimension Note that in the series defined by (8.3) with k = 4, all observations are found on the quadratic curve. Therefore, we can say that the dimension of this series is smaller than the dimension corresponding to a random series which can take any value in the square 1 by 1. In fact, there is a definition for non integral dimension and in particular the series with a fractal behavior may have the non integral dimensions Fractal behavior and self-similarity A fractal is a geometric shape that can be cut into pieces which are (at least Approximately) copies of the original form. This property is also called the selfsimilarity which can be visible at whatever scale. Fractals can be obtained by iterative series such as (8.3). Sometimes the field of attraction gives very interesting geometric forms as shown in Figure

57 Figure 8.3. Examples of fractals 57

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Some Time-Series Models

Some Time-Series Models Some Time-Series Models Outline 1. Stochastic processes and their properties 2. Stationary processes 3. Some properties of the autocorrelation function 4. Some useful models Purely random processes, random

More information

Time Series Analysis -- An Introduction -- AMS 586

Time Series Analysis -- An Introduction -- AMS 586 Time Series Analysis -- An Introduction -- AMS 586 1 Objectives of time series analysis Data description Data interpretation Modeling Control Prediction & Forecasting 2 Time-Series Data Numerical data

More information

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M.

TIME SERIES ANALYSIS. Forecasting and Control. Wiley. Fifth Edition GWILYM M. JENKINS GEORGE E. P. BOX GREGORY C. REINSEL GRETA M. TIME SERIES ANALYSIS Forecasting and Control Fifth Edition GEORGE E. P. BOX GWILYM M. JENKINS GREGORY C. REINSEL GRETA M. LJUNG Wiley CONTENTS PREFACE TO THE FIFTH EDITION PREFACE TO THE FOURTH EDITION

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Lesson 2: Analysis of time series

Lesson 2: Analysis of time series Lesson 2: Analysis of time series Time series Main aims of time series analysis choosing right model statistical testing forecast driving and optimalisation Problems in analysis of time series time problems

More information

at least 50 and preferably 100 observations should be available to build a proper model

at least 50 and preferably 100 observations should be available to build a proper model III Box-Jenkins Methods 1. Pros and Cons of ARIMA Forecasting a) need for data at least 50 and preferably 100 observations should be available to build a proper model used most frequently for hourly or

More information

Elements of Multivariate Time Series Analysis

Elements of Multivariate Time Series Analysis Gregory C. Reinsel Elements of Multivariate Time Series Analysis Second Edition With 14 Figures Springer Contents Preface to the Second Edition Preface to the First Edition vii ix 1. Vector Time Series

More information

Time Series: Theory and Methods

Time Series: Theory and Methods Peter J. Brockwell Richard A. Davis Time Series: Theory and Methods Second Edition With 124 Illustrations Springer Contents Preface to the Second Edition Preface to the First Edition vn ix CHAPTER 1 Stationary

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY PREFACE xiii 1 Difference Equations 1.1. First-Order Difference Equations 1 1.2. pth-order Difference Equations 7

More information

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY Time Series Analysis James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY & Contents PREFACE xiii 1 1.1. 1.2. Difference Equations First-Order Difference Equations 1 /?th-order Difference

More information

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL

FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL FORECASTING SUGARCANE PRODUCTION IN INDIA WITH ARIMA MODEL B. N. MANDAL Abstract: Yearly sugarcane production data for the period of - to - of India were analyzed by time-series methods. Autocorrelation

More information

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo Vol.4, No.2, pp.2-27, April 216 MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo ABSTRACT: This study

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

1 Linear Difference Equations

1 Linear Difference Equations ARMA Handout Jialin Yu 1 Linear Difference Equations First order systems Let {ε t } t=1 denote an input sequence and {y t} t=1 sequence generated by denote an output y t = φy t 1 + ε t t = 1, 2,... with

More information

Applied Time. Series Analysis. Wayne A. Woodward. Henry L. Gray. Alan C. Elliott. Dallas, Texas, USA

Applied Time. Series Analysis. Wayne A. Woodward. Henry L. Gray. Alan C. Elliott. Dallas, Texas, USA Applied Time Series Analysis Wayne A. Woodward Southern Methodist University Dallas, Texas, USA Henry L. Gray Southern Methodist University Dallas, Texas, USA Alan C. Elliott University of Texas Southwestern

More information

ARIMA Modelling and Forecasting

ARIMA Modelling and Forecasting ARIMA Modelling and Forecasting Economic time series often appear nonstationary, because of trends, seasonal patterns, cycles, etc. However, the differences may appear stationary. Δx t x t x t 1 (first

More information

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38 ARIMA Models Jamie Monogan University of Georgia January 25, 2012 Jamie Monogan (UGA) ARIMA Models January 25, 2012 1 / 38 Objectives By the end of this meeting, participants should be able to: Describe

More information

A time series is called strictly stationary if the joint distribution of every collection (Y t

A time series is called strictly stationary if the joint distribution of every collection (Y t 5 Time series A time series is a set of observations recorded over time. You can think for example at the GDP of a country over the years (or quarters) or the hourly measurements of temperature over a

More information

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis Introduction to Time Series Analysis 1 Contents: I. Basics of Time Series Analysis... 4 I.1 Stationarity... 5 I.2 Autocorrelation Function... 9 I.3 Partial Autocorrelation Function (PACF)... 14 I.4 Transformation

More information

Classic Time Series Analysis

Classic Time Series Analysis Classic Time Series Analysis Concepts and Definitions Let Y be a random number with PDF f Y t ~f,t Define t =E[Y t ] m(t) is known as the trend Define the autocovariance t, s =COV [Y t,y s ] =E[ Y t t

More information

ESSE Mid-Term Test 2017 Tuesday 17 October :30-09:45

ESSE Mid-Term Test 2017 Tuesday 17 October :30-09:45 ESSE 4020 3.0 - Mid-Term Test 207 Tuesday 7 October 207. 08:30-09:45 Symbols have their usual meanings. All questions are worth 0 marks, although some are more difficult than others. Answer as many questions

More information

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA CHAPTER 6 TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA 6.1. Introduction A time series is a sequence of observations ordered in time. A basic assumption in the time series analysis

More information

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models

Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Lesson 13: Box-Jenkins Modeling Strategy for building ARMA models Facoltà di Economia Università dell Aquila umberto.triacca@gmail.com Introduction In this lesson we present a method to construct an ARMA(p,

More information

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection

SGN Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection SG 21006 Advanced Signal Processing: Lecture 8 Parameter estimation for AR and MA models. Model order selection Ioan Tabus Department of Signal Processing Tampere University of Technology Finland 1 / 28

More information

Econometric Forecasting

Econometric Forecasting Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 1, 2014 Outline Introduction Model-free extrapolation Univariate time-series models Trend

More information

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models

Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models Probability Models in Electrical and Computer Engineering Mathematical models as tools in analysis and design Deterministic models Probability models Statistical regularity Properties of relative frequency

More information

5 Autoregressive-Moving-Average Modeling

5 Autoregressive-Moving-Average Modeling 5 Autoregressive-Moving-Average Modeling 5. Purpose. Autoregressive-moving-average (ARMA models are mathematical models of the persistence, or autocorrelation, in a time series. ARMA models are widely

More information

Autoregressive Moving Average (ARMA) Models and their Practical Applications

Autoregressive Moving Average (ARMA) Models and their Practical Applications Autoregressive Moving Average (ARMA) Models and their Practical Applications Massimo Guidolin February 2018 1 Essential Concepts in Time Series Analysis 1.1 Time Series and Their Properties Time series:

More information

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them.

Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book; you are not tested on them. TS Module 1 Time series overview (The attached PDF file has better formatting.)! Model building! Time series plots Read Section 1.1, Examples of time series, on pages 1-8. These example introduce the book;

More information

Cohort Effect Structure in the Lee-Carter Residual Term. Naoki Sunamoto, FIAJ. Fukoku Mutual Life Insurance Company

Cohort Effect Structure in the Lee-Carter Residual Term. Naoki Sunamoto, FIAJ. Fukoku Mutual Life Insurance Company Cohort Effect Structure in the Lee-Carter Residual Term Naoki Sunamoto, FIAJ Fukoku Mutual Life Insurance Company 2-2 Uchisaiwaicho 2-chome, Chiyoda-ku, Tokyo, 100-0011, Japan Tel: +81-3-3593-7445, Fax:

More information

Forecasting. Simon Shaw 2005/06 Semester II

Forecasting. Simon Shaw 2005/06 Semester II Forecasting Simon Shaw s.c.shaw@maths.bath.ac.uk 2005/06 Semester II 1 Introduction A critical aspect of managing any business is planning for the future. events is called forecasting. Predicting future

More information

1. Fundamental concepts

1. Fundamental concepts . Fundamental concepts A time series is a sequence of data points, measured typically at successive times spaced at uniform intervals. Time series are used in such fields as statistics, signal processing

More information

Estimation and application of best ARIMA model for forecasting the uranium price.

Estimation and application of best ARIMA model for forecasting the uranium price. Estimation and application of best ARIMA model for forecasting the uranium price. Medeu Amangeldi May 13, 2018 Capstone Project Superviser: Dongming Wei Second reader: Zhenisbek Assylbekov Abstract This

More information

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27 ARIMA Models Jamie Monogan University of Georgia January 16, 2018 Jamie Monogan (UGA) ARIMA Models January 16, 2018 1 / 27 Objectives By the end of this meeting, participants should be able to: Argue why

More information

Lecture 2: ARMA(p,q) models (part 2)

Lecture 2: ARMA(p,q) models (part 2) Lecture 2: ARMA(p,q) models (part 2) Florian Pelgrin University of Lausanne, École des HEC Department of mathematics (IMEA-Nice) Sept. 2011 - Jan. 2012 Florian Pelgrin (HEC) Univariate time series Sept.

More information

Empirical Market Microstructure Analysis (EMMA)

Empirical Market Microstructure Analysis (EMMA) Empirical Market Microstructure Analysis (EMMA) Lecture 3: Statistical Building Blocks and Econometric Basics Prof. Dr. Michael Stein michael.stein@vwl.uni-freiburg.de Albert-Ludwigs-University of Freiburg

More information

Nonlinear time series

Nonlinear time series Based on the book by Fan/Yao: Nonlinear Time Series Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 27, 2009 Outline Characteristics of

More information

Regression of Time Series

Regression of Time Series Mahlerʼs Guide to Regression of Time Series CAS Exam S prepared by Howard C. Mahler, FCAS Copyright 2016 by Howard C. Mahler. Study Aid 2016F-S-9Supplement Howard Mahler hmahler@mac.com www.howardmahler.com/teaching

More information

Econ 623 Econometrics II Topic 2: Stationary Time Series

Econ 623 Econometrics II Topic 2: Stationary Time Series 1 Introduction Econ 623 Econometrics II Topic 2: Stationary Time Series In the regression model we can model the error term as an autoregression AR(1) process. That is, we can use the past value of the

More information

Exercises - Time series analysis

Exercises - Time series analysis Descriptive analysis of a time series (1) Estimate the trend of the series of gasoline consumption in Spain using a straight line in the period from 1945 to 1995 and generate forecasts for 24 months. Compare

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Stochastic Processes

Stochastic Processes Stochastic Processes Stochastic Process Non Formal Definition: Non formal: A stochastic process (random process) is the opposite of a deterministic process such as one defined by a differential equation.

More information

Appendix A: The time series behavior of employment growth

Appendix A: The time series behavior of employment growth Unpublished appendices from The Relationship between Firm Size and Firm Growth in the U.S. Manufacturing Sector Bronwyn H. Hall Journal of Industrial Economics 35 (June 987): 583-606. Appendix A: The time

More information

Discrete time processes

Discrete time processes Discrete time processes Predictions are difficult. Especially about the future Mark Twain. Florian Herzog 2013 Modeling observed data When we model observed (realized) data, we encounter usually the following

More information

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Moving average processes Autoregressive

More information

The autocorrelation and autocovariance functions - helpful tools in the modelling problem

The autocorrelation and autocovariance functions - helpful tools in the modelling problem The autocorrelation and autocovariance functions - helpful tools in the modelling problem J. Nowicka-Zagrajek A. Wy lomańska Institute of Mathematics and Computer Science Wroc law University of Technology,

More information

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications

Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications Lecture 3: Autoregressive Moving Average (ARMA) Models and their Practical Applications Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Moving average processes Autoregressive

More information

Time Series and Forecasting

Time Series and Forecasting Time Series and Forecasting Introduction to Forecasting n What is forecasting? n Primary Function is to Predict the Future using (time series related or other) data we have in hand n Why are we interested?

More information

Econ 424 Time Series Concepts

Econ 424 Time Series Concepts Econ 424 Time Series Concepts Eric Zivot January 20 2015 Time Series Processes Stochastic (Random) Process { 1 2 +1 } = { } = sequence of random variables indexed by time Observed time series of length

More information

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications

ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications ECONOMICS 7200 MODERN TIME SERIES ANALYSIS Econometric Theory and Applications Yongmiao Hong Department of Economics & Department of Statistical Sciences Cornell University Spring 2019 Time and uncertainty

More information

Chapter 3 - Temporal processes

Chapter 3 - Temporal processes STK4150 - Intro 1 Chapter 3 - Temporal processes Odd Kolbjørnsen and Geir Storvik January 23 2017 STK4150 - Intro 2 Temporal processes Data collected over time Past, present, future, change Temporal aspect

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

interval forecasting

interval forecasting Interval Forecasting Based on Chapter 7 of the Time Series Forecasting by Chatfield Econometric Forecasting, January 2008 Outline 1 2 3 4 5 Terminology Interval Forecasts Density Forecast Fan Chart Most

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

7. Forecasting with ARIMA models

7. Forecasting with ARIMA models 7. Forecasting with ARIMA models 309 Outline: Introduction The prediction equation of an ARIMA model Interpreting the predictions Variance of the predictions Forecast updating Measuring predictability

More information

Volatility. Gerald P. Dwyer. February Clemson University

Volatility. Gerald P. Dwyer. February Clemson University Volatility Gerald P. Dwyer Clemson University February 2016 Outline 1 Volatility Characteristics of Time Series Heteroskedasticity Simpler Estimation Strategies Exponentially Weighted Moving Average Use

More information

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation University of Oxford Statistical Methods Autocorrelation Identification and Estimation Dr. Órlaith Burke Michaelmas Term, 2011 Department of Statistics, 1 South Parks Road, Oxford OX1 3TG Contents 1 Model

More information

AR, MA and ARMA models

AR, MA and ARMA models AR, MA and AR by Hedibert Lopes P Based on Tsay s Analysis of Financial Time Series (3rd edition) P 1 Stationarity 2 3 4 5 6 7 P 8 9 10 11 Outline P Linear Time Series Analysis and Its Applications For

More information

A Data-Driven Model for Software Reliability Prediction

A Data-Driven Model for Software Reliability Prediction A Data-Driven Model for Software Reliability Prediction Author: Jung-Hua Lo IEEE International Conference on Granular Computing (2012) Young Taek Kim KAIST SE Lab. 9/4/2013 Contents Introduction Background

More information

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45

Two hours. To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER. 21 June :45 11:45 Two hours MATH20802 To be supplied by the Examinations Office: Mathematical Formula Tables THE UNIVERSITY OF MANCHESTER STATISTICAL METHODS 21 June 2010 9:45 11:45 Answer any FOUR of the questions. University-approved

More information

Sample Exam Questions for Econometrics

Sample Exam Questions for Econometrics Sample Exam Questions for Econometrics 1 a) What is meant by marginalisation and conditioning in the process of model reduction within the dynamic modelling tradition? (30%) b) Having derived a model for

More information

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity.

LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity. LECTURES 2-3 : Stochastic Processes, Autocorrelation function. Stationarity. Important points of Lecture 1: A time series {X t } is a series of observations taken sequentially over time: x t is an observation

More information

Time Series 2. Robert Almgren. Sept. 21, 2009

Time Series 2. Robert Almgren. Sept. 21, 2009 Time Series 2 Robert Almgren Sept. 21, 2009 This week we will talk about linear time series models: AR, MA, ARMA, ARIMA, etc. First we will talk about theory and after we will talk about fitting the models

More information

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn }

{ } Stochastic processes. Models for time series. Specification of a process. Specification of a process. , X t3. ,...X tn } Stochastic processes Time series are an example of a stochastic or random process Models for time series A stochastic process is 'a statistical phenomenon that evolves in time according to probabilistic

More information

Statistics of stochastic processes

Statistics of stochastic processes Introduction Statistics of stochastic processes Generally statistics is performed on observations y 1,..., y n assumed to be realizations of independent random variables Y 1,..., Y n. 14 settembre 2014

More information

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models

Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models Stat 248 Lab 2: Stationarity, More EDA, Basic TS Models Tessa L. Childers-Day February 8, 2013 1 Introduction Today s section will deal with topics such as: the mean function, the auto- and cross-covariance

More information

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Basics: Definitions and Notation. Stationarity. A More Formal Definition Basics: Definitions and Notation A Univariate is a sequence of measurements of the same variable collected over (usually regular intervals of) time. Usual assumption in many time series techniques is that

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 32 multiple choice

More information

Ross Bettinger, Analytical Consultant, Seattle, WA

Ross Bettinger, Analytical Consultant, Seattle, WA ABSTRACT DYNAMIC REGRESSION IN ARIMA MODELING Ross Bettinger, Analytical Consultant, Seattle, WA Box-Jenkins time series models that contain exogenous predictor variables are called dynamic regression

More information

Time Series and Forecasting

Time Series and Forecasting Time Series and Forecasting Introduction to Forecasting n What is forecasting? n Primary Function is to Predict the Future using (time series related or other) data we have in hand n Why are we interested?

More information

Applied time-series analysis

Applied time-series analysis Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna October 18, 2011 Outline Introduction and overview Econometric Time-Series Analysis In principle,

More information

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 )

Covariance Stationary Time Series. Example: Independent White Noise (IWN(0,σ 2 )) Y t = ε t, ε t iid N(0,σ 2 ) Covariance Stationary Time Series Stochastic Process: sequence of rv s ordered by time {Y t } {...,Y 1,Y 0,Y 1,...} Defn: {Y t } is covariance stationary if E[Y t ]μ for all t cov(y t,y t j )E[(Y t μ)(y

More information

Seasonal Models and Seasonal Adjustment

Seasonal Models and Seasonal Adjustment LECTURE 10 Seasonal Models and Seasonal Adjustment So far, we have relied upon the method of trigonometrical regression for building models which can be used for forecasting seasonal economic time series.

More information

FE570 Financial Markets and Trading. Stevens Institute of Technology

FE570 Financial Markets and Trading. Stevens Institute of Technology FE570 Financial Markets and Trading Lecture 5. Linear Time Series Analysis and Its Applications (Ref. Joel Hasbrouck - Empirical Market Microstructure ) Steve Yang Stevens Institute of Technology 9/25/2012

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Autoregressive and Moving-Average Models

Autoregressive and Moving-Average Models Chapter 3 Autoregressive and Moving-Average Models 3.1 Introduction Let y be a random variable. We consider the elements of an observed time series {y 0,y 1,y2,...,y t } as being realizations of this randoms

More information

Chapter 9: Forecasting

Chapter 9: Forecasting Chapter 9: Forecasting One of the critical goals of time series analysis is to forecast (predict) the values of the time series at times in the future. When forecasting, we ideally should evaluate the

More information

Forecasting Egyptian GDP Using ARIMA Models

Forecasting Egyptian GDP Using ARIMA Models Reports on Economics and Finance, Vol. 5, 2019, no. 1, 35-47 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ref.2019.81023 Forecasting Egyptian GDP Using ARIMA Models Mohamed Reda Abonazel * and

More information

Time Series 4. Robert Almgren. Oct. 5, 2009

Time Series 4. Robert Almgren. Oct. 5, 2009 Time Series 4 Robert Almgren Oct. 5, 2009 1 Nonstationarity How should you model a process that has drift? ARMA models are intrinsically stationary, that is, they are mean-reverting: when the value of

More information

A comparison of four different block bootstrap methods

A comparison of four different block bootstrap methods Croatian Operational Research Review 189 CRORR 5(014), 189 0 A comparison of four different block bootstrap methods Boris Radovanov 1, and Aleksandra Marcikić 1 1 Faculty of Economics Subotica, University

More information

UNIVARIATE TIME SERIES ANALYSIS BRIEFING 1970

UNIVARIATE TIME SERIES ANALYSIS BRIEFING 1970 UNIVARIATE TIME SERIES ANALYSIS BRIEFING 1970 Joseph George Caldwell, PhD (Statistics) 1432 N Camino Mateo, Tucson, AZ 85745-3311 USA Tel. (001)(520)222-3446, E-mail jcaldwell9@yahoo.com (File converted

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics T H I R D E D I T I O N Global Edition James H. Stock Harvard University Mark W. Watson Princeton University Boston Columbus Indianapolis New York San Francisco Upper Saddle

More information

Design of IP networks with Quality of Service

Design of IP networks with Quality of Service Course of Multimedia Internet (Sub-course Reti Internet Multimediali ), AA 2010-2011 Prof. Pag. 1 Design of IP networks with Quality of Service 1 Course of Multimedia Internet (Sub-course Reti Internet

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Nonlinear time series analysis Gerald P. Dwyer Trinity College, Dublin January 2016 Outline 1 Nonlinearity Does nonlinearity matter? Nonlinear models Tests for nonlinearity Forecasting

More information

Lecture 5: Estimation of time series

Lecture 5: Estimation of time series Lecture 5, page 1 Lecture 5: Estimation of time series Outline of lesson 5 (chapter 4) (Extended version of the book): a.) Model formulation Explorative analyses Model formulation b.) Model estimation

More information

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10

Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis. 17th Class 7/1/10 Econ 300/QAC 201: Quantitative Methods in Economics/Applied Data Analysis 17th Class 7/1/10 The only function of economic forecasting is to make astrology look respectable. --John Kenneth Galbraith show

More information

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS

Romanian Economic and Business Review Vol. 3, No. 3 THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS THE EVOLUTION OF SNP PETROM STOCK LIST - STUDY THROUGH AUTOREGRESSIVE MODELS Marian Zaharia, Ioana Zaheu, and Elena Roxana Stan Abstract Stock exchange market is one of the most dynamic and unpredictable

More information

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes

Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Sequential Monte Carlo methods for filtering of unobservable components of multidimensional diffusion Markov processes Ellida M. Khazen * 13395 Coppermine Rd. Apartment 410 Herndon VA 20171 USA Abstract

More information

Time Series I Time Domain Methods

Time Series I Time Domain Methods Astrostatistics Summer School Penn State University University Park, PA 16802 May 21, 2007 Overview Filtering and the Likelihood Function Time series is the study of data consisting of a sequence of DEPENDENT

More information

Statistical Methods for Forecasting

Statistical Methods for Forecasting Statistical Methods for Forecasting BOVAS ABRAHAM University of Waterloo JOHANNES LEDOLTER University of Iowa John Wiley & Sons New York Chichester Brisbane Toronto Singapore Contents 1 INTRODUCTION AND

More information

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53

State-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53 State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State

More information

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis Chapter 12: An introduction to Time Series Analysis Introduction In this chapter, we will discuss forecasting with single-series (univariate) Box-Jenkins models. The common name of the models is Auto-Regressive

More information

Introduction to ARMA and GARCH processes

Introduction to ARMA and GARCH processes Introduction to ARMA and GARCH processes Fulvio Corsi SNS Pisa 3 March 2010 Fulvio Corsi Introduction to ARMA () and GARCH processes SNS Pisa 3 March 2010 1 / 24 Stationarity Strict stationarity: (X 1,

More information

Ch 6. Model Specification. Time Series Analysis

Ch 6. Model Specification. Time Series Analysis We start to build ARIMA(p,d,q) models. The subjects include: 1 how to determine p, d, q for a given series (Chapter 6); 2 how to estimate the parameters (φ s and θ s) of a specific ARIMA(p,d,q) model (Chapter

More information

Heteroskedasticity in Time Series

Heteroskedasticity in Time Series Heteroskedasticity in Time Series Figure: Time Series of Daily NYSE Returns. 206 / 285 Key Fact 1: Stock Returns are Approximately Serially Uncorrelated Figure: Correlogram of Daily Stock Market Returns.

More information

Lesson 9: Autoregressive-Moving Average (ARMA) models

Lesson 9: Autoregressive-Moving Average (ARMA) models Lesson 9: Autoregressive-Moving Average (ARMA) models Dipartimento di Ingegneria e Scienze dell Informazione e Matematica Università dell Aquila, umberto.triacca@ec.univaq.it Introduction We have seen

More information

Maximum-Likelihood Estimation: Basic Ideas

Maximum-Likelihood Estimation: Basic Ideas Sociology 740 John Fox Lecture Notes Maximum-Likelihood Estimation: Basic Ideas Copyright 2014 by John Fox Maximum-Likelihood Estimation: Basic Ideas 1 I The method of maximum likelihood provides estimators

More information

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 4, 215 Name: Please read the following directions. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directions This exam is closed book and closed notes. There are 31 questions. Circle

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information