ECONOMETRICS. Study supports

Size: px
Start display at page:

Download "ECONOMETRICS. Study supports"

Transcription

1 VYSOKÁ ŠKOLA BÁŇSKÁ-TECHNICKÁ UNIVERZITA OSTRAVA Fakulta metalurgie a materiálového inženýrství PROJECT IRP Creation of English Study Supports for Selected Subjects of the Follow-up Master Study in the Quality Management Study Field IRP/2015/104 ECONOMETRICS Study supports

2 Language review: Mgr. Gabriela Chudašová Title: Author: Ing., Ph.D. Edition: First, 2015 Number of pages: 80 Study materials for the study supports of, the Faculty of Metallurgy and Material Engineering. Intended for the IRP project of: Creation of English Study Supports for Selected Subjects of the Follow-up Master Study in the Quality Management Study Field. Number: IRP/2015/104 Execution: VŠB Technical University of Ostrava The project is co-financed by the Ministry of Education, Youth and Sports of the Czech Republic ISBN

3 STUDY INSTRUCTIONS You have received a study text for the course which is intended for students of the study programmes Quality Management and Economics and Management in Industry at the Faculty of Metallurgy and Material Engineering. The aim of the course is to introduce students to the foundations of Econometric theory which is widely used in industry, finance and other scientific or applied fields for purposes of modelling relations among phenomena of interest. The study text is divided into parts and chapters, which logically divide the subject matter, but are not equally comprehensive. The estimated study time of the chapters may vary considerably, which is why large chapters are further divided into numbered sub-chapters. The division corresponds tothe structure described below. The subject matter is presented to students within the framework of corresponding lectures, and is practised in seminars where students practically learn the presented topics. After studying the course, students should be able to: - Define and estimate models among many variables, using regression techniques. - Analyse model suitability for practical purposes. - Detect problems in modelling and remove them if they are present. - Work with single-equation and multi-equation models. 3

4 STRUCTURE OF THE CHAPTERS Goal At the beginning of each chapter, objectives of that chapter are given, so that students can better orientate in terms of what is to be achieved after reading the chapter in question. Time to learning After a goal is set, the time necessary to study the subject matter is provided. The time is approximate and serves as a rough guide for the study layout. Example The text provides the reader with concrete examples which are to describe the practical aspect of work with the presented theory. The illustrative problems are solved step by step so that the ideas of the procedures used are more clear. Summary of terms At the end of each chapter, the most important terms are listed for convenience. These terms represent the part of the theory students should focus on. If a term has not been fully understood, the student should go back in the text, and read the corresponding explanatory part again. Questions There are also several theoretical or practical questions to verify that the student fully and well mastered the subject matter of the chapter. Answers to questions The practical questions are answered in the follow-up Answers to questions section. The author wishes readers a successful study of this textbook. Ing., Ph.D. 4

5 CONTENTS STUDY INSTRUCTIONS...3 STRUCTURE OF THE CHAPTERS...4 INTRODUCTION CLASSICAL LINEAR REGRESSION Regression line Model formulation Estimation of parameters of regression line Evaluation of model quality Properties of estimates Statistical properties of regression line Statistical inference Multivariate regression Model formulation The least squares estimation of model parameters Statistical properties of least squares Statistical inference Adjusted coefficient of determination Test of model PROBLEMS IN CLASSICAL REGRESSION Multicollinearity Number of regressors Measurement errors Errors in y Errors in x Heteroscedasticity Formulation of the model and its consequences The generalized least squares Goldfeld-Quandt test Autocorrelation

6 Formulation of the model and consequences Autocorrelation and the generalized least squares method Durbin-Watson test SETS OF SIMULTANEOUS EQUATIONS Identification Estimation of simultaneous equations The method of indirect least squares Two stage least squares TABLES REFERENCES

7 INTRODUCTION Many economic subjects need to possess a deeper knowledge of the relations among diverse variables due to their own industrial or economic activity. One of the reasons why companies demand the knowledge might be the fact that the result of their work depends directly on the development of these variables. To give an example, banks monitor the progress of interest rates because banks financial profits are affected by the rates. Thus, knowing the relations between the rates and financial profits, a change in the rates might indicate the extent to which the financial profits will be altered. Another example involves industrial product makers who need to cut down their costs to stay competitive. To achieve this objective, they need to learn how different factors entering their production processes are related to each other and how they influence the firms industrial output. Understanding that, the levels of the factors can be set up in such a way that the output is realized at lower costs. Other examples could be worded to illustrate that different relations among different variables are of interest to both private and public companies. If relations are of interest, they should be described as best as possible, so that precise enough predictions can be formulated on the development of one set of variables, based on the setting of other variables. This allows companies to correct their behaviour in advance, so that the outcome of their activities satisfies both their and customers needs. is one of the disciplines that provides the theory and tools to describe the relations among variables. However, not only is it used to discover new relations, it is also exploited for empirical verifications of already formulated relations. There are many areas of human activity where econometrics can be used, but its economic and industrial applications probably dominate. At this stage we can say that the essential objective of econometrics is to describe and measure dependencies among variables. When relations among variables are of interest, they are usually described through a certain mathematical model that they are a part of. In the sciences, mathematical models are mostly deterministic, meaning that the relations hold exactly. In this context, the word deterministic signifies that once a set of variables takes on specific values, another set of variables, related to the former through a function, attains its own levels uniquely. A law in physics may be an example of this unique deterministic relationship. However, deterministic relations are often not the case both in practice and theory, even though some theories assume the deterministic nature of what they work with. This assumption might only be a simplification of the true state of things. If this is the case, only the most optimistic analyst, who works with the deterministic model, can presume the model corresponds to its real counterpart. More often than not, this will not be true, however. To give an example, let us look at production models. No matter how perfect such models might be, they will never encompass unexpected events, such as technical breakdowns of a machine that can temporarily stop production altogether. For these purposes, it is better if the formerly deterministic model is extended by a stochastic element, which leads to the concept of a stochastic model. In a stochastic model, the behaviour of the variable we are interested in, such as production level, for instance, depends not only on deterministically defined other variables, but also on variables the effect of which on the variable of interest is hard or impossible to identify or measure. The overall effect of these unidentified variables is contained in or described by a random element which is inserted in the model, as well. When working with a stochastic model, it is no longer possible to predict the value of one variable, when knowing the values of other variables, since the model contains a random element the 7

8 value of which remains unknown until it is implemented. This distinguishes stochastic models from the deterministic ones. Stochastic models are less precise than deterministic models in the sense of whether the unique value of a variable corresponds exactly to the values of other variables. On the other hand, stochastic models often describe the reality more precisely than deterministic models. This is usually at the cost of simplifying the work with these models, however. The validity of a deterministic model can be refuted with a single empirical observation, whereas in the case of stochastic models, this would require many observations which predominantly are not in accordance with the model. Trying to approach the reality better means working with stochastic models. When building a stochastic model, economic and business theory is utilized, because the theory may require certain conditions to be met for the parameters entering the model. The theory may also define which variables can enter the model at all from the logical point of view. Mathematical tools must also be used as well as the theory of probability or its application in the form of mathematical statistics, since a stochastic model is dealt with. Furthermore, the model must be in line with what real data suggest, and so empirical data have to be at hand, as well, to underscore the validity of the built-up model. To sum up, econometrics is a complex scientific discipline in which the following three disciplines meet general mathematics, mathematical statistics and (business) economics. We can mention the first issue of the journal Econometrica, in which The Econometric Society wrote that the main objective of the society will be its support of studies which try to unify a quantitative theoretical approach with a quantitative empirical approach, and which rely on constructive and rigorous thinking similar to that of the sciences. However, a quantitative approach has several aspects none of which should be linked to econometrics alone. is not the same as economic statistics nor general economic theory, although a large part of it is quantitative in nature. Nor should econometrics be a synonym for the mathematics applied in economics. Experience showed that each of the three views, i.e. mathematics, economics and statistics, is necessary, and none of them can serve for these purposes by itself. It is their unification that turns them into a powerful tool. And this unification represents econometrics. How does an econometric model originate? At first, a model linking several variables is formulated. This model can draw on the validity of other theories from which it has been verified. The Cobb-Douglas production function may serve as an example. The formulation might also depend on the experience of the analyst who tries to build the model, the type of available information based on which the model is defined, and the character of the problem the model is built for. There may be more forms of the model, although none of them should contradict the theories. To give an example of the procedure, the Cobb-Douglas function can be taken as a starting point for modelling purposes, the function serving as an instance of a theory which has not been disputed, and the function may be further examined quantitatively in case of a specific company, since the function contains company-specific unknown parameters. In the next step, the formulated model is converted into an econometric model which represents an algebraic expression of the relations among the variables of interest, using the means of mathematics. This model contains the aforementioned random element and unknown parameters which reflect the intensity and direction of the mutual effects of the variables appearing in the model. The conversion requires that the variables to be included in the model are listed, and the 8

9 explanatory (independent) and explained (dependent) variables are determined. Thus, the direction of causal relations must be defined. Also, it must be decided whether one or more equations will make up the entire model, and what analytical form of the model it will be (linear vs nonlinear, for instance). If it can be assumed that the resulting econometric model contains all important variables and its random element has reasonable statistical properties, the unknown parameters of the model may be estimated, using proper statistical techniques. Under suitable conditions, the estimated parameters will have statistical properties that are optimal in a sense. To estimate the parameters means to determine a particular form of the model, based on concrete data, since at the beginning the model was defined only generally. The theory of econometrics strives to search for methods and conditions under which the estimated model has optimal statistical properties. It also analyses what happens to the model properties when the optimal conditions are violated and looks for ways how to solve these problems. A suitable estimation of the unknown parameters in the model is based on data gathered by the analyst who formulated the model. This is an integral part of econometricians work because the core of econometrics lies in a quantitative approach. In economic reality, a source of necessary data are official reports published by the country s statistical office, the results of polls run by marketing agencies, the information releases of central banks, companies office records, the reports from chambers of commerce, etc. This information represents the basis for modelling the relations between economic and industrial variables. After a specific model is built, it is imperative to verify its validity. Firstly, it must be verified whether the estimated parameters are in accordance with general economic and business theory. For instance, some model parameters should be positive according to economic theory. If it turns out to be negative, the model cannot be correct and must be modified. In this context, we talk about an economic verification of the model. Secondly, the quality of the model from the purely statistical point of view must be checked, as well. This requires that various statistical tests are run, and model quality characteristics are calculated. Different criteria checking model quality exist, and they are either a direct result of the theory of statistics, or they are artificially constructed, having a specific interpretation. In this case, a statistical verification of the model is performed. An econometric verification of the model represents yet another feedback about the model and its estimation. This type of verification is very important, since it checks whether all conditions necessary for estimating the model and its statistical verification have been met. As it turns out, building an econometric model is not a straightforward and easy task. It can be actually quite difficult because the quality of the mathematical description of relations among economic variables depends on more aspects involving the extent of the economic reality to be described, the character of variables to be used in the model, the mathematical form of the model or the stochastic character of the random component used in the model. The theory of econometrics tries to overcome these obstacles so that the models formulated were reasonable in terms of their precision, and could be suitably used in real-life situations. 9

10 is a valuable tool for studying economic and business phenomena, and as a theory it is advanced. On the other hand, its power should not be overestimated. We must realize that econometricians often observe economic events under conditions which do not allow them to run a controlled experiment. They cannot always monitor the reaction of a variable of interest, dependent on other influential variables or factors, the values of which they could control at their own free will. In these cases, they cannot expect to find deep relations through widespread and controlled data manipulation. What econometricians may have at hand are data of a nonexperimental nature because they are only the passive observers of events. This contributes to the fact that stochastic rather than deterministic models are worked with. The analyst can only hope that a data sample on implemented economic processes will be available under conditions which econometrics can handle, and the result of his or her efforts will be a narrower class of reasonably good models usable in practice. One might even say that the theory of econometrics is a good organizer of available economic and business data. It must be stressed that any model will always be a simplification of reality, and that it will not be usually possible to create a model which would be better than other models in every respect. This means that the objective of an econometrician should rather be to narrow down the set of all usable models to a subset of better ones which might serve as better candidates for the description of reality. As statisticians say, every model is poor, but some models are better than others. And it is the better model that econometricians should try to detect, based on all the data they can have. originated some 80 years ago, in the 1930s, at the time when the American Econometric Society was founded. As it is the case with other theories, econometrics is built in subsequent steps, starting with models corresponding to simpler situations, and ending up with models reflecting more complex economic settings. We shall follow this concept in the following chapters of this text. 10

11 1 CLASSICAL LINEAR REGRESSION Goal: This chapter introduces the reader to fundamental concepts of regression modelling in its simplest form, which covers multivariate classical linear regression and its special case, the case of the regression line. The chapter covers estimation procedures for finding parameters in regression models, checking model quality and using models for statistical inference. Time to learning: 10 hours. We shall start the explanation of the theory of econometrics in the area of classical linear regression which represents a starting point for more advanced econometric procedures. The term regression is already known from other fields, such as psychoanalysis, and it can be loosely translated as a reversed procedure. In econometrics, we do not analyse data based on the already known functional relations among them. We rather proceed the opposite way, hence the term reversed. We try to find a functional relation or model, having gathered some empirical data in advance. This model should explain the form of the available data. In econometrics, the regression principles and techniques are utilized to a great extent. Regarding the modifier linear, the term relates to the analytical form of the models we are going to work with. Linearity is known from general mathematics, although within the econometric framework, it is perceived in a broader sense. We shall define the term linear regression model as the model of the form = , (1.1) where is the so-called explained or dependent variable,, j=1,2,,k, are explanatory or independent variables (yet another term is regressors),, j=1,2,,k, are unknown model parameters and represents the random component of the model. To stress the meaning of the word linear, we shall write the relation 1-1 in a more general form = , (1.2) where = (,,, ) for j=1,2,,k. Thus, when we talk about a linear model, we will have in mind the expression 1-2, which is linear in terms of how the unknown parameters appear analytically in the model. The explanatory variables can be nonlinear functions of their 11

12 arguments. For example, a model of the form = is still linear, while the model = + is not. The word classical is related to a set of properties required for the random model component and the regressors. The properties are formulated in a way which ensures that the estimation of the unknown model parameters will result in an econometric model of a reasonable quality. One of these conditions requires that the regressors are non-random variables (or more precisely, that they are degenerated random variables). This practically means that an econometrician working with the model defines the values of the regressors to be included in the model himself or herself. Having set the i-th value of the j-th variable, which shall be denoted, the econometrician then measures or obtains a value of the dependent variable corresponding to the values of regressors set in advance. If the subscript i represented a point in time (in that case, a more common subscript is t instead of i), the model contained only one explanatory variable - the amount of a company s savings, and the dependent variable Y would stand for a company s investments, then for i= 1, the econometrician would define a company s savings of and look for the investments of companies whose savings are valued at. If n values are defined for the explanatory variables in advance, the resulting data sample to be analysed will be of the form (,,,, ), i=1,2,, n. This sample shall be used to build an econometric model which would describe the relation(s) between the variable and the variables,,,. The obtained values of are realizations of the random variable Y. As was outlined in the introduction, the construction of a model can be divided into three major steps. In the first step, a general mathematical form of the model is defined, and statistical properties are set for the random component of the model. In the second step, the unknown parameters of the model are estimated, so that the general form of the model from the first step takes on a specific empirical form. For this, concrete empirical data must be available. In the final step, the overall quality of the model is assessed, and the feasibility of the assumptions about the properties of the random component is discussed. If the assumptions are met, the model will have a reasonable quality and can be worked with for the purposes of statistical inference. If the assumptions are not met, the approach to building the model must first be altered before it is reasonably used for statistical inference. We shall follow this concept in the next chapters. The theory of classical regression will now be divided into two parts. In part one, we shall discuss the concept of the regression line. This will serve us well in demonstrating some fundamental principles of econometrics [1]. In the second part, these principles will be generalized for the case when the regression model contains more independent variables. Thus, we will switch to model 1-1 in which k may be greater than one. 1.1 REGRESSION LINE MODEL FORMULATION The model of the regression line is of the form 12

13 = + +, = 1, 2,,. (1.3) Compared to 1-1, the model contains the subscript i, which simply says that we work with n specific forms/values of the regressor. Expression 1-3 is a special case of 1-1 when k = 1 and the equation is examined for n specific values of. As we shall see, concrete sets of values (, ), = 1, 2,,, will be utilized when building the model. These values are available before the unknown coefficients in the model are estimated. The values s are concrete realizations of the random variables s. Since 1-3 represents a set of n equations with unknown parameters,, it is possible to rewrite it in the matrix form = +, (1.4) where 1 = 1, = : : :, = :, =. 1 In the classical regression, the random components are required to meet the following conditions: ~(0, ) for = 1, 2,,,, = 0 for. In other words, the random elements are to be normally distributed with a zero mean and a constant variance. They should also be mutually or serially uncorrelated. When the condition of constant variance is met, we talk about homoscedasticity. In the opposite case, when the condition is violated, we talk about heteroscedasticity. If the condition of the zero correlation is satisfied, we say that there is no autocorrelation in the model. More precisely, the two conditions are usually written in the matrix form, which implies that the random components are not only uncorrelated, but also statistically independent: ~(, ), (1.5) where N denotes a multivariate normal distribution (of dimension n in this case), 0 is the vector of expected values (0, 0,, 0) and, with I being the identity matrix, is the so-called covariance matrix. The matrix contains variances of the random components of the model on its diagonal the diagonal elements are all equal in the classical model and covariances between the random components represent the off-diagonal elements of the matrix. In general, the element of the matrix lying in the i-th row and j-th column is =,, i, j = 1,, n. The classical case also has requirements for the matrix of regressors X. It is demanded that a) elements of are determined in advance, and thus are not realizations of 13

14 nondegenerated random variables. They are not assigned to the analyst by chance. b) columns of, which is of type, i.e. it has n rows and k columns, need to be linearly independent. In other words, the rank of X is h() =. (1.6) 1-6 implies that. We have six conditions altogether in the case of the classical regression model. Four of them concern the random components of the model (normality, zero expected values, constant variance and zero covariances, or put equivalently, zero correlations), while the remaining two conditions are related to the matrix of regressors. We add that in the general formulation of classical regression, the condition of normality is absent, since many important properties of the resulting estimated model may still be proved without this requirement. However, normality simplifies the subsequent statistical inference we shall do with the model, therefore we will assume the condition is met ESTIMATION OF PARAMETERS OF REGRESSION LINE We will present the basic and most used method of estimation of the unknown parameters appearing in a regression line. The method is called the least squares method, and it can be used for more complex models, as well. Computationally, the method is simple, and it produces estimates with reasonable statistical properties. Its principle is such that the estimates of the two unknown parameters,, denoted,, are the values which minimize expression (, ) = ( ). (1.7) Formula 1-7 can be understood as a function of two variables and, and so to find the estimates means to find the minimum of the function S, or to find the extreme of a function of two variables, generally speaking. To find the extreme, we derive function 1-7 with respect to and then with respect to, and set the two partial derivatives equal to zero. This way, we arrive at equations (, ) = 2 ( 1) = 0, = 2 ( ) = 0, (, ) This represents the set of normal equations. The set can be further adjusted as + =, + =. This leads to the solution =, 14

15 = Expressions 1-8 can be simplified to. (1.8) = (,), ( ) =, (1.9) where the symbol cov means the covariance of the variables and : (, ) = (1/) (1/) (1/), and the symbol var stands for the variance of calculated according to the formula (1/) ( ). The symbols and are often replaced with symbols and because this complies with the notation principles used in statistics. Greek letters are reserved for unknown parameters, whereas their estimates are denoted with letters of the Latin alphabet. The model ( ) = +, where the symbol E denotes the expected value of Y as usual, is sometimes called the theoretical regression function, its estimate = + is called the sample regression function. The values are called fitted values. When the model is estimated, it is possible to estimate the i-th value of the random component = with =, which is more often denoted as. The estimate of the random component is called residual. Expression 1-7 may now be estimated by the residual sum of squares =. EXAMPLE 1 Using the least squares method, find the estimates and of the unknown coefficients in the model = + +. The following data sample is available Table 1: Data sample for Example 1 x y Source: own Solution: We shall use expressions 1-8. To do so, let us extend Table 1 to table 15

16 Table 2 x y xy x Summing the columns of the table, we get = = = 1.232, = Therefore, the sample regression line is of the form = The same result can be obtained, using formulas 1-9. In our case, (, ) = and () = 16. Further, = and = 8, so that = /16 = and = = In the next step, residuals can be calculated as well as the residual sum of squares. The results of these calculations are in Table 3. Table 3: Calculation of residuals in Example 1 x y fitted y e e The residual sum of squares is 4.82 (this is the sum of the last column of Table 3). The value represents the minimal value of function EVALUATION OF MODEL QUALITY After the model is estimated, it makes sense to make a judgement about its quality. We shall focus on the statistical verification of the model at this stage and on the general characteristics of model quality. Since we have not assigned a particular economic meaning to the model, we will 16

17 not perform the economic verification. As far as the econometric verification is concerned, it will be dealt with in greater detail in later chapters where we will be concerned with the causes and consequences of violations of classical regression conditions for a general model. If we want to make a judgement about the quality of a model, we could theoretically use the criterion, because its value tells us how well the model at hand runs through the measured values. The smaller the value of the residual sum of squares, the more optimistic we might be in relation to the model found. However, this criterion is not suitable and the reason behind this statement is simple - depends on the physical units of Y, and so a change in the units changes. Therefore, various dimensionless criteria of model quality are constructed, reflecting in a sense to what extent the model is suitable for use. One of these criteria is the coefficient of determination. Let us demonstrate its principle. Let = ( ) and = ( ). The expression for is called the total sum of squares, the expression for is called the explained sum of squares, and it describes the variability of the fitted values. We may write ( ) = ( + ), = ( ) + ( ). Thus, we have = +. (1.10) Dividing both sides of 1-10 by, we get 1 = / + /. (1.11) The ratio / is called the coefficient of determination, denoted R 2. The ratio describes what part of the total variability of s is explained by the variability of the fitted values, i.e. what part of the total variability is explained by the model. It follows from 1-11 that the coefficient can take on values from interval [0, 1]. The higher the value of the coefficient, the more suitable the model seems to be. In the case of our regression line, = 170/ = This value suggests that the model is suitable for the description of the x, y relation. It must be stressed, however, that the coefficient is an elementary characteristic of model quality, and it has its drawbacks. We shall talk about it in the chapter on multivariate regression, where more regressors will be added to the model and a suitable modification of the coefficient of determination will be introduced. The coefficient of determination evaluates the suitability of the analytical form of the estimated model, but it does not say anything about the statistical properties of the estimated model 17

18 parameters or the entire model. Let us focus on these properties which are related to conditions 1-5 and 1-6 of the classical regression case PROPERTIES OF ESTIMATES To assess the statistical quality of a model and its estimated parameters, suitable criteria must be defined, against which the quality will be compared. The theory of econometrics works particularly with the following criteria: Unbiased estimate We say that b is an unbiased estimate of if the expected value of the estimate equals the unknown parameter, that is, if () =. Usually, a vector of unknown parameters = (,,, ), k >1, is being estimated, and in that case, a vector = (,,, ) is said to be an unbiased estimate of if ( ) =, i = 1,, k. The best unbiased estimate An unbiased estimate b is said to be the best unbiased estimate of the unknown parameter if ( ) (), where is any unbiased estimate of. This means no unbiased estimate has a smaller variance. If the entire vector = (,,, ), k >1, is estimated, then an estimate is said to be the best unbiased estimate if a linear combination = + +, where s are arbitrary real numbers, is the best unbiased estimate of. Asymptotically unbiased estimate We say that, where the subscript denotes the data sample size we work with, is an asymptotically unbiased estimate of if its expected value converges to the unknown parameter lim ( ) =. This means that for large enough samples, the estimate is an unbiased estimate of the unknown parameters, but only approximately. An analogous definition holds for the vector of estimates: the limit of the expected value is applied to each vector component separately. One may also define the best asymptotically unbiased estimate in a manner similar to that of the best unbiased estimate, but within the limit. Consistent estimate An estimate, where the subscript denotes the data sample size we work with, is a consistent estimate of the unknown parameter if it converges to in probability, i.e. if lim ({ > }) = 0 18

19 for any fixed > 0. A vector of estimates is said to be a consistent estimate of the vector of unknown parameters if each component of the vector of estimates is a consistent estimate of the corresponding component of the vector of unknown parameters. Convergence in distribution and asymptotic distribution of an estimate In many situations it is convenient to know the probability distribution of the estimated parameters of a regression model because it allows us to perform statistical inferences. It may happen, however, that the exact distribution is unknown when only a data sample of finite size is available, which is always the case in practice. What might be known, however, is the limiting distribution of such an estimate. It is then possible to say that the known distribution describes the stochastic behaviour of the estimate with a precision that grows as the data sample size increases. The term asymptotic distribution of an estimate may then be introduced in this context. Before doing so, however, the term limiting distribution must first be defined, which is related to what is called a convergence in distribution. In our case, all these terms will concern a normal probability distribution, in particular. Convergence in distribution Let { } be a sequence of random variables the behaviour of which is described by distribution functions (), and let X be a random variable described by a distribution function (). If lim () = () in the points of continuity of F, then is said to converge to X in distribution. The convergence in distribution is analogically defined for random vectors all that must be done in the definition just presented is that { } needs to be understood as a sequence of random vectors, X is taken as a random vector and the distribution functions discussed must be viewed as distribution functions of the corresponding random vectors. If univariate random variables converge in distribution to a normal variable, the notation (.,. ) is used to describe this fact. A similar notation is used for vectors of a multivariate normal distribution: (.,. ). Asymptotic distribution We say that the distribution of an estimate b is asymptotically normal with an expected value and a variance / if ( ) (0, ). We say that a vector of estimates b is asymptotically normally distributed with a vector of expected values and a covariance matrix / if ( ) (, ). Here, we deal with a multivariate normal distribution N and the covariance matrix of a random vector. Regarding the covariance matrix, see the beginning of this chapter, where this term has been described for the vector of random components of a regression model, which is a random vector, as well. V in this general definition may not be, however, a diagonal matrix. Some other statistical properties are defined in the context of econometrics, as well. These are the efficiency and asymptotic efficiency of an estimate [2]. The precise definitions of these terms are rather technical, dealing with regular probability distributions. We shall not present these definitions in this text, and will focus mainly on (the best) unbiasedness, (the best) linear 19

20 unbiasedness to be yet introduced, and on the consistency and distribution of an estimate. These terms are especially important both for practical and theoretical purposes STATISTICAL PROPERTIES OF REGRESSION LINE Let us look at some of the statistical properties of the estimated parameters ina regression line provided the conditions for classical regression are satisfied. We shall be interested in the most important properties for the case of finite data samples, and the properties that do not have a straightforward proof will only be mentioned without presenting the proof. A more comprehensive coverage of the properties will be done in the more general case of multivariate regression models in the next chapter. Regarding the estimate in the model = + +, we may write where = (,) ( ) = ( ) ( ). = ( ) ( ) ( ) Similarly as in 1-12, we can write =. = ( ) ( ) =, (1.12) Thus, both estimates can be expressed analytically in the form h. Since this expression is a linear function of y, the estimates are called linear estimates. Taking as a random variable now, not its specific realization, thus using instead of in 1-12, and substituting with equation = + + in 1-12, we have = ( ) ( ) ( ) = +, (1.13) since ( ) = ( ) and ( ) = 0. Therefore, ( ) = ( + ) = + ( ) = because of the condition ( ) = 0. Further, since () = ( ) = ( + ) = +, it follows that ( ) = ( ) = + ( ) =. We can say that the least squares method leads to unbiased estimates of the unknown parameters and. Let us calculate the variances of the estimated parameters now, as an exercise. Given 1-13 and given the fact that the random components of the model are uncorrelated, we may write ( ) = ( + ) = ( ) = = / ( ), 20

21 because generally, ( ) = ( ) + 2 (, ),. As for the parameter, it can be adjusted, using 1-12, as = = ( + + ) = + ( ) + = +, where = ( ) [3]. Therefore, ( ) =. We are now interested in how good these estimates are. Theoretically, the data at hand could have been handled another way, which would have led to other estimates of the model parameters. For instance, other estimates can result from using a different criterion than 1-7. It turns out, however, that such efforts would have been fruitless because in the class of all linear unbiased estimates, we would not have found estimates with smaller variances. If the conditions for classical regression hold true, the least squares method leads to the best linear unbiased estimates of the unknown parameters and. The class of linear estimates could be too narrow, however, and the question arises what the best unbiased estimate looks like, whether it is linear or not. This question has an answer in our classical case, because if 1-5 is satisfied, the least squares method gives estimates which are the best among all unbiased estimates, whatever their analytical form is STATISTICAL INFERENCE The main objective of building a model, which includes estimating its parameters among other things, is to use it for statistical inference. Statistical inference concerns testing statistical hypotheses and constructing confidence intervals for unknown model coefficients. This will be our focus at the very end of this chapter on the regression line. To perform statistical inference, the variances of the estimated coefficients must be known. These were derived in the previous paragraphs, but the resulting equations contain the parameter which is almost always unknown. Therefore, we must start with finding its appropriate estimate =. Since = ( ) = ( ) [( )] = ( ) and ( ) =, it looks like a reasonable candidate might be the expression /. And really, we are not far from the truth. We shall, however, alter the expression slightly to get a variance estimate with proper statistical properties. Let s start with = + +, = 1, 2,,. (1.14) Averaging both sides of 1-14, we get =. (1.15) Also, using 1-12, 1-14 and 1-15, we have 21

22 = = ( + + ) Therefore = ( ) + + ( ) = ( ) + ( )( ). = ( ) ( ). (1.16) This equality implies that ( ) = = ( 2). In other words, = /( 2) is an unbiased estimate of. We have now all information we need to proceed with statistical inference. Since we know the unbiased estimate of, we know the unbiased estimates of variances of the two estimated parameters in the regression line:. ( ) = ;. ( ) = / ( ). Symbols ( ) and ( ) are used more frequently instead of. ( ) and. ( ). We also know that the estimated coefficients are unbiased estimates. Last but not least, when a random variable, such as an estimate of a coefficient, is a linear transformation of another normally distributed random variable, the estimate itself is normally distributed. Therefore, ~, ( ). It follows then that ( )/( )~(0,1). When the unknown variance ( ) is replaced with its unbiased estimate ( ), it can be proved that the variable obtained has the Student s t distribution with 2 degrees of freedom ( )/ ( )~. (1.17) The construction of 1-17 is similar to that of the one-sample t-test criterion. The result 1-17 allows us to construct confidence intervals for the model coefficients, and to test hypotheses about these coefficients. We can test the hypothesis : =, where k is a given number, against the alternative hypothesis :. The null hypothesis will be accepted at the significance level of test if 22

23 ( )/ ( ) < (), where the term on the right-hand side represents the corresponding critical value of the t- distribution, i.e. the value (), such that for a random variable ~, ({ ()}) =. In the opposite case, when ( )/ ( ) (), the null hypothesis is rejected. Usually, k = 0, and in that case, the statistical significance of is tested. Put in other words, it is tested whether there is any sense in adding the i-th regressor to the model. We may also construct confidence intervals for the parameters of the model. For the parameter, the 100(1 α) - per cent confidence interval is, according to 1-17, of the form b t,/ s(b ) β b + t,/ s(b ), (1.18) where,/ is the (1 /2) - quantile of the t-distribution, i.e. <,/ = MULTIVARIATE REGRESSION A regression line may be sufficient for the description of elementary relations. An example is the capital market line in the theory of investments. But for most purposes, this will be a too simple model, since the explained variable Y usually depends on more than one regressor. In these cases, we talk about multivariate regression, which is the contents of this chapter. We shall again explain the subject matter in three basic steps. Step one will concern the formulation of the model, its unknown coefficients will be estimated in step two, and in the final step three, statistical properties of the estimates will be discussed. Since multivariate regression is a generalization of the regression line, it is not surprising that many theoretical conclusions that will be formulated in this chapter resemble those that have been presented in the previous chapter MODEL FORMULATION The multivariate linear regression model is of the form = , = 1, 2,,. (1.19) For k = 1, we get the case of regression line. Model 1-19 may be written in the matrix form = +, where the symbols have the same meaning as before, in the case of line, except for the 23

24 matrix of regressors X appearing in this equation. In the case of multivariate regression, the matrix satisfies 1 1 = The model is linear in terms of its parameters, but not necessarily in terms of its regressors (see chapter one). Therefore, an example of the multivariate linear regression model is the equation = We shall still assume that the classical regression case is valid here, i.e. no theoretical problems occur when estimating the parameters of the model. Let us generalize the classical regression conditions for the case of multivariate regression. We shall do so, using the matrix notation, since this form of expressing ideas will be helpful in subsequent chapters. We assume that 1) ( ) = 0, = 1, 2,,, which can also be expressed as ( ) 0 ( () = ) = 0... =. ( ) 0 Considering 1-19, condition 1) implies that () =. 2) ( ) ( ) ( ) ( ) ( ) =.... ( ) ( ). ( ) 0. ( ) 0.. = ( ) This condition is related to the covariance matrix of the vector of random components = ( ). The element represents the covariance between the i-th and j-th component of the vector. The second condition says that all the random components have the same variance and are mutually uncorrelated:, = ( ) = 0 for. Condition 2) may also be written as ( ) =, where I is the identity matrix. 3) The matrix of regressors = ( ), where is the i-th value of the j-th variable appearing in the model, is nonrandom and its columns are linearly independent. Thus, the values in X are determined in advance by the statistician, the matrix is of type and the rank of the matrix is k. For the rank condition to be satisfied, must be necessarily satisfied, i.e. if <, then the rank of the matrix cannot be equal to k. 24

25 The third condition is important in relation to the rank of the matrix. The rank plays a significant role when the least squares estimation of the multivariate model coefficients is performed. Looking at the case of regression line, if the rank condition is not satisfied, the second column of X would be a multiple of the first column of ones. This would mean the second column would be a column of constants, and in that case, we would have ( ) 2 = 0, being unable to use formula 1-9 due to division by zero. We would not be able to estimate the parameters of the regression line. As we shall see, a similar result would occur if we tried to estimate the parameters of the multivariate linear regression model provided the matrix X had linearly dependent columns. 4) The vector of random components appearing in 1-19 has a multivariate normal distribution. Summarizing the conditions related to the vector, we may write: ~ (, ) THE LEAST SQUARES ESTIMATION OF MODEL PARAMETERS Let us now derive the formula that estimates the unknown parameters of model The estimates are obtained using the least squares method, which means that the objective is to find values,,, which minimize expression () = ( ). (1.20) The expression is understood as a function of the variables,,,. To find their proper values means, as in the case of the regression line, to solve an optimization problem. To do so, partial derivatives of 1-20 are calculated and set to zero. Setting = for the residuals of the model, we get () = 2 ( ) = 0, = 0, 1,, (1.21) Writing = (,,, ) and = (,,, ), =, 1-21 may be rewritten as () = = ( ) = + =, and therefore the solution for satisfies =. (1.22) Expression 1-22 represents a set of the so-called normal equations we have resolved in the case of the regression line. If the inverse ( ) exists, the set of equations has the unique solution = ( ). (1.23) At, the sum of least squares () may theoretically be either minimized or maximized, or there may be no extreme of the function, as well. It can be shown that the function is minimized in this 25

26 case, and therefore, 1-23 is the solution we were looking for. Formulas 1-9 and 1-23 give the same result, or put another way, 1-9 is a special case of The vector of coefficients can be obtained uniquely only if the inverse ( ) exists, which brings us back to the conditions required in the case of classical regression. If the matrix X does not have the full rank, i.e. its columns are linearly dependent, the matrix does not have its inverse, and 1-23 cannot be applied. It is clearly seen why the classical regression conditions include the condition for the rank of the matrix of regressors. Once the model is estimated, the question about the statistical quality of the estimates arises again STATISTICAL PROPERTIES OF LEAST SQUARES We shall derive some basic properties of the estimates, the other more complex results will only be mentioned as a result. The vector of estimates, viewed as a random vector,satisfies = ( ) = ( ) ( + ) = + ( ), (1.24) and so its expected value satisfies () = () + {( ) }, = + ( ) (), = +. (1.25) When calculating the expected value of a vector, recall that this means that the expected value of each of its components is calculated. Expression 1-25 holds because the vector of unknown parameters is a vector of constants and the conditions 1) and 3) of classical regression apply. This means that the vector is an unbiased estimate of the vector. This is a generalization of the result obtained for regression lines. For regression lines, we have also derived the variances of the estimated coefficients. We shall generalize this procedure in the case of the multivariate model, as well. When working with a random vector, however, we talk, as we already know, about its covariance matrix = ( ), where denotes the element in the i-th row and j-th the column of the matrix, or the covariance between the i-th and j-th component of the random vector. For the vector, we can write =, = ( ( )) ( ). The entire covariance matrix of the vector is then () = {( ())( ()) }, or, in a more readable form according to 1-25, () = {( )( ) }. Since the condition 3) of classical regression holds, as well as 1-25, we may use the following 1 1 three matrix rules: ( AB ) B A, ( A) ( A ), var( Ax) A var( x) A for a random vector x. Thus, 26

27 () = {( )( ) }, = {( ) (( ) ) }, = ( ) ( )( ), = ( ) ( ), = ( ). (1.26) What matters most is the result which tells us how to find the covariance matrix of the vector of estimates. In fact, for a regression line we can also construct the covariance matrix of its estimated coefficients, since a line contains two parameters, or a vector of two parameters. The diagonal of such matrix, a special case of 1-26, would contain the variances of and, which have been derived, and the two matrix elements off its main diagonal would represent the covariances (, ) and (, ), (, ) = (, ). Matrix 1-26 is always symmetric, given the definition of covariance. Using the matrix ( ), it can be shown that the estimate is the best unbiased estimate of , provided the conditions of classical regression are satisfied and the coefficients were estimated by the least squares method according to As far as the probability distribution of is concerned, we assume that the random vector has a multivariate normal distribution. Therefore, the vector Y, a linear transformation of, and the vector, a linear transformation of Y, are also normally distributed. Summarizing the results so far, we have ~(, ( ) ). (1.27) For the sake of completeness, let us also say that under additional and very general conditions imposed on the behaviour of X, is also a consistent estimate of [3]. Relation 1-27 may be used for statistical inference STATISTICAL INFERENCE We shall test the hypothesis about the components of and construct confidence intervals for them, which is going to be an analogy to the case of the regression line. The starting point is expression For the i-th component of, we have ~(, ), where is the i-th diagonal element of the matrix ( ). This implies that ( )/ ~(0,1). (1.28) If we knew the parameter, we could use 1-28 for statistical inference. However, the parameter is rarely known, and so we have to resort to its reasonable estimate, as in the case of the regression line. The use of an estimate will change the probability distribution of It can be shown that an unbiased estimate of is = /( ), where p denotes the number of model parameters. For our generally defined multivariate model, = + 1. This result is a generalization of the case of the line, where we had = 2. Using the estimate, 27

28 ( )/ ~, (1.29) where represents the Student s t-distribution with degrees of freedom. Expression 1-29 can be used to test the hypotheses about the parameters and for construction of their confidence intervals. Therefore, formulas 1-17 and 1-18 hold with different degrees of freedom of the t-distribution. Let s demonstrate the procedures in classical regression by an example. EXAMPLE 2 The following table contains data on a variable Y which is assumed to depend on variables and, the relation being of the form = The goal is to estimate the unknown parameters s, test their significance and construct a 95% confidence interval for the parameters. Table 4: Data for Example 2 Source: own Solution: y x x The model can be written as = , where =, so we are working with three explanatory variables. For this purpose, let us expand the original table to table Table 5 y x x x 3 = x 1.x The matrix of regressors is 28

29 = Hence 8 16 = , ( ) = The vector of estimates is = ( ) = = , The model is of the form = Inserting the vector of the i-th values (,, ) in the resulting model, we can calculate the fitted values. This further allows us to evaluate the residuals =. Table 6 shows the calculations (i = 1, 2,, 8) together with second powers of the residuals. Table 6: Calculations in Example 2 y y-fitted e e

30 The residual sum of squares is = = The parameter is therefore estimated as /(8 4) The variances of the estimated coefficients are estimated as Table 7: Coefficients and their estimated variances coefficient b b b b Knowing the estimates of the variances, we can now construct 95% confidence intervals for the model coefficients. For this purpose, we use formula 1-18, where 2 degrees of freedom are replaced with 4 degrees of freedom, since there are = 4 parameters in the model. For = 0.05, we have,, = 2.776, which provides the following confidence intervals = ( 5.93, 7.35), = (0.028, 4.77), = ( 4.13, 3.86), = ( 1.21, 3.29). In the final step, we are going to test the significance of the model coefficients. To test the significance of the i-the coefficient means to examine the validity of the null hypothesis : = 0. We can use the statistic = ( ), just like in the case of regression line. If the null hypothesis is true, the statistic has the t- distribution with n-p degrees of freedom. For each of the unknown coefficients, we have 30

31 Table 8: Test of coefficient significance Coefficient T b b b b A coefficient is considered significant if (), because this is when the null hypothesis is rejected. In the example above, only is significant, since () = None the less, let us note that the absolute term is included in the regression models, as well, for various reasons even if it seems insignificant by the statistical test. The conclusion for our example could be such that we might be satisfied with modelling the behaviour of the variable Y with a regression line, reflecting the effects of a single regressor. The other regressors seem to have no effect on Y ADJUSTED COEFFICIENT OF DETERMINATION Let us return to the coefficient of determination which was previously used to assess the suitability of a regression line. In the multivariate case, the coefficient is calculated the same way. For the last example, the coefficient is = 93.84/ = If insignificant variables were excluded from the model (except for the absolute term), we would work with a regression line. A change in the number of regressors will generally have an effect on the estimates of and in the new model. The least squares method results in a line = The coefficient of determination for the line is As can be seen, adding new variables to the model of line increased the coefficient of determination. This is a general property of the coefficient. More regressors in the model mean that the coefficient will not decrease. This property is a weakness of the criterion. As will be seen later, enriching a simpler model with an insignificant regressor worsens the precision of model parameters. The quality will be worse in the sense that their variances will rise. The coefficient of determination, however, does not capture this deterioration. On the contrary, it signals a model improvement. Therefore it is necessary to have another criterion available which would penalize a model that resulted from an extension of a simpler model through adding insignificant regressors. The socalled adjusted coefficient of determination ranks among the criteria that have this property (to an extent). It is defined as = 1 (1 ). (1.30) When another variable is added to the model, will not decrease, but for a low enough t-test statistic of this variable, will decrease. It is convenient to use the adjusted version of the 31

32 criterion when the quality of two models is compared, one of the model being an extension of another. The higher the criterion 1-30, the better the model. The adjusted coefficient can also take on negative values. When this happens, it is zero by definition TEST OF MODEL The end of the chapter on classical regression is devoted to the statistical test often used in relation to regression. It is the F-test of model significance. The test examines the validity of the hypothesis : = = = = 0. Thus, the test asks the question whether any of the regressors is significant. The t-tests of significance concerned each regressor alone. Although it might seem proper to test significance of the regressors by applying a series of t-tests, such a procedure, as is known in statistics, is not adequate in the statistical sense of the word. If the regression model is of the form 1-19, the F- test criterion can be written as = / /(), (1.31) which has the Fisher s distribution with k and n k 1 degrees of freedom if the null hypothesis is true. If the criterion is greater than (or equal) to the critical value of the distribution,, (), the null hypothesis is rejected for the significance level of test. Expression 1-31 can also be written as = / ( )/(). (1.32) Returning to Example 2, we can test significance of the original model with all three explanatory variables. Using 1-32, we have = 0.87/3 (1 0.87)/(8 3 1) = 8.92, which is greater than the critical value F, (0.05) = The hypothesis that the model is insignificant is rejected. At the very end, it is important to note that it happen that all regressors of a model are insignificant, as suggested by t-tests, but the model as a whole is significant by the F-test. It may even be said that this situation occurs fairly often. The contradiction suggests there is a strong multicollinearity in the model, a problem we will deal with in the subsequent sections of this text. In the final but important note, let us say that many of the results concerning statistical properties of parameter estimates can be attained without the assumption of normality. Some of the results are weaker for instance, the estimates are not the best unbiased but only the best linear unbiased, and some statistical properties hold only asymptotically. 32

33 Summary of terms: - Regression line - Multivariate regression - Linear regression model - Model coefficients - Random component of the model - Regressor - Explained and explanatory variable - Classical regression - The least squares estimation - Normal equations - Statistical inference in regression - T-test of coefficient significance - F-test of model significance - Confidence interval for a model coefficient - Expected value of a coefficient - Variance of a coefficient - Covariance matrix of vector of coefficients - Linear estimate - Unbiased and best unbiased estimate - Best linear unbiased estimate - Consistent estimate - Asymptotic distribution - Fitted value - Residual, residual sum of squares - Coefficient of determination - Adjusted coefficient of determination - Statistical, economic and econometric verification of a model Questions 1. Given the data in the following table, estimate the regression line (= model 1), describe the dependence of the variable y on the variable x, and calculate the fitted values of y, using the model. Make a judgement on the quality of the model, using the coefficient of determination. x y Source: own 33

34 2. The table below contains data on three variables. Find estimates of the unknown parameters in model 2: = + + +, and evaluate the quality of the model with the coefficient of determination. x x y Source: own 3. Compare Models 1 and 2 with the coefficient of determination and its adjusted version. Discuss which of the models seems more suitable. 4. Estimate the variances of the coefficients, from the models 1 and 2, and compare the variances. For which of the two models are the variances greater? 5. Test the significance of the regression coefficients in Model 1 and 2 (alpha = 0.05). 6. Construct a 95% confidence interval for the parameter of Model 1 and 2. Compare the intervals. Which of the intervals is broader? Explain why. 7. What is the difference between the random component of the model and its residual. 8. Explain the advantage of the adjusted coefficient of determination, as compared to its unadjusted version. 9. Explain the main idea behind the least squares estimation. 10. Explain the idea of a consistent estimate. Answers to questions 1) 1 b b (X X) X y b The fitted values yˆ satisfy i yˆi b0 b1 x i, and so x y-fitted The coefficient of determination is 34

35 R 2) 13 2 i1 13 i1 2 yˆ i y y y i 1 b b b b (X X) X y R 13 2 i1 13 i1 2 yˆ i y y y i 3) The coefficients of determination are similar for both models. Thus, from the practical point of view, the two models are more or less of the same quality. The problem is that the coefficient will be greater automatically for the second model because there are more regressors in the second model. Since the models have a different number of regressors, it is more suitable to compare their quality with the adjusted coefficient of determination: Model 1: R adj. (1 (1 R )) Model 2: R adj The two coefficients are again similar, though not the same. The coefficient is a bit higher for the first model, signalling that the second regressor x 2 does not have a significant influence on the variable y. 4) For Model 1, (XX) and the residual sum of squares is 13 2 ( y ˆ i yi ) Therefore, the estimate of the variance of b0 i1 is (3.77 / (13 2)) For b 1, we have (3.77 / (13 2)) For Model 2, (XX)

36 and the residual sum of squares is 13 2 ( y ˆ i yi ) Thus, the variance estimate for b0 is i1 (3.704 / (13 3)) For b 1, the estimate is (3.704/ (13 3)) In model 2, both coefficients have more than twice as large variances as those in Model 1. 5) The test criterion is T b s b ), where s b ) j ( j ( j is the estimate of the standard deviation of b j, or its standard error. The criterion is compared to the critical value of the Student s t distribution with 13-2 = 11 degrees of freedom (the case of model 1) or 13-3 = 10 degrees of freedom (the case of Model 2). Model 1: T Crit. value b(0) b(1) Both parameters can be regarded as significant, since the absolute value of T exceeds the critical value in both cases. Model 2: T Crit. value b(0) b(1) b(2) The test confirmed that the variable x 2 does not have a significant effect on y, as has been demonstrated by the adjusted coefficients of determination for Model 1 and 2. It is not reasonable to include the variable in the model (it is not reasonable to extend Model 1 to Model 2) because it is probably the primary cause of higher variances of the coefficients in Model 2. We shall discuss this phenomenon in the second chapter of the text. 6) The general form of the confidence interval is ( b1 s( b1 ) tn p ( ), b1 s( b1 ) tn p ( )). For Model 1, the interval is ( 1.34, 1.06), for Model 2, the interval is ( 1.39, 0.94). The second interval is 1.5 times wider than the first interval, which is a result of the higher variance of the estimated regressor. 7) Residual is an estimate. 8) See section ) See section ) See section

37 2 PROBLEMS IN CLASSICAL REGRESSION Goal: This chapter covers various problems that may appear in classical regression. These problems concern measurement errors in data, the problem of how many regressors should be included in the model, the problem of multicollinearity, heteroscedasticity and autocorrelation. Time to learning: 11 hours. So far, we have assumed that all the conditions of classical regression are satisfied. This means we have worked under ideal conditions. Starting in this chapter, the theory will begin to divert from the ideal state, since it does not happen very often in reality that all the conditions are met. Violation of the conditions has different negative effectson the model which is analysed. Some of the problems are related to how the model is formulated, others are linked to imperfections in the data used for building the model, and are not related to the model itself. The explanation of the theory that follows will begin with the problem of multicollinearity, the problem of how many regressors should be inserted in the model, and the problem of measurement errors that might occur when gathering the data. The frequently occurring problems of heteroscedasticity and autocorrelation will then follow MULTICOLLINEARITY The problem of multicollinearity arises when the condition of the linear independence of the columns of matrix X is either violated or almost violated. It has been explained that when applying the least squares method, the linear dependence of the columns prohibits using 1-23 to get estimates of the unknown coefficients. This is the case of so-called perfect multicollinearity. When this happens, some explanatory variables (regressors) can be expressed as linear combinations or more linear combinations of other explanatory variables. Multicollinearity rarely occurs in its perfect form, and when it does, it rather suggests that the model was ill-constructed (for instance, too many regressors were included in the model). More often, the condition of the linear independence of the columns of X is satisfied, i.e. the matrix has the full rank, equal to the number of its columns. However, to a certain degree, an approximate linear dependence among the regressors exists. In that case, we talk about the problem of imperfect multicollinearity. This type of approximate dependence substantially increases the variances of the coefficient estimates, which means that the precision of the estimates is reduced. As a result, the estimates will vary substantially from one data sample to another. These consequences are shown in Example 3, which follows after this chapter. Another consequence can be seen in the construction of the t-test criterion, used to test the significance of 37

38 model coefficients. A stronger linear dependence among regressors distorts the conclusions based on the t-tests the null hypothesis is accepted more often than it should be, due to the increased variance of the coefficient estimates. Thus, some regressors will appear to be insignificant only because their values were not defined properly. The increased variances of the coefficient estimates will, of course, also affect the width of the confidence intervals, constructed for model coefficients. To summarize, multicollinearity reduces the precision of the model as well as conclusions made on the basis of statistical inference. On the positive side, multicollinearity does not introduce any bias to the estimated coefficients they remain unbiased. This is evident from the derivation of unbiasedness, in which a specific form of the matrix of regressors played no role. Multicollinearity originates for many reasons. It can be the case that regressors, the natural character of which is such that they simply are related to each other, enter the model. This is often the case in economic sphere, where one variable develops in accordance with another variable. Another reason occurs when delayed variables are used in the model. Such models are called dynamic, since they capture the dynamics of time not only is the present value of a regressor used in the model, but also its values from the past are used:,,. The model may then look like this: = Such regressors are usually strongly correlated due to certain inertia embedded in their development over time. When discrete regressors (variables whose range of values is a finite set) are used in modelling, multicollinearity may even be perfect when the model is wrongly specified. These situations are usually mentioned in the remarks, related to modelling seasonalities, the remarks serving as a warning against the perfect multicollinearity. The following example portrays the consequences of multicollinearity. EXAMPLE 3 Using the data from Table 9, estimate the coefficients in the model = and calculate their variances. Table 9: data for Example 3 Source: own y x x Solution: The least squares method gives estimates = 7.73, = 2.76, = 0.03, and the residual sum of squares is = Thus, = 40.01/(8 3) = 8. Using and the inverse 38

39 ( ) = , we get for the estimated variances of the estimated coefficients:. ( ) = 2.5,. ( ) = 0.58 and. ( ) = 0.2. The two regressors are not highly correlated: the correlation coefficient equals 0.1. Let us assume now that the values of the regressors are as is depicted in Table 10. The correlation between the variables is Table 10 x x y We have not inserted the values of Y, since they will be different for the different values of the regressors. Let the values be such that the estimate will remain the same. This allows showing the effect of the different values of the regressors on the variance estimates. Using the inverse ( ), , we now have:. ( ) = 191.2,. ( ) = 4.64 and. ( ) = What a difference! The variance of the absolute term, for instance, rose more than 76 times! When analysing multicollinearity, we are mainly interested in detecting its approximate form. Several techniques exist for these purposes. A simple approach lies in constructing the sample correlation matrix which contains paired correlation coefficients for individual pairs of model regressors. If any of the matrix elements is greater than 0,8 in absolute value, this suggests the linear dependence is harmful. This approach is simple, but has a major disadvantage: it measures the dependence among the regressors only by paired correlation. More complex linear relations among them are not taken into account. Therefore it is recommended that multiple correlation coefficients be calculated. This coefficient measures the strength of the linear dependence of a selected regressor on all the other regressors. If any of the coefficients seems high, multicollinearity is present. It is a known fact that the multiple correlation coefficient is related to the coefficient of determination from the auxiliary regression, and so the coefficient of determination itself may serve as an indicator of multicollinearity. As is known, the square root of the coefficient of determination gives the multiple correlation coefficient. There are various rules of thumb, based on experience from empirical studies, which suggest how high the coefficient of determination must be in order for the multicollinearity to be harmful. These rules say that if the coefficient is higher than 0.8, one should be concerned with the multicollinearity. 39

40 Sometimes, a less strict criterion is used, advising caution when the coefficient is higher than 0.9. Last but not least, the test ofsignificance of is recommended, as well, i.e. relations = ,, = 1, 2,, 1 = 1, 2,, (2.1) are potentially assumed and the F-tests of model significance are run for these regression. For Model 2-1, the test criterion is = /( 1) (1 )/( ). If any of the regressions is confirmed as significant, multicollinearity is present. Sometimes, it is also a good habit to compare with the coefficient of determination calculated for the original regression model describing the dependence of Y on the regressors. If any of the coefficients is greater than, the multicollinearity is again considered harmful. Since the multiple correlation coefficient is closely related to, it can also be used as an indicator of a too strong linear relation among the regressors. Having detected a too strong multicollinearity, the question is how to solve the problem. There are more ways of lessening the problem, although none of them necessarily leads to a completely satisfactory result. One possibility is to expand the size of the data sample being processed. Of course, there are two problems with this approach: gathered data is usually a precious item and no expansion of the experiment might later be available due to financial reasons, for instance; also, this simple technique assumes that multicollinearity is milder in the corresponding wider population. If this is not true, one cannot expect any improvement in the character of the expanded data sample. Even controlled data expansion, leading to a muted linear dependence, requires a careful setting of the additional values of the regressors by the analyst, and this may be an overly tricky thing to do. Another possibility is to try to find other information, not in the form of additional data, about the model itself. This information may take the form of equality or inequality or another mathematical constraint that should be incorporated into the model. Such information reduce the variances of the estimated coefficients, although the information must be accurate or, more generally, not too inaccurate. If there is any amount of inaccuracy in the information, the consequence is that the estimated coefficients will have a lower variance, but they will not be unbiased anymore, which would complicate statistical inference. Yet another possibility is to exclude from the model the regressor which generates the unwanted linearity among the regressors. In the better case, such a regressor may actually have no effect on the modelled variable Y (see chapter ). In the worse case, when it will have a nonzero effect on Y, its exclusion may, as we shall learn, bring more harm than benefit. Other special techniques for dealing with multicollinearity exist, as well. They include the socalled ridge regression and the technique of principal components. It must be stressed that the techniques are not always welcome, since they may reduce the variances of the estimated coefficients, but render them biased, generally speaking. Biasedness is not a desired property 40

41 because it complicates the statistical inference to be done with the resulting model. If the amount of biasedness was known, inference could still be performed, but with these techniques, it depends on unknown parameters. The method of principal components reduces the number of variables used so that the loss of information incurred is minimal in a sense, but the technique is sensitive to the physical units used for the regressors, and the interpretation of the estimated coefficients is also complicated, since the resulting estimates are a mixture of the original estimates. Selecting a reasonable approach to solving the problem with multicollinearity depends on the character of the situation, the experience of the analyst, and also on the purpose for which the model is to be utilized. The best way to avoid multicollinearity is prevention, of course, which means a suitable design of the matrix of regressors if it is possible NUMBER OF REGRESSORS The conclusions based on regression analysis depend on what regressors were used in the model. An optimal scenario is such that the model works only with significant regressors, having put aside the insignificant ones. Of course, it is usually not known if this is the case or not. However, an analysis can be done as to what happens to the properties of the estimated coefficients if the model contains one or more insignificant variables or if it, on the contrary, lacks a significant regressor or more such regressors. Let us assume that the correct form of the model is = +, whereas a model of the form = + + is worked with. Here, represents a matrix of insignificant regressors. This is the situation when there are too many regressors in the model. In the same way as in 1-25, it is straightforward to show that the coefficients estimated by the least squares will remain unbiased. There is, however, a problem with their variance (their efficiency). Only the greatest optimist can expect the additional insignificant regressors not to have any linear relation to the other, significant regressors enclosed in, at least to an extent. Usually there will be, at least approximately, a certain linear dependence between the values of the significant regressors and insignificant regressors. This will mean, as was seen in the case of multicollinearity, that the variances of the estimated coefficients will generally rise. In other words, the estimates will be less efficient. However, since unbiasedness is maintained, statistical inference can still be made (tests and intervals). An opposite situation arises when some important regressors are missing in the model. Mathematically speaking, we work with the model = + instead of the true model = + +, where is the matrix of significant regressors. In this case, the consequences are far worse. The least squares method yields biased and also inconsistent estimates of the regression coefficients. The truth is that the lower variance of the estimates accompanies this procedure, but biasedness inhibits statistical inference. The inference is also hampered by the biasedness of the estimate = /( ) in this case MEASUREMENT ERRORS Another type of problem that can be encountered in modelling has to do with measurement errors contained in model variables. To keep the problem simple, we shall restrict the analysis to 41

42 the case of the regression line. The character of the errors must be mathematically described so that their consequences were more obvious. What is in this context usually assumed is = +, ~(0, ), = +, ~(0, ). Here,, are variables not encumbered with measurement errors, whereas, are variables containing errors;, are normally distributed random variables with zero mean and their own specific variance. It is also assumed that and are uncorrelated, and the same is true about the pairs, and,. What are the consequences of this situation? Let us look separately at errors contained in Y and errors contained in ERRORS IN Y Let us assume the correct form of the model is = +, whereas we work with a model = +, = +. Then Thus = ( ), = ( ) ( + ), = + ( ) ( + ). () = + ( ) ( + ) =. Therefore, the vector of estimates remains an unbiased estimate. As far as the variance of the estimate is concerned, i.e. its covariance matrix, we get similarly as in 1-26 () = ( + )( ). Since > 0, the estimate is no longer efficient. We can sum up that when the explained variable is measured with errors, the consequences for the estimate of unknown coefficients are the same as when the model contains too many regressors ERRORS IN X In this case, a model of the form = +, =, is analysed, which is the same as in The expected value satisfies () = + {( ) }, () = + {( ) ( )}. Here, the far-right term is not equal to a zero vector because the matrix contains the random component, and so the vector of regressors and the random component are not generally uncorrelated. Because of that {( ) ( )}. 42

43 The result is that the estimate of unknown coefficients is generally biased preventing statistical inference. Moreover, it can be shown that it is also generally inconsistent, so we cannot rely on large-sample properties of least-square estimates either. The consequences of measurement errors contained in regressors are far more serious than in the case of measurement errors contained in the dependent variable. The same results are obtained for multivariate regression where, however, the situation is even worse in the sense that it suffices to have a measurement error in only one regressor, and all the other regressors will be contaminated by error, as well. To prevent measurement errors, the fundamental principle is prevention. This means that one should, for instance, draw on data that are officially published. It is not always possible to use this strategy, however. A technique of so-called instrumental variables might then be utilized as a method that solves problems with measurement errors. The technique provides biased estimates, but consistent, asymptotically unbiased and normally distributed estimates under fairly mild conditions. Let us demonstrate the idea behind the technique in the case of regression line = + +, = 1, 2,,. The equation implies that = + +, so that we can write ( ) = ( ) + ( ), = 1, 2,,. (2.2) Equation 2-2 can be rewritten in the form = +, = 1, 2,,, where the symbols with asterisk denote the corresponding deviation. Multiplying the equation with the i-th value of a variable, which will be specified in a moment, and summing both sides of the equation for = 1, 2,,, we have, = +. Let us note that if this equation was multiplied by (1/n), the left-hand side of the multiplied equation would denote an estimate of the covariance between Z and Y, and covariances would also be estimated on the right-hand side of that equation. If Z, the so-called instrumental variable, is selected in such a way that it is uncorrelated with but correlated with x, this will be reflected in the sample, and the bigger the sample, the better. We may then, using the sample values of the variables, calculate the coefficient estimates as =, =. The procedure can also be generalized and formulated for the case of multivariate regression. It must be stressed, however, that not always it is straightforward to find variables with the aforementioned statistical properties, i.e. variables uncorrelated with the random component of the model but correlated with the regressors. As the number of regressors contaminated with measurement errors increases, the difficulty of finding instrumental variables increases, as well. Again, it is much more convenient to pay a lot of attention to prevention, when it comes to 43

44 measurement errors. Sometimes, however, the nature of a regressor is such that errors are unavoidable in its case HETEROSCEDASTICITY In previous chapters, we dealt with problems we can influence to a certain extent. To give an example, the problem of multicollinearity is such that it can be alleviated by selecting proper values of the regressors, provided that measurements of Y can be obtained for the modified values of the regressors. We can also affect the number of regressors, which, as was seen, may lead to some problems, as well, and the same is true, to an extent, about measurement errors. Starting with this chapter, our attention will be turned to problems we cannot affect because they are connected to the probability distributions that govern the behaviour of the random component. If this is the case, then we can either reconcile this fact, or we can try to modify the regression procedures to lessen the extent of the problem. We shall start with the problem of heteroscedasticity. From now on, only the case of multivariate regression will be the subject of the discussion, so that the conclusions were general enough, regardless of how many regressors there are in the model FORMULATION OF THE MODEL AND ITS CONSEQUENCES The classical multivariate regression model is altered when heteroscedasticity is present it is assumed that ( ) =, = 1, 2,,. In other words, the random components do not have the same variance any more. We shall assume that all other classical regression conditions are still valid. The covariance matrix of the vector of random components is now of the form 0 ( ) = (2.3) Expression 2-3 is usually written as Ω, which means that the elements on the main diagonal of the matrix are rescaled. It will become clear why this form of the covariance matrix is preferred. Heteroscedasticity is typically encountered in cross-sectional data. These are data that usually describe relations among subjects at one point in time. A progress of the relations in time, if such a development exists at all, is of no primary interest in cross-sectional data analysis. On the contrary, in time series analyses, where the time development of one or more variables is observed in the first place, heteroscedasticity is not typically present. As in the case of multicollinearity, heteroscedasticity originates for many reasons. It may be the case that an important explanatory variable is missing in the model. Heteroscedasticity can also be caused by measurement errors. This happens when the errors accumulate with an increase of one of the explanatory variables. The accumulation increases variability of the random component of the model. As shall be seen, nonconstant variances also appear when the analyst stops working with original data and starts to work with their averaged values. This is, generally speaking, the problem of aggregated data. Many other reasons explaining the origin of heteroscedasticity could be presented. 44

45 Regarding the consequences of heteroscedasticity when the least squares method of parameter estimation is used, the expected value of the estimate is () = + ( ) () =. (2.4) The estimate is still unbiased. It is obvious because the variability of the random components does not enter relation 2-4 in any way. As far as the covariance matrix of the estimate is concerned, the situation is different. Since () = (). for any nonstochastic matrix A and any random vector v, we have () = (( ) ), = ( ) ( )( ). (2.5) If heteroscedasticity isn t present, matrix 2-5 will collapse to the matrix known from the classical regression, since then =. It can hardly be expected that the expression on the right-hand side of 2-5 is generally equal to ( ), the term obtained in the classical regression. Additionally, if we realize that the vector of estimates is efficient when the classical regression conditions are satisfied, it is not surprising that the estimate, the covariance matrix of which is 2-5, cannot be more efficient than the one obtained in the classical regression. Trully, its efficiency is lower, and it is not, generally speaking, the best unbiased linear estimate any more. The question how much its efficiency is lower cannot be answered, however, as it differs case from case. The lower efficiency of is not the only consequence of heteroscedasticity. Another consequence is that the estimate = /( ) of is biased, so any statistical inference based on the matrix ( ) is inappropriate, let alone the fact that the matrix has a different form, as compared to 2-5. The t-test of an individual model coefficient is based on this matrix, and the F-test of model significance is also designed under the condition of homoscedasticity. Thus, the tests are invalid THE GENERALIZED LEAST SQUARES As we have seen, heteroscedasticity causes some problems the parameters estimated by ordinary least squares cease to be precise enough, and similar consequences concern the subsequent statistical inference, if the inference can be done at all. The question is how to proceed when heteroscedasticity is present. One of the methods that tries to remove, or at least diminish, the problem is the so-called generalized least squares method. The main idea of this procedure is to transform the data, or model, so that it satisfies the condition of homoscedasticity. The method then applies the ordinary least squares to the transformed model. To demonstrate the technique, let us assume that a regular matrix P can be found so that =. Multiplying the model = + by P from the left, we have = +. (2.6) Now, the transformation implies () = (), = ( ), 45

46 = ( ), =. Thus, rewriting 2-6 to = +, (2.7) where =, =, =, we can say that the random component now satisfies all the conditions of classical linear regression. Therefore, the proper estimate of the unknown coefficients satisfies = ( ), = ( ), = ( ). (2.8) The estimate was obtained under the conditions of classical regression, and so it is the best linear unbiased estimate, or even best unbiased estimate under normality. Its covariance matrix is () = ( ), = ( ), = ( ). (2.9) Let us use a simple example to demonstrate how the matrix could look like. Assuming that the variance of the i-th random component of the model is =, i.e. it rises as the value of the k-th regressor increases, the covariance matrix of the vector of random components is 0 ( ) = Since =, = / = 0 1/ / Now, we are required to find the matrix P, so that =. It is easy to see that = 1/ 0 0 1/ /. (2.10) 46

47 Expression 2-6 further implies that to write the general form of the regression model, i.e. to multiply equation 2-6 with matrix 2-10 from the left, means to perform the following division = , = 1, 2,,. Strictly speaking, should be the divisor, but the simplification we have made by skipping the absolute value does not matter here, since the simplified transformation also leads to random components which satisfy the conditions of classical regression, that is the condition of homoscedasticity, among other conditions. This procedure can be used in a similar manner when group averages, or aggregated data, are used instead of original data. In this case, the original n observations are divided into G groups, and average observation is calculated within each group. This way, the original data are represented by group averages. This data organization spurs heteroscedasticity, since the variance of the random component of the model starts to shift as the size of the j-th group changes. This is the problem of aggregated data outlined earlier. Let us describe this situation formally. Instead of the original classical regression model, = , = 1, 2,,, we work with a model where = , (2.11) =, =, =, for j 1, 2,..., G, m 1, 2,..., k and data in the j-th group. There are G groups now. Since = 1/ =, the variance of depends on the size of the j-th group. Thus, heteroscedasticity is present. A better way to estimate the parameters of 2-11 is therefore the technique of generalized least squares, or the corresponding transformation of 2-11 to 2-7, this being done by multiplying the j-th equation of 2-11 by. This is because the covariance matrix is in this case of the form 1/ 0 0 = 0 1/ 0, 0 0 1/ which implies that 47

48 = We would like to make one final note concerning the group averages. To find the estimate of 2-11 means, as has just been presented, to find the parameters that minimize the expression ( ). The element may now be regarded as a weight. This is why the procedure is also labelled as the method of the weighted least squares. We know how heteroscedasticity can come into existence, what effect it has on the least squares estimates and how it can be removed, at least theoretically. What remains is finding out how the problem could be detected. There are statistical tests for these purposes, and one of the most fundamental tests is the Goldfeld-Quandt test GOLDFELD-QUANDT TEST Let the model of interest be = + with p unknown parameters and n observations. It is assumed that =. The idea of the test lies in dividing the n observations into two groups and running a separate least-squares regression in each group. A middle part of the original observations is left out and not used in the regressions. Using the two estimated regressions models, residuals are calculated separately for each group as well as the corresponding two residual sums of squares. A test criterion based on the sums is then constructed. There is often a discussion on how to create the two groups, because the conclusion of the test depends on the way the original data are divided. A general rule is to select the first and second group so that not more than one third of all the data are left out. The recommendation for the data division is such that the values of are sorted in the ascending order, and if there are 30 observations altogether, the first regression is applied to the first 11 observations of y corresponding to the first 11 sorted (lowest) values of, whereas the second regression is run, using the 11 values of y related to the 11 highest values of. Generally speaking, for n observations, it is recommended to skip the middle 8n/30 values. The number must be adjusted slightly if it is not an integer number (see the example below). The test examines the validity of the null hypothesis : The model satisfies the condition of homoscedasticity against the alternative hypothesis : =. We would like to draw attention to the formulation of the alternative hypothesis. If the form of heteroscedasticity is more complex, the test should not be used, and a different technique is 48

49 required to detect the problem in the model. The test also assumes a multivariate normality of the vector ofrandom components of the model. The test criterion is of the form = (1) (2), where (1) and (2) are the residual sums of squares calculated separately for each of the two groups. The numerator of T represents the bigger of the two sums of squares. The criterion is compared with the critical value of the Fisher s distribution, (), for the level of significance, where p is the number of parameters in the model, M 1 is the number of observations in the group corresponding to (1) and M 2 is the number of observations in the group corresponding to (2). If, (), the null hypothesis is rejected. If the opposite inequality holds, the null hypothesis is accepted. EXAMPLE 4 Table 11 represents the result of a poll run among 17 companies, which differ from one another by the time of their existence (variable in years) and by their annual net profit (variable in millions of crowns). The dependence of the companies average monthly investments in information technology (variable y in thousands of crowns) on the two variables/regressors is examined. We shall perform the Goldfeld-Quandt test with respect to the variable, and if the test turns out to be significant, we will use the generalized least squares method to adjust the model. Table 11: Data of Example 4 y x x Source: own Solution: There are 17 observations, so we shall leave out the middle (8/30) 17 5 values. Thus, the first group will contain 6 observations, and the same number of observations will be in the second group. We shall now comment on the results contained in the tables that follow. The following two tables are related to the test applied to. The first six observations, utilized for the first regression, are in the left table. In this case, the regression leads to a model = The last six observations, used for the second regression, are in the right table. In the latter case, the regression results in a model = The last columns of the tables contain residuals of the related models. 49

50 The test criterion is = (1) (2) = = The significance level of the test is set at five per cent, which means that the critical value equals, (0.05) = Since the criterion exceeds the critical value, the hypothesis of homoscedasticity is rejected. Table 12:1 st group Table 13: 2 nd group y x 1 x 2 e y x 1 x 2 e Similarly, the test could be applied to the model with respect to the second regressor. The test has shown the first regressor causes a heteroscedasticity of the form =. Therefore, we shall improve the estimates of the coefficients in the original model by using the generalized least squares method. This means that each variable of the model will be multiplied by 1/, and the ordinary least squares will be applied to these transformed values. The transformed data are in Table 14. Table 14: Transformed data y x x Source: own Instead of the model = + + +, the model = is used, where = /, = / a = 1/. The ordinary least squares method, applied to the transformed model, gives the estimate = (,, ) = (6.13, 0.056, 0.664). The covariance matrix is in this case () = ,

51 where the first element on the diagonal is related to the parameter, the second element on the diagonal is related to and the last element, lying in the 3 rd row and 3 rd column of the matrix, concerns the parameter. If the ordinary least squares were applied to the original, untransformed model, the result would have been = (,, ) = ( 2.18, 6.813, 0.046) with the covariance matrix (see chapter ) () = ( ) ( )( ) = Comparing the results, it can be seen that the transformation led to an approximately twice as low variance for the first two coefficients (according to the estimated variances, at least). Variances / are Table 15: Comparison of variance / for the least squares and generalized least squares Least sq. Generalized least sq. b b b We shall finish this chapter by noting that the relation = represents only the simplest model that describes heteroscedasticity. Other, more complex models exist, as well. For instance, a statistical test can show that a regression of the form = + is more suitable for the description of. Such a relation will not be detected by the Goldfeld-Quandt test. In this model, is a random variable and is often expressed as a linear function of the explanatory variables appearing in the original regression, i.e. = + + +, or = ( ). To apply the generalized least squares method in this case means to estimate the parameters in the model for in the first place, so that the matrix P could be constructed. Following this procedure, however, leads to the estimate =, (2.12) where the symbol represents the estimate of, and contains the estimates. The vector will have different statistical properties than the estimate ( ) that should have been used. For small-size data samples, the properties of 2-12 are unknown, but its asymptotic properties may help. For the definition of asymptotic properties, see the first chapter of this text. If the estimates are consistent, the generalized least squares method calculated according to 2-12 gives an estimate which has the same asymptotic properties as the estimate ( ). These are the properties valid in the case of classical regression. This is the reason why the first attempts are usually aimed at finding consistent estimates, using various models for, when a more complicated form of heteroscedasticity occurs [4]. 51

52 2.5. AUTOCORRELATION Another problem that deserves extra attention is autocorrelation. Whereas heteroscedasticity typically occurs in cross-sectional data, and not in time series data, the opposite is true in the case of autocorrelated random components. Time series are the domain of autocorrelation, crosssectional data less frequently so. We shall therefore replace the subscript i used in regression models with the subscript t which will denote a point in time. The general regression model takes the form = , = 1, 2,,. The subject matter of this chapter will be presented similarly as in the previous chapter. Firstly, we shall be preoccupied with the general formulation of the model and the consequences of autocorrelation. Later, the removal of problems caused by autocorrelation will be discussed, and the final parts of the chapter will deal with the detection of autocorrelation FORMULATION OF THE MODEL AND CONSEQUENCES When autocorrelation is present, it is assumed that all the conditions of classical regression hold, save for the assumption of zero covariance between the random components, which now satisfies the condition (, ) = 0,, = 1, 2,,,. The condition of homoscedasticity: (, ) = = > 0 is still satisfied. Under these conditions, the covariance matrix of the vector of random components of the model can be written as or ( ) = 1 ( ) = 1 =, (2.13) 1 where denotes serial correlation between the i-th and j-th random component. Autocorrelation originates for many reasons. It typically results from the natural association of the modelled economic variables. It is a distinctive feature that the past values of these variables affect to an extent their future values, as well. The household consumption of electronics may serve as an example it is logical that the more households supplied themselves with electronic devices in the past, the less they will buy them in the near future. Another reason behind autocorrelation is the situation when an important explanatory variable has not been included in the regression model, i.e. the model is incorrectly specified. Also, autocorrelation is a typical feature of dynamic models, which contain lagged variables. Autoregressive models are such an example they contain not only explanatory variables, but also lagged-explained variables. Last 52

53 but not least, autocorrelation may also result from poor linear approximations of a nonlinear function. The roughness of the approximation can install some correlated structure within the model. As for the consequences of autocorrelation, their general formulation is more complex than in the case of heteroscedasticity because of the great variety of forms of autocorrelation. For instance, only random components that are placed next to one another on the time axis may be correlated, whereas all other components farther away from each other in time may not. The variety of forms of autocorrelation is suggested, after all, by matrix Nonetheless, only a few of the many models that can be used to describe the behaviour of random components is actually exploited in practice. This chapter will focus on one such model, the so-called autoregressive model of the first order, or AR(1). This model is used most often because it is simple and works surprisingly well in many situations. The model assumes that the random components satisfy = +, where is another random component which meets all the conditions of classical regression, together with a new condition that it is uncorrelated with. The component is also called white noise. This model implies that the correlation between and equals, regardless of t, the correlation between and is equal to, regardless of t, and so on. Generally, the =, = 0, 1, 2,, regardless of t. Let us return now to the consequences of autocorrelation that would have to be confronted if the ordinary least squares method was used to estimate the unknown parameters of the regression model. First of all, when looking at 2-4, it can be seen that the conclusions are the same as in the case of heteroscedasticity, as the derivation of the conclusions would be based again on working with the matrix of parameters, regardless of how the parameters look like. This means that even if the autocorrelation is present, the estimated regression coefficients remain unbiased. When it comes to their variances, let us first rewrite the relation = + to = + = + + = = +. (2.14) If < 1, the first term on the right-hand side of 2-14 converges, in a certain well-defined sense [5], to zero, which implies that 2-14 can be rewritten as =. Further, since the operation of summing and the operation of calculating variance can be interchanged, the last equation and the uncorrelated nature of the result in ( ) = 1. 53

54 This means that the vector of estimated coefficients, obtained by the ordinary least squares, has a variance () = ( ), where is the variance of the white noise. This has a clear consequence: if there is an AR(1) autocorrelation in the model, the least squares estimates cease to be efficient. Similar results are enforced by other forms of autocorrelation, and so it can be generally said that autocorrelation reduces the precision of the coefficients estimated by the ordinary least squares method. Another problem is the expression ( ) which is used to estimate the unknown variance. This estimate is not unbiased anymore, which means, among other things, that the statistical inference based on the standard F-test and t-tests is not ideal. The consequences of autocorrelation are therefore analogous to those provoked by heteroscedasticity AUTOCORRELATION AND THE GENERALIZED LEAST SQUARES METHOD Since the consequences of autocorrelation are similar to those of heteroscedasticity, there is no reason to alter the approach to solving the problem of autocorrelation, and the procedure used in the case of heteroscedasticity can be applied, as well. The generalized least squares method leads to the same conclusions under autocorrelation, the only difference being the form of the transformation matrix P. Since we are working with the model AR(1) and the matrix satisfies 2-13, we can replace the element of the matrix with, getting 1 1 = 1. (2.15) 1 The inverse to the matrix 2-15 is = ( ) (2.16) and the matrix P such that = is 54

55 = (2.17) As in the case of heteroscedasticity, we may again ask the question what this matrix represents, when it comes to using it to transform the original model. Multiplying the model = + with P from the left, we get a set of equations that we would obtain if we multiplied the first equation of the set by the term 1, i.e. (1 ) = , and the other equations (i=2,...,t) were transformed into equations (2.18) = (1 ) +, + +, +( ), = 2,,. (2.19) Formula 2-19 suggests that the equation describing the relation from the previous time period be multiplied by and substracted from the equation related to the current time t. Sometimes, only transformation 2-19 is used without applying 2-18 to the first equation of the model. The estimate obtained by applying the ordinary least squares method to the transformed model given by 2-19 is called the Cochran-Orcutt estimate. A more efficient estimate, however, is the one obtained by transforming the first equation, as well. If all the data are used, the transformations restore the case of classical regression, and so the resulting estimate has all the desired statistical properties. In order to apply the generalized least squares method given by 2-18 and 2-19, a suitable estimate of the unknown correlation coefficient must be available. Such an estimate is =, (2.20) where s are residuals obtained by the ordinary least squares method applied to the original model. The estimate r is a consistent estimate of DURBIN-WATSON TEST To verify whether an AR(1) model can be assumed for the relation between the random components of a regression model, the Durbin-Watson test can be applied. The test scrutinizes the validity of the null hypothesis that there is no autocorrelation in the model. The alternative hypothesis assumes there is an AR(1) autocorrelation in the model. The test is carried out in several steps. First of all, the least squares estimates of the unknown coefficients of the original model are found, and the estimates are used to calculate the residuals of the model. Secondly, the test criterion 55

56 = ( ) (2.21) is obtained. Special statistical tables with critical values (see the end of this text) are then utilized to evaluate the criterion. The tables list the critical values for the test for a given number of observations T, significance level and the number of model parameters k, excluding the absolute term. Two values are found in the tables a lower limit and an upper limit. If the sample correlation r is positive, the null hypothesis is accepted if the test criterion T is greater than. It is rejected if the criterion is smaller than. If the correlation is negative, an alternative criterion = 4 is calculated and evaluated the same way with the same critical values and. If any of the two test criteria is greater than and smaller than, the test is inconclusive. Nevertheless, if this is the case, it is recommended to assume that autocorrelation is present in the model as a precaution because this is highly probable in the time series under scrutiny. The test is suitable for the detection of an AR(1) autocorrelation. It is a strong test. It can also be used for the detection of autocorrelations of the form = +, > 1. The testing procedure is the same with the exception that appears in the numerator of 2-21 instead of. The Durbin-Watson s test is subject to certain requirements. One of the conditions requires that there is no lagged dependent variable among the regressors of the original model. If the condition is not met, the so-called modified Durbin-Watson test must be applied instead. The null and alternative hypotheses are the same; the test criterion is of the form = (1 0,5), (2.22) ( ) where d is the test criterion 2-21 and ( ) is the variance estimate for the coefficient of the lagged variable. Test criterion 2-22 has approximately the standard normal distribution, which means that if its absolute value exceeds the corresponding quantile or critical value of N(0,1), the null hypothesis is rejected. The disadvantage of the test is that it cannot be used if ( ) = 1/. As in the case of heteroscedasticity, different and more complicated forms of autocorrelation also exist. Different forms then change the transformation matrix P, the properties of the generalized least squares estimates are proved differently, as well, and the detection of these autocorrelations requires other testing procedures. Example 5 shows the mechanism of working with the simple form of autocorrelation. EXAMPLE 5 The following Table 16 contains fictitious data on monthly household expenses on food in the Moravian region (in millions of Czech crowns). The data reflects the period from January 2000 (t = 1) to March 2001 (t = 15). In this example, =. 56

57 Table 16: Entry data for Example 5 y t x t Source: own Given the character of the time series (see Fig.1), a linear regression model with a single explanatory variable = shall be used to describe the data. The parameters of the model will be estimated by the ordinary least squares method and then tested for the presence of autocorrelation with the Durbin-Watson test. If it turns out the model is burdened with a significant autocorrelation, its coefficients will be re-estimated with the generalized least squares method. It is assumed that autocorrelation might be the only problem in the model. Figure 1: Time series y t The ordinary least squares method leads to estimates = = ( ) = 0, = Inserting the estimates back in the model, we get the fitted values = + for various values of t, and also the residuals = and =. All the calculations are in Table 17 as well as other data to be used in the test of autocorrelation. Table 17: Residuals and other calculations e t e t-1 (e t - e t-1 ) 2 e t 2 e t. e t

58 Sum The Durbin-Watson test criterion = /85.9 = 3.1. The estimated correlation between the random components of the model is, according to 2-21, = 48.73/85.9 = The Durbin-Watson s table of critical values shows that, given T = 15, the number of regressors without the absolute term = 1 and significance level = 0.05, the lower limit = and the upper limit = Since the correlation is negative, the alternative test criterion is calculated: 4 = 0.9 and compared with the limits. The alternative criterion is too small, suggesting that there is an AR(1) autocorrelation in the model. Thus, the model will be reestimated by transforming its terms according to 2-19 and This means the first observations are multiplied by 1 = 0.83, the other observations are reduced by the r-multiple of the corresponding observation from the preceding time period. Table 18 depicts the transformed data Table 18: Transformed data * y t abs.term * t * Now, applying the ordinary least squares method to the transformed data, we get a new estimate 58

59 = = ( ) = = The residuals now look as is described by Table 19. Table 19: Residuals from the transformed model e t 2 e t e t-1 (e t - e t-1 ) 2 e t. e t For the new model, = 95.17/61.19 = 1.55 and = 11.71/61.19 = Since the correlation is positive, we work with the statistic = The table at the end of this text provides us now with the lower limit and the upper limit The statistic DW is higher than the upper limit, suggesting we have managed to reduce the autocorrelation to a bearable level. This fact has also been reflected by the low sample correlation. Summary of terms: - Multicollinearity perfect and imperfect - Measurement errors errors in Y, errors in X - Method of instrumental variables - Heteroscedasticity and its consequences - Goldfeld-Quandt test - The generalized least squares in the case of Goldfeld-Quandt heteroscedasticity - The method of weighted least squares - Autocorrelation and its consequences 59

60 - The autoregressive process of order 1 and autocorrelation - Autocorrelation AR(1) and the generalized least squares - Cochran-Orcutt estimate - Durbin-Watson test, modified Durbin-Watson s test Questions 1. Using the table below, estimate the parameters of the model = A possible heteroscedasticity of the form = is assumed. Find out with the Goldfeld- Quandt test whether any of the regressors generate the heteroscedasticity. The significance level is 10%. (When creating the groups for the test, leave out the middle three observations). x x y Let s assume, regardless of the result in 1), that the model from 1) involves the heteroscedasticity discussed, which is caused by. In fact, the result given by the test may not be credulous due to the small size of the data. Estimate the model parameters using the generalized least squares (for the case of heteroscedasticity). 3. What would the estimated variances of the coefficients of the model in 1) look like if the ordinary least squares were applied to the original data from the table above, in spite of the presence of the heteroscedasticity? 4. Estimate the variances of the coefficients acquired through the generalized least squares, and compare them with the estimated variances of the coefficients obtained by the ordinary least squares applied to the original data. 5. Data from the two tables below are available. Find the parameters of the model = + + +, and verify with the Durbin-Watson test whether an autocorrelation is present in the model. The significance level is 5%. t x t x t y t t x t x t y t

61 6. Assume an AR(1) autocorrelation in the model from 5). Estimate the model parameters with the generalized least squares. 7. How do the estimated coefficients of the model from 5) change if they are calculated by the Cochran-Orcutt s procedure? Compare the results also with the estimates obtained by the ordinary least squares. Answers to questions 1) The ordinary least squares give the estimates b 0 = -90.1, b 1= 4.2, b = There are 13 values, thus the number of values to be left out in subsequent calculations involved in the Goldfeld-Quandt test is (8/30) x 13 = We decided to skip the middle three observations. One regression is performed for the observations related to the five lowest values of the selected explanatory variable, another regression is performed for the observations related to the five highest values of this explanatory variable. The test criterion is compared with the 10% critical value of the distribution,, which equals 9. To test the heteroscedasticity caused by, the two regressions lead to the sums of squares Sr 1 = and Sr 2 = 9.786, and so Sr2 T Sr , which is smaller than 9. Therefore, the first explanatory variable does not seem to cause any heteroscedasticity. As for the variable, in this case, one works exceptionally with exactly the same data after sorting the values of the explanatory variable in the ascending order. Thus, the conclusion will be exactly the same. None of the regressors causes any heteroscedasticity. 2) The covariance matrix of the vector of random components of the model Ω contains zero 2 elements off its main diagonal. The i-th diagonal element of the matrix is x 22. i The 2 2 transformation matrix is therefore diagonal, as well. Its i-th diagonal element is The transformed model is Y * * * x * * x * * i 0 1 i1 2 i2 i, (1/ ). x i 2 Y Y / x * i i i2 * 0 2 * i1 1 / xi 2 x * 1 0 * i2 xi1 / xi2 x * 2 1 We work with new data now: 61

62 * x i * x i * y i * * The ordinary least squares applied to these data give the estimates: ˆ = 0.69 = ˆ 2, = = 0 1 ˆ * ˆ 0, ˆ = 2.26 = ˆ 1, and the resulting model is yˆ x x2. 2 3) Using the ordinary least squares regardless of heteroscedasticity leads to the following covariance matrix of the vector of estimated coefficients Since b var( ) ( X X ) ( X X ) ( X X ), ( X X ) , 9, , , 049 ( X X ) 250, 343 6, , , 049 7, , , we have 7, var( b) ) The covariance matrix of the vector of estimated coefficients of the transformed model is ( X X). Therefore 5, ( Ω ) = We may now compare the estimated variances for the ordinary and generalized least squares 2 estimates, after the variances are divided by the unknown parameter : Ordinary least Generalized least squares squares b(0) 7, ,456.9 b(1) b(2) ) The least squares method yields estimates b0 3.5 b1 3.88, b The residuals are 62

63 t e(t) t e(t) The sample correlation coefficient is positive: 17 e e 40.3 r t t 1 t et t1 The Durbin-Watson statistic equals DW 17 2 ( et et 1) t et t Since there are two parameters in the model (excluding the absolute term), k = 2 and the number of observations is n = 17. The lower and upper limit for the test, given the parameters k and n, is d L = and d H = The test criterion lies between the two limits, meaning that it is not possible to determine whether an autocorrelation is or is not present. 6) To apply the generalized least squares method to the model Yt 0xt 0 1xt1 2xt2 t, where = 1 for t = 1,2.,,17, means to perform the following data transformation: x t0 For t = 1: for t = 2,3,,17 2 t (1 r y t t t t1 y ) y y ry 2 t0 ( 1 ) xt0 xt0 xt0 rxt1, 0 x r 2 t1 ( 1 ) xt1 xt1 xt1 rxt1, 1 x r 2 t 2 ( 1 ) xt 2 t2 t2 t1, 2 x r x x The parameter r is the usual estimate of the unknown correlation coefficient : rx r 17 e. e t t 1 t et t * * * * * After the transformation, the model Y x x x is estimated. Here, t 0 t0 1 t1 2 t2 t 63

64 t * x t * x t * x t * y t t * x t * x t * x t * y t The least squares method applied to the transformed data gives the estimates: b b b ) The procedure is similar to that in 6). The difference is that only the transformation of the original observations for t = 2,,17 is done: y y x x x ry t t t1, t0 xt0 rxt1,0, x rx t1 t1 t1,1, x rx t2 t2 t1,2. Thus, the column for t = 1 in the tables from 6) is left out, and the ordinary least squares method is then applied to the remaining transformed data. This leads to the estimation of for b 0, 2.53 for b 1 and for b 2. The ordinary least squares applied to the original, untransformed data gives the estimates: b , b1 3.88, b

65 3. SETS OF SIMULTANEOUS EQUATIONS Goal: This chapter introduces the reader to sets of simultaneous equations which have their own specific features that require altering the standard procedures used to estimate the unknown parameters of a model in a classical regression case. The chapter covers the problem of identification and presents frequently used methods of estimating unknown model coefficients. Time to learning: 4 hours. The presentation of fundamental regression concepts has focused so far on a model given by a single equation. In many cases, however, it is necessary to introduce more equations to describe properly all the relations among the variables of interest. This means that sets of regression equations are involved in the analysis. Sets of equations are applied to a great extent in economics, for instance, which often deals with situations that are supposed to describe a balance in the market. These situations require a description of the supply side with one equation, the demand side with another equation, and they also require a third equation to describe the balance of supply and demand. Generally speaking, different kinds of sets of regression equations exist, depending on what theoretical requirements the equations are supposed to comply with. The various kinds of sets differ in the extent to which the equations of a set are intertwined. The interconnection further depends on what variables appear in the equations in the first place. Since the classification of variables is important here, let us extend the vocabulary we have used to describe the variables of a regression model. Looking at a particular regression equation, we see that the explained variable is generated by the system which is defined by an equation. The variable is a product of that equation. We call such variables endogenous. On the contrary, variables which come into existence outside the system described by the equation, or they are not really a product of that equation, are called exogenous. So far, we have worked with a single regression equation. In that equation, the variable Y was endogenous and the regressors were exogenous. There is also another category of variables that are called lagged endogenous. These are, as is clearly suggested by the name, endogenous variables from previous time periods. To give an example, the equation = contains two lagged endogenous variables and. All exogenous variables together with lagged endogenous variables are called predetermined variables. When it comes to different kinds of regression systems, one type is represented by M equations = +, j = 1,2,,M, where the equations may contain different regressors on their right-hand side and different coefficients, as well, but only exogenous variables appear on the 65

66 right-hand side of the equations. Also, the random components from different equations may be correlated. These systems are called seemingly unrelated equations. Although the systems look as something quite new, their parameters can often be estimated, using the theory that has been presented because the models consisting of one equation are actually a set of equations, each of the equations being related to a particular observation. We shall not be preoccupied with these systems, however, and will focus on sets of equations the mutual relations of which are much tighter because variables which are explanatory variables in one equation may be explained variables in another equation. These are the systems that are called simultaneous equations. An example of such a system is the set of two equations = + + +, = As we can see in this example, the variable is an explanatory variable in the first equations, whereas it is the explained variable in the second equation. Analogy holds true for the variable. The two equations are outside the framework of classical regression because some of the regressors, being stochastic in the first place, are correlated with the respective random components of the model. To see this, we observe that depends on which depends on. Thus, depends on, and we cannot assume that there is no correlation between the two variables. This causes a problem. If a nonzero correlation exists between a regressor and the corresponding random component, the ordinary least squares method gives biased and inconsistent estimates. This implies we cannot use this method to estimate the parameters of each equation separately. And this problem has not been at all complicated in the text by a potential autocorrelation among the random components themselves, which would make things even worse. Methods exist, however, which provide us with the estimates of reasonable statistical properties. It can be shown that all of them, in fact, belong to the category of instrumental-variable technique. Some of these techniques are formally more complicated, others less complicated. In this chapter, two basic methods of estimation will be presented the method of indirect least squares and the two-stage least squares method. Before doing so, however, the so-called identification problem must be resolved first for any set of simultaneous equations. 3.1 IDENTIFICATION The problem of identification will be described for the following typical example of simultaneous equations: = + +, = + +, =. The first equation describes the dependence of supply on price level. The second equation describes the dependence of demand on price level. The third, a final equation, says that there is a balance in the market the amount of supplied products is the same as the amount of products demanded by the market. Let us imagine now that we are analysts who want to estimate the unknown regression parameters appearing in the equations. If we do that for the second equation, 66

67 we will work with two sets of data, when performing the estimation: a data array for = = and a data array for. When we are done with the estimation, we may ask ourselves what has actually been estimated. Is it the first equation or the second equation (because = = )? We can even multiply the first equation by a number a and the second equation by a number b, getting or ( + ) = + + ( + ) + ( + ) = = As it is suggested by the last equation, we might have even estimated a linear combination of the supply and demand equation. To sum up, the set of simultaneous equations is defined in such a way that we do not know what we estimate. In other words, we are confronted with the so-called problem of identification. Thus, before performing any estimation, conditions should be formulated, which will ensure that there is no problem with identification. The problem of identification also manifests itself in more than one way. Another way, perhaps more natural, is based on the relation between the so-called structural and the reduced form of the set of equations. Let us write the set of simultaneous equations for the times t= 1, 2,.., T in the structural form = = =. (3.1) Here, the y s are endogenous variables and there are as many of them as there are equations. The x s represent predetermined variables. In the matrix form, the equations can be written as where and + =, (3.2) = (,,, ), = (,,, ), = (,,, ) =, =. The set can also be written in the reduced form 67

68 = +, = +. (3.3) The question is whether the original coefficients of the set, contained in the matrices and, can be obtained from the coefficients of the matrix because this could be the way of estimating the parameters of 3-2. We might estimate the reduced coefficients of 3-3 and then reconstruct the original or structural coefficients, knowing that =. If the original coefficients can be obtained uniquely from the reduced coefficients, the set 3-2 is said to be exactly identified and there is no problem with identification. If there are more equations in = than is necessary to obtain the structural coefficients, the set 3-2 is said to be overidentified. In this case, there is no problem with identification either, but the question is how to utilize the abundant information on the coefficients as effectively as possible. If there are too few equations in = to reconstruct the structural coefficients, the set 3-2 is said to be underidentified, and there is no way for us to identify the original coefficients. In order for the entire system of equations to be identified, each of its equations must be identified, which means that each equation must be either exactly identified or overidentified. Conditions exist that tell us which of the aforementioned cases occur, when working with a set of simultaneous equations. There are two conditions: the order condition and the rank condition. The former is necessary for identification, but not sufficient. Together with the latter, however, we get a sufficient condition of identification. Working with m equations, and endogenous and predetermined variables, the order condition for an equation is: the number of variables (endogenous and predetermined) excluded from the equation must be equal or greater than the number of endogenous variables in the system less one. the rank condition for an equation is: the rank of the matrix of parameters of all the variables (endogenous and predetermined) excluded from the equation is equal to m 1. If the rank condition for an equation is met and the order condition is met in the form of equality, that equation is exactly identified. If the rank condition for an equation is met and the order condition is met in the form of inequality, that equation is overidentified. If all the equations are identified exactly, the whole system is exactly identified. If at least one equation is overidentified, the whole system is overidentified. EXAMPLE 6 Determine the identification form of the system = + + +, = + + +, where the y s are endogenous and the x s are predetermined. 68

69 Solution: As for the first equation, one variable is missing the variable, whereas the number of endogennous variables in the system equals two. Thus, the order condition is met exactly. Analogously, only one variable is missing in the second equation, and so the order condition is met exactly in for the second equation, as well. Regarding the rank condition, let us rewrite the system as = 0, = 0. Tabulating the coefficients of the endogenous and predetermined variables, we have Table 20: Coefficients in the structural form of equations The rank condition now is as follows: for the first equation, the corresponding matrix has a single element:. And so, unless this parameter is zero, the rank condition is satisfied. For the second equation, the matrix of interest has a single element:, and unless this element is zero, the rank condition is satisfied for this equation, as well. To sum up, unless the two parameters are zero, both equations are exactly identified. 3.2 ESTIMATION OF SIMULTANEOUS EQUATIONS Let us focus now on how the unknown coefficients of the system could be estimated. We shall assume that the equations are either exactly identified or overidentified because there is no procedure to estimate underidentified systems. The methods of estimation can be divided into two categories: category one is represented by the methods that are applied to each equation of the system separately; category two is represented by the methods which are applied to the whole system at once. We shall deal with the first category which is represented particularly by the method of indirect least squares and by the method of two-stage least squares THE METHOD OF INDIRECT LEAST SQUARES The principle of this method is simple, and has been outlined when the problem of identification was described. Let us assume a set of simultaneous equations in the structural form 3-2. This system can be written in the reduced form = ( ) + = +, in which all the endogenous variables are functions of all the predetermined variables. We can apply the ordinary least squares method to each reduced-form equation separately, whereby acquiring the estimates of the parameters contained in the matrix. Knowing that, we can get estimates of the original parameters by solving the set of equations =. If the system 69

70 3-2 is exactly identified, the equations = have a unique solution for the estimates of s and s. This procedure is the method of indirect least squares, and is applicable only to exactly identified systems of equations. The method provides consistent, asymptotically unbiased and asymptotically normal estimates under relatively mild conditions. For the purpose of statistical inference, the estimated covariance matrix of the vector of estimated coefficients b for a given equation is the matrix () = ( ) where = ( ), s being the components of the vector ( + ), T is the number of observations, p = number of parameters in the equation and, are the matrices with estimated coefficients. Further, is the matrix the columns of which are values of the predetermined variables appearing in the equation and fitted values of the explanatory endogenous variables appearing in that equation. The fitted values are obtained by regressing each explanatory endogenous variable on all predetermined variables of the system. Thus, the matrix looks similar to that used in classical regression. The only difference is that the values of the endogenous variables appearing on the right-hand side of the equation are replaced with their respective fitted values. In other words, approximately ~(, ) where is the vector of unknown coefficients in the given equation. EXAMPLE 7 Let us demonstrate the method for the system of equations = + + +, = + + +, which, as we have seen in Example 6, is exactly identified. To apply the method, the observations of Table 21 will be used. Table 21: Data for Example 7 t y t y t x t x t Source: own t y t y t x t x t

71 Solution: To simplify the notation, we shall not use the subscript t in what follows. We have = = = = The ordinary least squares applied separately to each equation above result in Since and = , = = +, 1 =, 1 = 1 we have and because = + 1, = 1, = = = , = = = (1 ) = , = (1 ) = , = (1 ), = (1 ), 1 71

72 it also follows that = 2.821, = TWO STAGE LEAST SQUARES As opposed tothe indirect least squares, the method of two-stage least squares is applicable to sets of both exactly identified and overidentified equations. In the first step of the method, instrumental variables are searched for, which would serve as a suitable replacement for the endogenous variables, appearing as explanatory variables in the equations. Using these instruments, the ordinary least squares are then applied to each equation separately in the second step of the method, the instruments serving as a replacement for the endogenous explanatory variables. The most suitable instrument turns out to be the fitted values of the explanatory endogenous variable in question. The fitted values are obtained from regressing the given explanatory endogenous variable on all the predetermined variables of the system. The same statistical properties as in the case of indirect least squares hold for the resulting estimator here (the properties and the asymptotic distribution). EXAMPLE 8 Estimate the unknown coefficients of the system below with two-stage least squares. = + + +, = The necessary observations are in Table 22. Table 22: Time series data for Example 8 y t y t x t x t Source: own Solution: This is the same set of equations as in the previous example; therefore the equations are exactly identified. We may proceed with the method. In the first step, we perform the regression =

73 with the ordinary least-squares method. This gives the estimate = The fitted values are in Table 23. Table 23: Fitted values from equation = In the second step, the ordinary least squares are applied to the equation = + + +, which gives the result = Similarly for the second equation, the first step of the method - the ordinary least squares applied to the equation = results in the estimate = with theoretical values in Table 24. Table 24: Fitted values from regression = The second step application of the ordinary least squares to the equation = provides the result = To sum up, the two-equation system is estimated as = , = Summary of terms: - Simultaneous equations - Structural and reduced form of sets of equations - Endogenous, exogenous variable; explained, explanatory, predetermined variables - Identification exactly identified, underidentified, overidentified systems - Order and rank condition for identification - Indirect least squares - Two-stage least squares 73

74 Questions 1. Estimate the following model with the two-stage least squares method = + +, + +, = + + +, +, using the data: t y t y t x t x t t y t y t x t x t Estimate the system from problem 1, with the indirect least squares and compare the result with the two-stage least squares. 3. Explain the principle of the two-stage least squares. 4. Explain the principle of the indirect least squares. 5. Formulate the order and rank condition for an equation. 6. Explain what predetermined variable means, explain the difference between exogenous and endogenous. Answers to questions 1. First -step result: yˆ x 0.14x y 0.37 y, t1 t1 t 2 t 1,1 t 1,2 yˆ x 0.18x 0.6y 0.67 y. t2 t1 t 2 t 1,1 t1,2 74

75 Second-step regression is: y t1 0 1xt1 2 yt 1,1 3 yˆ t2 t1 y t2 0 1xt2 2 yˆ t1 3 yt 1,2 t2. The resulting estimates are: ˆ0 5, ˆ 1 0.1, ˆ , ˆ 3 0.5, ˆ , ˆ , ˆ 2 7.8, ˆ The final estimates (for the structural-form system) will be the same as in See section again. 4. See section again. 5. See section 3.1. again. 6. See the beginning of the chapter 3 again. 75

76 TABLES Tables for the Durbin Watson test: sig.level = 1%, dl = lower crit.value, du = upper crit.value, n = data sample size, k = number of regressors excluding the absolute term (taken from [3]) 76

77 77

78 Tables for the Durbin Watson test: sig.level = 5%, dl = lower crit.value, du = upper crit.value, n = data sample size, k = number of regressors excluding the absolute term (taken from [3]) 78

79 References [1] KOUTSOYIANNIS, A. Theory of econometrics. Palgrave MacMmillan, 2001, 681 p., ISBN [2] AMEMIYA, T. Advanced. Harvard University Press, 1985, 521 p., ISBN- 13: [3] GREENE, William H. Econometric analysis. Prentice Hall, 2012, 1198 p., ISBN [4] HUŠEK, Roman. Základy ekonometrické analýzy. VŠE Praha, 1996, 225 p., ISBN [5] BROCKWELL, P.J., DAVIS, R.A. Time Series: Theory and Methods. Springer Science & Business Media, 2009, 577 p., ISBN-13: Additional resources [1] WOOLDRIDGE, J.M. Introductory. Macmillan Publishing, 2009, 868 p., ISBN [2] KENNEDY, P. A Guide to. MIT Press, 2003, 623 p., ISBN [3]HAYASHI, F..Princeton University Press, 2000, 712 p., ISBN

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model

Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Wooldridge, Introductory Econometrics, 4th ed. Chapter 2: The simple regression model Most of this course will be concerned with use of a regression model: a structure in which one or more explanatory

More information

1. The OLS Estimator. 1.1 Population model and notation

1. The OLS Estimator. 1.1 Population model and notation 1. The OLS Estimator OLS stands for Ordinary Least Squares. There are 6 assumptions ordinarily made, and the method of fitting a line through data is by least-squares. OLS is a common estimation methodology

More information

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM Subject Business Economics Paper No and Title Module No and Title Module Tag 8, Fundamentals of Econometrics 3, The gauss Markov theorem BSE_P8_M3 1 TABLE OF CONTENTS 1. INTRODUCTION 2. ASSUMPTIONS OF

More information

Linear Models in Econometrics

Linear Models in Econometrics Linear Models in Econometrics Nicky Grant At the most fundamental level econometrics is the development of statistical techniques suited primarily to answering economic questions and testing economic theories.

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations Simultaneous Equation Models. Introduction: basic definitions 2. Consequences of ignoring simultaneity 3. The identification problem 4. Estimation of simultaneous equation models 5. Example: IS LM model

More information

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47

ECON2228 Notes 2. Christopher F Baum. Boston College Economics. cfb (BC Econ) ECON2228 Notes / 47 ECON2228 Notes 2 Christopher F Baum Boston College Economics 2014 2015 cfb (BC Econ) ECON2228 Notes 2 2014 2015 1 / 47 Chapter 2: The simple regression model Most of this course will be concerned with

More information

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS chapter MORE MATRIX ALGEBRA GOALS In Chapter we studied matrix operations and the algebra of sets and logic. We also made note of the strong resemblance of matrix algebra to elementary algebra. The reader

More information

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley

Review of Classical Least Squares. James L. Powell Department of Economics University of California, Berkeley Review of Classical Least Squares James L. Powell Department of Economics University of California, Berkeley The Classical Linear Model The object of least squares regression methods is to model and estimate

More information

The regression model with one stochastic regressor.

The regression model with one stochastic regressor. The regression model with one stochastic regressor. 3150/4150 Lecture 6 Ragnar Nymoen 30 January 2012 We are now on Lecture topic 4 The main goal in this lecture is to extend the results of the regression

More information

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations. POLI 7 - Mathematical and Statistical Foundations Prof S Saiegh Fall Lecture Notes - Class 4 October 4, Linear Algebra The analysis of many models in the social sciences reduces to the study of systems

More information

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation

Outline. Nature of the Problem. Nature of the Problem. Basic Econometrics in Transportation. Autocorrelation 1/30 Outline Basic Econometrics in Transportation Autocorrelation Amir Samimi What is the nature of autocorrelation? What are the theoretical and practical consequences of autocorrelation? Since the assumption

More information

CHAPTER 6: SPECIFICATION VARIABLES

CHAPTER 6: SPECIFICATION VARIABLES Recall, we had the following six assumptions required for the Gauss-Markov Theorem: 1. The regression model is linear, correctly specified, and has an additive error term. 2. The error term has a zero

More information

2 Prediction and Analysis of Variance

2 Prediction and Analysis of Variance 2 Prediction and Analysis of Variance Reading: Chapters and 2 of Kennedy A Guide to Econometrics Achen, Christopher H. Interpreting and Using Regression (London: Sage, 982). Chapter 4 of Andy Field, Discovering

More information

Review of probability and statistics 1 / 31

Review of probability and statistics 1 / 31 Review of probability and statistics 1 / 31 2 / 31 Why? This chapter follows Stock and Watson (all graphs are from Stock and Watson). You may as well refer to the appendix in Wooldridge or any other introduction

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

ECON4515 Finance theory 1 Diderik Lund, 5 May Perold: The CAPM

ECON4515 Finance theory 1 Diderik Lund, 5 May Perold: The CAPM Perold: The CAPM Perold starts with a historical background, the development of portfolio theory and the CAPM. Points out that until 1950 there was no theory to describe the equilibrium determination of

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Linear Algebra for Beginners Open Doors to Great Careers. Richard Han

Linear Algebra for Beginners Open Doors to Great Careers. Richard Han Linear Algebra for Beginners Open Doors to Great Careers Richard Han Copyright 2018 Richard Han All rights reserved. CONTENTS PREFACE... 7 1 - INTRODUCTION... 8 2 SOLVING SYSTEMS OF LINEAR EQUATIONS...

More information

A Guide to Proof-Writing

A Guide to Proof-Writing A Guide to Proof-Writing 437 A Guide to Proof-Writing by Ron Morash, University of Michigan Dearborn Toward the end of Section 1.5, the text states that there is no algorithm for proving theorems.... Such

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Introduction to Metalogic

Introduction to Metalogic Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor (part II) The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Financial Econometrics

Financial Econometrics Financial Econometrics Estimation and Inference Gerald P. Dwyer Trinity College, Dublin January 2013 Who am I? Visiting Professor and BB&T Scholar at Clemson University Federal Reserve Bank of Atlanta

More information

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold.

Unless provided with information to the contrary, assume for each question below that the Classical Linear Model assumptions hold. Economics 345: Applied Econometrics Section A01 University of Victoria Midterm Examination #2 Version 1 SOLUTIONS Spring 2015 Instructor: Martin Farnham Unless provided with information to the contrary,

More information

Probabilities & Statistics Revision

Probabilities & Statistics Revision Probabilities & Statistics Revision Christopher Ting Christopher Ting http://www.mysmu.edu/faculty/christophert/ : christopherting@smu.edu.sg : 6828 0364 : LKCSB 5036 January 6, 2017 Christopher Ting QF

More information

[y i α βx i ] 2 (2) Q = i=1

[y i α βx i ] 2 (2) Q = i=1 Least squares fits This section has no probability in it. There are no random variables. We are given n points (x i, y i ) and want to find the equation of the line that best fits them. We take the equation

More information

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity 1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Inference with Simple Regression

Inference with Simple Regression 1 Introduction Inference with Simple Regression Alan B. Gelder 06E:071, The University of Iowa 1 Moving to infinite means: In this course we have seen one-mean problems, twomean problems, and problems

More information

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006

Keller: Stats for Mgmt & Econ, 7th Ed July 17, 2006 Chapter 17 Simple Linear Regression and Correlation 17.1 Regression Analysis Our problem objective is to analyze the relationship between interval variables; regression analysis is the first tool we will

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research

Linear models. Linear models are computationally convenient and remain widely used in. applied econometric research Linear models Linear models are computationally convenient and remain widely used in applied econometric research Our main focus in these lectures will be on single equation linear models of the form y

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Analysis of Variance and Co-variance. By Manza Ramesh

Analysis of Variance and Co-variance. By Manza Ramesh Analysis of Variance and Co-variance By Manza Ramesh Contents Analysis of Variance (ANOVA) What is ANOVA? The Basic Principle of ANOVA ANOVA Technique Setting up Analysis of Variance Table Short-cut Method

More information

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS

CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS CHAPTER 21: TIME SERIES ECONOMETRICS: SOME BASIC CONCEPTS 21.1 A stochastic process is said to be weakly stationary if its mean and variance are constant over time and if the value of the covariance between

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

1 Least Squares Estimation - multiple regression.

1 Least Squares Estimation - multiple regression. Introduction to multiple regression. Fall 2010 1 Least Squares Estimation - multiple regression. Let y = {y 1,, y n } be a n 1 vector of dependent variable observations. Let β = {β 0, β 1 } be the 2 1

More information

Econometrics - 30C00200

Econometrics - 30C00200 Econometrics - 30C00200 Lecture 11: Heteroskedasticity Antti Saastamoinen VATT Institute for Economic Research Fall 2015 30C00200 Lecture 11: Heteroskedasticity 12.10.2015 Aalto University School of Business

More information

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University

Topic 4 Unit Roots. Gerald P. Dwyer. February Clemson University Topic 4 Unit Roots Gerald P. Dwyer Clemson University February 2016 Outline 1 Unit Roots Introduction Trend and Difference Stationary Autocorrelations of Series That Have Deterministic or Stochastic Trends

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0 Introduction to Econometrics Midterm April 26, 2011 Name Student ID MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. (5,000 credit for each correct

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

Multiple Regression Analysis

Multiple Regression Analysis Chapter 4 Multiple Regression Analysis The simple linear regression covered in Chapter 2 can be generalized to include more than one variable. Multiple regression analysis is an extension of the simple

More information

Elementary Linear Algebra, Second Edition, by Spence, Insel, and Friedberg. ISBN Pearson Education, Inc., Upper Saddle River, NJ.

Elementary Linear Algebra, Second Edition, by Spence, Insel, and Friedberg. ISBN Pearson Education, Inc., Upper Saddle River, NJ. 2008 Pearson Education, Inc., Upper Saddle River, NJ. All rights reserved. APPENDIX: Mathematical Proof There are many mathematical statements whose truth is not obvious. For example, the French mathematician

More information

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati 405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati Prof. M. El-Sakka Dept of Economics Kuwait University In this chapter we take a critical

More information

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity

More information

1 Correlation and Inference from Regression

1 Correlation and Inference from Regression 1 Correlation and Inference from Regression Reading: Kennedy (1998) A Guide to Econometrics, Chapters 4 and 6 Maddala, G.S. (1992) Introduction to Econometrics p. 170-177 Moore and McCabe, chapter 12 is

More information

Making sense of Econometrics: Basics

Making sense of Econometrics: Basics Making sense of Econometrics: Basics Lecture 4: Qualitative influences and Heteroskedasticity Egypt Scholars Economic Society November 1, 2014 Assignment & feedback enter classroom at http://b.socrative.com/login/student/

More information

Homoskedasticity. Var (u X) = σ 2. (23)

Homoskedasticity. Var (u X) = σ 2. (23) Homoskedasticity How big is the difference between the OLS estimator and the true parameter? To answer this question, we make an additional assumption called homoskedasticity: Var (u X) = σ 2. (23) This

More information

QUANTITATIVE TECHNIQUES

QUANTITATIVE TECHNIQUES UNIVERSITY OF CALICUT SCHOOL OF DISTANCE EDUCATION (For B Com. IV Semester & BBA III Semester) COMPLEMENTARY COURSE QUANTITATIVE TECHNIQUES QUESTION BANK 1. The techniques which provide the decision maker

More information

The Simple Regression Model. Part II. The Simple Regression Model

The Simple Regression Model. Part II. The Simple Regression Model Part II The Simple Regression Model As of Sep 22, 2015 Definition 1 The Simple Regression Model Definition Estimation of the model, OLS OLS Statistics Algebraic properties Goodness-of-Fit, the R-square

More information

Econometrics. 7) Endogeneity

Econometrics. 7) Endogeneity 30C00200 Econometrics 7) Endogeneity Timo Kuosmanen Professor, Ph.D. http://nomepre.net/index.php/timokuosmanen Today s topics Common types of endogeneity Simultaneity Omitted variables Measurement errors

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

Modelling the Electric Power Consumption in Germany

Modelling the Electric Power Consumption in Germany Modelling the Electric Power Consumption in Germany Cerasela Măgură Agricultural Food and Resource Economics (Master students) Rheinische Friedrich-Wilhelms-Universität Bonn cerasela.magura@gmail.com Codruța

More information

Chapter 13. On the special properties of coarse and subtle matter.

Chapter 13. On the special properties of coarse and subtle matter. 60 Chapter 13. On the special properties of coarse and subtle matter. 98. A body can only be placed into a smaller space if its pores are compressed; also only the apparent size of a body can be altered,

More information

1. The Multivariate Classical Linear Regression Model

1. The Multivariate Classical Linear Regression Model Business School, Brunel University MSc. EC550/5509 Modelling Financial Decisions and Markets/Introduction to Quantitative Methods Prof. Menelaos Karanasos (Room SS69, Tel. 08956584) Lecture Notes 5. The

More information

2.1 Linear regression with matrices

2.1 Linear regression with matrices 21 Linear regression with matrices The values of the independent variables are united into the matrix X (design matrix), the values of the outcome and the coefficient are represented by the vectors Y and

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

TESTING FOR CO-INTEGRATION

TESTING FOR CO-INTEGRATION Bo Sjö 2010-12-05 TESTING FOR CO-INTEGRATION To be used in combination with Sjö (2008) Testing for Unit Roots and Cointegration A Guide. Instructions: Use the Johansen method to test for Purchasing Power

More information

LECTURE 11. Introduction to Econometrics. Autocorrelation

LECTURE 11. Introduction to Econometrics. Autocorrelation LECTURE 11 Introduction to Econometrics Autocorrelation November 29, 2016 1 / 24 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists of choosing: 1. correct

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression.

Variance. Standard deviation VAR = = value. Unbiased SD = SD = 10/23/2011. Functional Connectivity Correlation and Regression. 10/3/011 Functional Connectivity Correlation and Regression Variance VAR = Standard deviation Standard deviation SD = Unbiased SD = 1 10/3/011 Standard error Confidence interval SE = CI = = t value for

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression ST 430/514 Recall: A regression model describes how a dependent variable (or response) Y is affected, on average, by one or more independent variables (or factors, or covariates)

More information

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee

Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Business Analytics and Data Mining Modeling Using R Prof. Gaurav Dixit Department of Management Studies Indian Institute of Technology, Roorkee Lecture - 04 Basic Statistics Part-1 (Refer Slide Time: 00:33)

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

CORRELATION AND SIMPLE REGRESSION 10.0 OBJECTIVES 10.1 INTRODUCTION

CORRELATION AND SIMPLE REGRESSION 10.0 OBJECTIVES 10.1 INTRODUCTION UNIT 10 CORRELATION AND SIMPLE REGRESSION STRUCTURE 10.0 Objectives 10.1 Introduction 10. Correlation 10..1 Scatter Diagram 10.3 The Correlation Coefficient 10.3.1 Karl Pearson s Correlation Coefficient

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Introduction to Matrix Algebra and the Multivariate Normal Distribution

Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Structural Equation Modeling Lecture #2 January 18, 2012 ERSH 8750: Lecture 2 Motivation for Learning the Multivariate

More information

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY

UNIVERSITY OF NOTTINGHAM. Discussion Papers in Economics CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY UNIVERSITY OF NOTTINGHAM Discussion Papers in Economics Discussion Paper No. 0/06 CONSISTENT FIRM CHOICE AND THE THEORY OF SUPPLY by Indraneel Dasgupta July 00 DP 0/06 ISSN 1360-438 UNIVERSITY OF NOTTINGHAM

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

Lecture 4: Heteroskedasticity

Lecture 4: Heteroskedasticity Lecture 4: Heteroskedasticity Econometric Methods Warsaw School of Economics (4) Heteroskedasticity 1 / 24 Outline 1 What is heteroskedasticity? 2 Testing for heteroskedasticity White Goldfeld-Quandt Breusch-Pagan

More information

For all For every For each For any There exists at least one There exists There is Some

For all For every For each For any There exists at least one There exists There is Some Section 1.3 Predicates and Quantifiers Assume universe of discourse is all the people who are participating in this course. Also let us assume that we know each person in the course. Consider the following

More information

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b.

Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. B203: Quantitative Methods Answer all questions from part I. Answer two question from part II.a, and one question from part II.b. Part I: Compulsory Questions. Answer all questions. Each question carries

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory

Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models. An obvious reason for the endogeneity of explanatory Wooldridge, Introductory Econometrics, 3d ed. Chapter 16: Simultaneous equations models An obvious reason for the endogeneity of explanatory variables in a regression model is simultaneity: that is, one

More information

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics

Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics Wooldridge, Introductory Econometrics, 4th ed. Appendix C: Fundamentals of mathematical statistics A short review of the principles of mathematical statistics (or, what you should have learned in EC 151).

More information

Econ 510 B. Brown Spring 2014 Final Exam Answers

Econ 510 B. Brown Spring 2014 Final Exam Answers Econ 510 B. Brown Spring 2014 Final Exam Answers Answer five of the following questions. You must answer question 7. The question are weighted equally. You have 2.5 hours. You may use a calculator. Brevity

More information

Introduction to statistical modeling

Introduction to statistical modeling Introduction to statistical modeling Illustrated with XLSTAT Jean Paul Maalouf webinar@xlstat.com linkedin.com/in/jean-paul-maalouf November 30, 2016 www.xlstat.com 1 PLAN XLSTAT: who are we? Statistics:

More information

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i

B. Weaver (24-Mar-2005) Multiple Regression Chapter 5: Multiple Regression Y ) (5.1) Deviation score = (Y i B. Weaver (24-Mar-2005) Multiple Regression... 1 Chapter 5: Multiple Regression 5.1 Partial and semi-partial correlation Before starting on multiple regression per se, we need to consider the concepts

More information

THE LOGIC OF QUANTIFIED STATEMENTS. Predicates and Quantified Statements I. Predicates and Quantified Statements I CHAPTER 3 SECTION 3.

THE LOGIC OF QUANTIFIED STATEMENTS. Predicates and Quantified Statements I. Predicates and Quantified Statements I CHAPTER 3 SECTION 3. CHAPTER 3 THE LOGIC OF QUANTIFIED STATEMENTS SECTION 3.1 Predicates and Quantified Statements I Copyright Cengage Learning. All rights reserved. Copyright Cengage Learning. All rights reserved. Predicates

More information

Multiple Regression Analysis

Multiple Regression Analysis 1 OUTLINE Basic Concept: Multiple Regression MULTICOLLINEARITY AUTOCORRELATION HETEROSCEDASTICITY REASEARCH IN FINANCE 2 BASIC CONCEPTS: Multiple Regression Y i = β 1 + β 2 X 1i + β 3 X 2i + β 4 X 3i +

More information

VIEWPOINTS. Slavica Jovetic* s comment on Correlation analysis of indicators of regional competitiveness: The case of the Republic of Serbia (2013)

VIEWPOINTS. Slavica Jovetic* s comment on Correlation analysis of indicators of regional competitiveness: The case of the Republic of Serbia (2013) Economic Horizons, May - August 2014, Volume 16, Number 2, 161-167 Faculty of Economics, University of Kragujevac UDC: 33 eissn 2217-9232 www. ekfak.kg.ac.rs VIEWPOINTS Slavica Jovetic* s comment on analysis

More information

Statistical Inference with Regression Analysis

Statistical Inference with Regression Analysis Introductory Applied Econometrics EEP/IAS 118 Spring 2015 Steven Buck Lecture #13 Statistical Inference with Regression Analysis Next we turn to calculating confidence intervals and hypothesis testing

More information

Regression #8: Loose Ends

Regression #8: Loose Ends Regression #8: Loose Ends Econ 671 Purdue University Justin L. Tobias (Purdue) Regression #8 1 / 30 In this lecture we investigate a variety of topics that you are probably familiar with, but need to touch

More information

Linear Programming and its Extensions Prof. Prabha Shrama Department of Mathematics and Statistics Indian Institute of Technology, Kanpur

Linear Programming and its Extensions Prof. Prabha Shrama Department of Mathematics and Statistics Indian Institute of Technology, Kanpur Linear Programming and its Extensions Prof. Prabha Shrama Department of Mathematics and Statistics Indian Institute of Technology, Kanpur Lecture No. # 03 Moving from one basic feasible solution to another,

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

CS 147: Computer Systems Performance Analysis

CS 147: Computer Systems Performance Analysis CS 147: Computer Systems Performance Analysis Advanced Regression Techniques CS 147: Computer Systems Performance Analysis Advanced Regression Techniques 1 / 31 Overview Overview Overview Common Transformations

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information