Econometric Modelling

Size: px
Start display at page:

Download "Econometric Modelling"

Transcription

1 Econometric Modelling David F. Hendry Nuffield College, Oxford University. July 8, 2000 Abstract The theory of reduction explains the origins of empirical models, by delineating all the steps involved in mapping from the actual data generation process (DGP) in the economy far too complicated and high dimensional ever to be completely modeled to an empirical model thereof. Each reduction step involves a potential loss of information from: aggregating, marginalizing, conditioning, approximating, and truncating, leading to a local DGP which is the actual generating process in the space of variables under analysis. Tests of losses from many of the reduction steps are feasible. Models that show no losses are deemed congruent; those that explain rival models are called encompassing. The main reductions correspond to well-established econometrics concepts (causality, exogeneity, invariance, innovations, etc.) which are the null hypotheses of the mis-specification tests, so the theory has considerable excess content. General-to-specific (Gets) modelling seeks to mimic reduction by commencing from a general congruent specification that is simplified to a minimal representation consistent with the desired criteria and the data evidence (essentially represented by the local DGP). However, in small data samples, model selection is difficult. We reconsider model selection from a computer-automation perspective, focusing on general-to-specific reductions, embodied in PcGets an Ox Package for implementing this modelling strategy for linear, dynamic regression models. We present an econometric theory that explains the remarkable properties of PcGets. Starting from a general congruent model, standard testing procedures eliminate statistically-insignificant variables, with diagnostic tests checking the validity of reductions, ensuring a congruent final selection. Path searches in PcGets terminate when no variable meets the pre-set criteria, or any diagnostic test becomes significant. Non-rejected models are tested by encompassing: if several are acceptable, the reduction recommences from their union: if they re-appear, the search is terminated using the Schwartz criterion. Since model selection with diagnostic testing has eluded theoretical analysis, we study modelling strategies by simulation. The Monte Carlo experiments show that PcGets recovers the DGP specification from a general model with size and power close to commencing from the DGP itself, so model selection can be relatively non-distortionary even when the mechanism is unknown. Empirical illustrations for consumers expenditure and money demand will be shown live. Next, we discuss sample-selection effects on forecast failure, with a Monte Carlo study of their impact. This leads to a discussion of the role of selection when testing theories, and the problems inherent in conventional approaches. Finally, we show that selecting policy-analysis models by forecast accuracy is not generally appropriate. We anticipate that Gets will perform well in selecting models for policy. Financial support from the UK Economic and Social Research Council under grant L Modelling Nonstationary Economic Time Series, R , and Forecasting and Policy in the Evolving Macro-economy, L , is gratefully acknowledged. The research is based on joint work with Hans-Martin Krolzig of Oxford University.

2 2 Contents Introduction Theoryofreduction Empiricalmodels DGP Datatransformationsandaggregation Parametersofinterest Data partition Marginalization Sequentialfactorization Sequential factorization of WT Marginalizing with respect to VT Mapping to I(0) Conditionalfactorization Constancy Lagtruncation Functionalform Thederivedmodel Dominance Econometric concepts as measures of no information loss Implicitmodeldesign Explicitmodeldesign A taxonomy of evaluation information General-to-specific modelling Pre-search reductions Additionalpaths Encompassing Information criteria... 3 Sub-sample reliability Significantmis-specificationtests... 4 The econometrics of model selection Searchcosts Selection probabilities Deletion probabilities Path selection probabilities Improvedinferenceprocedures PcGets The multi-path reduction process of PcGets Settings in PcGets Limits to PcGets Collinearity Integratedvariables SomeMonteCarloresults AimoftheMonteCarlo DesignoftheMonteCarlo... 26

3 6.3 EvaluationoftheMonteCarlo Diagnostic tests Sizeandpowerofvariableselection Testsizeanalysis Empirical Illustrations DHSY UKMoneyDemand Model selection in forecasting, testing, and policy analysis Model selection for forecasting Sources of forecast errors Sampleselectionexperiments Modelselectionfortheorytesting Modelselectionforpolicyanalysis Congruent modelling Conclusions Appendix: encompassing References

4 4 Introduction The economy is a complicated, dynamic, non-linear, simultaneous, high-dimensional, and evolving entity; social systems alter over time; laws change; and technological innovations occur. Time-series data samples are short, highly aggregated, heterogeneous, non-stationary, time-dependent and interdependent. Economic magnitudes are inaccurately measured, subject to revision and important variables not unobservable. Economic theories are highly abstract and simplified, with suspect aggregation assumptions, change over time, and often rival, conflicting explanations co-exist. In the face of this welter of problems, econometric modelling of economic time series seeks to discover sustainable and interpretable relationships between observed economic variables. However, the situation is not as bleak as it may seem, provided some general scientific notions are understood. The first key is that knowledge accumulation is progressive: one does not need to know all the answers at the start (otherwise, no science could have advanced). Although the best empirical model at any point will later be supplanted, it can provide a springboard for further discovery. Thus, model selection problems (e.g., data mining) are not a serious concern: this is established below, by the actual behaviour of model-selection algorithms. The second key is that determining inconsistencies between the implications of any conjectured model and the observed data is easy. Indeed, the ease of rejection worries some economists about econometric models, yet is a powerful advantage. Conversely, constructive progress is difficult, because we do not know what we don t know, so cannot know how to find out. The dichotomy between construction and destruction is an old one in the philosophy of science: critically evaluating empirical evidence is a destructive use of econometrics, but can establish a legitimate basis for models. To understand modelling, one must begin by assuming a probability structure and conjecturing the data generation process. However, the relevant probability basis is unclear, sincet the economic mechanism is unknown. Consequently, one must proceed iteratively: conjecture the process, develop the associated probability theory, use that for modelling, and revise the starting point when the results do not match consistently. This can be seen in the gradual progress from stationarity assumptions, through integrated-cointegrated systems, to general non-stationary, mixing processes: further developments will undoubtedly occur, leading to a more useful probability basis for empirical modelling. These notes first review the theory of reduction in 2 to explain the origins of empirical models, then discuss some methodological issues that concern many economists. Despite the controversy surrounding econometric methodology, the LSE approach (see Hendry, 993, for an overview) has emerged as a leading approach to empirical modelling. One of its main tenets is the concept of general-to-specific modelling (Gets general-to-specific): starting from a general dynamic statistical model, which captures the essential characteristics of the underlying data set, standard testing procedures are used to reduce its complexity by eliminating statistically-insignificant variables, checking the validity of the reductions at every stage to ensure the congruence of the selected model. Section 3 discusses Gets, and relates it to the empirical analogue of reduction. Recently econometric model-selection has been automated in a program called PcGets, which is an Ox Package (see Doornik, 999, and Hendry and Krolzig, 999a) designed for Gets modelling, currently focusing on reduction approaches for linear, dynamic, regression models. The development of PcGets has been stimulated by Hoover and Perez (999), who sought to evaluate the performance of Gets. To implement a general-to-specific approach in a computer algorithm, all decisions must be mechanized. In doing so, Hoover and Perez made some important advances in practical modelling, and our approach builds on these by introducing further improvements. Given an initial general model, many reduction paths could be considered, and different selection strategies adopted for each path. Some of

5 5 these searches may lead to different terminal specifications, between which a choice must be made. Consequently, the reduction process is inherently iterative. Should multiple congruent contenders eventuate after a reduction round, encompassing can be used to test between them, with only the surviving usually non-nested specifications retained. If multiple models still remain after this testimation process, a new general model is formed from their union, and the simplification process re-applied. Should that union repeat, a final selection is made using information criteria, otherwise a unique congruent and encompassing reduction has been located. Automating Gets throws further light on several methodological issues, and prompts some new ideas, which are discussed in section 4. While the joint issue of variable selection and diagnostic testing using multiple criteria has eluded most attempts at theoretical analysis, computer automation of the model-selection process allows us to evaluate econometric model-selection strategies by simulation. Section 6 presents the results of some Monte Carlo experiments to investigate if the model-selection process works well or fails badly; their implications for the calibration of PcGets are also analyzed. The empirical illustrations presented in section 7 demonstrate the usefulness of PcGets for applied econometric research. Section 8 then investigates model selection in forecasting, testing, and policy analysis and shows the drawbacks of some widely-used approaches. 2 Theory of reduction First we define the notion of an empirical model, then explain the the origins of such models by the theory of reduction. 2. Empirical models In an experiment, the output is caused by the inputs and can be treated as if it were a mechanism: y t = f (z t ) + ν t [output] [input] [perturbation] () where y t is the observed outcome of the experiment when z t is the experimental input, f ( ) is the mapping from input to output, and ν t is a small, random perturbation which varies between experiments conducted at the same values of z. Given the same inputs {z t }, repeating the experiment generates essentially the same outputs. In an econometric model, however: y t = g (z t ) + ɛ t [observed] [explanation] [remainder] (2) y t can always be decomposed into two components, namely g (z t ) (the part explained) and ɛ t (unexplained). Such a partition is feasible even when y t does not depend on g (z t ). In econometrics: ɛ t = y t g (z t ). (3) Thus, models can be designed by selection of z t. Design criteria must be analyzed, and lead to the notion of a congruent model: one that matches the data evidence on the measured attributes. Successive congruent models should be able to explain previous ones, which is the concept of encompassing, and thereby progress can be achieved.

6 6 2.2 DGP Let {u t } denote a stochastic process where u t is a vector of n random variables. Consider the sample U T = (u...u T ),whereu t = (u...u t ). Denote the initial conditions by U 0 = (...u r...u u 0 ),andletu t = ( U 0 : U t ). The density function of U T conditional on U ( 0 is given by D U U T U 0, ψ ) where D U ( ) is represented parametrically by a k-dimensional vector of parameters ψ =(ψ...ψ k ) with parameter space Ψ R k. All elements of ψ need not be the same at each t, and some of the {ψ i } may reflect transient effects or regime shifts. The data generation process (DGP) of {u t } is written as: D U ( U T U 0, ψ ) with ψ Ψ R k. (4) The complete sample {u t,t =,...,T} is generated from D U ( ) by a population parameter value ψ p. ( The sample joint data density D U U T U 0, ψ ) is called the Haavelmo distribution (see e.g., Spanos, 989). The complete set of random variables relevant to the economy under investigation over t =,...T is denoted {u t },where denotes a perfectly measured variable and U T =(u,...,u T ),definedon the probability space (Ω, F, P). The DGP induces U T =(u,...,u T ) but U T is unmanageably large. Operational models are defined by a sequence of data reductions, organized into eleven stages. 2.3 Data transformations and aggregation One-one mapping of U T to a new data set W T : U T W T. The DGP of U T, and so of W T is characterized by the joint density: ( D U U T U 0, ψ ) ( T = DW W T W 0, φ ) T (5) where ψ T Ψ and φ T Φ, making parameter change explicit The transformation from U to W affects the parameter space, so Ψ is transformed into Φ. 2.4 Parameters of interest µ M. Identifiable, and invariant to an interesting class of interventions. 2 Data partition Partition WT into the two sets: WT = ( X T : ) V T where the X T matrix is T n. Everything about µ must be learnt from analyzing the X T is not essential to inference about µ. V T (6) alone, so that 2.6 Marginalization ( D W W T W 0, φ ) ( T = DV X V T X T, W 0, Λ ) ( a,t DX X T W 0, Λ ) b,t. (7) ( ) Eliminate VT by discarding the conditional density D V X VT X T, W 0, Λ a,t in (7), while retaining ( ) the marginal density D X (X T W 0, Λ b,t ). µ must be a function of Λ b,t alone, given by µ = f Λ b,t. ( ) A cut is required, so that Λ a,t : Λ b,t Λ a Λ b.

7 7 2.7 Sequential factorization To create the innovation process sequentially factorize X T as: ( D X X T W 0, Λ ) T ( b,t = D x xt X ) t, W 0, λ b,t. (8) t= Mean innovation error process ɛ t = x t E [ x t X t ] Sequential factorization of W T. Alternatively: ( D W W T W 0, φ ) T T = D w (w t W t, δ t ). (9) RHS innovation process is η t = w t E [ w t W t ] Marginalizing with respect to V T. t= ( D w (w t W t, δ t )=D v x (v t x t, W t, δ a,t ) D x xt Vt, X t, W ) 0, δ b,t, (0) as W t = { Vt, X t, W 0}. µ must be obtained from {δb,t } alone. Marginalize with respect to Vt : ( D x xt Vt, X t, W ) ( 0, δ b,t = Dx xt X t, W 0, δ b,t). () No loss of information if and only if δ b,t = δ b,t t, so the conditional, sequential distribution of {x t} does not depend on Vt (Granger non-causality). 2.8 Mapping to I(0) Needed to ensure conventional inference is valid, though many inferences will be valid even if this reduction is not enforced. Cointegration would need to be treated in a separate set of lectures. 2.9 Conditional factorization Factorize the density of x t into sets of n and n 2 variables where n + n 2 = n: x t = ( y t : z t), (2) where the y t are endogenous and the z t are non-modelled. ( D x xt X t, W ) ( 0, λ bt = Dy z yt z t, X t, W ) ( 0, θ a,t Dz zt X t, W ) 0, θ b,t (3) z t is weakly exogenous for µ if (i) µ = f (θ a,t ) alone; and (ii) (θ a,t, θ b,t ) Θ a Θ b. 2.0 Constancy Complete parameter constancy is : θ a,t = θ a t (4) where θ a Θ a, so that µ is a function of θ a : µ = f (θ a ). T ( D y z yt z t, X t, W ) 0, θ a (5) with θ a Θ. t=

8 8 2. Lag truncation Fix the extent of the history of X t in (5) at s earlier periods: 2.2 Functional form D y z ( yt z t, X t, W 0, θ a ) = Dy z ( yt z t, X t s t, W 0, δ ). (6) Map y t into y t = h (y t) and z t into z t = g (z t), and denote the resulting data by X. Assume that y t and z t simultaneously make D y z ( ) approximately normal and homoscedastic, denoted N n [η t, Υ]: 2.3 The derived model D y z ( yt z t, X t s t, W 0, δ ) = D y z ( y t z t, X t s t, W 0, γ ) (7) A (L) h (y) t = B (L) g (z) t + ɛ t (8) where ɛ t gapp N n [0, Σ ɛ ],anda (L) and B (L) are polynomial matrices (i.e., matrices whose elements are polynomials) of order s in the lag operator L. ɛ t is a derived, and not an autonomous, process defined by: ɛ t = A (L) h (y) t B (L) g (z) t. (9) The reduction to the generic econometric equation involves all the stages of aggregation, marginalization, conditioning etc., transforming the parameters from ψ which determines the stochastic features of the data, to the coefficients of the empirical model. 2.4 Dominance Consider two distinct scalar empirical models denoted M and M 2 with mean-innovation processes (MIPs) {ν t } and {ɛ t } relative to their own information sets, where ν t and ɛ t have constant, finite variances σν 2 and σ2 ɛ respectively. Then M variance dominates M 2 if σν 2 <σ2 ɛ, denoted by M M 2. Variance dominance is transitive since if M M 2 and M 2 M 3 then M M 3, and anti-symmetric since if M M 2 then it cannot be true that M 2 M. A model without a MIP error can be variance dominated by a model with a MIP on a common data set. The DGP cannot be variance dominated in the population by any models thereof (see e.g. Theil, 97, p543). Let U t denote the universe of information for the DGP and let X t be the subset, with associated innovation sequences {ν u,t } and {ν x,t }. Then as {X t } {U t }, E [ν u,t X t ]=0, whereas E [ν x,t U t ] need not be zero. A model with an innovation error cannot be variance dominated by a model which uses only a subset of the same information. If ɛ t = x t E [x t X t ],thenσɛ 2 is no larger than the variance of any other empirical model error defined by ξ t = x t G [x t X t ] whatever the choice of G [ ]. The conditional expectation is the minimum mean-square error predictor. These implications favour general rather than simple empirical models, given any choice of information set, and suggest modelling the conditional expectation. A model which nests all contending explanations as special cases must variance dominate in its class. Let model M j be characterized by parameter vector ψ j with κ j elements, then as in Hendry and Richard (982): M is parsimoniously undominated in the class {M i } if i, κ κ i and no M i M. Model selection procedures (such as AIC or the Schwarz criterion: see Judge, Griffiths, Hill, Lütkepohl and Lee (985)) seek parsimoniously undominated models, but do not check for congruence.

9 9 2 Econometric concepts as measures of no information loss [] Aggregation entails no loss of information on marginalizing with respect to disaggregates when the retained information comprises a set of sufficient statistics for the parameters of interest µ. [2] Transformations per se do not entail any associated reduction but directly introduce the concept of parameters of interest, and indirectly the notions that parameters should be invariant and identifiable. [3] Data partition is a preliminary although the decision about which variables to include and which to omit is perhaps the most fundamental determinant of the success or otherwise of empirical modelling. [4] Marginalizing with respect to v t is without loss providing the remaining data are sufficient for µ, whereas marginalizing without loss with respect to V t entails both Granger non-causality for x t and a cut in the parameters. [5] Sequential factorization involves no loss if the derived error process is an innovation relative to the history of the random variables, and via the notion of common factors, reveals that autoregressive errors are a restriction and not a generalization. [6] Integrated data systems can be reduced to I(0) by suitable combinations of cointegration and differencing, allowing conventional inference procedures to be applied to more parsimonious relationships. [7] Conditional factorization reductions, which eliminate marginal processes, lead to no loss of information relative to the joint analysis when the conditioning variables are weakly exogenous for the parameters of interest. [8] Parameter constancy, implicitly relates to invariance as constancy across interventions which affect the marginal processes. [9] Lag truncation involves no loss if the error process remains an innovation despite excluding some of the past of relevant variables. [0] Functional form approximations need involve no reduction (logs of log-normally distributed variables): e.g. when the two densities in (7) are equal. [] The derived model, as a reduction of the DGP, is nested within that DGP and its properties are explained by the reduction process: knowledge of the DGP entails knowledge of all reductions thereof. When knowledge of one model entails knowledge of another, the first is said to encompass the second. 2.6 Implicit model design This correcponds to the symptomatology approach in econometrics, testing for problems (autocorrelation, heteroscedasticity, omitted variables, multicollinearity, non-constant parameters etc.), and correcting these. 2.7 Explicit model design Mimic reduction theory in practical research to minimize the losses due to the reductions selected: leads to Gets modelling. 2.8 A taxonomy of evaluation information Partition the data X T [a] past data; [b] present data [c] future data. used in modelling into the three information sets: X T = ( X t : x t : X t+ ) T (20)

10 0 [d] theory information, which often is the source of parameters of interest, and is a creative stimulus in economics; [e] measurement information, including price index theory, constructed identities such as consumption equals income minus savings, data accuracy and so on; and: [f] data of rival models, which could be analyzed into past, present and future in turn. The six main criteria which result for selecting an empirical model are: [a] homoscedastic innovation errors; [b] weakly exogenous conditioning variables for the parameters of interest; [c] constant, invariant parameters of interest; [d] theory consistent, identifiable structures; [e] data admissible formulations on accurate observations; and [f] encompass rival models. Models which satisfy the first five information sets are said to be congruent: an encompassing congruent model satisfies all six criteria. 3 General-to-specific modelling The practical embodiment of reduction is general-to-specific (Gets) modelling. The DGP is replaced by the concept of the local DGP (LDGP), namely the joint distribution of the subset of variables under analysis. Then a general unrestricted model (GUM) is formulated to provide a congruent approximation to the LDGP, given the theoretical and previous empirical background. The empirical analysis commences from this general specification, after testing for mis-specifications, and if none are apparent, is simplified to a parsimonious, congruent representation, each simplification step being checked by diagnostic testing. Simplification can be done in many ways: and although the goodness of a model is intrinsic to it, and not a property of the selection route, poor routes seem unlikely to deliver useful models. Even so, some economists worry about the impact of selection rules on the properties of the resulting models, and insist on the use of apriorispecifications: but these need knowledge of the answer before we start, so deny empirical modelling any useful role and in practice, it has rarely contributed. Few studies have investigated how well general-to-specific modelling does. However, Hoover and Perez (999) offer important evidence in a major Monte Carlo, reconsidering the Lovell (983) experiments. They place 20 macro variables in databank; generate one (y) as a function of 0 5 others; regress y on all 20 plus all lags thereof, then let their algorithm simplify that GUM till it finds a congruent (encompassing) irreducible result. They check up to 0 different paths, testing for mis-specification, collect the results from each, then select one choice from the remainder by following many paths, the algorithm is protected against chance false routes, and delivers an undominated congruent model. Nevertheless, Hendry and Krolzig (999b) improve on their algorithm in several important respects and this section now describes these. 3. Pre-search reductions First, groups of variables are tested in the order of their absolute t-values, commencing with a block where all the p-values exceed 0.9, and continuing down towards the pre-assigned selection criterion, when deletion must become inadmissible. A less-stringent significance level is used at this step, usually 0%, since the insignificant variables are deleted permanently. If no test is significant, the F-test on all variables in the GUM has been calculated, establishing that there is nothing to model.

11 3.2 Additional paths Blocks of variables constitute feasible search paths, in addition to individual-coefficients, like the block F-tests in the preceding sub-section but along search paths. All paths that also commence with an insignificant t-deletion are explored. 3.3 Encompassing Encompassing tests select between the candidate congruent models at the end of path searches. Each contender is tested against their union, dropping those which are dominated by, and do not dominate, another contender. If a unique model results,select that; otherwise, if some are rejected, form the union of the remaining models, and repeat this round till no encompassing reductions result. That union then constitutes a new starting point, and the complete path-search algorithm repeats till the union is unchanged between successive rounds. 3.4 Information criteria When a union coincides with the original GUM, or with a previous union, so no further feasible reductions can be found, PcGets selects a model by an information criterion. The preferred final-selection rule presently is the Schwarz criterion, or BIC, defined as: SC = 2logL/T + p log(t )/T, where L is the maximized likelihood, p is the number of parameters and T is the sample size. For T = 40 and p = 40, minimum SC corresponds approximately to the marginal regressor satisfying t.9. 3 Sub-sample reliability For that finally-selected model, sub-sample reliability is evaluated by the Hoover Perez overlapping split-sample test. PcGets concludes that some variables are definitely excluded; some definitely included, and some have an uncertain role, varying from a reliability of 25% (included in the final model, but insignificant and insignificant in both sub-samples), through to 75% (significant overall and in one sub-sample, or in both sub-samples). 3.6 Significant mis-specification tests If the initial mis-specification tests are significant at the pre-specified level, we raise the required significance level, terminating search paths only when that higher level is violated. Empirical investigators would re-specify the GUM on rejection. To see why Gets does well, we develop the analytics for several of its procedures. 4 The econometrics of model selection The key issue for any model-selection procedure is the cost of search, since there are always bound to be mistakes in statistical inference: specifically, how bad is it to search across many alternatives? The conventional statistical analysis of repeated testing provides a pessimistic background: every test has a non-zero null rejection frequency (or size, if independent of nuisance parameters), and so type I errors

12 accumulate. Setting a small size for every test can induce low power to detect the influences that really matter. Critics of general-to-specific methods have pointed to a number of potential difficulties, including the problems of lack of identification, measurement without theory, data mining, pre-test biases, ignoring selection effects, repeated testing, and the potential path dependence of any selection: see inter alia, Faust and Whiteman (997), Koopmans (947), Lovell (983), Judge and Bock (978), Leamer (978), Hendry, Leamer and Poirier (990), and Pagan (987). The following discussion draws on Hendry (2000a). Koopmans critique followed up the earlier attack by Keynes (939, 940) on Tinbergen (940a, 940b), and set the scene for doubting all econometric analyses that failed to commence from prespecified models. Lovell s study of trying to select a small relation (zero to five regressors) hidden in a large database (40 variables) found a low success rate, thereby suggesting that search procedures had high costs, and supporting an adverse view of data-based model selection. The third criticism concerned applying significance tests to select variables, arguing that the resulting estimator was biased in general by being a weighted average of zero (when the variable was excluded) and an unbiased coefficient (on inclusion). The fourth concerned biases in reported coefficient standard errors from treating the selected model as if there was no uncertainty in the choice. The next argued that the probability of retaining variables that should not enter a relationship would be high because a multitude of tests on irrelevant variables must deliver some significant outcomes. The sixth suggested that how a model was selected affected its credibility : at its extreme, we find the claim in Leamer (983) that the mapping is the message, emphasizing the selection process over the properties of the final choice. In the face of this barrage of criticism, many economists came to doubt the value of empirical evidence, even to the extent of referring to it as a scientific illusion (Summers, 99). The upshot of these attacks on empirical research was that almost all econometric studies had to commence from pre-specified models (or pretend they did). Summers (99) failed to notice that this was the source of his claimed scientific illusion : econometric evidence had become theory dependent, with little value added, and a strong propensity to be discarded when fashions in theory changed. Much empirical evidence only depends on low-level theories which are part of the background knowledge base not subject to scrutiny in the current analysis so a data-based approach to studying the economy is feasible. Since theory dependence has at least as many drawbacks as sample dependence, data modelling procedures are essential: see Hendry (995a). Indeed, all of these criticisms are refutable, as we now show. First, identification has three attributes, as discussed in Hendry (997), namely uniqueness, satisfying the required interpretation, and correspondence to the desired entity. A non-unique result is clearly not identified, so the first attribute is necessary, but insufficient, since uniqueness can be achieved by arbitrary restrictions (criticized by Sims, 980, inter alia). There can exist a unique combination of several relationships which is incorrectly interpreted as one of those equations: e.g., a reduced form that has a positive price effect, wrongly interpreted as a supply relation. Finally, a unique, interpretable model of (say) a money-demand relation may in fact correspond to a Central Bank s supply schedule, and this too is sometimes called a failure to identify the demand relation. Because economies are highly interdependent, simultaneity was long believed to be a serious problem, but higher frequencies of observation have attenuated this problem. Anyway, simultaneity is not invariant under linear transformations although linear systems are so can be avoided by eschewing contemporaneous regressors until weak exogeneity is established. Conditioning ensures a unique outcome, although it cannot guarantee that the resulting model corresponds to the underlying reality. Next, Keynes appears to have believed that statistical work in economics is impossible without 2

13 3 knowledge of everything in advance. But if partial explanations are devoid of use, and empirically we could discover nothing not already known, then no science could have progressed. That is clearly refuted by the historical record. The fallacy in Keynes s argument is that since theoretical models are incomplete and incorrect, an econometrics that is forced to use such theories as the only permissible starting point for data analysis can contribute little useful knowledge, except perhaps rejecting the theories. When invariant features of reality exist, progressive research can discover them in part without prior knowledge of the whole: see Hendry (995b). A similar analysis applies to the attack in Koopmans on the study by Burns and Mitchell: he relies on the (unstated) assumption that only one sort of economic theory is applicable, that it is correct, and that it is immutable (see Hendry and Morgan, 995). Data mining is revealed when conflicting evidence exists or when rival models cannot be encompassed and if they can, then an undominated model results despite the inappropriate procedure. Thus, stringent critical evaluation renders the data mining criticism otiose. Gilbert (986) suggests separating output into two groups: the first contains only redundant results (those parsimoniously encompassed by the finally-selected model), and the second contains all other findings. If the second group is not null, then there has been data mining. On such a characterization, Gets cannot involve data mining, despite depending heavily on data basing. When the LDGP is known apriorifrom economic theory, but an investigator did not know that the resulting model was in fact true, so sought to test conventional null hypotheses on its coefficients, then inferential mistakes will occur in general. These will vary as a function of the characteristics of the LDGP, and of the particular data sample drawn, but for many parameter values, the selected model will differ from the LDGP, and hence have biased coefficients. This is the pre-test problem, and is quite distinct from the costs of searching across a general set of specifications for a congruent representation of the LDGP. If a wide variety of models would be reported when applying any given selection procedure to different samples from a common DGP, then the results using a single sample apparently understate the true uncertainty. Coefficient standard errors only reflect sampling variation conditional on a fixed specification, with no additional terms from changes in that specification (see e.g., Chatfield, 995). Thus, reported empirical estimates must be judged conditional on the resulting equation being a good approximation to the LDGP. Undominated (i.e., encompassing) congruent models have a strong claim to provide such an approximation, and conditional on that, their reported uncertainty is a good measure of the uncertainty inherent in such a specification for the relevant LDGP. The theory of repeated testing is easily understood: the probability p that none of n tests rejects at 00α% is: p α =( α) n. When 40 tests of correct null hypotheses are conducted at α =0.05, p , whereas p However, it is difficult to obtain spurious t-test values much in excess of three despite repeated testing: as Sargan (98) pointed out, the t-distribution is thin tailed, so even the 0% critical value is less than three for 50 degrees of freedom. Unfortunately, stringent criteria for avoiding rejections when the null is true lower the power of rejection when it is false. The logic of repeated testing is accurate as a description of the statistical properties of mis-specification testing: conducting four independent diagnostic tests at 5% will lead to about 9% false rejections. Nevertheless, even in that context, there are possible solutions such as using a single combined test which can substantially lower the size without too great a power loss (see e.g., Godfrey and Veale, 999). It is less clear that the analysis is a valid characterization of selection procedures in general when more one path is searched, so there is no error correction for wrong reductions. In fact, the serious practical difficulty is not one of avoiding

14 4 spuriously significant regressors because of repeated testing when many hypotheses are tested, it is retaining all the variables that genuinely matter. Path dependence is when the results obtained in a modelling exercise depend on the simplification sequence adopted. Since the quality of a model is intrinsic to it, and progressive research induces a sequence of mutually-encompassing congruent models, proponents of Gets consider that the path adopted is unlikely to matter. As Hendry and Mizon (990) expressed the matter: the model is the message. Nevertheless, it must be true that some simplifications lead to poorer representations than others. One aspect of the value-added of the approach discussed below is that it ensures a unique outcome, so the path does not matter. We conclude that each of these criticisms of Gets can be refuted. Indeed, White (990) showed that with sufficiently-rigorous testing, the selected model will converge to the DGP. Thus, any overfitting and mis-specification problems are primarily finite sample. Moreover, Mayo (98) emphasized the importance of diagnostic test information being effectively independent of the sufficient statistics from which parameter estimates are derived. Hoover and Perez (999) show how much better Gets is than any method Lovell considered, suggesting that modelling per se need not be bad. Indeed, overall, the size of their selection procedure is close to that expected, and the power is reasonable. Moreover, re-running their experiments using our version (PcGets) delivered substantively better outcomes (see Hendry and Krolzig, 999b). Thus, the case against model selection is far from proved. 4. Search costs Let p dgp i denote the probability of retaining the i th variable out of k when commencing from the DGP specification and applying the relevant selection test at the same significance level as the search procedure. Then p dgp i is the expected cost of inference. For irrelevant variables, p dgp i 0, sothat whole cost for those is attributed to search. Let p gum i denote the probability of retaining the i th variable when commencing from the GUM, and applying the same selection test and significance level. Then, the search costs are p dgp i p gum i. False rejection frequencies of the null can be lowered by increasing the required significance levels of selection tests, but only at the cost of also reducing power. However, it is feasible to lower the former and raise the latter simultaneously by an improved search algorithm, subject to the bound of attaining the same performance as knowing the DGP from the outset. To keep search costs low, any model-selection process must satisfy a number of requirements. First, it must start from a congruent statistical model to ensure that selection inferences are reliable: consequently, it must test for model mis-specification initially, and such tests must be well calibrated (nominal size close to actual). Secondly, it must avoid getting stuck in search paths that initially inadvertently delete relevant variables, thereby retaining many other variables as proxies: consequently, it must search many paths. Thirdly, it must check that eliminating variables does not induce diagnostic tests to become significant during searches: consequently, model mis-specification tests must be computed at every stage. Fourthly, it must ensure that any candidate model parsimoniously encompasses the GUM, so no loss of information has occurred. Fifthly, it must have a high probability of retaining relevant variables: consequently, a loose significance level and powerful selection tests are required. Sixthly, it must have a low probability of retaining variables that are actually irrelevant: consequently, this clashes with the fifth objective in part, but requires an alternative use of the available information. Finally, it must have powerful procedures to select between the candidate models, and any models derived from them, to end with a good model choice, namely one for which: k L = p dgp p gum i= i i

15 5 is close to zero. 4.2 Selection probabilities When searching a large database for that DGP, an investigator could well retain the relevant regressors much less often than when the correct specification is known, in addition to retaining irrelevant variables in the finally-selected model. We first examine the problem of retaining significant variables commencing from the DGP, then turn to any additional power losses resulting from search. For a regression coefficient β i, hypothesis tests of the null H 0 : β i =0will reject with a probability dependent on the non-centrality parameter of the test. We consider the slightly more general setting where t-tests are used to check an hypothesis, denoted t(n, ψ) for n degrees of freedom, when ψ is the non-centrality parameter, equal to zero under the null. For a critical value c α, P ( t c α H 0 )=α where H 0 implies ψ =0. The following table records some approximate power calculations when one coefficient null hypothesis is tested and when four are tested, in each case, precisely once. t-test powers ψ n α P ( t c α ) P ( t c α ) Thus, there is little hope of retaining variables with ψ =, and only a chance of retaining a single variable with a theoretical t of 2 when the critical value is also 2, falling to for a critical value of 2.6. Whenψ =3, the power of detection is sharply higher, but still leads to more than 35% mis-classifications. Finally, when ψ =4, one such variable will almost always be retained. However, the final column shows that the probability of retaining all four relevant variables with the given non-centrality is essentially negligible even when they are independent, except in the last few cases. Mixed cases (with different values of ψ) can be calculated by multiplying the probabilities in the fourth column (e.g., for ψ =2, 3, 4, 6 the joint P ( ) =0 at α =0.0). Such combined probabilities are highly non-linear in ψ, since one is almost certain to retain all four when ψ =6,even at a % significance level. The important conclusion is that, despite knowing the DGP, low signalnoise variables will rarely be retained using t-tests when there is any need to test the null; and if there are many relevant variables, all of them are unlikely to be retained even when they have quite large non-centralities. 4.3 Deletion probabilities The most extreme case where low deletion probabilities might entail high search costs is when many variables are included but none actually matters. PcGets systematically checks the reducibility of the GUM by testing simplifications up to the empty model. A one-off F-test F G of the GUM against the null model using critical value c γ would have size P (F G c γ )=γ under the null if it was the only test implemented. Consequently, path searches would only commence γ% of the time, and some of these could also terminate at the null model. Let there be k regressors in the GUM, of which n are retained

16 6 when t-test selection is used should the null model be rejected. In general, when there are no relevant variables, the probability of retaining no variables using t-tests with critical value c α is: P ( t i <c α i =,...,k)=( α) k. (2) Combining (2) with the F G -test, the null model will be selected with approximate probability: p G =( γ)+γ ( α) k, (22) where γ γ is the probability of F G rejecting yet no regressors being retained (conditioning on F G c γ cannot decrease the probability of at least one rejection). Since γ is set at quite a high value, such as 0.20, whereas α =0.05 is more usual, F G c 0.20 can occur without any t i c Evaluating (22) for γ =0.20, α =0.05 and k =20yields p G 0.87; whereas the re-run of the Hoover Perez experiments with k = 40reported by Hendry and Krolzig (999b) using γ = 0.0 yielded 97.2% in the Monte Carlo as against a theory prediction from (22) of 99%. Alternatively, when γ = 0. and α =0.0 (22) has an upper bound of 96.7%, falling to 9.3% for α =0.05. Thus, it is relatively easy to obtain a high probability of locating the null model, even when 40 irrelevant variables are included, using relatively tight significance levels, or a reasonable probability for looser significance levels. 4.4 Path selection probabilities We now calculate how many spurious regressors will be retained in path searches. The probability distribution of one or more null coefficients being significant in pure t-test selection at significance level α is given by the k +terms of the binomial expansion of: (α +( α)) k. The following table illustrates by enumeration for k =3: event probability number retained P ( t i <c α, i =,...3) ( α) 3 0 P ( t i c α t j <c α, j i) 3α ( α) 2 P ( t i <c α t j c α, j i) 3( α) α 2 2 P ( t i c α, i =,...3) α 3 3 Thus, for k =3, the average number of variables retained is: n =3 α ( α) α 2 +3α ( α) 2 =3α = kα. The result n = kα is general. When α =0.05 and k =40, n equals 2, falling to 0.4 for α =0.0: so even if only t-tests are used, few spurious variables will be retained. Combining the probability of a non-null model with the number of variables selected when the GUM F-test rejects: p = γα, (where p is the probability any given variable will be retained), which does not depend on k. For γ =0., α =0.0, wehavep =0.00. Evenforγ =0.25 and α =0.05, p =0.025 before search paths and diagnostic testing are included in the algorithm. The actual behaviour of PcGets is much more complicated than this, but can deliver a small overall size. Following the event F G c γ when γ =0. (so the null is incorrectly rejected 0% of the time), and approximating by 0 variables retained when

17 7 that occurs, then the average non-deletion probability (i.e., the probability any given variable will be retained) is p r = γn/k =0.25%, as against the reported value of 0.9% found by Hendry and Krolzig (999b). These are very small retention rates of spuriously-significant variables. Thus, in contrast to the relatively high costs of inference discussed in the previous section, those of search arising from retaining additional irrelevant variables are almost negligible. For a reasonable GUM with (say) 40 variables where 25 are irrelevant, even without the pre-selection and multiple path searches of PcGets, and using just t-tests at 5%, roughly one spuriously significant variable will be retained by chance. Against that, from the previous section, there is at most a 50% chance of retaining each of the variables that have non-centralites around 2, and little chance of keeping them all: the difficult problem is retention of relevance, not elimination of irrelevance. The only two solutions are better inference procedures, or looser critical values; we will consider them both. 4 Improved inference procedures An inference procedure involves a sequence of steps. As a simple example, consider a procedure comprising two F-tests: the first is conducted at the γ = 50% level, the second at δ = 5%. The variables to be tested are first ordered by their t-values in the GUM, such that t 2 t2 2 t2 k, and the first F-test adds in variables from the smallest observed t-values till a rejection would occur, with either F >c γ or an individual t >c α (say). All those variables except the last are then deleted from the model, and a second F-test conducted of the null that all remaining variables are significant. If that rejects, so F 2 >c δ, all the remaining variables are retained, otherwise, all are eliminated. We will now analyze the probability properties of this 2-step test when all k regressors are orthogonal for a regression model estimated from T observations. Once m variables are included in the first step, non-rejection requires that (a) the diagnostics are insignificant; (b) m variables did not induce rejection, (c) t m <c α and (d): F (m, T k) m t 2 i m c γ. (23) Clearly, any t 2 i reduces the mean F statistic, and since P( t i < ) = 0.68, whenk = 40, approximately 28 variables fall in that group; and P( t i.65) = 0. so only 4 variables should chance to have a larger t i value on average. In the conventional setting where α = 0.05 with P( t i < 2) 0.95, only 2 variables will chance to have larger t-values, whereas slightly more than half will have t 2 i < 0 or smaller. Since P(F (20, 00) < H 0 ) 03, a first step with γ =0 should eliminate all variables with t 2 i, and some larger t-values as well hence the need to check that t m <c α (below we explain why collinearity between variables that matter and those that do not should not jeopardize this step). A crude approximation to the likely value of (23) under H 0 is to treat all t-values within blocks as having a value equal to the mid-point. We use the five ranges t 2 i < 0,,.652, 4, and greater than 4, using the expected numbers falling in each of the first four blocks, which yields: F (38, 00) ( ) = , noting P(F (38, 00) < 0.84 H 0 ) 0.72 (setting all ts equal to the upper bound of each block yields an illustrative upper bound of about.3 for F ). Thus, surprisingly-large values of γ, suchas0.75, can be selected for this step yet have a high probability of eliminating almost all the irrelevant variables. Indeed, using γ =0.75 entails c γ 0.75 when m =20, since: i= P (F (20, 00) < 0.75 H 0 ) 0.75,

Econometrics Spring School 2016 Econometric Modelling. Lecture 6: Model selection theory and evidence Introduction to Monte Carlo Simulation

Econometrics Spring School 2016 Econometric Modelling. Lecture 6: Model selection theory and evidence Introduction to Monte Carlo Simulation Econometrics Spring School 2016 Econometric Modelling Jurgen A Doornik, David F. Hendry, and Felix Pretis George-Washington University March 2016 Lecture 6: Model selection theory and evidence Introduction

More information

New Developments in Automatic General-to-specific Modelling

New Developments in Automatic General-to-specific Modelling New Developments in Automatic General-to-specific Modelling David F. Hendry and Hans-Martin Krolzig Economics Department, Oxford University. December 8, 2001 Abstract We consider the analytic basis for

More information

Computer Automation of General-to-Specific Model Selection Procedures

Computer Automation of General-to-Specific Model Selection Procedures Computer Automation of General-to-Specific Model Selection Procedures By HANS-MARTIN KROLZIG and DAVID F. HENDRY Institute of Economics and Statistics and Nuffield College, Oxford. January 22, 2000 Abstract

More information

Semi-automatic Non-linear Model Selection

Semi-automatic Non-linear Model Selection Semi-automatic Non-linear Model Selection Jennifer L. Castle Institute for New Economic Thinking at the Oxford Martin School, University of Oxford Based on research with David F. Hendry ISF 2013, Seoul

More information

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i

1/34 3/ Omission of a relevant variable(s) Y i = α 1 + α 2 X 1i + α 3 X 2i + u 2i 1/34 Outline Basic Econometrics in Transportation Model Specification How does one go about finding the correct model? What are the consequences of specification errors? How does one detect specification

More information

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication

G. S. Maddala Kajal Lahiri. WILEY A John Wiley and Sons, Ltd., Publication G. S. Maddala Kajal Lahiri WILEY A John Wiley and Sons, Ltd., Publication TEMT Foreword Preface to the Fourth Edition xvii xix Part I Introduction and the Linear Regression Model 1 CHAPTER 1 What is Econometrics?

More information

Forecasting with large-scale macroeconometric models

Forecasting with large-scale macroeconometric models Forecasting with large-scale macroeconometric models Econometric Forecasting January 8th 2008 Agenda Introduction 1 Introduction Agenda Introduction 1 Introduction 2 Taxonomy General formulation Agenda

More information

The Properties of Automatic Gets Modelling

The Properties of Automatic Gets Modelling The Properties of Automatic Gets Modelling David F. Hendry and Hans-Martin Krolzig Economics Department, Oxford University. First version: September 00 This version: March 003 Abstract We examine the properties

More information

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations Simultaneous Equation Models. Introduction: basic definitions 2. Consequences of ignoring simultaneity 3. The identification problem 4. Estimation of simultaneous equation models 5. Example: IS LM model

More information

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES ISSN 1471-0498 DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES STATISTICAL MODEL SELECTION WITH BIG DATA Jurgen A. Doornik and David F. Hendry Number 735 December 2014 Manor Road Building, Manor Road,

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

IIF Economic Forecasting Summer School 2018, Boulder Colorado, USA

IIF Economic Forecasting Summer School 2018, Boulder Colorado, USA IIF Economic Forecasting Summer School 2018, Boulder Colorado, USA Dr. Jennifer L. Castle, David F. Hendry, and Dr. J. James Reade Economics Department, Oxford and Reading Universities, UK. September 30,

More information

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H.

ACE 564 Spring Lecture 8. Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information. by Professor Scott H. ACE 564 Spring 2006 Lecture 8 Violations of Basic Assumptions I: Multicollinearity and Non-Sample Information by Professor Scott H. Irwin Readings: Griffiths, Hill and Judge. "Collinear Economic Variables,

More information

7. Integrated Processes

7. Integrated Processes 7. Integrated Processes Up to now: Analysis of stationary processes (stationary ARMA(p, q) processes) Problem: Many economic time series exhibit non-stationary patterns over time 226 Example: We consider

More information

Population Growth and Economic Development: Test for Causality

Population Growth and Economic Development: Test for Causality The Lahore Journal of Economics 11 : 2 (Winter 2006) pp. 71-77 Population Growth and Economic Development: Test for Causality Khalid Mushtaq * Abstract This paper examines the existence of a long-run relationship

More information

7. Integrated Processes

7. Integrated Processes 7. Integrated Processes Up to now: Analysis of stationary processes (stationary ARMA(p, q) processes) Problem: Many economic time series exhibit non-stationary patterns over time 226 Example: We consider

More information

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES ISSN 1471-0498 DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES Improving the Teaching of Econometrics David F. Hendry and Grayham E. Mizon Number 785 March 2016 Manor Road Building, Oxford OX1 3UQ Improving

More information

General to Specific Model Selection Procedures for Structural Vector Autoregressions

General to Specific Model Selection Procedures for Structural Vector Autoregressions General to Specific Model Selection Procedures for Structural Vector Autoregressions Hans-Martin Krolzig Department of Economics and Nuffield College, Oxford University. hans-martin.krolzig@nuffield.oxford.ac.uk

More information

Reformulating Empirical Macro-econometric Modelling

Reformulating Empirical Macro-econometric Modelling Reformulating Empirical Macro-econometric Modelling David F. Hendry Economics Department, Oxford University and Grayham E. Mizon Economics Department, Southampton University. Abstract The policy implications

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Statistical model selection with Big Data

Statistical model selection with Big Data ECONOMIC METHODOLOGY, PHILOSOPHY & HISTORY RESEARCH ARTICLE Statistical model selection with Big Data Jurgen A. Doornik 1,2 and David F. Hendry 1,2 * Received: 11 March 2015 Accepted: 01 April 2015 Published:

More information

10. Time series regression and forecasting

10. Time series regression and forecasting 10. Time series regression and forecasting Key feature of this section: Analysis of data on a single entity observed at multiple points in time (time series data) Typical research questions: What is the

More information

Introduction to Eco n o m et rics

Introduction to Eco n o m et rics 2008 AGI-Information Management Consultants May be used for personal purporses only or by libraries associated to dandelon.com network. Introduction to Eco n o m et rics Third Edition G.S. Maddala Formerly

More information

1 Regression with Time Series Variables

1 Regression with Time Series Variables 1 Regression with Time Series Variables With time series regression, Y might not only depend on X, but also lags of Y and lags of X Autoregressive Distributed lag (or ADL(p; q)) model has these features:

More information

Department of Economics. Discussion Paper Series

Department of Economics. Discussion Paper Series Department of Economics Discussion Paper Series Selecting a Model for Forecasting Jennifer L. Castle, Jurgen A. Doornik and David F. Hendry Number 861 November, 2018 Department of Economics Manor Road

More information

Non-Stationary Time Series, Cointegration, and Spurious Regression

Non-Stationary Time Series, Cointegration, and Spurious Regression Econometrics II Non-Stationary Time Series, Cointegration, and Spurious Regression Econometrics II Course Outline: Non-Stationary Time Series, Cointegration and Spurious Regression 1 Regression with Non-Stationarity

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

Lecture 6 Specification and Model Selection Strategies

Lecture 6 Specification and Model Selection Strategies Lecture 6 Specification and Model Selection Strategies 1 Model Selection Strategies So far, we have implicitly used a simple strategy: (1) We started with a DGP, which we assumed to be true. (2) We tested

More information

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models

Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models Using Matching, Instrumental Variables and Control Functions to Estimate Economic Choice Models James J. Heckman and Salvador Navarro The University of Chicago Review of Economics and Statistics 86(1)

More information

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND

DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND DEPARTMENT OF ECONOMICS AND FINANCE COLLEGE OF BUSINESS AND ECONOMICS UNIVERSITY OF CANTERBURY CHRISTCHURCH, NEW ZEALAND Testing For Unit Roots With Cointegrated Data NOTE: This paper is a revision of

More information

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK

REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK REED TUTORIALS (Pty) LTD ECS3706 EXAM PACK 1 ECONOMETRICS STUDY PACK MAY/JUNE 2016 Question 1 (a) (i) Describing economic reality (ii) Testing hypothesis about economic theory (iii) Forecasting future

More information

Covers Chapter 10-12, some of 16, some of 18 in Wooldridge. Regression Analysis with Time Series Data

Covers Chapter 10-12, some of 16, some of 18 in Wooldridge. Regression Analysis with Time Series Data Covers Chapter 10-12, some of 16, some of 18 in Wooldridge Regression Analysis with Time Series Data Obviously time series data different from cross section in terms of source of variation in x and y temporal

More information

Testing for Unit Roots with Cointegrated Data

Testing for Unit Roots with Cointegrated Data Discussion Paper No. 2015-57 August 19, 2015 http://www.economics-ejournal.org/economics/discussionpapers/2015-57 Testing for Unit Roots with Cointegrated Data W. Robert Reed Abstract This paper demonstrates

More information

The Linear Regression Model

The Linear Regression Model The Linear Regression Model Carlo Favero Favero () The Linear Regression Model 1 / 67 OLS To illustrate how estimation can be performed to derive conditional expectations, consider the following general

More information

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models

A Non-Parametric Approach of Heteroskedasticity Robust Estimation of Vector-Autoregressive (VAR) Models Journal of Finance and Investment Analysis, vol.1, no.1, 2012, 55-67 ISSN: 2241-0988 (print version), 2241-0996 (online) International Scientific Press, 2012 A Non-Parametric Approach of Heteroskedasticity

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Empirical Model Discovery

Empirical Model Discovery Automatic Methods for Empirical Model Discovery p.1/151 Empirical Model Discovery David F. Hendry Department of Economics, Oxford University Statistical Science meets Philosophy of Science Conference LSE,

More information

ECOM 009 Macroeconomics B. Lecture 3

ECOM 009 Macroeconomics B. Lecture 3 ECOM 009 Macroeconomics B Lecture 3 Giulio Fella c Giulio Fella, 2014 ECOM 009 Macroeconomics B - Lecture 3 84/197 Predictions of the PICH 1. Marginal propensity to consume out of wealth windfalls 0.03.

More information

Inflation Revisited: New Evidence from Modified Unit Root Tests

Inflation Revisited: New Evidence from Modified Unit Root Tests 1 Inflation Revisited: New Evidence from Modified Unit Root Tests Walter Enders and Yu Liu * University of Alabama in Tuscaloosa and University of Texas at El Paso Abstract: We propose a simple modification

More information

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES

DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES ISSN 1471-0498 DEPARTMENT OF ECONOMICS DISCUSSION PAPER SERIES STEP-INDICATOR SATURATION Jurgen A. Doornik, David F. Hendry and Felix Pretis Number 658 June 2013 Manor Road Building, Manor Road, Oxford

More information

Econometría 2: Análisis de series de Tiempo

Econometría 2: Análisis de series de Tiempo Econometría 2: Análisis de series de Tiempo Karoll GOMEZ kgomezp@unal.edu.co http://karollgomez.wordpress.com Segundo semestre 2016 IX. Vector Time Series Models VARMA Models A. 1. Motivation: The vector

More information

Evaluating Automatic Model Selection

Evaluating Automatic Model Selection Evaluating Automatic Model Selection Jennifer L. Castle, Jurgen A. Doornik and David F. Hendry Department of Economics, University of Oxford, UK Abstract We outline a range of criteria for evaluating model

More information

MULTIPLE TIME SERIES MODELS

MULTIPLE TIME SERIES MODELS MULTIPLE TIME SERIES MODELS Patrick T. Brandt University of Texas at Dallas John T. Williams University of California, Riverside 1. INTRODUCTION TO MULTIPLE TIME SERIES MODELS Many social science data

More information

Econ 423 Lecture Notes: Additional Topics in Time Series 1

Econ 423 Lecture Notes: Additional Topics in Time Series 1 Econ 423 Lecture Notes: Additional Topics in Time Series 1 John C. Chao April 25, 2017 1 These notes are based in large part on Chapter 16 of Stock and Watson (2011). They are for instructional purposes

More information

Economics 308: Econometrics Professor Moody

Economics 308: Econometrics Professor Moody Economics 308: Econometrics Professor Moody References on reserve: Text Moody, Basic Econometrics with Stata (BES) Pindyck and Rubinfeld, Econometric Models and Economic Forecasts (PR) Wooldridge, Jeffrey

More information

Problems in model averaging with dummy variables

Problems in model averaging with dummy variables Problems in model averaging with dummy variables David F. Hendry and J. James Reade Economics Department, Oxford University Model Evaluation in Macroeconomics Workshop, University of Oslo 6th May 2005

More information

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS

MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS MULTIPLE REGRESSION AND ISSUES IN REGRESSION ANALYSIS Page 1 MSR = Mean Regression Sum of Squares MSE = Mean Squared Error RSS = Regression Sum of Squares SSE = Sum of Squared Errors/Residuals α = Level

More information

This is a repository copy of The Error Correction Model as a Test for Cointegration.

This is a repository copy of The Error Correction Model as a Test for Cointegration. This is a repository copy of The Error Correction Model as a Test for Cointegration. White Rose Research Online URL for this paper: http://eprints.whiterose.ac.uk/9886/ Monograph: Kanioura, A. and Turner,

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Econometrics Summary Algebraic and Statistical Preliminaries

Econometrics Summary Algebraic and Statistical Preliminaries Econometrics Summary Algebraic and Statistical Preliminaries Elasticity: The point elasticity of Y with respect to L is given by α = ( Y/ L)/(Y/L). The arc elasticity is given by ( Y/ L)/(Y/L), when L

More information

EC408 Topics in Applied Econometrics. B Fingleton, Dept of Economics, Strathclyde University

EC408 Topics in Applied Econometrics. B Fingleton, Dept of Economics, Strathclyde University EC408 Topics in Applied Econometrics B Fingleton, Dept of Economics, Strathclyde University Applied Econometrics What is spurious regression? How do we check for stochastic trends? Cointegration and Error

More information

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables

Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Applied Econometrics (MSc.) Lecture 3 Instrumental Variables Estimation - Theory Department of Economics University of Gothenburg December 4, 2014 1/28 Why IV estimation? So far, in OLS, we assumed independence.

More information

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem

Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem Lecture 5: Unit Roots, Cointegration and Error Correction Models The Spurious Regression Problem Prof. Massimo Guidolin 20192 Financial Econometrics Winter/Spring 2018 Overview Stochastic vs. deterministic

More information

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati

405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati 405 ECONOMETRICS Chapter # 11: MULTICOLLINEARITY: WHAT HAPPENS IF THE REGRESSORS ARE CORRELATED? Domodar N. Gujarati Prof. M. El-Sakka Dept of Economics Kuwait University In this chapter we take a critical

More information

Volume 30, Issue 1. The relationship between the F-test and the Schwarz criterion: Implications for Granger-causality tests

Volume 30, Issue 1. The relationship between the F-test and the Schwarz criterion: Implications for Granger-causality tests Volume 30, Issue 1 The relationship between the F-test and the Schwarz criterion: Implications for Granger-causality tests Erdal Atukeren ETH Zurich - KOF Swiss Economic Institute Abstract In applied research,

More information

Oil price and macroeconomy in Russia. Abstract

Oil price and macroeconomy in Russia. Abstract Oil price and macroeconomy in Russia Katsuya Ito Fukuoka University Abstract In this note, using the VEC model we attempt to empirically investigate the effects of oil price and monetary shocks on the

More information

Christopher Dougherty London School of Economics and Political Science

Christopher Dougherty London School of Economics and Political Science Introduction to Econometrics FIFTH EDITION Christopher Dougherty London School of Economics and Political Science OXFORD UNIVERSITY PRESS Contents INTRODU CTION 1 Why study econometrics? 1 Aim of this

More information

Stationarity and cointegration tests: Comparison of Engle - Granger and Johansen methodologies

Stationarity and cointegration tests: Comparison of Engle - Granger and Johansen methodologies MPRA Munich Personal RePEc Archive Stationarity and cointegration tests: Comparison of Engle - Granger and Johansen methodologies Faik Bilgili Erciyes University, Faculty of Economics and Administrative

More information

Title. Description. var intro Introduction to vector autoregressive models

Title. Description. var intro Introduction to vector autoregressive models Title var intro Introduction to vector autoregressive models Description Stata has a suite of commands for fitting, forecasting, interpreting, and performing inference on vector autoregressive (VAR) models

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

The Identification of ARIMA Models

The Identification of ARIMA Models APPENDIX 4 The Identification of ARIMA Models As we have established in a previous lecture, there is a one-to-one correspondence between the parameters of an ARMA(p, q) model, including the variance of

More information

Forecast comparison of principal component regression and principal covariate regression

Forecast comparison of principal component regression and principal covariate regression Forecast comparison of principal component regression and principal covariate regression Christiaan Heij, Patrick J.F. Groenen, Dick J. van Dijk Econometric Institute, Erasmus University Rotterdam Econometric

More information

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012 Econometric Methods Prediction / Violation of A-Assumptions Burcu Erdogan Universität Trier WS 2011/2012 (Universität Trier) Econometric Methods 30.11.2011 1 / 42 Moving on to... 1 Prediction 2 Violation

More information

We Ran One Regression

We Ran One Regression We Ran One Regression David F. Hendry and Hans-Martin Krolzig Department of Economics, Oxford University. March 10, 2004 Abstract The recent controversy over model selection in the context of growth regressions

More information

A Guide to Modern Econometric:

A Guide to Modern Econometric: A Guide to Modern Econometric: 4th edition Marno Verbeek Rotterdam School of Management, Erasmus University, Rotterdam B 379887 )WILEY A John Wiley & Sons, Ltd., Publication Contents Preface xiii 1 Introduction

More information

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω

TAKEHOME FINAL EXAM e iω e 2iω e iω e 2iω ECO 513 Spring 2015 TAKEHOME FINAL EXAM (1) Suppose the univariate stochastic process y is ARMA(2,2) of the following form: y t = 1.6974y t 1.9604y t 2 + ε t 1.6628ε t 1 +.9216ε t 2, (1) where ε is i.i.d.

More information

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems

Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Wooldridge, Introductory Econometrics, 3d ed. Chapter 9: More on specification and data problems Functional form misspecification We may have a model that is correctly specified, in terms of including

More information

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity

LECTURE 10. Introduction to Econometrics. Multicollinearity & Heteroskedasticity LECTURE 10 Introduction to Econometrics Multicollinearity & Heteroskedasticity November 22, 2016 1 / 23 ON PREVIOUS LECTURES We discussed the specification of a regression equation Specification consists

More information

The regression model with one stochastic regressor (part II)

The regression model with one stochastic regressor (part II) The regression model with one stochastic regressor (part II) 3150/4150 Lecture 7 Ragnar Nymoen 6 Feb 2012 We will finish Lecture topic 4: The regression model with stochastic regressor We will first look

More information

Lecture 6: Dynamic Models

Lecture 6: Dynamic Models Lecture 6: Dynamic Models R.G. Pierse 1 Introduction Up until now we have maintained the assumption that X values are fixed in repeated sampling (A4) In this lecture we look at dynamic models, where the

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Forecasting Levels of log Variables in Vector Autoregressions

Forecasting Levels of log Variables in Vector Autoregressions September 24, 200 Forecasting Levels of log Variables in Vector Autoregressions Gunnar Bårdsen Department of Economics, Dragvoll, NTNU, N-749 Trondheim, NORWAY email: gunnar.bardsen@svt.ntnu.no Helmut

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

4.8 Instrumental Variables

4.8 Instrumental Variables 4.8. INSTRUMENTAL VARIABLES 35 4.8 Instrumental Variables A major complication that is emphasized in microeconometrics is the possibility of inconsistent parameter estimation due to endogenous regressors.

More information

Tests of the Present-Value Model of the Current Account: A Note

Tests of the Present-Value Model of the Current Account: A Note Tests of the Present-Value Model of the Current Account: A Note Hafedh Bouakez Takashi Kano March 5, 2007 Abstract Using a Monte Carlo approach, we evaluate the small-sample properties of four different

More information

DATABASE AND METHODOLOGY

DATABASE AND METHODOLOGY CHAPTER 3 DATABASE AND METHODOLOGY In the present chapter, sources of database used and methodology applied for the empirical analysis has been presented The whole chapter has been divided into three sections

More information

Section 2 NABE ASTEF 65

Section 2 NABE ASTEF 65 Section 2 NABE ASTEF 65 Econometric (Structural) Models 66 67 The Multiple Regression Model 68 69 Assumptions 70 Components of Model Endogenous variables -- Dependent variables, values of which are determined

More information

THE LONG-RUN DETERMINANTS OF MONEY DEMAND IN SLOVAKIA MARTIN LUKÁČIK - ADRIANA LUKÁČIKOVÁ - KAROL SZOMOLÁNYI

THE LONG-RUN DETERMINANTS OF MONEY DEMAND IN SLOVAKIA MARTIN LUKÁČIK - ADRIANA LUKÁČIKOVÁ - KAROL SZOMOLÁNYI 92 Multiple Criteria Decision Making XIII THE LONG-RUN DETERMINANTS OF MONEY DEMAND IN SLOVAKIA MARTIN LUKÁČIK - ADRIANA LUKÁČIKOVÁ - KAROL SZOMOLÁNYI Abstract: The paper verifies the long-run determinants

More information

ECON 4160, Lecture 11 and 12

ECON 4160, Lecture 11 and 12 ECON 4160, 2016. Lecture 11 and 12 Co-integration Ragnar Nymoen Department of Economics 9 November 2017 1 / 43 Introduction I So far we have considered: Stationary VAR ( no unit roots ) Standard inference

More information

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure

A Robust Approach to Estimating Production Functions: Replication of the ACF procedure A Robust Approach to Estimating Production Functions: Replication of the ACF procedure Kyoo il Kim Michigan State University Yao Luo University of Toronto Yingjun Su IESR, Jinan University August 2018

More information

Inference with few assumptions: Wasserman s example

Inference with few assumptions: Wasserman s example Inference with few assumptions: Wasserman s example Christopher A. Sims Princeton University sims@princeton.edu October 27, 2007 Types of assumption-free inference A simple procedure or set of statistics

More information

Measurement Independence, Parameter Independence and Non-locality

Measurement Independence, Parameter Independence and Non-locality Measurement Independence, Parameter Independence and Non-locality Iñaki San Pedro Department of Logic and Philosophy of Science University of the Basque Country, UPV/EHU inaki.sanpedro@ehu.es Abstract

More information

Topic 4: Model Specifications

Topic 4: Model Specifications Topic 4: Model Specifications Advanced Econometrics (I) Dong Chen School of Economics, Peking University 1 Functional Forms 1.1 Redefining Variables Change the unit of measurement of the variables will

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Are US Output Expectations Unbiased? A Cointegrated VAR Analysis in Real Time

Are US Output Expectations Unbiased? A Cointegrated VAR Analysis in Real Time Are US Output Expectations Unbiased? A Cointegrated VAR Analysis in Real Time by Dimitrios Papaikonomou a and Jacinta Pires b, a Ministry of Finance, Greece b Christ Church, University of Oxford, UK Abstract

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Empirical Economic Research, Part II

Empirical Economic Research, Part II Based on the text book by Ramanathan: Introductory Econometrics Robert M. Kunst robert.kunst@univie.ac.at University of Vienna and Institute for Advanced Studies Vienna December 7, 2011 Outline Introduction

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

Interpreting Regression Results

Interpreting Regression Results Interpreting Regression Results Carlo Favero Favero () Interpreting Regression Results 1 / 42 Interpreting Regression Results Interpreting regression results is not a simple exercise. We propose to split

More information

Bootstrapping the Grainger Causality Test With Integrated Data

Bootstrapping the Grainger Causality Test With Integrated Data Bootstrapping the Grainger Causality Test With Integrated Data Richard Ti n University of Reading July 26, 2006 Abstract A Monte-carlo experiment is conducted to investigate the small sample performance

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

Exogeneity and Causality

Exogeneity and Causality Università di Pavia Exogeneity and Causality Eduardo Rossi University of Pavia Factorization of the density DGP: D t (x t χ t 1, d t ; Ψ) x t represent all the variables in the economy. The econometric

More information

Glossary. Appendix G AAG-SAM APP G

Glossary. Appendix G AAG-SAM APP G Appendix G Glossary Glossary 159 G.1 This glossary summarizes definitions of the terms related to audit sampling used in this guide. It does not contain definitions of common audit terms. Related terms

More information

Testing methodology. It often the case that we try to determine the form of the model on the basis of data

Testing methodology. It often the case that we try to determine the form of the model on the basis of data Testing methodology It often the case that we try to determine the form of the model on the basis of data The simplest case: we try to determine the set of explanatory variables in the model Testing for

More information

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data?

When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? When Should We Use Linear Fixed Effects Regression Models for Causal Inference with Longitudinal Data? Kosuke Imai Princeton University Asian Political Methodology Conference University of Sydney Joint

More information

ARDL Cointegration Tests for Beginner

ARDL Cointegration Tests for Beginner ARDL Cointegration Tests for Beginner Tuck Cheong TANG Department of Economics, Faculty of Economics & Administration University of Malaya Email: tangtuckcheong@um.edu.my DURATION: 3 HOURS On completing

More information

Obtaining Critical Values for Test of Markov Regime Switching

Obtaining Critical Values for Test of Markov Regime Switching University of California, Santa Barbara From the SelectedWorks of Douglas G. Steigerwald November 1, 01 Obtaining Critical Values for Test of Markov Regime Switching Douglas G Steigerwald, University of

More information

Chapter Three. Hypothesis Testing

Chapter Three. Hypothesis Testing 3.1 Introduction The final phase of analyzing data is to make a decision concerning a set of choices or options. Should I invest in stocks or bonds? Should a new product be marketed? Are my products being

More information

Variable Selection in Predictive Regressions

Variable Selection in Predictive Regressions Variable Selection in Predictive Regressions Alessandro Stringhi Advanced Financial Econometrics III Winter/Spring 2018 Overview This chapter considers linear models for explaining a scalar variable when

More information