Monte Carlo Simulations and the PcNaive Software

Econometrics 2 Monte Carlo Simulations and the PcNaive Software Heino Bohn Nielsen 1of21

Monte Carlo Simulations MC simulations were introduced in Econometrics 1. Formalizing the thought experiment underlying the data sampling. Mimic the data generation and study the behaviour. In this course we will frequently use MC simulations. Standard tool in econometrics. Underlying the econometric results is a layer of difficult statistical theory. (1) Many asymptotic results are technically demanding. Sometimes also difficult to firmly understand. Use MC simulations to obtain intuition. (2) The finite sample properties are often analytically intractable. Analyze finite sample properties. 2of21

Outline of the Lecture (1) The basic idea in Monte Carlo simulations. (2) Example 1: Sample mean(ols)of IID normals. (3) Example 2: Illustration of a Central Limit Theorem. (4) Introduction to PcNaive. (5) Example 3: Consistency and unbiasedness of OLS in a cross-sectional regression. Genereal-to-Specific orspecific-to-general? 3of21

The Monte Carlo Idea The basic idea of the Monte Carlo method: Replace a difficult deterministic problem with a stochastic problem with the same solution. If we can solve the stochastic problem by simulations, labour intensive work can be replaced by cheap capital intensive simulations. How can we be sure that deterministic and stochastic problems have same solution? General answer is the law of large numbers (LLN). As an example, consider a stochastic variable x f(x). Calculation of the mean is a (potentially difficult) deterministic problem: Z E[x] = xf(x)dx. If we can draw realizations x 1,x 2,...,x m,...,x M from f(x), wecanuse M 1 M X m=1 x m E[x] for M. 4of21

Consider a regression Example y t = x 0 tβ + t, t =1, 2,..., T, ( ) where x t and t have some specified properties; and the OLS estimator Ã TX! 1 Ã TX! bβ = x t x 0 t x t y t. t=1 We are often interested in E[ b β] to check for bias. This is difficult in most situations. But if we could draw realizations of b β, then we could estimate E[ b β]. MC simulation: (1) Construct M artificial data sets from the model ( ). (2) Find the estimate, b β m, for each data set, m =1, 2,...,M. Then from the LLN: M 1 M X m=1 t=1 bβ m E[ b β] for M. 5of21

Note of Caution The Monte Carlo method is a useful tool in econometrics. BUT: (1) Simulations do not replace theory. Simulation can illustrate but not prove theorems. (2) Simulations results are not general. Results are specific to the chosen setup. We have to totally specify the model. (3) Work like good examples. In this course we hope to give you a Monte Carlo intuition. 6of21

Example 1: Mean of IID Normals Consider a model where we know the finite sample properties: y t = µ + t, t N(0,η 2 ), t =1, 2,...,T. ( ) The OLS estimator bµ of µ isthesamplemean TX bµ = T 1 y t. Note, that bµ is consistent, unbiased and (exactly) normally distributed t=1 bµ N(µ, T 1 η 2 ). The standard deviation of the estimate, in PcNaive called the estimated standard error, can be calculated as v q u TX ESE(bµ) = T 1 bη 2 = t T 2 (y t bµ) 2. t=1 7of21

Ex. 1 (cont.): Illustration by Simulation We can illustrate the results, if we can generate data from ( ). We need: (1) A fully specified Data Generating Process (DGP), e.g. y t = µ + t, t N(0,η 2 ), t =1, 2,...,T (#) µ = 5 η 2 = 1. An algorithm for drawing random numbers from N(, ). Specify a sample length, e.g. T =50or T {10, 20, 30,..., 100}. (2) An estimation model for y t and an estimator. Consider OLS in y t = β + u t. (##) Note that the statistical model (#) and the DGP(##) need not coincide; but here they do. 8of21

Ex. 1 (cont.): Four Realizations Supposewedraw 1,..., 50 from N(0, 1) and construct a data set, y 1,..., y 50. We then apply OLS to the regression model y t = β + u t, to obtain the sample mean and the standard deviation in one realization, bβ =4.98013, ESE( b β)=0.1477. Wecanlookatmorerealizations Realization, m βm b ESE( β b m ) 1 4.98013 0.1477 2 5.04104 0.1320 3 4.99815 0.1479 4 4.82347 0.1504 Mean 4.96070 0.1445 9of21

Four Realization First realization, Mean=4.98013 Second realization, Mean=5.04104 7.5 7.5 5.0 5.0 2.5 2.5 Third realization, Mean=4.99815 Fourth realization, Mean=4.82347 7.5 7.5 5.0 5.0 2.5 2.5 10 of 21

Ex. 1 (cont.): Formalization Now suppose we generate data from (#) M times, y1 m,..., y50, m m =1, 2,..., M. For each m we obtain a sample mean β b m. We look at the mean estimate and the Monte Carlo standard deviation: MEAN( β) b X M = M 1 bβ m MCSD( b β) = m=1 v u t M 1 MX ³ bβm MEAN( β) 2 b m=1 For large M we expect: MEAN( b β) to be close to the true µ (LLN). The bias is defined as BIAS = MEAN( b β) µ. 11 of 21

Ex. 1 (cont.): Measures of Uncertainty Note, that MEAN( b β) is also an estimator (stochastic variable). The standard deviation of MEAN( b β) is the Monte Carlo standard error MCSE( b β)=m 1 2 MCSD( b β). Note the difference MCSD( b β) measures the uncertainty of b β ( ESE( b β m )). The variation between replications. MCSE( b β) measures the uncertainty of MEAN( b β) in the simulation. The variation between experiments. MCSE( b β) 0 for M. 12 of 21

Ex. 1 (cont.): Results Consider the results for T =50, M = 5000 : bβ m ESE( β b m ) 1 4.98013 0.1477 2 5.04104 0.1320... 5000 4.92140 0.1254 MEAN( β)=4.9985 b MEAN(ESE)=0.14083 MCSD( β)=0.14061 b MCSE( b β)=0.0019886 13 of 21

Ex. 1 (cont.): Results for Different T 1.0 Density, T=5 Density, T=10 1.0 0.5 0.5 3 2 1 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 Density, T=50 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 5.5 5.0 4.5 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 Estimates, different T 0 50 100 150 200 250 14 of 21

Example 2: A Central Limit Theorem (CLT) Recall the idea of a CLT (Lindeberg-Levy): Let z 1,..., z T be IID with E [z t ]=µ and V [z t ]=σ 2.Then 1 T TX t=1 z t µ σ N (0, 1) for T. This can be extended to Heterogeneous processes. (Limited) time dependence. We will illustrate this for two examples Uniform distribution. Exponential distribution. 15 of 21

Ex. 2 (cont.): Uniform Distribution Consider as an example z t Uniform (0 : 1), t =1, 2,..., T. It holds that E [z t ] = 1 2 V [z t ] = (1 0)2 = 1 12 12. We look at the estimated distribution of 1 TX z t µ = 1 TX µ 12 z t 1, T σ T 2 t=1 based on M = 20000 replications. t=1 16 of 21

Ex. 2 (cont.): Uniform Distribution T=1 T=2 1.0 0.4 0.5 0.2 0.0-4 -2 0 2 4 0.0-4 -2 0 2 4 0.4 T=5 0.4 T=10 0.2 0.2 0.0-4 -2 0 2 4 0.0-4 -2 0 2 4 17 of 21

Ex. 2 (cont.): Exponential Distribution Consider as a second example z t Exp (1), t =1, 2,..., T. It holds that E [z t ] = 1 V [z t ] = 1 2 =1. We look at the estimated distribution of 1 TX z t µ T σ t=1 based on M = 20000 replications. = 1 T TX (z t 1), t=1 18 of 21

Ex. 2 (cont.): Exponential Distribution T=1 T=2 0.75 0.4 0.50 0.25 0.2 0.00-2.5 0.0 2.5 5.0 7.5 0.0-2.5 0.0 2.5 5.0 T=5 T=50 0.4 0.4 0.2 0.2 0.0-4 -2 0 2 4 6 0.0-4 -2 0 2 4 19 of 21

PcNaive PcNaive is a menu-driven module in GiveWin. Technically, PcNaive generates Ox code, which is then executed by Ox. Output is returned in GiveWin. Outline: (1) SetuptheDGP. AR(1) Static PcNaive General (2) Specify the estimation model. (3) Choose estimators and test statistics to analyze. (4) Set specifications: M, T etc. (5) Select output to generate. (6) Save and run. 20 of 21

Example 3: PcNaive Static DGP DGP: µ x1t x 2t y t = α 1 x 1t + α 2 x 2t + t, t N(0, 1) N µ 0 0, for t =1, 2,..., T,wherec is the correlation. µ 1 c c 1 Estimation model: Apply OLS to the linear regression model Example: y t = β 0 + β 1 x 1t + β 2 x 2t + u t. (1) Unbiasedness and consistency of OLS in this setting. (2) Effect of including a redundant regressor. (3) Effect of excluding a relevant regressor. 21 of 21