Link to Paper. The latest iteration can be found at:

Size: px

Start display at page:

Download "Link to Paper. The latest iteration can be found at:"

Nora Douglas
6 years ago
Views:

1 Link to Paper Introduction The latest iteration can be found at:

2 BKW dignostics in GRETL: Interpretation and Performance Oklahoma State University 5th Gretl Conference, 2017

3 Table of Contents Introduction 1 Introduction 2 3 Example Simulation Analysis Example Simulation Analysis 4

4 Consequences of Weak Identification Parameters estimated imprecisely Estimated covariance nonsingular or not positive definite Excessive numbers of iterations or numerical algorithms don t converge Distorts the distribution of the estimator (e.g., IV estimator with weak instruments) Use the BKW diagnostics to diagnose identification problems in nonlinear models.

5 BKW Table Introduction Condition Variance Proportions of OLS Index var(b 1 ) var(b 2 ) var(b k ) η 1 φ 11 φ 12 φ 1k η 1 φ 21 φ 22 φ 2k... η k φ k1 φ k2 φ kk Table: Matrix of Variance Proportions In linear models, the Table summarizes much of what we can learn about collinearity in data.

6 Condition Index Introduction The BKW diagnostics use the orthonormal eigenvectors, V, and eigenvalues from the SVD of Cov( ˆβ) 1 = UΛV T, where the columns of the Cov( ˆβ) have been scaled to unit length. The condition index is η l = where η 1 = 1 < η 2 <... < η l. ( λ1 λ l ) 1 2.

7 Variance Decomposition Define φ jk = v 2 jk λ k, and let φ j be the variance of b j, apart from the error variance, σ 2. Var(b j ) = σ 2 φ j = ( v 2 j1 + v j v jk 2 λ 1 λ 2 λ k The proportion of the variance of b j associated with the k th eigenvalue λ k is φ jk φ j. )

8 Recipe Introduction Check the condition numbers. Anything larger than 30 indicates a severe problem. 1 condition number is large: read variance decompositions across its row of the table. Multiple φ jk > 50% indicate weakness. 2 large condition numbers: sum the variance decomp s φ jk across j φ jk and repeat. Check to see if any parameters are well identified. Single large condition numbers with variance < 50% is good. For multiple large η, sum φ jk

9 Introduction The BKW diagnostics can be used in any estimator whose covariance can be estimated. The s/n diagnostics can be used for estimators that are normally, or approximately normally distributed. 1 2 Endogenous

10 Introduction Following (Greene, 2012, p ) the ordered probit model is treated as a latent variable, y that depends on a linear index, x T β. y = x T β + e The latent variable y is unobserved. Instead we observe integers, y, such that y=0 if y 0 y=1 if 0 y τ 1 y=2 if τ 1 y τ 2. y=j. if τ J 1 y

11 The β and the τs are unknown parameters. If e is assumed to be normally distributed across observations. Since the overall variance is unidentified by the model it is conventional to let σ = 1 in order to identify β. Then, Prob(y = 0 x) = Φ( x T β) Prob(y = 1 x) = Φ(τ 1 x T β) Φ( x T β) Prob(y = 2 x) = Φ(τ 2 x T β) Φ(τ 1 x T β). Prob(y = J x) = 1 Φ(τ J 1 x T β) with 0 < τ 1 < τ 2 <... < τ J 1 and Φ(t) is the standard normal cdf evaluated at t.

12 example Model 2:, using observations Dependent variable: kidsl6 Standard errors based on Hessian Coefficient Std. Error z p-value educ exper age cut cut cut

13 BKW table from GRETL The vif command in GRETL produces the BKW table as follows: --- variance proportions --- lambda cond educ exper age

14 BKW table from GRETL, cont. The RHS of the vif command in GRETL: --- variance proportions --- lambda cond cut1 cut2 cut

15 Design Introduction There are three regressors in the model. X = {const, age, educ, exper}. (1) is decomposed using SVD. X = UDV, where D is a diagonal matrix containing the eigenvalues of the original data. These are replaced using the desired ones Λ as in. X = UΛV

16 Collinear Regressors Introduction Five sets of eigenvalues for the SVD are considered. Eigenvalues Λ = {1, 1, 1, 1} Λ = {10, 7, 4, 1} Λ = {10, 1, 1, 0.1} Λ = {10, 10, 0.1, 0.1} Λ = {10, 0.1, 0.05, 0.05} Collinearity None Moderate Moderate Moderately Severe Severe (2)

17 Setting cutoff points, τ j The cutoff parameters are set by taking the mean of Xβ and adding c such that P(X < c) = p with the parameter p chosen from Cumulative Prob Bin Size p = {0.1, 0.5, 0.9} Wide (3) p = {0.1, 0.3, 0.5} Moderate p = {0.1, 0.2, 0.3} Narrow The parameters from the regression are set to β is set to { 0.1, 0.05, 0.02}, which mimic the values of the parameters when the original data are used.

18 Table: Average condition numbers for the various collinearity designs. Large bins, p = {0.1, 0.5, 0.9}. Collinearity among variables Cond {1,1,1,1} {10,1,1,0.1} {10,0,0.1,0.1} {10,.1,.05,.05} η η η η η η l

19 Monte Carlo Summary Statistics β 1 β 2 β 3 τ 1 τ 2 τ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 τ 1 τ 2 τ Standard Errors of the Variance Proportion Table Table: Orthogonal Regressors: Λ = {1, 1, 1, 1}, with Wide Bins: p={0.1, 0.5, 0.9}. Results based on 100 Monte Carlo samples.

20 Monte Carlo Summary Statistics β 1 β 2 β 3 τ 1 τ 2 τ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 τ 1 τ 2 τ Standard Errors of the Variance Proportion Table Table: Moderate Collinearity: Λ = {10, 1, 1, 0.1}, with Wide Bins: p={0.1, 0.5, 0.9}. Results based on 100 Monte Carlo samples.

21 Monte Carlo Summary Statistics β 1 β 2 β 3 τ 1 τ 2 τ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 τ 1 τ 2 τ Standard Errors of the Variance Proportion Table Table: Severe Collinearity: Λ = {10, 10, 0.1, 0.1}, with Wide Bins: p={0.1, 0.5, 0.9}. Results based on 100 Monte Carlo samples.

22 Monte Carlo Summary Statistics β 1 β 2 β 3 τ 1 τ 2 τ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 τ 1 τ 2 τ Standard Errors of the Variance Proportion Table Table: Severe Collinearity: Λ = {10, 0.1, 0.05, 0.05}, with Wide Bins: p={0.1, 0.5, 0.9}. Results based on 100 Monte Carlo samples.

23 General Thoughts on Estimation Bias low. Orthogonality of regressors does not imply no weakness Variance proportions measured fairly precisely.

24 Table: Various bin sizes for a mildly collinear data in. Λ = 10, 7, 4, 1. There is only 1 relatively large condition number in this scenario. Wide β 1 β 2 β 3 τ 1 τ 2 τ 3 Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l Narrow β 1 β 2 β 3 τ 1 τ 2 τ 3 Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l Really narrow β 1 β 2 β 3 τ 1 τ 2 τ 3 Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l

25 1 Identification depends on collinear variables as well as functional form and parameters. 2 BKW diagnostics useful in diagnosing problems, both as collinearity worsens and as bins size shrinks. 3 Proportion of BKW diagnostics variability due to parameter estimation is minimal.

26 : Selection Equation z i = w T i γ + u i, i = 1,..., N (4) The continuous latent variable z is not directly observed. Instead we only know { 1 z z i = i > 0 0 otherwise

27 : Regression Equation y i = x T i β + e i, i = 1,..., n, N > n (5) The random disturbances of the two equations are normally distributed and correlated [ ] [( ) ( )] ui 0 1 ρ N, e i 0 ρ σe 2 Notice that the selection equation has more observations than the regression equation. (6)

28 Correlation between u i and e i alters the conditional mean of the the linear regression (5). E[y i z i > 0] = xi T β + ρσ e λ i where λ i = φ(w i γ)/φ(w i γ) is the inverse Mill s ratio, φ( ) and Φ( ) are standard normal pdf and cdf, respectively. Also, it is conventional to define β λ = ρσ e

29 The log-likelhihood (Greene, 2012, p. 878) is ln L = [ exp( (1/2)e 2 i /σ 2 ( e) ρei /σ e + w i Φ γ )] + [1 ln Φ(w z=1 σ e 2π 1 ρ 2 i γ)] z=0 where e i = y i x i β and i = 1, 2,..., n. Note: Fewer observations are available for the regression part of this likelihood. That affects identification in an important way.

30 Dependent variable: l wage, Selection variable: lfp Coefficient Std. Error z p-value const educ exper lambda Selection equation const educ exper kidsl Total observations: 753 Censored observations: 325 (43.2%)

31 Regression part of the BKW table --- variance proportions --- cond const educ exper lambda

32 BKW table, cont. Selection --- variance proportions --- cond const educ exper kidsl

33 Collinear variables Introduction X = {const, educ, exper, kidsl6}. is decomposed using the SVD. X = UDV and the diagonal matrix containing the eigenvalues, D, is replaced using the desired ones, X = UΛV. Since n < N the sample collinearity in regression and selection equations will be different, and not necessarily orthogonal in the design.

34 Four sets of eigenvalues are considered as replacements for D in the SVD. Eigenvalues Λ = {1, 1, 1, 1} Λ = {10, 1, 1, 0.1} Λ = {10, 10, 0.1, 0.1} Λ = {10, 0.1, 0.1, 0.05} Collinearity None Moderate High Severe

35 Censoring and Variance The selection degree is chosen from ρ = {0, 0.5, 0.99} and the variance from σ e = {1, 2} Doubling σ e doubles β λ as well as its standard error. It also doubles the measured standard errors of the regression equation, having no effect on the selection equation.

36 Choosing regressors: same or different subsets Two sets of regressors are chosen based on the collinear induced X obtained above. These are x i = 1, educ i, exper i and. w i = 1, educ i, exper i x i = 1, educ i, exper i w i = 1, educ i, kidsl6 i

37 Notes and issues Introduction 400 simulated samples per design. β estimated using the original data and rounded γ also estimated using the original data and rounded β λ = ρσ e n < N, which affects collinearity within each sample.

38 ρ β 1 β 2 β 3 β λ γ 1 γ 2 γ True Mean Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l6 φ l ρ β 1 β 2 β 3 β λ γ 1 γ 2 γ True Mean Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l6 φ l ρ β 1 β 2 β 3 β λ γ 1 γ 2 γ True Mean Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l6 φ l ρ β 1 β 2 β 3 β λ γ 1 γ 2 γ True Mean Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l6 φ l ρ β 1 β 2 β 3 β λ γ 1 γ 2 γ True Mean Std Err η l φ l1 φ l2 φ l3 φ l4 φ l5 φ l6 φ l

39 Variable choice. Orthogonal variables. Regression Selection λ cond cconst ceduc cexper β λ cconst ceduc cexper ckidsl λ cond cconst ceduc cexper β λ cconst ceduc cexper ckidsl λ cond cconst ceduc cexper β λ cconst ceduc cexper ckidsl

40 Variable choice. Orthogonal variables, con t. Regression Selection λ cond cconst ceduc cexper β λ cconst ceduc cexper ckidsl λ cond cconst ceduc cexper β λ cconst ceduc cexper ckidsl Note: cexper in the selection equation helps to identify cexper in the regression.

41 Monte Carlo Summary Statistics β 1 β 2 β 3 β λ γ 1 γ 2 γ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 β λ γ 1 γ 2 γ Standard Errors of the Variance Proportion Table Table: Orthogonal Regressors: Λ = {1, 1, 1, 1}, with moderate selectivity: ρ = 0.5. σ e = 1 and x i2 w i2. Failure proportion=0.

42 β 1 β 2 β 3 β λ γ 1 γ 2 γ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 β λ γ 1 γ 2 γ Standard Errors of the Variance Proportion Table Table: Severe Collinearity: Λ = {10, 0.1, 0.1, 0.05}, with moderate selectivity: ρ = 0.5. σ e = 1 and x i2 w i2. Failure proportion=0.02.

43 Monte Carlo Summary Statistics β 1 β 2 β 3 β λ γ 1 γ 2 γ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 β λ γ 1 γ 2 γ Standard Errors of the Variance Proportion Table Table: Orthogonal Regressors: Λ = {1, 1, 1, 1}, with high selectivity: ρ = σ e = 1 and x i2 w i2. Failure proportion=0.005.

44 Monte Carlo Summary Statistics β 1 β 2 β 3 β λ γ 1 γ 2 γ 3 True Mean Std Err t-ratio Averages of the Variance Proportion Table η l β 1 β 2 β 3 β λ γ 1 γ 2 γ Standard Errors of the Variance Proportion Table Table: Severe Collinearity: Λ = {10, 0.1, 0.1, 0.05}, with high selectivity: ρ = σ e = 1 and x i2 w i2. Failure proportion=0.008.

45 MLE badly biased for β λ when w = x. Bias smaller when w x. As sample endogeneity gets worse (ρ 1), the identification of β λ improves. Overall identification improves (a little). BKW diagnostics helpful in detecting issues with collinear variables and the identification of selectivity parameter.

46 Introduction The BKW diagnostics appear to work quite nicely and as expected in the nonlinear settings examined here. Not only can collinearity be detected, but issues with parameters and functional form can also be revealed. Thanks Allin for adding them to GRETL. :-)

47 Link to Paper Introduction For more information, results, and code see:

48 Greene, William H. (2012), Econometric Analysis, 7th edn, Prentice-Hall, Upper Saddle River, NJ.

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,