ECONOMETFUCS FIELD EXAM Michigan State University May 11, 2007 Instructions: Answer all four (4) questions. Point totals for each question are given in parenthesis; there are 100 points possible. Within a question, each part receives equal weight. You may use a calculator, but only for computations -not for storage or retrieval of information. You must show all your working to get credit for your solutions. Be sure to show your work or provide sufficient justification for your answers. You may use your notes and books. 1.- (25 points) Assume that data (yt, xt)', t = 1,..., T are stationary, ergodic and generated by where the conditional distribution of ut given xt is utlxt - N (0, a:), with xt N N (0, V) where v is a parameter, and E [utu,lxt, x,] = 0 for t # s. Explain how to find estimates and their standard errors (construct robust standard errors when possible) for all parameters when a. The entire a: as a function of xt is fully known. Here we want to h d estimates and standard errors for v, a and p. b. The values of a: at t = 1,..., T are known. Here we want to find estimates and standard errors for v, a and p. c. It is known that a: = (0 + 6 ~ but ~ the parameters ) ~ ~ 0 and 6 are unknown. Here we want to find estimates and standard errors for v, a, P, 0 and 6. d. It is known that a: = 0 + Jut-,, but the parameters 0 and 6 are unknown. Here we want to find estimates and standard errors for v, a, P, 0 and 6. e. It is only known that a: is stationary. Here we want to find estimates and standard errors for v, a and p. 2.- (25 points) Provide an answer for each of the following six questions. You must support any "agree/disagreev answer with a careful explanation. a. (2 points) Agree or disagree with the following statement using a short answer: "Let us consider the case of an overidentified model with moment condition
where gi (p) = g (yi, zi, xi, p), y = x'p + e and n indicates the sample size. The multinornial distribution which places probability pi at each observation (yi, zi, xi)' will satisfy this condition if and only if Then, the empirical likelihood (EL) estimate of P maximizes where X (p) is a Lagrange multiplier" b. (3 points) Consider the two linear simultaneous equations (i.e. G = 2) with two exogenous variables (K = 2) where u = (ul, 7-42)', E [u u'] = = [ ] and.= [ TI]. According to each of the following restrictions on the system parameters, state if you agree or disagree with the following statements. (i) A Seemly Unrelated (SUR) model sets y12 = y2, = 0, and then the first equation is just identified. (ii) If d12 = 622 = 0, then the first equation is not identified. (iii) If d12 = 621 = 0, then the first equation is just identified if 622 # 0. c. (5 points) Consider the following statement: "With time series data a spurious regression can occur when one random walk process is regressed on an intercept and another random walk process and the two random walk processes are independent of each other. However, if the two time series are trend stationary, then the spurious regression problem disappears and the plim of the OLS slope estimate is zero when the series are independent of each other." Do you agree or disagree with any or parts of this statement? Please give the details and rational for your answer.
d. (5 points) Discuss practical issues involved in obtaining standard errors that are robust to serial correlation (Newey-West standard errors) in a time series regression. Discuss the adjust- ments, if any, needed to make serial correlation robust standard errors valid when there is also heteroskedasticity in the model. e. (5 points) Suppose you want to determine the effect of participating in a job training program (as indicated by jtrain, a dummy variable) on subsequent employement probability; let employ be the binary variable equal to one of the person is employed after the program. Using a random sample of 621 people, the OLS regression of employi on jtraini, x,, where x, is a vector of controls, yields A =.072 (standard error =.029) The probit coefficient on jtrain is larger,.232 (standard error =.091). Does this finding prove that probit provides a better estimate of the effect of job training than the linear probability model? f. (5 points) In a balanced panel data setup with T = 5, suppose you estimate a linear, unob served effects model by fixed effects and first differencing. Evaluate the following statement: "A large difference in the FE and FD estimates is likely due to serial correlation in the idiosyncratic errors." 3.- (25 points) Consider the linear regression model where Y is an n x 1 matrix of data, X is a n x (k + 1) matrix of data and p is a (k + 1) x 1 vector of parameters. The first column of X contains the intercept regressor. Assume that the data represents a random sample from the population, assume ranlc(x) = lc+ 1 and assume Var(u1X) = a21n where In is an (n x n) identity matrix. a. Consider the estimator of P given by 3 = C'Y where C is a n x (k+ 1) matrix that does not depend on the Y data. State sufficient assumptions about C so that 3 is an unbiased estimator. Let 3 be the ordinary least squares estimator (OLS) of p. State sufficient assumptions that make p an unbiased estimator. b. Suppose your assumptions in part (a) are true. Derive a formula for
that is a function of the variance-covariance matrices ~ar(b) = EI(B - P)(P - P)']. Treat the variance and covariances as conditional on X and C and assume var(ulx, C) = a21n. c. Suppose your assumptions in part (a) are true, prove that, conditional on X and C, the matrix is positive semi-definite. d. Suppose that E(u1X) # 0 (and is not a vector of constants). Prove that B is biased and compute an approximation of the bias (assume that any laws of large numbers or central limit theorems you need hold). Suppose there is an n x (k + 1) matrix of data, Z, such that E(uIZ) = 0 and rank(2'x) = k + 1. Let C = Z(XfZ)-l. Is B an unbiased estimator for this particular C matrix? If yes, provide the derivation. If not, explain why not. e. What does the Gauss Markov theorem have to say about var(3) compared to var(3) (conditional on X and C) for the situation described in part (d)? Would your answer change if you could assume E(u1X) = 0. Please provide details. 4.- (25 points) For a random draw i from the cross section, let {(xit, yit) : t = 1,..., T ) be the data over T time periods, where the covariates xt all vary over time. The response variable, yit, has range 0 5 yit 5 1 - that is, yit is a "fractional response." You think that unobserved heterogeneity, G, is likely to be correlated with {qt: t = 1,..., T). a. Suppose you specify a linear, unobserved effects model where x,, = (xil,..., xit). What are the strengths and weaknesses of such a model for fractional responses? How would you estimate P and its asymptotic variance?
b. Suppose that yit takes on the values zero and one with positive probability but is continuous on (0,l). A sensible model can be written in latent variable form: y$ = xitp + q + uit and yit = 0 if y,*, 5 0 yit = y,f, if yi*, > 0 and y,*, < 1 yit = 1 ify,*,> 1. If uitlxi, q Norma1(O1 (T:), find P(yit = Olq, q) and P(yit = ljxi, q). c. Model the relationship between q and xi using the Chamberlain-Mundlak device: Under the normality assumption from part b, find the density of yit given xi. d. Under the assumptions of part c, which parameters are identified? How would you estimate those parameters and how would you test that q is independent of xi? e. Suppose that you directly specify the expectation where A(z) = exp(z)/[l + exp(z)] is the logistic function. How would you estimate the para- meters in this case and perform inference?