Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and

Size: px
Start display at page:

Download "Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data. Jeff Dominitz RAND. and"

Transcription

1 Minimax-Regret Sample Design in Anticipation of Missing Data, With Application to Panel Data Jeff Dominitz RAND and Charles F. Manski Department of Economics and Institute for Policy Research, Northwestern University May 2018 Abstract Missing data problems are ubiquitous in data collection. In surveys, these problems may arise from unit response, item nonresponse, and panel attrition. Building on the Dominitz and Manski (2017) study of choice between two or more sampling processes that differ in cost and quality, we study minimax-regret sample design in anticipation of missing data, where the collected data will be used for prediction under square loss of the values of functions of two variables. The analysis imposes no assumptions that restrict unobserved outcomes. Findings are reported for prediction of the values of linear and indicator functions using panel data with attrition. We also consider choice between a panel and repeated cross sections. We are grateful for the comments of Max Tabord-Meehan.

2 1 1. Introduction Missing data problems are ubiquitous in data collection. In surveys, missing data may arise from unit response, item nonresponse, and panel attrition. Researchers who want to minimize the mean square error of estimates in surveys with missing data should be concerned with both bias and variance, as recommended in the literature on total survey error. However, statisticians have focused on variance, as explained by Groves and Lyberg (2010): The total survey error format forces attention to both variance and bias terms Most statistical attention to surveys is on the variance terms largely, we suspect, because that is where statistical estimation tools are best found (p. 868). Dominitz and Manski (2017) provided tools for making sample design choices that explicitly account for both variance and bias while imposing no assumptions that restrict the unobserved outcomes. The analysis used the Wald framework of statistical decision theory to study choice between two or more sampling processes that differ in the cost of data collection and the quality of the data obtained, where data quality is determined by the response rate. The study focused on minimax-regret sample design for prediction of a real-valued outcome under square loss that is, design which minimizes maximum mean square error when a reasonable and tractable predictor will be used. The ideal, but unknown, best predictor in this setting is the population mean outcome. It is computationally challenging to determine the value and maximum regret of the predictor that minimizes maximum regret. Seeking an approach that is both tractable and reasonable, we studied prediction using the midpoint of a sample analog estimate of the identification region for the population mean. This midpoint predictor is easy to compute and its maximum regret has a simple and sensible analytical form. If the identification interval for the population mean were known rather than estimated, then its midpoint would be the minimax-regret prediction, which we find to be another appealing aspect of this midpoint predictor. We now build on this framework to study minimax-regret sample design in anticipation of missing data, where the collected data will be used for prediction under square loss of functions of two variables.

3 2 Relative to our previous study, addressing this expanded prediction problem requires attention to additional dimensions of data cost and quality. We specifically study choice of sample size for a two-period panel with attrition. That is, we consider longitudinal data collection with a 100-percent response rate in period 1 and some nonresponse in period 2. Some findings apply as well to collection of data on two household members and to cross-sectional surveys with item nonresponse. Section 2 summarizes key elements of and findings from our previous study and calls attention to some complications that must be addressed in the expanded prediction problem. Analysis may be impacted not only by the higher dimension of the data but also by the form of the function whose value is to be predicted. As in our previous study, one must take a stand on how the data will be used before making sample design decisions. Section 3 studies the maximum regret of sample designs for prediction of two types of functions. The cases we study are prediction of the value of linear and indicator functions. In both cases, we presume knowledge of the response rate and use of a midpoint predictor akin to that posed in the previous study. Again, the attractions of midpoint predictors are that they are easy to compute and have sensible analytical forms for regret. Again, it is computationally challenging to determine the value and maximum regret of the predictor that minimizes maximum regret when the identification region for the population mean must be estimated. Section 4 compares the maximum regret of predictions with panel data to maximum regret of prediction with repeated cross-sectional (RCS) data. We find that RCS data collection often yields smaller maximum regret than a panel with equivalent sample size and cost when interest centers on prediction of linear functions. A particularly striking result is that RCS yields smaller maximum regret than a panel with complete response. The reason is that RCS draws an independent random sample each period, but panel observations may be correlated across periods. However, collection of panel data is typically more informative than RCS when the problem is to predict the value of an indicator function. Section 5 discusses extensions of the analysis.

4 3 2. Best Prediction under Square Loss of Functions of Two Variables Consider best prediction under square loss of a bounded real function f(y 1, y 2 ), where (y 1, y 2 ) take values in a bounded interval on R 2, normalized to be the unit square [0, 1] [0, 1]. Let P(y 1, y 2 ) be the probability distribution of (y 1, y 2 ) in a population that is a continuum. Then the best predictor is E[f(y 1, y 2 )]. The regret of a predictor based on sample data is its mean square error. The subscripts 1 and 2 may refer to time periods in a panel study, a husband and a wife in a study of households, or two different variables associated with each individual in a cross-section. Suppose a random sample is drawn from P(y 1, y 2 ), but there may be missing data. Let z t = 1 if y t is observed and z t = 0 if y t is missing for t = 1, 2. In a cross-sectional survey, unit nonresponse means that z 1 = z 2 = 0, whereas item nonresponse means that either (z 1 = 0, z 2 = 1) or (z 1 = 1, z 2 = 0). In a panel with full response in the first period but some attrition in the second, attrition means that (z 1 = 1, z 2 = 0). We focus on this case. Thus, we assume P(z 1 = 1) = 1, but permit P(z 2 = 1) < 1. We assume knowledge of the response rate in the second period but no knowledge of the composition of nonresponse. Although we focus on collection of panel data, our analysis applies as well to household surveys in which a researcher always interviews a specified spouse, but interviews only a subset of the other spouse. It also applies to surveys of individuals in which all sample members respond to one question but only a subset responds to another question. For example, the first question may ask about a non-sensitive matter such as age or education, while the second asks about a sensitive matter such as income or drug use. In general, survey response may vary with the process used to collected data. For example, a person may agree to provide data in an internet survey but not to be interviewed face-to-face. To make the dependence of response on the survey process explicit, we could denote the process by q and the missingdata indicators by (z q1, z q2 ). For simplicity, we keep the q notation implicit in most of the paper. 2.1 Previous Findings

5 4 Dominitz and Manski (2017) studied minimax-regret sample design for best prediction under square loss of a function of one variable, when a high-cost/high-quality sampling process accurately measures the outcome of each sample member and a low-cost/low-quality sampling process has nonresponse. The analysis assumed knowledge of the response rate, but it imposed no restrictions on the values of the data missing due to nonresponse. Using the present notation, one chooses between two processes for measuring y 1. The high-cost process has P(z 1 = 1) = 1, whereas the low-cost process has known P(z 1 = 1) < 1. Therefore, the high-cost process point-identifies E(y 1 ), whereas the low-cost process partially identifies E(y 1 ). With a predetermined budget, the minimax-regret choice between these two sampling processes is easy to determine when it is assumed that specific reasonable predictors will be used. We assumed that a sample-average predictor will be calculated based on the high-cost data and a midpoint predictor will be calculated based on the low-cost data. The midpoint predictor is the middle of a sample analog estimate of the identification region for the population mean. Among other findings, we showed that the maximum regret of the midpoint predictor is smaller than that of sample-average predictors that have been commonly used by researchers who face missing data problems. The latter predictors use ignorability assumptions to impute missing values or to motivate discarding these sample members. The analysis generalizes to designs that combine low-cost and high-cost sampling processes, under the assumption that the observed outcomes will be pooled. Further, when the budget is not predetermined, the analysis shows how to choose a budget sufficient to achieve an ε-optimal design; that is, a budget sufficient to make maximum regret less than a specified ε > Application to Functions of Two Variables Predicting the value of a function of two variables requires attention to additional dimensions of cost and quality, as well as to the form of the function. The general approach to analyzing the problem,

6 5 however, is unchanged. First, we determine the identification region for the best predictor under the assumed data generating process. Second, we define for any sample size a midpoint predictor based on a sample analog of this identification region. Finally, we solve for the maximum regret of this predictor in the case of indicator functions and an informative upper bound on maximum regret in the case of linear functions. When f(y 1, y 2 ) is a general function, the analysis of Dominitz and Manski (2017) yields an outer bound on E[f(y 1, y 2 )]; that is, a bound that holds but need not be sharp. The midpoint predictor studied there is applicable, but it may not be the best possible. Suppose that one knows the rate of complete response; that is, P(z 1 = z 2 = 1). The outer bound is obtained by (a) noting that the value of f(y 1, y 2 ) is observed if both y 1 and y 2 are observed and (b) considering the value of f(y 1, y 2 ) to be missing otherwise. The bound obtained in this manner may not be sharp because observation of either but not both of y 1 and y 2 may constrain the value of f(y 1, y 2 ). Section 3 studies two classes of functions in which this occurs. 3. Linear and Indicator Functions of Two Variables The main new contributions of this paper are to study prediction of the values of two classes of functions whose structure is such that observation of y 1 alone may be informative about the value of f(y 1, y 2 ). We consider prediction of the value of linear functions in Section 3.1 and indicator functions in Section 3.2. We assume complete response in period 1 and that one knows the response rate in period 2. We suppose throughout that the midpoint of a sample analog of the identification region for E[f(y 1, y 2 )] is used to predict the function value. To motivate the classes of functions we analyze, consider studies of employment dynamics, such as those that have utilized the Panel Study of Income Dynamics for the past 50 years. Let y 1 and y 2 denote the fraction of the year that a person works in years 1 and 2. Interest may center on employment change

7 6 (y 2 y 1 ) or average employment (y 1 + y 2 )/2. Section 3.1 covers such linear functions. Alternatively, one may want to predict the occurrence of an event, such as employment growth. Then the objective is prediction of the value of the indicator function 1[y 2 > y 1 ]. Section 3.2 covers prediction of indicator functions Linear Functions Let, for known values of (a, b, c). For ease of exposition, we consider functions where b 0 and c 0. The analysis may be extended to other cases. The best predictor is,. The identification region for, with panel data is the interval (1) 1P 1, 1P 1 P 0 =, P 0. This interval may be derived by applying the Law of Iterated Expectations. The lower bound obtains when 0 is degenerate at the value 0, the lower limit of the support of P(y 2 ). The upper bound obtains when 0 is degenerate at the value 1, the upper bound of the support of P(y 2 ). The width of this interval is P Panel Data Midpoint Predictor Let m t be the sample average of the observed values in period t; that is, and. A sample-analog midpoint predictor is (2) P 0.

8 7 The regret of the predictor is its mean square error, the sum of squared bias and variance. To find the squared bias of (2) with known period-2 response rate P(z 2 = 1), use the Law of Iterated Expectations to write the best predictor as, 1. Under random sampling, and. Bias arises from deviation between 1 and the midpoint predictor s assigned value of ½ P 0. Squared bias is therefore 0 0. The variance of the predictor is (3) P 0 2,. Random sampling implies that. We show in an Appendix that (4a) (4b), Hence, (5) P

9 8. To maximize regret, one can consider the variance and bias terms separately. This is so because bias depends only on E(y 2 z 2 = 0), which can vary independently of all the quantities that determine variance. Squared bias is maximized if P(y 2 z 2 = 0) is degenerate at 0 or 1, in which case maximum squared bias is 0. Maximum variance across all states of nature has a simple form in the polar cases of no response and complete response in period 2. When 1 0, the variance of the predictor reduces to. is maximized at ¼ when P(y 1 ) is Bernoulli with mean ½. Hence, maximum variance is. When 1 1, the variance of the predictor reduces to 1 1 2, 1 The variance and covariance terms can each take a maximum value of ¼ when (y 1, y 2 ) take values in [0, 1] [0, 1]. That is, (a) max. (b) 1. (c) max., 1 The maxima in (b) and (c) are both achieved when P(y 1, y 2 z 2 = 1) is bivariate Bernoulli with mean (½, ½) and covariance ¼. Finally, with P(y 1, y 2 z 2 = 1) bivariate Bernoulli with mean (½, ½), the maximum in (a) is achieved when P(y 1 z 2 = 0) is also Bernoulli with mean ½.

10 9 It appears difficult to determine maximum variance across all states of nature in non-polar cases. However, we can determine the maximum of an informative upper bound on the variance of the midpoint predictor. The Appendix shows that (6) P , Using the same argument as for the case with 1 1, the maximum of the upper bound on the variance is Putting the maximum of the upper bound on variance and maximum squared bias together, the upper bound on the maximum regret of the panel data midpoint predictor is (7) Inspection of (7) reveals that the upper bound on maximum regret is decreasing in the initial sample size N 1, holding the response rate fixed. This holds because the upper bound on maximum variance declines while maximum squared bias is unchanged. Observe that the upper bound (7) reduces to in the polar case where P(z 2 = 1) = 1. In this case, the upper bound is achieved when P(y 1, y 2 z 2 = 1) is bivariate Bernoulli with mean (½, ½) and covariance ¼. In the polar case of P(z 2 = 1) = 0, (4) reduces to. In this case, the upper bound is achieved when P(y 1 ) is Bernoulli with mean ½ and P(y 2 z 2 = 0) is degenerate at 0 or 1.

11 Indicator Functions Suppose now that the objective is best prediction of the event that (y 1, y 2 ) take values in some set A [0, 1] [0, 1]. Then f(y 1, y 2 ) is the indicator function 1[(y 1, y 2 ) A] and the best predictor is P[(y 1, y 2 ) A]. Examples include: (a), 1, for some values, 0, 1, (b), 1 for some, 0, 1, (c), 1, and (d), 1 for some 0,1. For ease of exposition, we focus on settings in which observation of y 1 may imply that (y 1, y 2 ) A but cannot imply that (y 1, y 2 ) A. This holds, for example, when, 1,. Then y 1 j implies that (y 1, y 2 ) A, but y 1 = j does not imply that (y 1, y 2 ) A. It also holds when, 1 and γ > ½. Then y 1 2γ 1 implies that (y 1 + y 2 )/2 < γ, but y 1 > 2γ 1 does not imply that (y 1 + y 2 )/2 > γ. The analysis may be extended to settings where observation of y 1 implies that (y 1, y 2 ) A. To obtain the identification region for,, define the binary random variable u as follows: u = 1 if z 2 = 0 and the observed value of y 1 implies (y 1, y 2 ) A, u = 0 otherwise. The identification region for P[(y 1, y 2 ) A] is the interval (8),y A, 1,,y A, 1 0,0. This interval may be derived by applying the Law of Total Probability and recalling that, when z 2 = 0, the event (y 1, y 2 ) A can occur only when u = Panel Data Midpoint Predictor A midpoint predictor based on (8) is the midpoint of its sample analog, namely

12 11 (9). 1,y A, 1 1 0, 0 To solve for the maximum regret of this predictor, we first derive its squared bias and variance. We then maximize the sum. To shorten the notation, we define two Bernoulli random variables 1,y A, 1 and 1 0,0. Then we rewrite (9) as (9 ). Squared Bias To find the squared bias, use the Law of Total Probability to write the best predictor (10), =,, 1, A, 0. Under random sampling,,y A, 1 and 0,0. Therefore, bias arises from deviation between,, 0 and 0,0/2. The event,, 0 implies that u = 0. Hence,,, 0,, 0,0 and squared bias may be expressed as follows: (11),y A 0,01/2 0,0. Variance To find the variance, note that, under random sampling, the midpoint predictor (9) is a linear function of the bivariate Bernoulli random variable, whose realizations are independent and

13 12 identically distributed across individuals i. Let,, 0, 1, 0, 1. Note that,, 1, and 1. Analysis of bivariate Bernoulli random variables in Dai et al. (2013), equation (2.12) shows that,. We also know that 1 1, 1 0, and 1,10. It now follows that o o o o o o o , 1 Thus, the variance of the midpoint predictor (9) can be written as follows: (12) 2, 1 1. Maximum Regret Summing (11) and (12), the regret of the midpoint predictor is (13) 1 1,y A 0,01/2. Note that 0,0 is found in both the variance and the squared bias components of regret. Hence, in contrast to the case with linear functions, the maximum regret of the predictor of an indicator function cannot be determined by separately maximizing variance and squared bias.

14 13 Setting,y A 0,0 = 0 or 1 maximizes squared bias for any feasible value of, and does not affect variance. 1 Hence, maximum regret for a given value of, is (14) 1 1. The problem is to maximize (14) over the feasible range 1 and 0. Fix p 01 at any feasible value and differentiate (14) with respect to. The derivative 1 2 is decreasing in. Hence, the maximum occurs at the interior solution if this is a feasible value of p 10 and at the boundary 0 otherwise. Considering first the interior solution for, plug into (14) and solve the concentrated optimization problem (15) max s.t. 0, 0. The derivative is increasing in. Therefore, with an interior solution for, regret is maximized at the boundary where either 0 or 0, in which case or 1, respectively. Inspection of the possible boundary solutions shows that, when 1 1 1/N 1, maximum regret occurs where 0 and 1. It follows that the maximum regret of the midpoint predictor (9) in this typical setting is (16) Note that,y A 0,0 is only defined if 0. However, if 0, then there is no missing data problem.

15 14 Inspection of (16) shows that maximum regret is decreasing in the initial sample size N 1, holding the response rate fixed. Differentiation of (16) with respect to the response rate shows that, holding the sample size fixed, maximum regret is decreasing in the response rate when 11 1/2 1 ) Outer-Bound Midpoint Predictor To apply the Dominitz and Manski (2017) midpoint predictor to indicator functions of two variables, once again consider the value of f(y 1, y 2 ) to be missing when y 2 is missing. Then the identification region for E[f(y 1, y 2 )] is the interval (17),y A, 1,,y A, 1 0. A midpoint predictor based on this outer bound on E[f(y 1, y 2 )] is (18) 1,y A, Note that (18) differs from (9) only in the arguments of the second indicator function; that is, 1 0 versus 1 0, 0. Maximum regret of midpoint predictor (18) is identical to maximum regret of midpoint predictor (9), because maximum regret of (9) arises where 0, and, therefore, , 0 for all i. Equivalence of the two midpoint predictors with respect to maximum regret does not imply that the two are equivalent in all states. In fact, midpoint predictor (9) dominates (18). That is, its regret is less than that of (18) in states where first-period data may be informative and equals that of (18) in the "worstcase" states where the first-period data are always uninformative. The latter are states in which z = 0 always implies that u = 0, so observation of the period-1 outcome is not informative about the value of E[f(y 1, y 2 )].

16 15 Maximum regret occurs in states when the first-period data are always uninformative. Hence, the maximum regret of (9) equals the maximum regret of (18) Choice of Sample Design Suppose that a set Q of sampling processes are feasible. Each q Q has a cost π q per initial sample member and a vector of response rates P(z q1 = i, z q2 = j), where i and j equal 0 or 1. We assume for simplicity that the cost per sample member does not depend on whether a person responds in period 2. Also for simplicity, we consider choice between two designs, the lower-cost design having higher attrition. Let N 1L and N 1H be the two initial sample sizes. The low-cost/low-quality design L has total cost π L N 1L and response rate L in period 2, whereas the high-cost/high-quality design has total cost π H N 1H and response rate H in period 2, with 0 < π L < π H and L < H < 1. In principle, one may have additional information about the sample design beyond cost and attrition rate that would impact the calculation of maximum regret and choice among designs. For example, the composition of those who choose to respond or not to respond in period 2 could be known to vary across designs and even if they have identical response rates. Thus, it may be that yet,,. We assume that no such information is available prior to data collection Allocation of a Predetermined Budget Suppose that the objective is to predict the value of a linear or indicator function, using the relevant midpoint predictor. Suppose that the planner has a predetermined budget B and must choose between one of the two designs. The feasible sample sizes are N 1L = INT(B/π L ) for low-cost sampling and N 1H = INT(B/π H ) for high-cost sampling. We henceforth ignore for simplicity the fact that sample sizes must be

17 16 integers and take the feasible low-cost sample size to be N 1L = B/π L and the feasible high-cost sample size to be N 1H = B/π H. Consider first best prediction of the value of the linear function, for known values of (a, b, c). Using the midpoint predictor (2), the feasible low-cost and high-cost designs yield upper bounds on maximum regret of / 1 / and / 1, respectively. Hence, the low-cost design has a / smaller upper bound on maximum regret when the budget is less than a certain threshold and the high-cost design does otherwise. The threshold budget is (19). Consider now best prediction of the value of indicator function 1[(y 1, y 2 ) A]. Using midpoint predictor (9), the feasible low-cost and high-cost designs yield maximum regret / 1 and / 1, respectively. The low-cost design has smaller maximum regret when the budget is less than a certain threshold and the high-cost design is better otherwise. The threshold budget is (20) Choice of Budget to Achieve ε-optimal Prediction Now suppose that budget is a choice variable. In principle, the planner should perform a benefitcost analysis. Devoting a larger budget to data collection improves prediction of outcomes but diverts resources from other uses. The planner must resolve this tension. Adapting arguments in Manski and Tetenov (2016) regarding sample size selection to enable ε-

18 17 optimal treatment decisions, Dominitz and Manski (2017) consider choice of a design so that the maximum MSE of the midpoint predictor is no larger than a specified ε > 0. If the objective is prediction of the value of linear functions, the analysis above shows that a budget of size B suffices to achieve this objective if (21) min{ / / } ε. 1, / 1 / Similarly, for indicator functions, a budget of size B suffices to achieve this objective if (22) min{ / 1, / 1 } ε. These budget sizes suffice for ε-optimality but may not be necessary. The smallest budgets that enable ε-optimal prediction occur when one uses MMR predictors rather than the tractable midpoint predictors studied in Sections 3.1 and 3.2. In the absence of knowledge of the MMR predictors, we can provide sufficient budget sizes but not necessary ones. 4. Panel Data versus Repeated Cross Sections This section uses our analysis to guide sample design choice between panel data and repeated crosssectional (RCS) data, with complete response in each cross-section. Continuing the two-period framework utilized above, RCS data are generated by a sampling process in which two independent random samples are drawn, with (z 1 = 1, z 2 = 0) in one sample and (z 1 = 0, z 2 = 1) in the other; thus, P(z 1 = 1, z 2 = 1) = 0. With repeated observations on individual sample members, panel data with full response point-identify the

19 18 joint distribution P(y 1, y 2 ), whereas RCS data point-identify only the period-specific marginal distributions P(y 1 ) and P(y 2 ). Much attention has been paid to estimation of dynamic models that are point-identified by RCS data; see, for example, the review in Verbeek (2005). In early work on this topic, Deaton (1985) restricted attentions to linear models that may include an additive fixed effect to be differenced out. Then the outcome of interest is the linear function, with b = c. Moffitt (1993) extended the approach to some nonlinear models, focusing on binary choice models where the outcome of interest in each period is an indicator function. Moffitt emphasized that estimation of dynamic models with RCS data is made difficult by the general lack of information on lagged dependent and independent variables and the consequent unobservability of the intertemporal covariances needed to identify and estimate dynamic models (Moffitt, 1993, p. 99). This line of research, which replaces individual observations with cohort means and uses additional assumptions to identify the models, is often motivated by cases in which panel data are not available. However, it has also been noted that panel data are often inferior to the available cross-sections in some respects (Moffitt, 1993, p. 100), such as smaller sample sizes in each time period, lower rates of response arising from attrition, and large and persistent errors of measurement (Deaton, 1985, p. 110). Rather than try to identify conditions under which existing panel data should be preferred to existing RCS data or vice versa, here we consider how one should design longitudinal data collection before commencing it. The answer to this question depends crucially on how the data will be used Linear Functions Under random sampling with no missing data, RCS data point-identify the expectations of linear functions of y 1 and y 2. As above, let, and let m t be the sample average of the N t observed values in period t. The RCS sample-average predictor is. Observe that

20 19,,,, and, 0. Squared bias equals 0 and variance is maximized if P(y 1, y 2 ) is bivariate Bernoulli with mean (½, ½). Maximum regret is (23). We may compare (23) with the maximum regret of the panel-data midpoint predictor. To make the comparison precise, let N 1 be the period-1 sample size for both RCS and panel data collection. Let N 2 = N 1 be the period-2 sample size in each case as well. N 2 is a new random sample in the RCS case and is the sample of period-2 responders in the panel case with no attrition. Let both designs have the same cost per observation. Thus, we compare designs that yield the same numbers of observations in each period and have the same cost, but differ in the composition of the period-2 observations. As previously noted, the maximum regret of the panel data midpoint predictor is when P(z 2 = 1) = 1. Comparison with (23), when N 2 = N 1, reveals that the maximum regret of the panel data predictor exceeds that of the RCS predictor by. This difference is attributable to the potential covariation between the period-1 and period-2 sample averages in a panel. This finding effectively turns a common argument in favor of a panel over RCS on its head. Unlike RCS, a panel yields information on the joint distribution of outcomes across periods. But this information has no value when the objective is to predict a linear function under square loss. Moreover, the possibility of covariation of outcomes across periods increases the maximum variance of the panel data predictor relative to the RCS predictor, which draws an independent random sample each period. The comparison is not as straightforward when there is attrition, because we only have an analytical upper bound on the maximum regret of the panel data midpoint predictor. Nonresponse in period-2 may reduce the impact of covariation of outcomes across periods on the maximum variance of the panel data

21 20 midpoint predictor relative to the RCS predictor, but nonresponse also increase the predictor s maximum squared bias. In contrast, the RCS predictor is unbiased under the maintained assumptions Indicator Functions Suppose now that the objective is best prediction of the event that (y 1, y 2 ) take specified values. The best predictor is, for values, 0, 1. With panel data, we found in (8) the identification region to be the following interval of width 0,0:,y, 1,,y, 1 0,0 With RCS data, the Frechet bound on a joint probability using knowledge of the marginals gives the identification region as the interval (24) max0, 1,min,. This interval has maximum width ½, which obtains when and. Suppose one were to know the marginal probabilities and. Then the minimax-regret predictor would be the midpoint of (24). Without additional prior information, the maximum regret of this midpoint predictor equals its maximum squared bias of 1/16, which obtains when and. Suppose instead that one uses information from a finite sample to estimate the marginal probabilities. Maximum regret of a predictor using finite-sample information on the probabilities cannot be less than maximum regret of the minimax-regret predictor using knowledge of the probabilities. Therefore, maximum regret of a sample-analog RCS midpoint predictor must be no less than 1/16.

22 21 Recall that maximum regret of the panel data midpoint predictor is 1 0. Thus, the lower bound on the maximum regret of a finite-sample RCS midpoint predictor exceeds the maximum regret of the panel data midpoint predictor when 1 0. When 1 > ½, there exists a threshold sample size such that this inequality holds for all samples larger than the threshold. RCS data may be even less informative when predictors of other indicator functions discussed in Section 3 are of interest. Consider the event [y 2 > y 1 ], whose best predictor is. It is possible that the marginal distributions P(y 1 ) and P(y 2 ) identified by RCS data are compatible with both arbitrarily close to 0 and arbitrarily close to 1, as would be the case when (y 1, y 2 ) are continuously distributed on [0, 1] with P(y 1 ) = P(y 2 ). Suppose one were to know the distributions P(y 1 ) and P(y 2 ). Then the minimax-regret predictor would be the midpoint of the identification region. Without additional information, this identification region is the open unit interval. The minimax-regret prediction is therefore ½, and maximum regret of this midpoint predictor equals its maximum squared bias of ¼. Thus, the RCS data are potentially uninformative and a finite-sample RCS midpoint predictor must have maximum regret no less than ¼, whereas maximum regret of the panel data midpoint predictor is again Conclusion This paper continues our effort to encourage increased use of statistical decision theory to inform the design of data collection when data quality is a decision variable. Building on our previous study, we demonstrate how the framework may be applied to more complex design problems. A notable general finding is that, when collecting panel data with attrition, prediction of the value

23 22 of a function of two variables is more subtle than prediction of a function of one variable. The reason is that observation of the outcome in the first period may constrain the value of the function when the outcome in the second period is not observed. The nature of the constraint, if any, depends on the form of the function being predicted. Juxtaposition of linear and indicator functions demonstrates this relationship well. The form of the function being predicted also is important when considering choice between collection of panel data and RCS. In the absence of restrictions on the joint distribution of outcomes, RCS data are well-suited for prediction of linear functions but not for prediction of nonlinear function such as indicator functions. When predicting linear functions, choice between panel data and RCS may depend on the relative magnitudes of recruitment and retention costs, as well as the relationship between retention costs and the attrition rate. The framework adopted in the study should be useful for addressing these matters and related questions. For instance, what is the optimal length of time between interview waves, when, all else equal, an increase in period length should increase retention costs and/or attrition? To what extent can a rotating panel be used to optimally combine the best elements of RCS and panel? Finally, how can retrospective questions in RCS be used in addition to or in lieu of a (rotating) panel and how does this answer depend on the length of time between interviews given the relationship between the length of this time span and retrospective reporting errors? We should re-emphasize that our analysis assumes knowledge of response rates but uses no information on how the sample design affects the composition of respondents. Such information may affect minimax-regret choice among designs. In particular, one may be able to lower maximum regret by combining complementary designs that tend to attract different segments of the population of interest. Appendix Derivation of V(m 2 )

24 23 Recall that. Random sampling implies that. Observe that 1 1 and 1 1. Thus, = Hence, Derivation of C(m 1, m 2 ) Recall that,. Random sampling implies that, 1 1, 1,,

25 24,,. Observe that 1 1 and 1 1. Thus,, = It follows that, Derivation of upper bound on the variance of the midpoint predictor The variance of the predictor is 2 P

26 25 Outcomes are normalized to lie in the unit interval. Therefore, we can derive an upper bound on 1 1 1, as follows: = V V We can also derive an upper bound on 1 1. By the Law of Iterated Expectations, 1 1 = , , , 1 0 Inserting these upper bounds gives this upper bound on the variance of the midpoint predictor: 2 P

27 26 References Dai, B., S. Ding, and G. Wahba (2013), Multivariate Bernoulli Distribution, Bernoulli, 19, Deaton, A. (1985), Panel Data from Time Series of Cross Sections, Journal of Econometrics 30, Dominitz, J., and C. Manski (2017), "More Data or Better Data? A Statistical Decision Problem," Review of Economic Studies, 84, Groves, R. (2006), Nonresponse Rates and Nonresponse Bias in Household Surveys, Public Opinion Quarterly, 70, Groves, R. and L. Lyberg (2010), Total Survey Error: Past, Present, and Future, Public Opinion Quarterly, 74, Manski, C. and A. Tetenov (2016), Sufficient Trial Size to Inform Clinical Practice, Proceedings of the National Academy of Sciences, 113, Moffitt, R. (1993), Identification and Estimation of Dynamic Models with a Time Series of Repeated Cross-Sections, Journal of Econometrics, 59, Verbeek, M. (2008), Pseudo-Panels and Repeated Cross-Sections, in L. Matyas and P. Sevestre (eds.) The Econometrics of Panel Data, Berlin: Springer-Verlag,

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes

Chapter 1 Introduction. What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes Chapter 1 Introduction What are longitudinal and panel data? Benefits and drawbacks of longitudinal data Longitudinal data models Historical notes 1.1 What are longitudinal and panel data? With regression

More information

Gibbs Sampling in Latent Variable Models #1

Gibbs Sampling in Latent Variable Models #1 Gibbs Sampling in Latent Variable Models #1 Econ 690 Purdue University Outline 1 Data augmentation 2 Probit Model Probit Application A Panel Probit Panel Probit 3 The Tobit Model Example: Female Labor

More information

Comment on Tests of Certain Types of Ignorable Nonresponse in Surveys Subject to Item Nonresponse or Attrition

Comment on Tests of Certain Types of Ignorable Nonresponse in Surveys Subject to Item Nonresponse or Attrition Institute for Policy Research Northwestern University Working Paper Series WP-09-10 Comment on Tests of Certain Types of Ignorable Nonresponse in Surveys Subject to Item Nonresponse or Attrition Christopher

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

Changes in the Transitory Variance of Income Components and their Impact on Family Income Instability

Changes in the Transitory Variance of Income Components and their Impact on Family Income Instability Changes in the Transitory Variance of Income Components and their Impact on Family Income Instability Peter Gottschalk and Sisi Zhang August 22, 2010 Abstract The well-documented increase in family income

More information

7 Sensitivity Analysis

7 Sensitivity Analysis 7 Sensitivity Analysis A recurrent theme underlying methodology for analysis in the presence of missing data is the need to make assumptions that cannot be verified based on the observed data. If the assumption

More information

What is to be done? Two attempts using Gaussian process priors

What is to be done? Two attempts using Gaussian process priors What is to be done? Two attempts using Gaussian process priors Maximilian Kasy Department of Economics, Harvard University Oct 14 2017 1 / 33 What questions should econometricians work on? Incentives of

More information

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper

More information

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i,

A Course in Applied Econometrics Lecture 18: Missing Data. Jeff Wooldridge IRP Lectures, UW Madison, August Linear model with IVs: y i x i u i, A Course in Applied Econometrics Lecture 18: Missing Data Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. When Can Missing Data be Ignored? 2. Inverse Probability Weighting 3. Imputation 4. Heckman-Type

More information

Discrete Dependent Variable Models

Discrete Dependent Variable Models Discrete Dependent Variable Models James J. Heckman University of Chicago This draft, April 10, 2006 Here s the general approach of this lecture: Economic model Decision rule (e.g. utility maximization)

More information

What s New in Econometrics. Lecture 1

What s New in Econometrics. Lecture 1 What s New in Econometrics Lecture 1 Estimation of Average Treatment Effects Under Unconfoundedness Guido Imbens NBER Summer Institute, 2007 Outline 1. Introduction 2. Potential Outcomes 3. Estimands and

More information

Chapter 11. Regression with a Binary Dependent Variable

Chapter 11. Regression with a Binary Dependent Variable Chapter 11 Regression with a Binary Dependent Variable 2 Regression with a Binary Dependent Variable (SW Chapter 11) So far the dependent variable (Y) has been continuous: district-wide average test score

More information

Chapter 6 Stochastic Regressors

Chapter 6 Stochastic Regressors Chapter 6 Stochastic Regressors 6. Stochastic regressors in non-longitudinal settings 6.2 Stochastic regressors in longitudinal settings 6.3 Longitudinal data models with heterogeneity terms and sequentially

More information

6 Pattern Mixture Models

6 Pattern Mixture Models 6 Pattern Mixture Models A common theme underlying the methods we have discussed so far is that interest focuses on making inference on parameters in a parametric or semiparametric model for the full data

More information

New Developments in Nonresponse Adjustment Methods

New Developments in Nonresponse Adjustment Methods New Developments in Nonresponse Adjustment Methods Fannie Cobben January 23, 2009 1 Introduction In this paper, we describe two relatively new techniques to adjust for (unit) nonresponse bias: The sample

More information

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7

EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 Introduction to Generalized Univariate Models: Models for Binary Outcomes EPSY 905: Fundamentals of Multivariate Modeling Online Lecture #7 EPSY 905: Intro to Generalized In This Lecture A short review

More information

The Logit Model: Estimation, Testing and Interpretation

The Logit Model: Estimation, Testing and Interpretation The Logit Model: Estimation, Testing and Interpretation Herman J. Bierens October 25, 2008 1 Introduction to maximum likelihood estimation 1.1 The likelihood function Consider a random sample Y 1,...,

More information

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance

Identi cation of Positive Treatment E ects in. Randomized Experiments with Non-Compliance Identi cation of Positive Treatment E ects in Randomized Experiments with Non-Compliance Aleksey Tetenov y February 18, 2012 Abstract I derive sharp nonparametric lower bounds on some parameters of the

More information

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006

Comments on: Panel Data Analysis Advantages and Challenges. Manuel Arellano CEMFI, Madrid November 2006 Comments on: Panel Data Analysis Advantages and Challenges Manuel Arellano CEMFI, Madrid November 2006 This paper provides an impressive, yet compact and easily accessible review of the econometric literature

More information

ECON Introductory Econometrics. Lecture 17: Experiments

ECON Introductory Econometrics. Lecture 17: Experiments ECON4150 - Introductory Econometrics Lecture 17: Experiments Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 13 Lecture outline 2 Why study experiments? The potential outcome framework.

More information

Appendix (For Online Publication) Community Development by Public Wealth Accumulation

Appendix (For Online Publication) Community Development by Public Wealth Accumulation March 219 Appendix (For Online Publication) to Community Development by Public Wealth Accumulation Levon Barseghyan Department of Economics Cornell University Ithaca NY 14853 lb247@cornell.edu Stephen

More information

Generalized Linear Models for Non-Normal Data

Generalized Linear Models for Non-Normal Data Generalized Linear Models for Non-Normal Data Today s Class: 3 parts of a generalized model Models for binary outcomes Complications for generalized multivariate or multilevel models SPLH 861: Lecture

More information

Efficiency of repeated-cross-section estimators in fixed-effects models

Efficiency of repeated-cross-section estimators in fixed-effects models Efficiency of repeated-cross-section estimators in fixed-effects models Montezuma Dumangane and Nicoletta Rosati CEMAPRE and ISEG-UTL January 2009 Abstract PRELIMINARY AND INCOMPLETE Exploiting across

More information

PhD/MA Econometrics Examination January 2012 PART A

PhD/MA Econometrics Examination January 2012 PART A PhD/MA Econometrics Examination January 2012 PART A ANSWER ANY TWO QUESTIONS IN THIS SECTION NOTE: (1) The indicator function has the properties: (2) Question 1 Let, [defined as if using the indicator

More information

Sparse Linear Models (10/7/13)

Sparse Linear Models (10/7/13) STA56: Probabilistic machine learning Sparse Linear Models (0/7/) Lecturer: Barbara Engelhardt Scribes: Jiaji Huang, Xin Jiang, Albert Oh Sparsity Sparsity has been a hot topic in statistics and machine

More information

Microeconometrics. Bernd Süssmuth. IEW Institute for Empirical Research in Economics. University of Leipzig. April 4, 2011

Microeconometrics. Bernd Süssmuth. IEW Institute for Empirical Research in Economics. University of Leipzig. April 4, 2011 Microeconometrics Bernd Süssmuth IEW Institute for Empirical Research in Economics University of Leipzig April 4, 2011 Bernd Süssmuth (University of Leipzig) Microeconometrics April 4, 2011 1 / 22 Organizational

More information

Text Source: Manski, C. (2007), Identification for Prediction and Decision, Harvard University Press (IPD). Sources by Lecture Lectures 1 and 2: IPD C

Text Source: Manski, C. (2007), Identification for Prediction and Decision, Harvard University Press (IPD). Sources by Lecture Lectures 1 and 2: IPD C LECTURES ON PARTIAL IDENTIFICATION University of Stavanger, August 30 September 3, 2010 Charles F. Manski Department of Economics, Northwestern University August 30. Lectures 1 and 2: Prediction with Incomplete

More information

Interest Rate Determination & the Taylor Rule JARED BERRY & JAIME MARQUEZ JOHNS HOPKINS SCHOOL OF ADVANCED INTERNATIONAL STUDIES JANURY 2017

Interest Rate Determination & the Taylor Rule JARED BERRY & JAIME MARQUEZ JOHNS HOPKINS SCHOOL OF ADVANCED INTERNATIONAL STUDIES JANURY 2017 Interest Rate Determination & the Taylor Rule JARED BERRY & JAIME MARQUEZ JOHNS HOPKINS SCHOOL OF ADVANCED INTERNATIONAL STUDIES JANURY 2017 Monetary Policy Rules Policy rules form part of the modern approach

More information

Basics of Modern Missing Data Analysis

Basics of Modern Missing Data Analysis Basics of Modern Missing Data Analysis Kyle M. Lang Center for Research Methods and Data Analysis University of Kansas March 8, 2013 Topics to be Covered An introduction to the missing data problem Missing

More information

ANALYSIS OF PANEL DATA MODELS WITH GROUPED OBSERVATIONS. 1. Introduction

ANALYSIS OF PANEL DATA MODELS WITH GROUPED OBSERVATIONS. 1. Introduction Tatra Mt Math Publ 39 (2008), 183 191 t m Mathematical Publications ANALYSIS OF PANEL DATA MODELS WITH GROUPED OBSERVATIONS Carlos Rivero Teófilo Valdés ABSTRACT We present an iterative estimation procedure

More information

Measuring Social Influence Without Bias

Measuring Social Influence Without Bias Measuring Social Influence Without Bias Annie Franco Bobbie NJ Macdonald December 9, 2015 The Problem CS224W: Final Paper How well can statistical models disentangle the effects of social influence from

More information

Stochastic programs with binary distributions: Structural properties of scenario trees and algorithms

Stochastic programs with binary distributions: Structural properties of scenario trees and algorithms INSTITUTT FOR FORETAKSØKONOMI DEPARTMENT OF BUSINESS AND MANAGEMENT SCIENCE FOR 12 2017 ISSN: 1500-4066 October 2017 Discussion paper Stochastic programs with binary distributions: Structural properties

More information

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY Uplink Downlink Duality Via Minimax Duality. Wei Yu, Member, IEEE (1) (2)

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY Uplink Downlink Duality Via Minimax Duality. Wei Yu, Member, IEEE (1) (2) IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006 361 Uplink Downlink Duality Via Minimax Duality Wei Yu, Member, IEEE Abstract The sum capacity of a Gaussian vector broadcast channel

More information

Lecture-20: Discrete Choice Modeling-I

Lecture-20: Discrete Choice Modeling-I Lecture-20: Discrete Choice Modeling-I 1 In Today s Class Introduction to discrete choice models General formulation Binary choice models Specification Model estimation Application Case Study 2 Discrete

More information

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix)

Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix) Additional Material for Estimating the Technology of Cognitive and Noncognitive Skill Formation (Cuttings from the Web Appendix Flavio Cunha The University of Pennsylvania James Heckman The University

More information

Part 8: GLMs and Hierarchical LMs and GLMs

Part 8: GLMs and Hierarchical LMs and GLMs Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Short Note: Naive Bayes Classifiers and Permanence of Ratios

Short Note: Naive Bayes Classifiers and Permanence of Ratios Short Note: Naive Bayes Classifiers and Permanence of Ratios Julián M. Ortiz (jmo1@ualberta.ca) Department of Civil & Environmental Engineering University of Alberta Abstract The assumption of permanence

More information

More on Roy Model of Self-Selection

More on Roy Model of Self-Selection V. J. Hotz Rev. May 26, 2007 More on Roy Model of Self-Selection Results drawn on Heckman and Sedlacek JPE, 1985 and Heckman and Honoré, Econometrica, 1986. Two-sector model in which: Agents are income

More information

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels.

Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Estimation of Dynamic Nonlinear Random E ects Models with Unbalanced Panels. Pedro Albarran y Raquel Carrasco z Jesus M. Carro x June 2014 Preliminary and Incomplete Abstract This paper presents and evaluates

More information

Multiple Imputation Methodology for Missing Data, Non-Random Response, and Panel Attrition

Multiple Imputation Methodology for Missing Data, Non-Random Response, and Panel Attrition March 1, 1997 Multiple Imputation Methodology for Missing Data, Non-Random Response, and Panel Attrition David Brownstone University of California Department of Economics 3151 Social Science Plaza Irvine,

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

This note introduces some key concepts in time series econometrics. First, we

This note introduces some key concepts in time series econometrics. First, we INTRODUCTION TO TIME SERIES Econometrics 2 Heino Bohn Nielsen September, 2005 This note introduces some key concepts in time series econometrics. First, we present by means of examples some characteristic

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015

MFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015 MFM Practitioner Module: Risk & Asset Allocation February 18, 2015 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Figure Figure

Figure Figure Figure 4-12. Equal probability of selection with simple random sampling of equal-sized clusters at first stage and simple random sampling of equal number at second stage. The next sampling approach, shown

More information

Introduction to bivariate analysis

Introduction to bivariate analysis Introduction to bivariate analysis When one measurement is made on each observation, univariate analysis is applied. If more than one measurement is made on each observation, multivariate analysis is applied.

More information

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command by Joseph V. Terza Department of Economics Indiana University Purdue University Indianapolis

More information

Econometric Analysis of Cross Section and Panel Data

Econometric Analysis of Cross Section and Panel Data Econometric Analysis of Cross Section and Panel Data Jeffrey M. Wooldridge / The MIT Press Cambridge, Massachusetts London, England Contents Preface Acknowledgments xvii xxiii I INTRODUCTION AND BACKGROUND

More information

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis

Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Stock Sampling with Interval-Censored Elapsed Duration: A Monte Carlo Analysis Michael P. Babington and Javier Cano-Urbina August 31, 2018 Abstract Duration data obtained from a given stock of individuals

More information

Combining Equity and Efficiency. in Health Care. John Hooker. Joint work with H. P. Williams, LSE. Imperial College, November 2010

Combining Equity and Efficiency. in Health Care. John Hooker. Joint work with H. P. Williams, LSE. Imperial College, November 2010 Combining Equity and Efficiency in Health Care John Hooker Carnegie Mellon University Joint work with H. P. Williams, LSE Imperial College, November 2010 1 Just Distribution The problem: How to distribute

More information

The Simplex Method: An Example

The Simplex Method: An Example The Simplex Method: An Example Our first step is to introduce one more new variable, which we denote by z. The variable z is define to be equal to 4x 1 +3x 2. Doing this will allow us to have a unified

More information

DETERRENCE AND THE DEATH PENALTY: PARTIAL IDENTIFICATION ANALYSIS USING REPEATED CROSS SECTIONS

DETERRENCE AND THE DEATH PENALTY: PARTIAL IDENTIFICATION ANALYSIS USING REPEATED CROSS SECTIONS DETERRENCE AND THE DEATH PENALTY: PARTIAL IDENTIFICATION ANALYSIS USING REPEATED CROSS SECTIONS Charles F. Manski Department of Economics and Institute for Policy Research Northwestern University and John

More information

Testing Restrictions and Comparing Models

Testing Restrictions and Comparing Models Econ. 513, Time Series Econometrics Fall 00 Chris Sims Testing Restrictions and Comparing Models 1. THE PROBLEM We consider here the problem of comparing two parametric models for the data X, defined by

More information

Lecture 2: Univariate Time Series

Lecture 2: Univariate Time Series Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

Non-linear panel data modeling

Non-linear panel data modeling Non-linear panel data modeling Laura Magazzini University of Verona laura.magazzini@univr.it http://dse.univr.it/magazzini May 2010 Laura Magazzini (@univr.it) Non-linear panel data modeling May 2010 1

More information

Identification for Difference in Differences with Cross-Section and Panel Data

Identification for Difference in Differences with Cross-Section and Panel Data Identification for Difference in Differences with Cross-Section and Panel Data (February 24, 2006) Myoung-jae Lee* Department of Economics Korea University Anam-dong, Sungbuk-ku Seoul 136-701, Korea E-mail:

More information

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models

Chapter 5. Introduction to Path Analysis. Overview. Correlation and causation. Specification of path models. Types of path models Chapter 5 Introduction to Path Analysis Put simply, the basic dilemma in all sciences is that of how much to oversimplify reality. Overview H. M. Blalock Correlation and causation Specification of path

More information

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction

More information

Exact Bounds on Sample Variance of Interval Data

Exact Bounds on Sample Variance of Interval Data University of Texas at El Paso DigitalCommons@UTEP Departmental Technical Reports (CS) Department of Computer Science 3-1-2002 Exact Bounds on Sample Variance of Interval Data Scott Ferson Lev Ginzburg

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo Lecture 1 Behavioral Models Multinomial Logit: Power and limitations Cinzia Cirillo 1 Overview 1. Choice Probabilities 2. Power and Limitations of Logit 1. Taste variation 2. Substitution patterns 3. Repeated

More information

Difference-in-Differences Methods

Difference-in-Differences Methods Difference-in-Differences Methods Teppei Yamamoto Keio University Introduction to Causal Inference Spring 2016 1 Introduction: A Motivating Example 2 Identification 3 Estimation and Inference 4 Diagnostics

More information

Topic 5: Discrete Random Variables & Expectations Reference Chapter 4

Topic 5: Discrete Random Variables & Expectations Reference Chapter 4 Page 1 Topic 5: Discrete Random Variables & Epectations Reference Chapter 4 In Chapter 3 we studied rules for associating a probability value with a single event or with a subset of events in an eperiment.

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

STATISTICAL TREATMENT RULES FOR HETEROGENEOUS POPULATIONS

STATISTICAL TREATMENT RULES FOR HETEROGENEOUS POPULATIONS STATISTICAL TREATMENT RULES FOR HETEROGENEOUS POPULATIONS Charles Manski THE INSTITUTE FOR FISCAL STUDIES DEPARTMENT OF ECONOMICS, UCL cemmap working paper CWP03/03 STATISTICAL TREATMENT RULES FOR HETEROGENEOUS

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1

Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1 Determining Changes in Welfare Distributions at the Micro-level: Updating Poverty Maps By Chris Elbers, Jean O. Lanjouw, and Peter Lanjouw 1 Income and wealth distributions have a prominent position in

More information

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois

Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 2017, Chicago, Illinois Longitudinal Data Analysis Using Stata Paul D. Allison, Ph.D. Upcoming Seminar: May 18-19, 217, Chicago, Illinois Outline 1. Opportunities and challenges of panel data. a. Data requirements b. Control

More information

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 28 LOGIT and PROBIT Model Good afternoon, this is doctor Pradhan

More information

On the Empirical Content of the Beckerian Marriage Model

On the Empirical Content of the Beckerian Marriage Model On the Empirical Content of the Beckerian Marriage Model Xiaoxia Shi Matthew Shum December 18, 2014 Abstract This note studies the empirical content of a simple marriage matching model with transferable

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata

leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata leebounds: Lee s (2009) treatment effects bounds for non-random sample selection for Stata Harald Tauchmann (RWI & CINCH) Rheinisch-Westfälisches Institut für Wirtschaftsforschung (RWI) & CINCH Health

More information

Don t be Fancy. Impute Your Dependent Variables!

Don t be Fancy. Impute Your Dependent Variables! Don t be Fancy. Impute Your Dependent Variables! Kyle M. Lang, Todd D. Little Institute for Measurement, Methodology, Analysis & Policy Texas Tech University Lubbock, TX May 24, 2016 Presented at the 6th

More information

1 Basic Analysis of Forward-Looking Decision Making

1 Basic Analysis of Forward-Looking Decision Making 1 Basic Analysis of Forward-Looking Decision Making Individuals and families make the key decisions that determine the future of the economy. The decisions involve balancing current sacrifice against future

More information

Lecture 2: Basic Concepts of Statistical Decision Theory

Lecture 2: Basic Concepts of Statistical Decision Theory EE378A Statistical Signal Processing Lecture 2-03/31/2016 Lecture 2: Basic Concepts of Statistical Decision Theory Lecturer: Jiantao Jiao, Tsachy Weissman Scribe: John Miller and Aran Nayebi In this lecture

More information

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing

Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Estimating and Using Propensity Score in Presence of Missing Background Data. An Application to Assess the Impact of Childbearing on Wellbeing Alessandra Mattei Dipartimento di Statistica G. Parenti Università

More information

Basic Sampling Methods

Basic Sampling Methods Basic Sampling Methods Sargur Srihari srihari@cedar.buffalo.edu 1 1. Motivation Topics Intractability in ML How sampling can help 2. Ancestral Sampling Using BNs 3. Transforming a Uniform Distribution

More information

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares

Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Wooldridge, Introductory Econometrics, 4th ed. Chapter 15: Instrumental variables and two stage least squares Many economic models involve endogeneity: that is, a theoretical relationship does not fit

More information

Identification with Latent Choice Sets: The Case of the Head Start Impact Study

Identification with Latent Choice Sets: The Case of the Head Start Impact Study Identification with Latent Choice Sets: The Case of the Head Start Impact Study Vishal Kamat University of Chicago 22 September 2018 Quantity of interest Common goal is to analyze ATE of program participation:

More information

Chapter Three. Hypothesis Testing

Chapter Three. Hypothesis Testing 3.1 Introduction The final phase of analyzing data is to make a decision concerning a set of choices or options. Should I invest in stocks or bonds? Should a new product be marketed? Are my products being

More information

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas

Multiple Imputation for Missing Data in Repeated Measurements Using MCMC and Copulas Multiple Imputation for Missing Data in epeated Measurements Using MCMC and Copulas Lily Ingsrisawang and Duangporn Potawee Abstract This paper presents two imputation methods: Marov Chain Monte Carlo

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

Chapter 6. Maximum Likelihood Analysis of Dynamic Stochastic General Equilibrium (DSGE) Models

Chapter 6. Maximum Likelihood Analysis of Dynamic Stochastic General Equilibrium (DSGE) Models Chapter 6. Maximum Likelihood Analysis of Dynamic Stochastic General Equilibrium (DSGE) Models Fall 22 Contents Introduction 2. An illustrative example........................... 2.2 Discussion...................................

More information

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7 Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum

More information

An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem

An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem An Asymptotically Optimal Algorithm for the Max k-armed Bandit Problem Matthew J. Streeter February 27, 2006 CMU-CS-06-110 Stephen F. Smith School of Computer Science Carnegie Mellon University Pittsburgh,

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

How to Use the Internet for Election Surveys

How to Use the Internet for Election Surveys How to Use the Internet for Election Surveys Simon Jackman and Douglas Rivers Stanford University and Polimetrix, Inc. May 9, 2008 Theory and Practice Practice Theory Works Doesn t work Works Great! Black

More information

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility

Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*

More information

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 4: Linear Panel Data Models, II Jeff Wooldridge IRP Lectures, UW Madison, August 2008 5. Estimating Production Functions Using Proxy Variables 6. Pseudo Panels

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates

Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Least Squares Estimation of a Panel Data Model with Multifactor Error Structure and Endogenous Covariates Matthew Harding and Carlos Lamarche January 12, 2011 Abstract We propose a method for estimating

More information

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:

More information

Probability theory basics

Probability theory basics Probability theory basics Michael Franke Basics of probability theory: axiomatic definition, interpretation, joint distributions, marginalization, conditional probability & Bayes rule. Random variables:

More information