Are electric toothbrushes effective? An example of structural modelling.

Size: px
Start display at page:

Download "Are electric toothbrushes effective? An example of structural modelling."

Transcription

1 Are electric toothbrushes effective? An example of structural modelling. Martin Browning Department of Economics, University of Oxford Revised, February Background. Very broadly, there are four classes of explanation for a correlation between two variables and : 1. Chance. Any two vectors of the same length will almost surely have a non-zero correlation. 2. There is a third factor,, that causes both and. A classic example is if and are trending time series variables. In that case, is time. 3. causes. 4. causes. These are not mutually exclusive. For example, we might believe that both and could cause. In that case, when testing for whether actually causes, we would want to control for. A famous example of this is the link between smoking () and lung cancer (). A strong correlation was found in epidemiological studies in the early 1950 s. These showed that patients with lung cancer were much more likely to be heavy smokers than the general population. The correlation was very strong so that chance could be ruled out. The immediate suspicion was that ThesenotesaresolelyfortheuseofstudentsontheStructuralModellingcourse.Please do not distribute them or quote them without permission. 1

2 smoking was causing cancer ( ). The tobacco companies responded by, amongst other things, employing the greatest statistician of the 20th century, R. A, Fisher, to point out the weaknesses in this causal inference. One of his arguments was that, perhaps, cancer causes smoking. It was easy to rule this out on the grounds that most people started smoking in their teens but the onset of cancer almost certainly had to be later than this. Thus a temporal sequence argument was used to rule. Fisher s most telling argument was that perhaps some people had a genetic pre-disposition towards smoking and lung cancer. Thus item is genetic makeup. This was much more difficult to rule out in observational studies. One piece of evidence against was that pipe smokers were less likely to contract lung cancer than cigarette smokers who had the same dosage. This started to make a genetic cause less credible but effectively, the genetic link was only almost universally rejected when random controlled experiments on beagles showed that dogs who systematically had their lungs filled with tobacco smoke were more likely to contract cancer. 2 A model of brushing. We shall use a transparent example to illustrate structural modelling. Soon after electric toothbrushes were introduced there was some evidence that people who used electric toothbrushes (ET) had healthier teeth than those who used a regular toothbrush (RT). There are many possible reasons for this correlation. To rule out 1 above we would run a test of whether the ET effect was significantly different from zero. An example of 2 above would be if high income leads people to buy electric toothbrushes more and also leads them to spend more on dentistry. This is an example of selection on an observable (assuming we can observe income). It is relatively easy to take account of selection on observables. Much more difficult to deal with is selectiononanunobservable.thisarisesin our context if, for example, people who care more about their teeth bought an ET thinking that it was better for their teeth. But these people will also have had better dental health habits (avoiding sweets, flossing, brushing after every meal etc.). So that even if the ET is no better than the RT, we would observe better health amongst those who use them. Another possible explanation for the toothbrush correlation is causal : ET s really do make brushing more effective. Note that both selection on an unobservable and causality could be operating at the same time, in which case the coefficient on using an ET overstates the 2

3 causal impact. With non-experimental data the way we would proceed is to try and find some observable variable that influences the choice of ET/RT but does not impact directly on the healthiness of teeth (an instrument for using an ET). Such an instrument is quite difficult to find. The effectiveness of ET s is a really big issue in dentistry so the next step was to conduct controlled experiments in which some people were randomly assigned an ET (the treatment group) and others an RT (the control group). The broad conclusion from this was that, on average, those who had been assigned an ET had healthier teeth at the end of the experimental period than those who were assigned a RT. 2.1 The model. The experimental result is mildly interesting but it raises more questions than it answers. To see why, consider a simple model in which a person cares about the health of their teeth, denoted, and the time taken brushing,. 1 One of the most fruitful aspects of modelling in microeconomics is to first consider constraints and preferences separately and then to bring them together to give observable behaviour Constraints. We first specify constraints, in this case two production functions. Using a regular toothbrush, health is produced by a production function = (). Using an electric toothbrush, the production function is = (). We shall say that an ET is effective if () () for all 0. An ET is ineffective if () = () for all. Figure1 illustrates a case in which an ET is more effective than an RT. 3 Most people would interpret the experimental finding above as proving that ET s are effective. As we shall see, this is not necessarily the case. 1 In practice we should consider different types of healthiness and different decisions that impact on dental health (such as sugar intake). But this simple model will serve to illustrate all the points of interest. 2 Many researchers in other social sciences think that this is nonsense. They believe that what is available conditions what you want and that we cannot consider preferences and constraints separately. This would invalidate most economic models. 3 Note that we allow that too much brushing can be a bad thing. Also, the mode is irrelevant if the agent never brushes. 3

4 2.1.2 Preferences. Let be a dummy variable that is 1 if the person uses an ET (and zero otherwise). Preferences are represented by the utility function ( ). Weassume that () is increasing in and decreasing in. The dependence on the mode,, is to capture pure preference effects for using an ET. One possible reason for this is that using an ET requires less effort or, maybe, an ET is considered too noisy for the early morning. An important consideration for the mode is whether it affects the trade-off between time and health. If the utility function can be written as: ( ) = ( ( ) ) (1) then we say that preferences over ( ) are separable from. 4 That is, the marginal rate of substitution between time and health is independent of : = = (2) where the final expression on the right hand side does not depend on. In figure 2 we show preferences which are non-separable so that the mode does change the slope of the indifference curves (that is, indifference curves can cross). In the case illustrated, using an ET makes time brushing less onerous. To see this, note that at the point where the two indifference curves cross a person using an RT would need a bigger increase in to compensate her for an increase in than if using an ET. 2.2 Choices. Having established what the agent likes and what the agent can do, we can put them together to model their choice. For modes 0 (RT) and 1 (ET) respectively we have: ˆ 0 = max{ ( () 0)} (RT) ˆ 1 = max{ ( () 1)} (ET) 4 We have already made an implicit separability assumption that preferences over ( ) are independent of the thousands of other things the agent cares about. 4

5 The corresponding choices are denoted ³ˆ0 ˆ 0 and ³ˆ1 ˆ 1. Ifwewereconsidering non-experimental ( observational ) data in which people choose whether to use RT or ET we would need to take into account these utility levels and the different prices of the two modes ( and for RT and ET respectively, with ). This would require us to make allowance for differences in the marginal utility of money,. Thuswehavethattheagentchoosestousethe ET if and only if: ˆ 1 ˆ 0 (3) This is an example of cost-benefit analysis: there are two choices and the one chosen has the highest benefit minus cost. This analysis would give the result discussed in the first section in which richer people (who have a low ) choose the ET option. In practice it is very difficult to control for the marginal utility of money (and other factors that might influence choice). One way around this is to conduct a controlled experiment in which people are assigned RT or ET. 2.3 Interpreting the experimental finding. The purpose of this subsection is to show that we have to be very careful how we interpret experimental effects. In an experimental setting, we take the choice of mode as external to the agent so that a control individual with ( =0) sets ( ) = ³ˆ0 ˆ 0 andatreatedpersonwith( =1)sets( ) = ³ˆ1 ˆ 1. Suppose the experimental finding tells us that the average ˆ 1 is statistically significantly higher than the average ˆ 0. 5 Without supplementary assumptions this does not tell us anything about whether or not ET is effective. Toseethis, consider two cases. The first case is illustrated in figure 3. HeretheETiseffective (the ET production curve is above the RT curve) but people s choice of time spent brushing undoes the effect. As shown, agents take all of the increased productivity to reduce their time brushing and they keep health constant. 6 This would show up as a zero experimental effect: ˆ 1 = () = ( )=ˆ 0 (4) 5 A simple way to do this is to regress the observed health variable on the treatment dummy and to test for whether the coefficient on the latter is significantly greater than zero. 6 Indifference curves that shift horizontally correspond to quasi-linear preferences. 5

6 Thus the experiment shows no effect even though an ET is effective. The converse case is illustrated in figure 4. Here ET is ineffective ( () = ()) but agents find the ET requires less effort and they increase. Thus we have: ˆ 1 = () ( )=ˆ 0 (5) Thus the experiment shows a positive effect even though ET is ineffective.assuming that preferences over ( ) are separable from rules this case out. This shows that a positive experimental effect is neither necessary nor sufficient for the hypothesis that ET s are effective. In this case, the problem is that people know whether they are a treatment or a control and they may change their behaviour accordingly. One solution to this problem would be to assign the time spent brushing. For example, everyone could be required to spend exactly 10 minutes per day brushing their teeth. Then we could just compare the postexperiment health of the two groups and determine whether ET s are effective at =10. This is obviously completely impractical. 7 Instead we might collect information on the time spent brushing and use this to control for changes in consequent on the assigned mode, RT or ET. Here we have taken the ideal case. For example, we are assuming that assignment is perfect in the sense that someone assigned an ET always uses it and similarly for RT. In practice experimental subjects do not always comply with the experimental assignment. This leads to additional complications for interpreting experimental outcomes. The situation is worse for quasi-experimental studies (or natural experiments ) in which there may even be doubt about the randomness of the assignment to treatment or control as well as the degree of compliance. To judge whether ET s are effective we need to build a structural model which uses information on the time spent brushing. Given people in total, the data are the values of ( ) for =12. 7 In a medical random controlled trial a patient does not know whether they are receiving the treatment or a placebo and consequently cannot react to the assignment. In the toothbrush example it would be difficult to hide the mode of toothbrush from the agent! 6

7 3 Taking the theory to the data. 3.1 A regression approach. We shall now consider how to take the theory to the data. We first consider choosing a functional form that gives a conventional looking linear regression. We take the following functional form for the health production function: = 2 exp ( ) (6) where is a stochastic health production shifter. This represents unobserved heterogeneity. The variable will vary across people according to the acidity of their saliva and their diet and also their behaviour (how often you visit the dentist). This functional form has some desirable properties; for example it is increasing and concave in if 2 (0 1). It also implies that the choice of mode does not affect health for those who never brush their teeth ( =0). The functional form also has some undesirable properties. For example, it does not allow for any decreasing segment. More importantly, there is no heterogeneity in health amongst people who never brush their teeth. Finally, the effect of the mode (here given by 1 ) is assumed to be the same for everyone. We could probably do better but at a cost of introducing more parameters. Choosing a flexible functional form that parsimoniously captures all desirable features in terms of theory and that fits the data is a very important element in parametric structural modelling. An ET is effective if and only if 1 0 Thus 1 is the parameter of interest. There are different ways to quantify effectiveness. For example, we could take the difference in health between those who use an ET and those who do not: ( =1) ( =0).Thisisheterogeneousinthesensethat: ( =1) ( =0)={ 2 exp ( 0 + )} (exp ( 1 ) 1) (7) depends on the heterogeneity parameter. An alternative definition of effectiveness is ( =1) ( =0) =exp( 1) (8) which is homogeneous (that is, independent of ). We shall assume that everyone spends some time brushing their teeth ( 0) 7

8 and take logs of (6) to give the simple linear-in-log form: ln = ln + (9) Given observations on ( ) wecouldrunaregressiononthefullsampleofln on ( ln ) and find OLS estimates for the parameter of interest. 8 Often someone presenting regression results such as this will say they are controlling for the time spent brushing. The OLS coefficient estimates yield consistent estimates of 2 only if and are uncorrelated with unobservable health heterogeneity,. That is ( ) =0. By design, the random experimental assignment implies that is uncorrelated with. This is the great value of an experiment - we do not need to model the selection of mode as in (3). 9 To determine whether it is plausible to assume is uncorrelated with we need to re-consider the choice of the time spent brushing. We need only consider the RT group (the same analysis applies for the ET group). We have the following program 10 : max ( ( ) 0) (10) The first order conditions are (denoting partial derivatives by subscripts): ˆ ˆ 0 ˆ + ˆ ˆ 0 =0 (11) Generally the optimal time spent brushing, ˆ, is a function of. To determine if ˆ depends on we calculate the partial derivative with respect to. Aftersome manipulations we find that: ˆ i h ( ) [ + + ]=0 (12) 8 A more sophisticated approach would be to model the dependence of on nonparametrically for each mode, RT and ET. Then test whether the two nonparametric curves are the same. This does not solve the problems we are about to discover. No amount of statistical pyrotechnics can solve an identification problem. 9 This ignores the important issue of compliance. Some of the treated may use a RT and some of the controls may use an ET even though they have been asked to use the assigned mode. 10 It is important to note exactly what we are assuming here. In particular, we assume that people know the true production functions for the two modes and use this in their decision making. Since the whole point of the research in this area is to establish whether ET s are effective, this is a very strong assumption! 8

9 If ˆ =0then the second term in brackets must also be zero, so that: + + =0 (13) If this holds then OLS gives a consistent estimate of the effect of using an electric toothbrush. It is difficult (impossible?) to interpret this condition but it looks unlikely that it will hold given that it depends on both technology (the () function) and tastes (the () function). Just to check how restrictive it is, we now consider a parametric form for the utility function and use the parametric form (6). 3.2 A parametric model. Let be the maximum number of minutes per day that anyone would brush; (for example, =20). Denote the logistic function: () = exp () 1+exp() so that () is always between zero and unity. functional form for the utility function: (14) We shall take the following ( ) = ( )ln( )+(1 ( )) ln ( ) (15) so that time brushing is a bad and health is a good. If the function value ( ) is high then the person cares a lot about the health of their teeth (relative to the time spent brushing). Preferences over ( ) are separable from the mode if 1 =0.Thevariable captures unobserved heterogeneity in tastes for healthy teeth; this may include concerns about how your teeth look. Maximising ( ) subject to (6) gives the following first order condition for the optimal,denoted ˆ: ( ) 2ˆ (2 1) exp ( ) ˆ 2 exp ( ) (1 ( )) ˆ =0 (16) This does not have closed form expression in ˆ but it is easy to solve it numerically for given values of ( ). The important point here is that the optimal value ˆ will generally depend on the unobserved production heterogeneity. That is, ln in equation (9) is endogenous. If you look care- 9

10 fully, however, we can rule this out by setting =0. 11 can find a closed form expression for the optimal : If we do that then we ˆ = ( ) 2 1+ ( )( 2 1) (17) The most important implication of this is that the OLS estimates of (9) are now consistent estimates of the parameters of interest, so long as and are uncorrelated. Thatis,if is correlated with then depends indirectly on even though the latter does not appear in the right hand expression explicitly. The two terms and being uncorrelated is a strong assumption since contains other things than and that determine the health of teeth and captures how much the agent cares about their teeth relative to the time spent brushing. One important thing to note is that separability of ( ) from in the utility function ( 1 =0) is irrelevant for the exogeneity of ln in equation (9). If we are willing to assume that and are uncorrelated, why not also assume =0? It is difficult to decide whether this is a strong assumption. Within our model, it implies that if a person never brushed their teeth ( =0) then they would have =0and a utility level of. This is not necessarily a disaster for the model since this sub-utility function is embedded within a wider utility function that may give low weight to a utility of for this particular outcome. 12 This analysis shows that once we take a structural model seriously, it is difficult to justify that OLS estimates of equation (9) will deliver reliable estimates of whether ET s are effective. The obvious approach is to try and find an instrument for in equation (9) For example, some measures of how busy the person is; this impacts on the time spent brushing but not on health directly. Examples would include the hours of work and whether there are young children in the household. Suppose we have such a variable, denoted. Assuming is uncorrelated with ( ), we can consistently estimate the parameters of (9) by instrumental variable estimation, using ( ) as instruments. The point of all this analysis is that the simple difference between dental health for those who are assigned an ET rather than an RT is not a reliable guide to whether ET s are effective. Instead our structural model gave that we 11 It is tedious to check, but this is equivalent to condition (13) for the general case. 12 For example, we might have an overall utility function that looks like: = (exp ( ( )) everything ealse) 10

11 have had to control for the time spent brushing and find a determinant of this that is uncorrelated with the unobservables. The latter have a precise definition within our structural model. Having developed a structural model, we can exploit the instrument to gain insight into the other parts of the model. 3.3 A structural model A nonparametric structural model. One approach is to set up a structural model. When doing this, the ideal would be to allow for full generality in tastes. That is, the utility function ( ) which now also depends on the taste shifter. We would also like to simply estimate the health production functions, ( ) and ( ). Thus we only assume that the functions exist, are smooth and satisfy certain monotonicity and concavity assumptions. For example, we would assume that () is strictly increasing and concave in, strictly increasing in ; strictly decreasing and concave in for fixed values of the other variables. We may also require that tastes over ( ) are strictly increasing in for fixed ( ); thatis: ( ) 0 (18) Similarly we might impose that () and () are strictly increasing and concave in and strictly increasing in. We would also need to allow for nonparametric distributions for the heterogeneity distributions for and. Again, we may impose conditions on the probability distribution functions such as continuity ( no mass points ) and full support. Ideally the data then determines the form of all of the unknowns. If we go that route, we have to show nonparametric identification of the unknown structure; that is, the forms of ( ), ( ) and ( ) and the joint distribution of and. This identification analysis would assume that we have sufficiently rich data that we can estimate consistently the joint distribution of ( ). Toshowidentification we need to show that if two different choices of the structure give the same joint distribution for ( ) then they are the same structure. If we found that the elements of the structure were not nonparametrically identified, a sophisticated response would be to conduct a partial identification analysis and to determine the bounds on the effectiveness of an ET. 11

12 3.3.2 A fully parametric structural model. A structural approach that is the complete opposite of the nonparametric structural approach is start with a simple fully parametric model. In fact, we have almost done that in the last section. Thus (9) and (16) give equations for the determination of observables ( ) given observables ( ) and unobservables ( ). All that remains is to specify distributions for the latter. For example we might assume: Ã! Ã" 0 # #! 0 " 2 2 (19) This still requires an identification analysis to show that the parameter of interest, 1,isidentified. 13 If we can point identify the parameter of interest, 1, (the nuisance parameters are ( ) and 2 2 )thenwe could estimate by maximum likelihood estimation. Then we would test 1 =0 against the alternative, 1 0. If additional observables such as age, gender and education are available, we could allow that 0 and 0 (the means for the heterogeneity) depend on them. In practice, deriving the likelihood function for the Normal model may be tricky and we may have to resort to simulation based estimation methods such as indirect inference ( simulated minimum distance ) Goodness of fit. If we do adopt a fully parametric model then we are open to the criticism that our results depend too much on the functional form assumptions. For example, suppose we took some other distribution for the heterogeneity than the bivariate Normal? Or a different functional form for the production function? Therearetwowaystoovercomethis. Thefirst is to adopt flexible functional forms for the various functions. For example, for the heterogeneity we could take a mixture of three Normals for and. A mixture of three Normals can approximate closely almost any continuous distribution. The drawback from doing this is the mixture distribution has 17 parameters rather than the 5 parameters in (19). Similarly, we could take much more complicated production and utility functions than (6) and (15). If we did this we would have to conduct 13 It is important to remember that we can have parametric models in which some parameters are point identified and others are not. 12

13 an identification analysis to show that the effectiveness of ET is point identified. This approach involves a move toward the nonparametric model discussed above. An alternative approach is to start with simple functional forms (such as those from the previous subsection) and estimate a small number of parameters. Then apply a wide range of goodness of fit tests. For example, one would want to test whether the estimated parameters adequately fit thefirst four moments (means, variances, skewness and kurtosis) of the distributions of and for each mode. Note that one of these moments is the quasi-experimental effect: the difference between the mean health outcomes for the two groups (RT and ET). One would also want to check the fit for the dependence of these two variables (at a minimum, their correlation). More ambitiously, can we match the quantiles and dependencies between the and variables. If our estimated model fits these moments well then we have some confidence in our estimates. This is a progressive strategy in the sense that failures to fit incertaindirections usually indicate in which way we need to generalise the model. We keep moving back and forth between theory, identification, estimation and goodness of fit until we have a model that captures all of the features of the observed joint distribution of, and. I would then take the final model as an adequate structural model for determining whether electric toothbrushes are effective. Note, however, that if the object of interest is not nonparametrically identified, then there will be other structural models that give a different estimate and also fit the data as well as our preferred model. This is a serious drawback - showing nonparametric identification remains the gold standard. 13

14 health, h ET, h=g(t) RT, h=f(t) time brushing, t Figure 1: Production functions for ET and RT Health, h ET, d=1 RT, d=0 time brushing, t Figure 2: Indifference curves for ET and RT 14

15 health ET h(et)=h(rt) RT t(et) t(rt) time brushing Figure 3: Effective ET and separable preferences health h(et) RT=ET h(rt) t(rt) t(et) time brushing Figure 4: Ineffective ET and non-separable preferences 15

Chapter 5: Preferences

Chapter 5: Preferences Chapter 5: Preferences 5.1: Introduction In chapters 3 and 4 we considered a particular type of preferences in which all the indifference curves are parallel to each other and in which each indifference

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Fitting a Straight Line to Data

Fitting a Straight Line to Data Fitting a Straight Line to Data Thanks for your patience. Finally we ll take a shot at real data! The data set in question is baryonic Tully-Fisher data from http://astroweb.cwru.edu/sparc/btfr Lelli2016a.mrt,

More information

Causal Inference. Prediction and causation are very different. Typical questions are:

Causal Inference. Prediction and causation are very different. Typical questions are: Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict Y after observing X = x Causation: Predict Y after setting X = x. Causation involves predicting

More information

Potential Outcomes Model (POM)

Potential Outcomes Model (POM) Potential Outcomes Model (POM) Relationship Between Counterfactual States Causality Empirical Strategies in Labor Economics, Angrist Krueger (1999): The most challenging empirical questions in economics

More information

Causality II: How does causal inference fit into public health and what it is the role of statistics?

Causality II: How does causal inference fit into public health and what it is the role of statistics? Causality II: How does causal inference fit into public health and what it is the role of statistics? Statistics for Psychosocial Research II November 13, 2006 1 Outline Potential Outcomes / Counterfactual

More information

1 Impact Evaluation: Randomized Controlled Trial (RCT)

1 Impact Evaluation: Randomized Controlled Trial (RCT) Introductory Applied Econometrics EEP/IAS 118 Fall 2013 Daley Kutzman Section #12 11-20-13 Warm-Up Consider the two panel data regressions below, where i indexes individuals and t indexes time in months:

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 24, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

3/1/2016. Intermediate Microeconomics W3211. Lecture 3: Preferences and Choice. Today s Aims. The Story So Far. A Short Diversion: Proofs

3/1/2016. Intermediate Microeconomics W3211. Lecture 3: Preferences and Choice. Today s Aims. The Story So Far. A Short Diversion: Proofs 1 Intermediate Microeconomics W3211 Lecture 3: Preferences and Choice Introduction Columbia University, Spring 2016 Mark Dean: mark.dean@columbia.edu 2 The Story So Far. 3 Today s Aims 4 So far, we have

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Binary Logistic Regression

Binary Logistic Regression The coefficients of the multiple regression model are estimated using sample data with k independent variables Estimated (or predicted) value of Y Estimated intercept Estimated slope coefficients Ŷ = b

More information

AGEC 661 Note Fourteen

AGEC 661 Note Fourteen AGEC 661 Note Fourteen Ximing Wu 1 Selection bias 1.1 Heckman s two-step model Consider the model in Heckman (1979) Y i = X iβ + ε i, D i = I {Z iγ + η i > 0}. For a random sample from the population,

More information

1 Bewley Economies with Aggregate Uncertainty

1 Bewley Economies with Aggregate Uncertainty 1 Bewley Economies with Aggregate Uncertainty Sofarwehaveassumedawayaggregatefluctuations (i.e., business cycles) in our description of the incomplete-markets economies with uninsurable idiosyncratic risk

More information

Uncertainty. Michael Peters December 27, 2013

Uncertainty. Michael Peters December 27, 2013 Uncertainty Michael Peters December 27, 20 Lotteries In many problems in economics, people are forced to make decisions without knowing exactly what the consequences will be. For example, when you buy

More information

Suggested solutions to the 6 th seminar, ECON4260

Suggested solutions to the 6 th seminar, ECON4260 1 Suggested solutions to the 6 th seminar, ECON4260 Problem 1 a) What is a public good game? See, for example, Camerer (2003), Fehr and Schmidt (1999) p.836, and/or lecture notes, lecture 1 of Topic 3.

More information

Physics 509: Bootstrap and Robust Parameter Estimation

Physics 509: Bootstrap and Robust Parameter Estimation Physics 509: Bootstrap and Robust Parameter Estimation Scott Oser Lecture #20 Physics 509 1 Nonparametric parameter estimation Question: what error estimate should you assign to the slope and intercept

More information

Rockefeller College University at Albany

Rockefeller College University at Albany Rockefeller College University at Albany PAD 705 Handout: Simultaneous quations and Two-Stage Least Squares So far, we have studied examples where the causal relationship is quite clear: the value of the

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS

DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS DEALING WITH MULTIVARIATE OUTCOMES IN STUDIES FOR CAUSAL EFFECTS Donald B. Rubin Harvard University 1 Oxford Street, 7th Floor Cambridge, MA 02138 USA Tel: 617-495-5496; Fax: 617-496-8057 email: rubin@stat.harvard.edu

More information

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL

IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL IDENTIFICATION OF TREATMENT EFFECTS WITH SELECTIVE PARTICIPATION IN A RANDOMIZED TRIAL BRENDAN KLINE AND ELIE TAMER Abstract. Randomized trials (RTs) are used to learn about treatment effects. This paper

More information

Lectures 5 & 6: Hypothesis Testing

Lectures 5 & 6: Hypothesis Testing Lectures 5 & 6: Hypothesis Testing in which you learn to apply the concept of statistical significance to OLS estimates, learn the concept of t values, how to use them in regression work and come across

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Dynamic Macroeconomic Theory Notes. David L. Kelly. Department of Economics University of Miami Box Coral Gables, FL

Dynamic Macroeconomic Theory Notes. David L. Kelly. Department of Economics University of Miami Box Coral Gables, FL Dynamic Macroeconomic Theory Notes David L. Kelly Department of Economics University of Miami Box 248126 Coral Gables, FL 33134 dkelly@miami.edu Current Version: Fall 2013/Spring 2013 I Introduction A

More information

An Introduction to Mplus and Path Analysis

An Introduction to Mplus and Path Analysis An Introduction to Mplus and Path Analysis PSYC 943: Fundamentals of Multivariate Modeling Lecture 10: October 30, 2013 PSYC 943: Lecture 10 Today s Lecture Path analysis starting with multivariate regression

More information

Interpreting and using heterogeneous choice & generalized ordered logit models

Interpreting and using heterogeneous choice & generalized ordered logit models Interpreting and using heterogeneous choice & generalized ordered logit models Richard Williams Department of Sociology University of Notre Dame July 2006 http://www.nd.edu/~rwilliam/ The gologit/gologit2

More information

Poisson regression: Further topics

Poisson regression: Further topics Poisson regression: Further topics April 21 Overdispersion One of the defining characteristics of Poisson regression is its lack of a scale parameter: E(Y ) = Var(Y ), and no parameter is available to

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics STAT-S-301 Experiments and Quasi-Experiments (2016/2017) Lecturer: Yves Dominicy Teaching Assistant: Elise Petit 1 Why study experiments? Ideal randomized controlled experiments

More information

Experiments and Quasi-Experiments

Experiments and Quasi-Experiments Experiments and Quasi-Experiments (SW Chapter 13) Outline 1. Potential Outcomes, Causal Effects, and Idealized Experiments 2. Threats to Validity of Experiments 3. Application: The Tennessee STAR Experiment

More information

Business Statistics. Lecture 9: Simple Regression

Business Statistics. Lecture 9: Simple Regression Business Statistics Lecture 9: Simple Regression 1 On to Model Building! Up to now, class was about descriptive and inferential statistics Numerical and graphical summaries of data Confidence intervals

More information

Do not copy, post, or distribute

Do not copy, post, or distribute 14 CORRELATION ANALYSIS AND LINEAR REGRESSION Assessing the Covariability of Two Quantitative Properties 14.0 LEARNING OBJECTIVES In this chapter, we discuss two related techniques for assessing a possible

More information

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur

Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Econometric Modelling Prof. Rudra P. Pradhan Department of Management Indian Institute of Technology, Kharagpur Module No. # 01 Lecture No. # 28 LOGIT and PROBIT Model Good afternoon, this is doctor Pradhan

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

Descriptive Statistics (And a little bit on rounding and significant digits)

Descriptive Statistics (And a little bit on rounding and significant digits) Descriptive Statistics (And a little bit on rounding and significant digits) Now that we know what our data look like, we d like to be able to describe it numerically. In other words, how can we represent

More information

An Introduction to Path Analysis

An Introduction to Path Analysis An Introduction to Path Analysis PRE 905: Multivariate Analysis Lecture 10: April 15, 2014 PRE 905: Lecture 10 Path Analysis Today s Lecture Path analysis starting with multivariate regression then arriving

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy Department of Economics, Harvard University 1 / 40 Agenda instrumental variables part I Origins of instrumental

More information

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case

Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Identification and Estimation Using Heteroscedasticity Without Instruments: The Binary Endogenous Regressor Case Arthur Lewbel Boston College December 2016 Abstract Lewbel (2012) provides an estimator

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Business Statistics. Lecture 5: Confidence Intervals

Business Statistics. Lecture 5: Confidence Intervals Business Statistics Lecture 5: Confidence Intervals Goals for this Lecture Confidence intervals The t distribution 2 Welcome to Interval Estimation! Moments Mean 815.0340 Std Dev 0.8923 Std Error Mean

More information

Simple Regression Model. January 24, 2011

Simple Regression Model. January 24, 2011 Simple Regression Model January 24, 2011 Outline Descriptive Analysis Causal Estimation Forecasting Regression Model We are actually going to derive the linear regression model in 3 very different ways

More information

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008

A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods. Jeff Wooldridge IRP Lectures, UW Madison, August 2008 A Course in Applied Econometrics Lecture 14: Control Functions and Related Methods Jeff Wooldridge IRP Lectures, UW Madison, August 2008 1. Linear-in-Parameters Models: IV versus Control Functions 2. Correlated

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

LECTURE 5. Introduction to Econometrics. Hypothesis testing

LECTURE 5. Introduction to Econometrics. Hypothesis testing LECTURE 5 Introduction to Econometrics Hypothesis testing October 18, 2016 1 / 26 ON TODAY S LECTURE We are going to discuss how hypotheses about coefficients can be tested in regression models We will

More information

The Generalized Roy Model and Treatment Effects

The Generalized Roy Model and Treatment Effects The Generalized Roy Model and Treatment Effects Christopher Taber University of Wisconsin November 10, 2016 Introduction From Imbens and Angrist we showed that if one runs IV, we get estimates of the Local

More information

Lecture 5: ANOVA and Correlation

Lecture 5: ANOVA and Correlation Lecture 5: ANOVA and Correlation Ani Manichaikul amanicha@jhsph.edu 23 April 2007 1 / 62 Comparing Multiple Groups Continous data: comparing means Analysis of variance Binary data: comparing proportions

More information

Area I: Contract Theory Question (Econ 206)

Area I: Contract Theory Question (Econ 206) Theory Field Exam Summer 2011 Instructions You must complete two of the four areas (the areas being (I) contract theory, (II) game theory A, (III) game theory B, and (IV) psychology & economics). Be sure

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

STOCKHOLM UNIVERSITY Department of Economics Course name: Empirical Methods Course code: EC40 Examiner: Lena Nekby Number of credits: 7,5 credits Date of exam: Friday, June 5, 009 Examination time: 3 hours

More information

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A

WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, Academic Year Exam Version: A WISE MA/PhD Programs Econometrics Instructor: Brett Graham Spring Semester, 2015-16 Academic Year Exam Version: A INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This

More information

Midterm 2 - Solutions

Midterm 2 - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman Midterm 2 - Solutions You have until 10:20am to complete this exam. Please remember to put

More information

The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics

The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics PHPM110062 Teaching Demo The cover page of the Encyclopedia of Health Economics (2014) Introduction to Econometric Application in Health Economics Instructor: Mengcen Qian School of Public Health What

More information

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b). Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

WISE International Masters

WISE International Masters WISE International Masters ECONOMETRICS Instructor: Brett Graham INSTRUCTIONS TO STUDENTS 1 The time allowed for this examination paper is 2 hours. 2 This examination paper contains 32 questions. You are

More information

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b).

Note that we are looking at the true mean, μ, not y. The problem for us is that we need to find the endpoints of our interval (a, b). Confidence Intervals 1) What are confidence intervals? Simply, an interval for which we have a certain confidence. For example, we are 90% certain that an interval contains the true value of something

More information

Lecture Notes 22 Causal Inference

Lecture Notes 22 Causal Inference Lecture Notes 22 Causal Inference Prediction and causation are very different. Typical questions are: Prediction: Predict after observing = x Causation: Predict after setting = x. Causation involves predicting

More information

PRICES VERSUS PREFERENCES: TASTE CHANGE AND TOBACCO CONSUMPTION

PRICES VERSUS PREFERENCES: TASTE CHANGE AND TOBACCO CONSUMPTION PRICES VERSUS PREFERENCES: TASTE CHANGE AND TOBACCO CONSUMPTION AEA SESSION: REVEALED PREFERENCE THEORY AND APPLICATIONS: RECENT DEVELOPMENTS Abigail Adams (IFS & Oxford) Martin Browning (Oxford & IFS)

More information

Simple Regression Model (Assumptions)

Simple Regression Model (Assumptions) Simple Regression Model (Assumptions) Lecture 18 Reading: Sections 18.1, 18., Logarithms in Regression Analysis with Asiaphoria, 19.6 19.8 (Optional: Normal probability plot pp. 607-8) 1 Height son, inches

More information

Two-sample Categorical data: Testing

Two-sample Categorical data: Testing Two-sample Categorical data: Testing Patrick Breheny April 1 Patrick Breheny Introduction to Biostatistics (171:161) 1/28 Separate vs. paired samples Despite the fact that paired samples usually offer

More information

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017 Introduction to Regression Analysis Dr. Devlina Chatterjee 11 th August, 2017 What is regression analysis? Regression analysis is a statistical technique for studying linear relationships. One dependent

More information

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012

Lecture Notes 12 Advanced Topics Econ 20150, Principles of Statistics Kevin R Foster, CCNY Spring 2012 Lecture Notes 2 Advanced Topics Econ 2050, Principles of Statistics Kevin R Foster, CCNY Spring 202 Endogenous Independent Variables are Invalid Need to have X causing Y not vice-versa or both! NEVER regress

More information

Semester 2, 2015/2016

Semester 2, 2015/2016 ECN 3202 APPLIED ECONOMETRICS 2. Simple linear regression B Mr. Sydney Armstrong Lecturer 1 The University of Guyana 1 Semester 2, 2015/2016 PREDICTION The true value of y when x takes some particular

More information

INTRODUCTION TO ANALYSIS OF VARIANCE

INTRODUCTION TO ANALYSIS OF VARIANCE CHAPTER 22 INTRODUCTION TO ANALYSIS OF VARIANCE Chapter 18 on inferences about population means illustrated two hypothesis testing situations: for one population mean and for the difference between two

More information

Development. ECON 8830 Anant Nyshadham

Development. ECON 8830 Anant Nyshadham Development ECON 8830 Anant Nyshadham Projections & Regressions Linear Projections If we have many potentially related (jointly distributed) variables Outcome of interest Y Explanatory variable of interest

More information

Job Training Partnership Act (JTPA)

Job Training Partnership Act (JTPA) Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training

More information

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton.

Econometrics in a nutshell: Variation and Identification Linear Regression Model in STATA. Research Methods. Carlos Noton. 1/17 Research Methods Carlos Noton Term 2-2012 Outline 2/17 1 Econometrics in a nutshell: Variation and Identification 2 Main Assumptions 3/17 Dependent variable or outcome Y is the result of two forces:

More information

Comparing the effects of two treatments on two ordinal outcome variables

Comparing the effects of two treatments on two ordinal outcome variables Working Papers in Statistics No 2015:16 Department of Statistics School of Economics and Management Lund University Comparing the effects of two treatments on two ordinal outcome variables VIBEKE HORSTMANN,

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

1 Basic Analysis of Forward-Looking Decision Making

1 Basic Analysis of Forward-Looking Decision Making 1 Basic Analysis of Forward-Looking Decision Making Individuals and families make the key decisions that determine the future of the economy. The decisions involve balancing current sacrifice against future

More information

Function Approximation

Function Approximation 1 Function Approximation This is page i Printer: Opaque this 1.1 Introduction In this chapter we discuss approximating functional forms. Both in econometric and in numerical problems, the need for an approximating

More information

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis

Path Analysis. PRE 906: Structural Equation Modeling Lecture #5 February 18, PRE 906, SEM: Lecture 5 - Path Analysis Path Analysis PRE 906: Structural Equation Modeling Lecture #5 February 18, 2015 PRE 906, SEM: Lecture 5 - Path Analysis Key Questions for Today s Lecture What distinguishes path models from multivariate

More information

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015

Introduction to causal identification. Nidhiya Menon IGC Summer School, New Delhi, July 2015 Introduction to causal identification Nidhiya Menon IGC Summer School, New Delhi, July 2015 Outline 1. Micro-empirical methods 2. Rubin causal model 3. More on Instrumental Variables (IV) Estimating causal

More information

Slope Fields: Graphing Solutions Without the Solutions

Slope Fields: Graphing Solutions Without the Solutions 8 Slope Fields: Graphing Solutions Without the Solutions Up to now, our efforts have been directed mainly towards finding formulas or equations describing solutions to given differential equations. Then,

More information

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43

Panel Data. March 2, () Applied Economoetrics: Topic 6 March 2, / 43 Panel Data March 2, 212 () Applied Economoetrics: Topic March 2, 212 1 / 43 Overview Many economic applications involve panel data. Panel data has both cross-sectional and time series aspects. Regression

More information

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors

IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors IV Estimation and its Limitations: Weak Instruments and Weakly Endogeneous Regressors Laura Mayoral IAE, Barcelona GSE and University of Gothenburg Gothenburg, May 2015 Roadmap Deviations from the standard

More information

Flexible Estimation of Treatment Effect Parameters

Flexible Estimation of Treatment Effect Parameters Flexible Estimation of Treatment Effect Parameters Thomas MaCurdy a and Xiaohong Chen b and Han Hong c Introduction Many empirical studies of program evaluations are complicated by the presence of both

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

PSC 504: Instrumental Variables

PSC 504: Instrumental Variables PSC 504: Instrumental Variables Matthew Blackwell 3/28/2013 Instrumental Variables and Structural Equation Modeling Setup e basic idea behind instrumental variables is that we have a treatment with unmeasured

More information

Sensitivity checks for the local average treatment effect

Sensitivity checks for the local average treatment effect Sensitivity checks for the local average treatment effect Martin Huber March 13, 2014 University of St. Gallen, Dept. of Economics Abstract: The nonparametric identification of the local average treatment

More information

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives

Estimating the Dynamic Effects of a Job Training Program with M. Program with Multiple Alternatives Estimating the Dynamic Effects of a Job Training Program with Multiple Alternatives Kai Liu 1, Antonio Dalla-Zuanna 2 1 University of Cambridge 2 Norwegian School of Economics June 19, 2018 Introduction

More information

Chapter 24. Comparing Means

Chapter 24. Comparing Means Chapter 4 Comparing Means!1 /34 Homework p579, 5, 7, 8, 10, 11, 17, 31, 3! /34 !3 /34 Objective Students test null and alternate hypothesis about two!4 /34 Plot the Data The intuitive display for comparing

More information

Many natural processes can be fit to a Poisson distribution

Many natural processes can be fit to a Poisson distribution BE.104 Spring Biostatistics: Poisson Analyses and Power J. L. Sherley Outline 1) Poisson analyses 2) Power What is a Poisson process? Rare events Values are observational (yes or no) Random distributed

More information

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger

More information

Principles Underlying Evaluation Estimators

Principles Underlying Evaluation Estimators The Principles Underlying Evaluation Estimators James J. University of Chicago Econ 350, Winter 2019 The Basic Principles Underlying the Identification of the Main Econometric Evaluation Estimators Two

More information

Vocabulary: Samples and Populations

Vocabulary: Samples and Populations Vocabulary: Samples and Populations Concept Different types of data Categorical data results when the question asked in a survey or sample can be answered with a nonnumerical answer. For example if we

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

CRP 272 Introduction To Regression Analysis

CRP 272 Introduction To Regression Analysis CRP 272 Introduction To Regression Analysis 30 Relationships Among Two Variables: Interpretations One variable is used to explain another variable X Variable Independent Variable Explaining Variable Exogenous

More information

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation

AP Statistics. Chapter 6 Scatterplots, Association, and Correlation AP Statistics Chapter 6 Scatterplots, Association, and Correlation Objectives: Scatterplots Association Outliers Response Variable Explanatory Variable Correlation Correlation Coefficient Lurking Variables

More information

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo

Lecture 1. Behavioral Models Multinomial Logit: Power and limitations. Cinzia Cirillo Lecture 1 Behavioral Models Multinomial Logit: Power and limitations Cinzia Cirillo 1 Overview 1. Choice Probabilities 2. Power and Limitations of Logit 1. Taste variation 2. Substitution patterns 3. Repeated

More information

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,

More information

Chapter 10 Nonlinear Models

Chapter 10 Nonlinear Models Chapter 10 Nonlinear Models Nonlinear models can be classified into two categories. In the first category are models that are nonlinear in the variables, but still linear in terms of the unknown parameters.

More information

Lecture 5: Sampling Methods

Lecture 5: Sampling Methods Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to

More information

Linear Regression with Multiple Regressors

Linear Regression with Multiple Regressors Linear Regression with Multiple Regressors (SW Chapter 6) Outline 1. Omitted variable bias 2. Causality and regression analysis 3. Multiple regression and OLS 4. Measures of fit 5. Sampling distribution

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Analysis and Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasini/teaching.html Lecture 31 (MWF) Review of test for independence and starting with linear regression Suhasini Subba

More information

Next, we discuss econometric methods that can be used to estimate panel data models.

Next, we discuss econometric methods that can be used to estimate panel data models. 1 Motivation Next, we discuss econometric methods that can be used to estimate panel data models. Panel data is a repeated observation of the same cross section Panel data is highly desirable when it is

More information

Regression Discontinuity Designs.

Regression Discontinuity Designs. Regression Discontinuity Designs. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 31/10/2017 I. Brunetti Labour Economics in an European Perspective 31/10/2017 1 / 36 Introduction

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Regression Discontinuity

Regression Discontinuity Regression Discontinuity Christopher Taber Department of Economics University of Wisconsin-Madison October 24, 2017 I will describe the basic ideas of RD, but ignore many of the details Good references

More information