SP Experimental Designs - Theoretical Background and Case Study

Size: px
Start display at page:

Download "SP Experimental Designs - Theoretical Background and Case Study"

Transcription

1 SP Experimental Designs - Theoretical Background and Case Study Basil Schmid IVT ETH Zurich Measurement and Modeling FS2016

2 Outline 1. Introduction 2. Orthogonal and fractional factorial designs 3. Efficient designs 4. Pivot designs 5. Testing a design: A case study 6. Conclusions SP Experimental Designs 2

3 Introduction Explain how the variation of certain attributes affects the outcome of interest (causal relationship), applying a statistically efficient and effective framework (maximum amount of information with minimum amount of resources) Kuhfeld (1994): The best approach to design creation is to use the computer as a tool along with traditional design skills, not as a substitute for thinking about the problem SP Experimental Designs 3

4 A brief history : While serving as surgeon on HMS Salisbury, James Lind carried out a systematic clinical trial to compare patients with scurvy (lack of vitamin C disease) Entry requirements to reduce exogenous variation 12 seamen were assigned to 6 treatment groups, each receiving a different diet over a 2-week-period Other examples: Agriculture, marketing, economics Sir Ronald Fisher (1935): Experiments are experience carefully planned in advance, and designed to form a secure basis of new knowledge : manipulation/variation of (existing) attributes formation of attribute levels observation/measurement of outcomes SP Experimental Designs 4

5 Experimental design In contrast to revealed preference (RP) data, stated preference (SP) data are generated by some systematic and planned design process SP data may provide insights into a hypothetical market for which no RP data is available Formulation of statistical hypotheses to be tested Specification of the number of experimental units (observations) required and the population from which they will be sampled Specification of the randomization procedure for assigning the experimental units to the attribute levels: Sources of variation among the units are distributed over the entire experiment Determination of the statistical analysis that will be performed (discrete choice, multivariate regression,...) SP Experimental Designs 5

6 Orthogonal designs x y: Two attribute vectors x and y are said to be (strictly) orthogonal if the inner product is zero cov(x, y) = E(x E(x)) E(y E(y)) = 0 Correlations between attributes are zero and attribute levels appear equally often in combination with all other attribute levels (balance) the effects of interest can be estimated efficient and stochastically independent Full factorial orthogonal design with 2 attributes x and y à 3 levels (3 2 possible combinations; orthogonally coded): Choice set x y SP Experimental Designs 6

7 Fractional factorial designs Full factorial design: Experiment size explodes with increasing attributes and levels. E.g. 10 attributes with 3 levels: 3 10 possible attribute combinations (= degrees of freedom) Full factorial designs are, by definition, perfectly orthogonal in all main-effects and higher order interactions Use an optimal subset of a full factorial Orthogonality can be maintained under the assumption that some effects (often higher order interactions) are zero However, interactions might be highly correlated with main effects: U i = α X tt,i + β X tc,i + γ X tt,i X tc,i + ɛ i (1) U i X tc,i = β + γ X tt,i (2) SP Experimental Designs 7

8 Fractional factorial designs Assume a full factorial design with 10 attributes à 3 levels (3 10 combinations): To estimate all 10 main-effects, one needs at least 20 choice sets (10*(3-1) degrees of freedom) Hence, 45 two-way ((10-1)*10/2) as well as many higher order interactions ( degrees of freedom) are ignored Practical considerations: Main-effects typically account for 70-90% of explained variance, two-way interactions for 5-15% Limit # of attribute levels: Often 2-5 levels Limit # of attributes: Often 6-16 attributes Only allow some two-way interactions to be different from 0 (e.g. travel time x travel cost) Block-design: Divide fractional factorial into groups with the same # of choice sets in a statistically efficient way SP Experimental Designs 8

9 Block-designs Typically, a respondent receives between 6 and 15 choice sets (response burden and cognitive fatigue) Even fractional designs often include more choice sets than what the researcher wants to assign to each respondent Correlation between blocks and attributes should be minimized. Otherwise, one respondent gets all blocks with e.g. high travel times Common mistake: Assign first x choice sets to block b Orthogonal blocking: Block number is uncorrelated with attributes Good news: Most software automatically assign choice sets to each block specified by the researcher SP Experimental Designs 9

10 Some important definitions Unlabeled experiment: A choice experiment where alternatives have no intrinsic meaning (e.g. route 1 vs. route 2) Labeled experiment: A choice experiment where the alternatives are labeled. Model parameters can be estimated for each alternative independently (e.g. car vs. train vs. bus) Generic effect: The same model parameter for all alternatives in the utility function (e.g. travel cost) Alternative-specific effect: Different model parameters for each alternative in the utility function (e.g. travel time car vs. travel time bus vs. travel time train) Own vs. cross effect: If cross effects are present, the IID error assumption is violated SP Experimental Designs 10

11 Example of unlabeled experiment SP Experimental Designs 11

12 Example of labeled experiment SP Experimental Designs 12

13 Orthognonal fraction of full factorial (example) 4 attributes with 3 levels, 3 unlabelled choice alternatives, possible attribute level combinations: Minimum of 8 choice sets to estimate all 4 (generic) main-effects Smallest orthogonal fraction = choice sets Set TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU redundant alternatives weakly dominant alternatives dominant alternatives SP Experimental Designs 13

14 Problems with orthogonal designs Reasons for moving away from orthogonal designs (OD): For some problems, an OD does not exist (e.g. for limited, by the researcher predefined number of choice sets) in general, ODs require a larger sample size and lead to larger choice sets Behaviorally plausible choice scenarios: ODs may include dominant/weakly dominant/redundant choice sets no information gain When working with preference constraints, orthogonality cannot be maintained Need for more sophisticated approaches: Efficient experimental designs SP Experimental Designs 14

15 Efficient designs: Some basic concepts Efficiency: For given design requirements (violating strict orthogonality), minimize the variances of parameter estimates, which are taken from the variance-covariance matrix of a design D-Efficient GLM Designs: No prior information about the parameter values (signs, magnitude) Efficiency convergence towards orthogonality D-Efficient MNL Designs: Efficiency measures depend on the unknown parameter values one wants to estimate In many cases, one has some sound knowledge about the sign and relative values of the design attributes (e.g. travel cost and travel time both have a negative effect on utility, leading to a positive value of time) SP Experimental Designs 15

16 Example Orthogonal design with travel time and travel cost (2 alternatives, 3 levels): Quadrants 1 and 3 dominate quadrants 2 and 4 SP Experimental Designs 16

17 Example 2 Cost_MIV - Cost_PT Time_MIV - Time_PT WLS Predictions Efficient design with travel time and travel cost (2 alternatives, 3 levels): Elimination of dominant alternatives SP Experimental Designs 17

18 Efficient designs: Some basic concepts Main question: How can the researcher make use of prior information in order to increase the efficiency (minimize standard errors of the attributes, i.e. more robust results) and reduce the sample size requirements? Example 1: Orthogonal designs make no use prior information time and cost attributes are uncorrelated Example 2: Efficient design with no dominant alternatives automatically leads to a negative correlation between time and cost forces respondents to trade-off and increases the amount of preference information given sample size D-Efficient MNL approach: Use expected parameter distributions with µ k and σ k to calculate the optimal design SP Experimental Designs 18

19 D-Efficient GLM designs Find a design matrix Z, with rows selected from a Q x k matrix X where n Q, that is optimal in some sense. Z is an n x k matrix, where k is the number of parameters and n is the number of choice sets in the actual experiment Row-based Federov algorithm (R-package AlgDesign): Selection from a predefined candidature set (after exclusion of dominant/redundant alternatives, etc.) Optimization criterion: Maximize k-th root of the determinant of the normalized dispersion matrix M Ω 1 Assumption: Observations are independent and error terms are normally distributed ( max. Z ) 1 D Efficiency = det Z k n (3) SP Experimental Designs 19

20 D-Efficient MNL designs Asymptotic variance-covariance (AVC) matrix for discrete choice models depends on the true parameter values Starting point: Need to make assumptions about the model, utility functions and parameter values Design matrices Z are created using a column-based swapping algorithm: Selection of attribute levels over all choice situations for each attribute Optimization criteria: Minimize k-th root of the determinant of the AVC matrix Ω ( min. D Error = det Ω(Z, β)) 1 k (4) SP Experimental Designs 20

21 Some remarks on D-Efficient designs Large number of different algorithms and optimization criteria exist (focus on D-Efficiency as most common approach in the literature) Eliminating undesirable choice sets has to be done manually by using preference constraints GLM designs: Can be created in the open-source software R. Robust towards misspecification of priors and often as efficient as MNL designs MNL designs: Created in the commercial software NGENE. Easier to implement, more assistance and possibilities. Priors usually come from the literature, intuition and pre-test studies. Misspecification can be minimized by assuming a random distribution of priors (Bayesian approach) SP Experimental Designs 21

22 An example of a design strategy 9 attributes with 3 levels (3 9 full factorial), 2 labeled alternatives, 32 choice sets with 4 blocks, estimation of all linear main effects, quadratic effects and 6 selected two-way interactions ( degrees of freedom) Polynomial and interaction effects have to be specified in the utility function of a design No weakly dominant alternatives (i.e. all attribute values of one alternative in choice set s are strictly better or equal: a 1 a 2 or a 1 a 2 ) No strongly dominant travel time relative to travel cost alternatives or vice versa (i.e. a 1,cost a 2,cost and a 1,time a 2,time or vice versa) Weak priors to determine the direction of expected effects SP Experimental Designs 22

23 Efficient design (example) 4 attributes with 3 levels, 3 unlabelled choice alternatives, possible attribute level combinations: Minimum of 8 choice sets to estimate all 4 (generic) main-effects Weak priors, exclusion of all dominant choice sets Free choice about the number of choice sets (# choice sets > df ) Set TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU no more dominant/weakly dominant/redundant choice sets SP Experimental Designs 23

24 Some general remarks Experimental design creation is a research topic on its own (Rose and Bliemer, 2009; Quan et al., 2011) If priors are misspecified, one might run into troubles. Be careful when using priors! Use attributes, values and trade-off variations that are plausible Make sure that there are some overlapping values of generic attributes between alternatives (pivot designs) Order effects: Randomize order of alternatives across respondents in the questionnaire Carefully introduce respondents to the (hypothetical) scenario and explain the attributes you are presenting to them SP Experimental Designs 24

25 Pivot designs It is preferable to base variations around values for observed behavior (state-of-the-art in transportation research): Calculate design with placeholder values (e.g. 1,2,3) and replace them by relative changes (e.g. 0.7, 1.1, 1.5) Combination of RP data with variations given by the design respondents can better identify with the presented choice scenarios; much more variation in the attribute levels Possible to include one reference alternative in the choice sets (e.g. bike travel time, whose value is not varied) Problems: If reference values are (highly) dominant, the respondents will more likely choose the respective alternative (only in labeled experiments) Correlation between attributes; skewness SP Experimental Designs 25

26 Pivot designs: Trade-off distribution Example where MIV is often cheaper and faster than PT modification of reference values needed! SP Experimental Designs 26

27 Testing a design: A case study Once you have your design, you should test the performance of estimating the coefficients of interest, based on simulation of a more or less hypothetical population Define priors for the attribute weights of the utility function based on recent similar studies Simulate error structure (GEV) for the utility function taking into account the panel structure of the designs Calculate individual utilities and determine the chosen alternatives for each simulated subject Estimate the parameters for the simulated data and compare the results with the a-priori assumptions SP Experimental Designs 27

28 Testing a design: Pivot approach Experimental design: 9 attributes with 3 levels (3 9 full factorial), 2 labeled alternatives, 32 choice sets (8 per subject) Reference values taken from a Swiss mode choice experiment: Total travel time: For PT alternative = travel time without access and egress time; for MIV alternative = travel time + parking search time Total travel cost: For PT alternative = ticket price; for MIV alternative = fuel cost + parking cost Number of transfers: PT alternative only Attribute Effect code: Travel time (MIV and PT) [%] Travel cost (MIV and PT) [%] Delay prob. (MIV and PT) [%] Walking / waiting time (MIV and PT) [min.] Number of transfers (PT) [#] SP Experimental Designs 28

29 Testing a design: A-priori Coefficients Prior values for the individual weights of attribute k, β ik N(µ k, σ k ), and alternative-specific constants are simulated based on results obtained from the linear model in the BMVI Zeitkostenstudie (Axhausen et al., 2014) For each simulated individual i, the same β ik is used over all 8 choice sets, representing the panel structure of the experiment Coefficient Mean SD Type ASC MIV Alternative-specific β timemiv Alternative-specific β timept Alternative-specific β cost Generic β delaymiv Alternative-specific β delaypt Alternative-specific β walk Generic β transferspt Alternative-specific VOT MIV VOT PT 15.0 CHF / h 14.2 CHF / h Number of simulated 400 coefficient vectors β ik SP Experimental Designs 29

30 Testing a design: Utility Function The random utility model framework (RUM) assumes that in each choice set s, individual i perceives utility U ijs for each alternative j among the full set of alternatives J (MIV and PT), given the attributes X ijs, and chooses the one that maximizes utility. U ijs has an observed component V ijs and an unobserved component ɛ ijs : U ijs = V ijs + ɛ ijs (5) where and ɛ ijs GEV (0, 1, 0) (6) K V ijs = β ik X ijsk (7) k=1 SP Experimental Designs 30

31 Testing a design: Choice simulation The chosen alternatives choice is are calculated as follows: if U is,miv > U is,pt : choice is = { MIV else PT (8) Snippet of a simulated discrete choice data set: id set alt choice block time cost walk delay transfers (min.) (CHF) (min.) (min.) (#) MIV PT MIV PT MIV SP Experimental Designs 31

32 Testing a design: Estimation For given randomly drawn subsets of RP data, simulated β ik coefficient vectors and simulated error terms ɛ ijs, the models are estimated for 3 different designs. This process is repeated 2000 times to get insights into the distributions of coefficients (robustness), variances (precision) and values of time The between-design differences of E(β k ) and E(SE k ) with respect to the a-priori parameters are small Design approach: GLM MNL: β 0 MNL: β k E(β k ) E(SE k ) E(β k ) E(SE k ) E(β k ) E(SE k ) ASC MIV * β timemiv β timept β cost β delaymiv * β delaypt * β walk * β transferspt VOT MIV VOT PT SP Experimental Designs 32

33 Conclusions No substantial differences between the different design approaches: Designs are robust and reproduce the a-priori values well From a behavioral perspective, one should always exclude dominant and weakly dominant alternatives! Personal suggestion: Create an efficient design by... carefully thinking about your research question and aims assigning about 8 choice sets to a respondents and using a block-design (total # of choice sets 1.5 df ) using MNL approach with zero (or weak) priors, excluding undesired choice sets by manually setting preference conditions updating your design after a pre-test study SP Experimental Designs 33

Keywords Stated choice experiments, experimental design, orthogonal designs, efficient designs

Keywords Stated choice experiments, experimental design, orthogonal designs, efficient designs Constructing Efficient Stated Choice Experimental Designs John M. Rose 1 Michiel C.J. Bliemer 1, 2 1 The University of Sydney, Faculty of Business and Economics, Institute of Transport & Logistics Studies,

More information

Selection on Observables: Propensity Score Matching.

Selection on Observables: Propensity Score Matching. Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017

More information

with the usual assumptions about the error term. The two values of X 1 X 2 0 1

with the usual assumptions about the error term. The two values of X 1 X 2 0 1 Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

A Sampling of IMPACT Research:

A Sampling of IMPACT Research: A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/

More information

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity

13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity Outline: Further Issues in Using OLS with Time Series Data 13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process I. Stationary and Weakly Dependent Time Series III. Highly Persistent

More information

Final Exam. Economics 835: Econometrics. Fall 2010

Final Exam. Economics 835: Econometrics. Fall 2010 Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each

More information

1. The General Linear-Quadratic Framework

1. The General Linear-Quadratic Framework ECO 317 Economics of Uncertainty Fall Term 2009 Slides to accompany 21. Incentives for Effort - Multi-Dimensional Cases 1. The General Linear-Quadratic Framework Notation: x = (x j ), n-vector of agent

More information

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63

Panel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63 1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:

More information

Lecture #8 & #9 Multiple regression

Lecture #8 & #9 Multiple regression Lecture #8 & #9 Multiple regression Starting point: Y = f(x 1, X 2,, X k, u) Outcome variable of interest (movie ticket price) a function of several variables. Observables and unobservables. One or more

More information

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018 Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate

More information

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not? Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2 Tetsuya Matsubayashi University of North Texas November 9, 2010 Learn multiple responses models that do not require the assumption

More information

1 Motivation for Instrumental Variable (IV) Regression

1 Motivation for Instrumental Variable (IV) Regression ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data

More information

Discriminant analysis and supervised classification

Discriminant analysis and supervised classification Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical

More information

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall

Structural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall 1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work

More information

Applied Microeconometrics (L5): Panel Data-Basics

Applied Microeconometrics (L5): Panel Data-Basics Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

6.3 How the Associational Criterion Fails

6.3 How the Associational Criterion Fails 6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P

More information

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017

Econometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017 Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)

More information

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014

BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y

More information

Statistical Tests. Matthieu de Lapparent

Statistical Tests. Matthieu de Lapparent Statistical Tests Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility Laboratory, School of Architecture, Civil and Environmental Engineering, Ecole Polytechnique Fédérale de Lausanne

More information

EMERGING MARKETS - Lecture 2: Methodology refresher

EMERGING MARKETS - Lecture 2: Methodology refresher EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different

More information

Lecture 6: Discrete Choice: Qualitative Response

Lecture 6: Discrete Choice: Qualitative Response Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator

More information

Rank-order conjoint experiments: efficiency and design s

Rank-order conjoint experiments: efficiency and design s Faculty of Business and Economics Rank-order conjoint experiments: efficiency and design s Bart Vermeulen, Peter Goos and Martina Vandebroek DEPARTMENT OF DECISION SCIENCES AND INFORMATION MANAGEMENT (KBI)

More information

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.

401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis. 401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis

More information

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations

Simultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations Simultaneous Equation Models. Introduction: basic definitions 2. Consequences of ignoring simultaneity 3. The identification problem 4. Estimation of simultaneous equation models 5. Example: IS LM model

More information

Orthogonal contrasts for a 2x2 factorial design Example p130

Orthogonal contrasts for a 2x2 factorial design Example p130 Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the

More information

Lecture 4: Types of errors. Bayesian regression models. Logistic regression

Lecture 4: Types of errors. Bayesian regression models. Logistic regression Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture

More information

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model

Goals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model

More information

Greene, Econometric Analysis (7th ed, 2012)

Greene, Econometric Analysis (7th ed, 2012) EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

Classification. Chapter Introduction. 6.2 The Bayes classifier

Classification. Chapter Introduction. 6.2 The Bayes classifier Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode

More information

INTRODUCTION TO TRANSPORTATION SYSTEMS

INTRODUCTION TO TRANSPORTATION SYSTEMS INTRODUCTION TO TRANSPORTATION SYSTEMS Lectures 5/6: Modeling/Equilibrium/Demand 1 OUTLINE 1. Conceptual view of TSA 2. Models: different roles and different types 3. Equilibrium 4. Demand Modeling References:

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling

Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical

More information

An overview of applied econometrics

An overview of applied econometrics An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical

More information

Econometrics for PhDs

Econometrics for PhDs Econometrics for PhDs Amine Ouazad April 2012, Final Assessment - Answer Key 1 Questions with a require some Stata in the answer. Other questions do not. 1 Ordinary Least Squares: Equality of Estimates

More information

Time Series 4. Robert Almgren. Oct. 5, 2009

Time Series 4. Robert Almgren. Oct. 5, 2009 Time Series 4 Robert Almgren Oct. 5, 2009 1 Nonstationarity How should you model a process that has drift? ARMA models are intrinsically stationary, that is, they are mean-reverting: when the value of

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the

More information

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from

Lecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:

More information

Lecture-20: Discrete Choice Modeling-I

Lecture-20: Discrete Choice Modeling-I Lecture-20: Discrete Choice Modeling-I 1 In Today s Class Introduction to discrete choice models General formulation Binary choice models Specification Model estimation Application Case Study 2 Discrete

More information

Pattern Recognition 2

Pattern Recognition 2 Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

MATH Notebook 3 Spring 2018

MATH Notebook 3 Spring 2018 MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................

More information

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures

Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have

More information

Applied Statistics and Econometrics

Applied Statistics and Econometrics Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.

More information

(6, 4) Is there arbitrage in this market? If so, find all arbitrages. If not, find all pricing kernels.

(6, 4) Is there arbitrage in this market? If so, find all arbitrages. If not, find all pricing kernels. Advanced Financial Models Example sheet - Michaelmas 208 Michael Tehranchi Problem. Consider a two-asset model with prices given by (P, P 2 ) (3, 9) /4 (4, 6) (6, 8) /4 /2 (6, 4) Is there arbitrage in

More information

Heteroskedasticity. Part VII. Heteroskedasticity

Heteroskedasticity. Part VII. Heteroskedasticity Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least

More information

Generalized Linear Models (GLZ)

Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the

More information

Advanced Econometrics I

Advanced Econometrics I Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics

More information

Model Estimation Example

Model Estimation Example Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions

More information

Matrices and Vectors

Matrices and Vectors Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix

More information

Bayesian optimal designs for discrete choice experiments with partial profiles

Bayesian optimal designs for discrete choice experiments with partial profiles Bayesian optimal designs for discrete choice experiments with partial profiles Roselinde Kessels Bradley Jones Peter Goos Roselinde Kessels is a post-doctoral researcher in econometrics at Universiteit

More information

Applied Quantitative Methods II

Applied Quantitative Methods II Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects

More information

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation

1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation 1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations

More information

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls

e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular

More information

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima

Hypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s

More information

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points]

Econometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points] Econometrics (60 points) Question 7: Short Answers (30 points) Answer parts 1-6 with a brief explanation. 1. Suppose the model of interest is Y i = 0 + 1 X 1i + 2 X 2i + u i, where E(u X)=0 and E(u 2 X)=

More information

Calculating indicators with PythonBiogeme

Calculating indicators with PythonBiogeme Calculating indicators with PythonBiogeme Michel Bierlaire May 17, 2017 Report TRANSP-OR 170517 Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique

More information

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear

More information

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design

More information

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection

More information

Final Exam. Problem Score Problem Score 1 /10 8 /10 2 /10 9 /10 3 /10 10 /10 4 /10 11 /10 5 /10 12 /10 6 /10 13 /10 7 /10 Total /130

Final Exam. Problem Score Problem Score 1 /10 8 /10 2 /10 9 /10 3 /10 10 /10 4 /10 11 /10 5 /10 12 /10 6 /10 13 /10 7 /10 Total /130 EE103/CME103: Introduction to Matrix Methods December 9 2015 S. Boyd Final Exam You may not use any books, notes, or computer programs (e.g., Julia). Throughout this exam we use standard mathematical notation;

More information

Statistical Inference of Covariate-Adjusted Randomized Experiments

Statistical Inference of Covariate-Adjusted Randomized Experiments 1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu

More information

Statistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23

Statistical Methods. Missing Data  snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23 1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing

More information

Extending causal inferences from a randomized trial to a target population

Extending causal inferences from a randomized trial to a target population Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh

More information

X t = a t + r t, (7.1)

X t = a t + r t, (7.1) Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical

More information

Advanced topics from statistics

Advanced topics from statistics Advanced topics from statistics Anders Ringgaard Kristensen Advanced Herd Management Slide 1 Outline Covariance and correlation Random vectors and multivariate distributions The multinomial distribution

More information

An Overview of Choice Models

An Overview of Choice Models An Overview of Choice Models Dilan Görür Gatsby Computational Neuroscience Unit University College London May 08, 2009 Machine Learning II 1 / 31 Outline 1 Overview Terminology and Notation Economic vs

More information

Chapter 3 ANALYSIS OF RESPONSE PROFILES

Chapter 3 ANALYSIS OF RESPONSE PROFILES Chapter 3 ANALYSIS OF RESPONSE PROFILES 78 31 Introduction In this chapter we present a method for analysing longitudinal data that imposes minimal structure or restrictions on the mean responses over

More information

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data 999 Prentice-Hall, Inc. Chap. 9 - Chapter Topics Comparing Two Independent Samples: Z Test for the Difference

More information

Lecture 11. Multivariate Normal theory

Lecture 11. Multivariate Normal theory 10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances

More information

ECON3327: Financial Econometrics, Spring 2016

ECON3327: Financial Econometrics, Spring 2016 ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary

More information

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method

More information

Logistic Regression: Regression with a Binary Dependent Variable

Logistic Regression: Regression with a Binary Dependent Variable Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression

More information

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

Linear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Estimating and Testing the US Model 8.1 Introduction

Estimating and Testing the US Model 8.1 Introduction 8 Estimating and Testing the US Model 8.1 Introduction The previous chapter discussed techniques for estimating and testing complete models, and this chapter applies these techniques to the US model. For

More information

Using Estimating Equations for Spatially Correlated A

Using Estimating Equations for Spatially Correlated A Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship

More information

Structural Reliability

Structural Reliability Structural Reliability Thuong Van DANG May 28, 2018 1 / 41 2 / 41 Introduction to Structural Reliability Concept of Limit State and Reliability Review of Probability Theory First Order Second Moment Method

More information

Sliced Inverse Regression

Sliced Inverse Regression Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed

More information

Simulating Uniform- and Triangular- Based Double Power Method Distributions

Simulating Uniform- and Triangular- Based Double Power Method Distributions Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions

More information

Statistical Distribution Assumptions of General Linear Models

Statistical Distribution Assumptions of General Linear Models Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions

More information

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing

Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science

UNIVERSITY OF TORONTO Faculty of Arts and Science UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator

More information

Linear Regression with Time Series Data

Linear Regression with Time Series Data Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)

More information

Multiple Regression Analysis. Part III. Multiple Regression Analysis

Multiple Regression Analysis. Part III. Multiple Regression Analysis Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant

More information

Least Mean Squares Regression. Machine Learning Fall 2018

Least Mean Squares Regression. Machine Learning Fall 2018 Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department

More information

I T L S. INSTITUTE of TRANSPORT and LOGISTICS STUDIES. Sample optimality in the design of stated choice experiments

I T L S. INSTITUTE of TRANSPORT and LOGISTICS STUDIES. Sample optimality in the design of stated choice experiments I T L S WORKING PAPER ITLS-WP-05-3 Sample optimality in the design of stated choice experiments By John M Rose & Michiel CJ Bliemer July 005 Faculty of Civil Engineering and Geosciences Delft University

More information

Optimal Designs for 2 k Experiments with Binary Response

Optimal Designs for 2 k Experiments with Binary Response 1 / 57 Optimal Designs for 2 k Experiments with Binary Response Dibyen Majumdar Mathematics, Statistics, and Computer Science College of Liberal Arts and Sciences University of Illinois at Chicago Joint

More information

THE MULTIVARIATE LINEAR REGRESSION MODEL

THE MULTIVARIATE LINEAR REGRESSION MODEL THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus

More information

Section 3: Simple Linear Regression

Section 3: Simple Linear Regression Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction

More information

Probability Theory for Machine Learning. Chris Cremer September 2015

Probability Theory for Machine Learning. Chris Cremer September 2015 Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares

More information

Part I Behavioral Models

Part I Behavioral Models Part I Behavioral Models 2 Properties of Discrete Choice Models 2.1 Overview This chapter describes the features that are common to all discrete choice models. We start by discussing the choice set, which

More information

Multi-Robotic Systems

Multi-Robotic Systems CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed

More information