SP Experimental Designs - Theoretical Background and Case Study
|
|
- Samson Heath
- 6 years ago
- Views:
Transcription
1 SP Experimental Designs - Theoretical Background and Case Study Basil Schmid IVT ETH Zurich Measurement and Modeling FS2016
2 Outline 1. Introduction 2. Orthogonal and fractional factorial designs 3. Efficient designs 4. Pivot designs 5. Testing a design: A case study 6. Conclusions SP Experimental Designs 2
3 Introduction Explain how the variation of certain attributes affects the outcome of interest (causal relationship), applying a statistically efficient and effective framework (maximum amount of information with minimum amount of resources) Kuhfeld (1994): The best approach to design creation is to use the computer as a tool along with traditional design skills, not as a substitute for thinking about the problem SP Experimental Designs 3
4 A brief history : While serving as surgeon on HMS Salisbury, James Lind carried out a systematic clinical trial to compare patients with scurvy (lack of vitamin C disease) Entry requirements to reduce exogenous variation 12 seamen were assigned to 6 treatment groups, each receiving a different diet over a 2-week-period Other examples: Agriculture, marketing, economics Sir Ronald Fisher (1935): Experiments are experience carefully planned in advance, and designed to form a secure basis of new knowledge : manipulation/variation of (existing) attributes formation of attribute levels observation/measurement of outcomes SP Experimental Designs 4
5 Experimental design In contrast to revealed preference (RP) data, stated preference (SP) data are generated by some systematic and planned design process SP data may provide insights into a hypothetical market for which no RP data is available Formulation of statistical hypotheses to be tested Specification of the number of experimental units (observations) required and the population from which they will be sampled Specification of the randomization procedure for assigning the experimental units to the attribute levels: Sources of variation among the units are distributed over the entire experiment Determination of the statistical analysis that will be performed (discrete choice, multivariate regression,...) SP Experimental Designs 5
6 Orthogonal designs x y: Two attribute vectors x and y are said to be (strictly) orthogonal if the inner product is zero cov(x, y) = E(x E(x)) E(y E(y)) = 0 Correlations between attributes are zero and attribute levels appear equally often in combination with all other attribute levels (balance) the effects of interest can be estimated efficient and stochastically independent Full factorial orthogonal design with 2 attributes x and y à 3 levels (3 2 possible combinations; orthogonally coded): Choice set x y SP Experimental Designs 6
7 Fractional factorial designs Full factorial design: Experiment size explodes with increasing attributes and levels. E.g. 10 attributes with 3 levels: 3 10 possible attribute combinations (= degrees of freedom) Full factorial designs are, by definition, perfectly orthogonal in all main-effects and higher order interactions Use an optimal subset of a full factorial Orthogonality can be maintained under the assumption that some effects (often higher order interactions) are zero However, interactions might be highly correlated with main effects: U i = α X tt,i + β X tc,i + γ X tt,i X tc,i + ɛ i (1) U i X tc,i = β + γ X tt,i (2) SP Experimental Designs 7
8 Fractional factorial designs Assume a full factorial design with 10 attributes à 3 levels (3 10 combinations): To estimate all 10 main-effects, one needs at least 20 choice sets (10*(3-1) degrees of freedom) Hence, 45 two-way ((10-1)*10/2) as well as many higher order interactions ( degrees of freedom) are ignored Practical considerations: Main-effects typically account for 70-90% of explained variance, two-way interactions for 5-15% Limit # of attribute levels: Often 2-5 levels Limit # of attributes: Often 6-16 attributes Only allow some two-way interactions to be different from 0 (e.g. travel time x travel cost) Block-design: Divide fractional factorial into groups with the same # of choice sets in a statistically efficient way SP Experimental Designs 8
9 Block-designs Typically, a respondent receives between 6 and 15 choice sets (response burden and cognitive fatigue) Even fractional designs often include more choice sets than what the researcher wants to assign to each respondent Correlation between blocks and attributes should be minimized. Otherwise, one respondent gets all blocks with e.g. high travel times Common mistake: Assign first x choice sets to block b Orthogonal blocking: Block number is uncorrelated with attributes Good news: Most software automatically assign choice sets to each block specified by the researcher SP Experimental Designs 9
10 Some important definitions Unlabeled experiment: A choice experiment where alternatives have no intrinsic meaning (e.g. route 1 vs. route 2) Labeled experiment: A choice experiment where the alternatives are labeled. Model parameters can be estimated for each alternative independently (e.g. car vs. train vs. bus) Generic effect: The same model parameter for all alternatives in the utility function (e.g. travel cost) Alternative-specific effect: Different model parameters for each alternative in the utility function (e.g. travel time car vs. travel time bus vs. travel time train) Own vs. cross effect: If cross effects are present, the IID error assumption is violated SP Experimental Designs 10
11 Example of unlabeled experiment SP Experimental Designs 11
12 Example of labeled experiment SP Experimental Designs 12
13 Orthognonal fraction of full factorial (example) 4 attributes with 3 levels, 3 unlabelled choice alternatives, possible attribute level combinations: Minimum of 8 choice sets to estimate all 4 (generic) main-effects Smallest orthogonal fraction = choice sets Set TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU redundant alternatives weakly dominant alternatives dominant alternatives SP Experimental Designs 13
14 Problems with orthogonal designs Reasons for moving away from orthogonal designs (OD): For some problems, an OD does not exist (e.g. for limited, by the researcher predefined number of choice sets) in general, ODs require a larger sample size and lead to larger choice sets Behaviorally plausible choice scenarios: ODs may include dominant/weakly dominant/redundant choice sets no information gain When working with preference constraints, orthogonality cannot be maintained Need for more sophisticated approaches: Efficient experimental designs SP Experimental Designs 14
15 Efficient designs: Some basic concepts Efficiency: For given design requirements (violating strict orthogonality), minimize the variances of parameter estimates, which are taken from the variance-covariance matrix of a design D-Efficient GLM Designs: No prior information about the parameter values (signs, magnitude) Efficiency convergence towards orthogonality D-Efficient MNL Designs: Efficiency measures depend on the unknown parameter values one wants to estimate In many cases, one has some sound knowledge about the sign and relative values of the design attributes (e.g. travel cost and travel time both have a negative effect on utility, leading to a positive value of time) SP Experimental Designs 15
16 Example Orthogonal design with travel time and travel cost (2 alternatives, 3 levels): Quadrants 1 and 3 dominate quadrants 2 and 4 SP Experimental Designs 16
17 Example 2 Cost_MIV - Cost_PT Time_MIV - Time_PT WLS Predictions Efficient design with travel time and travel cost (2 alternatives, 3 levels): Elimination of dominant alternatives SP Experimental Designs 17
18 Efficient designs: Some basic concepts Main question: How can the researcher make use of prior information in order to increase the efficiency (minimize standard errors of the attributes, i.e. more robust results) and reduce the sample size requirements? Example 1: Orthogonal designs make no use prior information time and cost attributes are uncorrelated Example 2: Efficient design with no dominant alternatives automatically leads to a negative correlation between time and cost forces respondents to trade-off and increases the amount of preference information given sample size D-Efficient MNL approach: Use expected parameter distributions with µ k and σ k to calculate the optimal design SP Experimental Designs 18
19 D-Efficient GLM designs Find a design matrix Z, with rows selected from a Q x k matrix X where n Q, that is optimal in some sense. Z is an n x k matrix, where k is the number of parameters and n is the number of choice sets in the actual experiment Row-based Federov algorithm (R-package AlgDesign): Selection from a predefined candidature set (after exclusion of dominant/redundant alternatives, etc.) Optimization criterion: Maximize k-th root of the determinant of the normalized dispersion matrix M Ω 1 Assumption: Observations are independent and error terms are normally distributed ( max. Z ) 1 D Efficiency = det Z k n (3) SP Experimental Designs 19
20 D-Efficient MNL designs Asymptotic variance-covariance (AVC) matrix for discrete choice models depends on the true parameter values Starting point: Need to make assumptions about the model, utility functions and parameter values Design matrices Z are created using a column-based swapping algorithm: Selection of attribute levels over all choice situations for each attribute Optimization criteria: Minimize k-th root of the determinant of the AVC matrix Ω ( min. D Error = det Ω(Z, β)) 1 k (4) SP Experimental Designs 20
21 Some remarks on D-Efficient designs Large number of different algorithms and optimization criteria exist (focus on D-Efficiency as most common approach in the literature) Eliminating undesirable choice sets has to be done manually by using preference constraints GLM designs: Can be created in the open-source software R. Robust towards misspecification of priors and often as efficient as MNL designs MNL designs: Created in the commercial software NGENE. Easier to implement, more assistance and possibilities. Priors usually come from the literature, intuition and pre-test studies. Misspecification can be minimized by assuming a random distribution of priors (Bayesian approach) SP Experimental Designs 21
22 An example of a design strategy 9 attributes with 3 levels (3 9 full factorial), 2 labeled alternatives, 32 choice sets with 4 blocks, estimation of all linear main effects, quadratic effects and 6 selected two-way interactions ( degrees of freedom) Polynomial and interaction effects have to be specified in the utility function of a design No weakly dominant alternatives (i.e. all attribute values of one alternative in choice set s are strictly better or equal: a 1 a 2 or a 1 a 2 ) No strongly dominant travel time relative to travel cost alternatives or vice versa (i.e. a 1,cost a 2,cost and a 1,time a 2,time or vice versa) Weak priors to determine the direction of expected effects SP Experimental Designs 22
23 Efficient design (example) 4 attributes with 3 levels, 3 unlabelled choice alternatives, possible attribute level combinations: Minimum of 8 choice sets to estimate all 4 (generic) main-effects Weak priors, exclusion of all dominant choice sets Free choice about the number of choice sets (# choice sets > df ) Set TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU no more dominant/weakly dominant/redundant choice sets SP Experimental Designs 23
24 Some general remarks Experimental design creation is a research topic on its own (Rose and Bliemer, 2009; Quan et al., 2011) If priors are misspecified, one might run into troubles. Be careful when using priors! Use attributes, values and trade-off variations that are plausible Make sure that there are some overlapping values of generic attributes between alternatives (pivot designs) Order effects: Randomize order of alternatives across respondents in the questionnaire Carefully introduce respondents to the (hypothetical) scenario and explain the attributes you are presenting to them SP Experimental Designs 24
25 Pivot designs It is preferable to base variations around values for observed behavior (state-of-the-art in transportation research): Calculate design with placeholder values (e.g. 1,2,3) and replace them by relative changes (e.g. 0.7, 1.1, 1.5) Combination of RP data with variations given by the design respondents can better identify with the presented choice scenarios; much more variation in the attribute levels Possible to include one reference alternative in the choice sets (e.g. bike travel time, whose value is not varied) Problems: If reference values are (highly) dominant, the respondents will more likely choose the respective alternative (only in labeled experiments) Correlation between attributes; skewness SP Experimental Designs 25
26 Pivot designs: Trade-off distribution Example where MIV is often cheaper and faster than PT modification of reference values needed! SP Experimental Designs 26
27 Testing a design: A case study Once you have your design, you should test the performance of estimating the coefficients of interest, based on simulation of a more or less hypothetical population Define priors for the attribute weights of the utility function based on recent similar studies Simulate error structure (GEV) for the utility function taking into account the panel structure of the designs Calculate individual utilities and determine the chosen alternatives for each simulated subject Estimate the parameters for the simulated data and compare the results with the a-priori assumptions SP Experimental Designs 27
28 Testing a design: Pivot approach Experimental design: 9 attributes with 3 levels (3 9 full factorial), 2 labeled alternatives, 32 choice sets (8 per subject) Reference values taken from a Swiss mode choice experiment: Total travel time: For PT alternative = travel time without access and egress time; for MIV alternative = travel time + parking search time Total travel cost: For PT alternative = ticket price; for MIV alternative = fuel cost + parking cost Number of transfers: PT alternative only Attribute Effect code: Travel time (MIV and PT) [%] Travel cost (MIV and PT) [%] Delay prob. (MIV and PT) [%] Walking / waiting time (MIV and PT) [min.] Number of transfers (PT) [#] SP Experimental Designs 28
29 Testing a design: A-priori Coefficients Prior values for the individual weights of attribute k, β ik N(µ k, σ k ), and alternative-specific constants are simulated based on results obtained from the linear model in the BMVI Zeitkostenstudie (Axhausen et al., 2014) For each simulated individual i, the same β ik is used over all 8 choice sets, representing the panel structure of the experiment Coefficient Mean SD Type ASC MIV Alternative-specific β timemiv Alternative-specific β timept Alternative-specific β cost Generic β delaymiv Alternative-specific β delaypt Alternative-specific β walk Generic β transferspt Alternative-specific VOT MIV VOT PT 15.0 CHF / h 14.2 CHF / h Number of simulated 400 coefficient vectors β ik SP Experimental Designs 29
30 Testing a design: Utility Function The random utility model framework (RUM) assumes that in each choice set s, individual i perceives utility U ijs for each alternative j among the full set of alternatives J (MIV and PT), given the attributes X ijs, and chooses the one that maximizes utility. U ijs has an observed component V ijs and an unobserved component ɛ ijs : U ijs = V ijs + ɛ ijs (5) where and ɛ ijs GEV (0, 1, 0) (6) K V ijs = β ik X ijsk (7) k=1 SP Experimental Designs 30
31 Testing a design: Choice simulation The chosen alternatives choice is are calculated as follows: if U is,miv > U is,pt : choice is = { MIV else PT (8) Snippet of a simulated discrete choice data set: id set alt choice block time cost walk delay transfers (min.) (CHF) (min.) (min.) (#) MIV PT MIV PT MIV SP Experimental Designs 31
32 Testing a design: Estimation For given randomly drawn subsets of RP data, simulated β ik coefficient vectors and simulated error terms ɛ ijs, the models are estimated for 3 different designs. This process is repeated 2000 times to get insights into the distributions of coefficients (robustness), variances (precision) and values of time The between-design differences of E(β k ) and E(SE k ) with respect to the a-priori parameters are small Design approach: GLM MNL: β 0 MNL: β k E(β k ) E(SE k ) E(β k ) E(SE k ) E(β k ) E(SE k ) ASC MIV * β timemiv β timept β cost β delaymiv * β delaypt * β walk * β transferspt VOT MIV VOT PT SP Experimental Designs 32
33 Conclusions No substantial differences between the different design approaches: Designs are robust and reproduce the a-priori values well From a behavioral perspective, one should always exclude dominant and weakly dominant alternatives! Personal suggestion: Create an efficient design by... carefully thinking about your research question and aims assigning about 8 choice sets to a respondents and using a block-design (total # of choice sets 1.5 df ) using MNL approach with zero (or weak) priors, excluding undesired choice sets by manually setting preference conditions updating your design after a pre-test study SP Experimental Designs 33
Keywords Stated choice experiments, experimental design, orthogonal designs, efficient designs
Constructing Efficient Stated Choice Experimental Designs John M. Rose 1 Michiel C.J. Bliemer 1, 2 1 The University of Sydney, Faculty of Business and Economics, Institute of Transport & Logistics Studies,
More informationSelection on Observables: Propensity Score Matching.
Selection on Observables: Propensity Score Matching. Department of Economics and Management Irene Brunetti ireneb@ec.unipi.it 24/10/2017 I. Brunetti Labour Economics in an European Perspective 24/10/2017
More informationwith the usual assumptions about the error term. The two values of X 1 X 2 0 1
Sample questions 1. A researcher is investigating the effects of two factors, X 1 and X 2, each at 2 levels, on a response variable Y. A balanced two-factor factorial design is used with 1 replicate. The
More informationWU Weiterbildung. Linear Mixed Models
Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes
More informationA Sampling of IMPACT Research:
A Sampling of IMPACT Research: Methods for Analysis with Dropout and Identifying Optimal Treatment Regimes Marie Davidian Department of Statistics North Carolina State University http://www.stat.ncsu.edu/
More information13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process. Strict Exogeneity
Outline: Further Issues in Using OLS with Time Series Data 13. Time Series Analysis: Asymptotics Weakly Dependent and Random Walk Process I. Stationary and Weakly Dependent Time Series III. Highly Persistent
More informationFinal Exam. Economics 835: Econometrics. Fall 2010
Final Exam Economics 835: Econometrics Fall 2010 Please answer the question I ask - no more and no less - and remember that the correct answer is often short and simple. 1 Some short questions a) For each
More information1. The General Linear-Quadratic Framework
ECO 317 Economics of Uncertainty Fall Term 2009 Slides to accompany 21. Incentives for Effort - Multi-Dimensional Cases 1. The General Linear-Quadratic Framework Notation: x = (x j ), n-vector of agent
More informationPanel Data Models. Chapter 5. Financial Econometrics. Michael Hauser WS17/18 1 / 63
1 / 63 Panel Data Models Chapter 5 Financial Econometrics Michael Hauser WS17/18 2 / 63 Content Data structures: Times series, cross sectional, panel data, pooled data Static linear panel data models:
More informationLecture #8 & #9 Multiple regression
Lecture #8 & #9 Multiple regression Starting point: Y = f(x 1, X 2,, X k, u) Outcome variable of interest (movie ticket price) a function of several variables. Observables and unobservables. One or more
More informationEconometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018
Econometrics I KS Module 2: Multivariate Linear Regression Alexander Ahammer Department of Economics Johannes Kepler University of Linz This version: April 16, 2018 Alexander Ahammer (JKU) Module 2: Multivariate
More informationQuestion. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?
Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.
More informationGoals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2. Recap: MNL. Recap: MNL
Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 2 Tetsuya Matsubayashi University of North Texas November 9, 2010 Learn multiple responses models that do not require the assumption
More information1 Motivation for Instrumental Variable (IV) Regression
ECON 370: IV & 2SLS 1 Instrumental Variables Estimation and Two Stage Least Squares Econometric Methods, ECON 370 Let s get back to the thiking in terms of cross sectional (or pooled cross sectional) data
More informationDiscriminant analysis and supervised classification
Discriminant analysis and supervised classification Angela Montanari 1 Linear discriminant analysis Linear discriminant analysis (LDA) also known as Fisher s linear discriminant analysis or as Canonical
More informationStructural Nested Mean Models for Assessing Time-Varying Effect Moderation. Daniel Almirall
1 Structural Nested Mean Models for Assessing Time-Varying Effect Moderation Daniel Almirall Center for Health Services Research, Durham VAMC & Dept. of Biostatistics, Duke University Medical Joint work
More informationApplied Microeconometrics (L5): Panel Data-Basics
Applied Microeconometrics (L5): Panel Data-Basics Nicholas Giannakopoulos University of Patras Department of Economics ngias@upatras.gr November 10, 2015 Nicholas Giannakopoulos (UPatras) MSc Applied Economics
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More information6.3 How the Associational Criterion Fails
6.3. HOW THE ASSOCIATIONAL CRITERION FAILS 271 is randomized. We recall that this probability can be calculated from a causal model M either directly, by simulating the intervention do( = x), or (if P
More informationEconometrics with Observational Data. Introduction and Identification Todd Wagner February 1, 2017
Econometrics with Observational Data Introduction and Identification Todd Wagner February 1, 2017 Goals for Course To enable researchers to conduct careful quantitative analyses with existing VA (and non-va)
More informationBTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014
BTRY 4830/6830: Quantitative Genomics and Genetics Fall 2014 Homework 4 (version 3) - posted October 3 Assigned October 2; Due 11:59PM October 9 Problem 1 (Easy) a. For the genetic regression model: Y
More informationStatistical Tests. Matthieu de Lapparent
Statistical Tests Matthieu de Lapparent matthieu.delapparent@epfl.ch Transport and Mobility Laboratory, School of Architecture, Civil and Environmental Engineering, Ecole Polytechnique Fédérale de Lausanne
More informationEMERGING MARKETS - Lecture 2: Methodology refresher
EMERGING MARKETS - Lecture 2: Methodology refresher Maria Perrotta April 4, 2013 SITE http://www.hhs.se/site/pages/default.aspx My contact: maria.perrotta@hhs.se Aim of this class There are many different
More informationLecture 6: Discrete Choice: Qualitative Response
Lecture 6: Instructor: Department of Economics Stanford University 2011 Types of Discrete Choice Models Univariate Models Binary: Linear; Probit; Logit; Arctan, etc. Multinomial: Logit; Nested Logit; GEV;
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 4: OLS and Statistics revision Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 4 VŠE, SS 2016/17 1 / 68 Outline 1 Econometric analysis Properties of an estimator
More informationRank-order conjoint experiments: efficiency and design s
Faculty of Business and Economics Rank-order conjoint experiments: efficiency and design s Bart Vermeulen, Peter Goos and Martina Vandebroek DEPARTMENT OF DECISION SCIENCES AND INFORMATION MANAGEMENT (KBI)
More information401 Review. 6. Power analysis for one/two-sample hypothesis tests and for correlation analysis.
401 Review Major topics of the course 1. Univariate analysis 2. Bivariate analysis 3. Simple linear regression 4. Linear algebra 5. Multiple regression analysis Major analysis methods 1. Graphical analysis
More informationSimultaneous Equation Models Learning Objectives Introduction Introduction (2) Introduction (3) Solving the Model structural equations
Simultaneous Equation Models. Introduction: basic definitions 2. Consequences of ignoring simultaneity 3. The identification problem 4. Estimation of simultaneous equation models 5. Example: IS LM model
More informationOrthogonal contrasts for a 2x2 factorial design Example p130
Week 9: Orthogonal comparisons for a 2x2 factorial design. The general two-factor factorial arrangement. Interaction and additivity. ANOVA summary table, tests, CIs. Planned/post-hoc comparisons for the
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationGoals. PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1. Multinomial Dependent Variable. Random Utility Model
Goals PSCI6000 Maximum Likelihood Estimation Multiple Response Model 1 Tetsuya Matsubayashi University of North Texas November 2, 2010 Random utility model Multinomial logit model Conditional logit model
More informationGreene, Econometric Analysis (7th ed, 2012)
EC771: Econometrics, Spring 2012 Greene, Econometric Analysis (7th ed, 2012) Chapters 2 3: Classical Linear Regression The classical linear regression model is the single most useful tool in econometrics.
More informationIntroduction to Bayesian Learning. Machine Learning Fall 2018
Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability
More informationClassification. Chapter Introduction. 6.2 The Bayes classifier
Chapter 6 Classification 6.1 Introduction Often encountered in applications is the situation where the response variable Y takes values in a finite set of labels. For example, the response Y could encode
More informationINTRODUCTION TO TRANSPORTATION SYSTEMS
INTRODUCTION TO TRANSPORTATION SYSTEMS Lectures 5/6: Modeling/Equilibrium/Demand 1 OUTLINE 1. Conceptual view of TSA 2. Models: different roles and different types 3. Equilibrium 4. Demand Modeling References:
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationLinear Modelling in Stata Session 6: Further Topics in Linear Modelling
Linear Modelling in Stata Session 6: Further Topics in Linear Modelling Mark Lunt Arthritis Research UK Epidemiology Unit University of Manchester 14/11/2017 This Week Categorical Variables Categorical
More informationAn overview of applied econometrics
An overview of applied econometrics Jo Thori Lind September 4, 2011 1 Introduction This note is intended as a brief overview of what is necessary to read and understand journal articles with empirical
More informationEconometrics for PhDs
Econometrics for PhDs Amine Ouazad April 2012, Final Assessment - Answer Key 1 Questions with a require some Stata in the answer. Other questions do not. 1 Ordinary Least Squares: Equality of Estimates
More informationTime Series 4. Robert Almgren. Oct. 5, 2009
Time Series 4 Robert Almgren Oct. 5, 2009 1 Nonstationarity How should you model a process that has drift? ARMA models are intrinsically stationary, that is, they are mean-reverting: when the value of
More informationSleep data, two drugs Ch13.xls
Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch
More informationTime Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley
Time Series Models and Inference James L. Powell Department of Economics University of California, Berkeley Overview In contrast to the classical linear regression model, in which the components of the
More informationLecture Notes 1: Decisions and Data. In these notes, I describe some basic ideas in decision theory. theory is constructed from
Topics in Data Analysis Steven N. Durlauf University of Wisconsin Lecture Notes : Decisions and Data In these notes, I describe some basic ideas in decision theory. theory is constructed from The Data:
More informationLecture-20: Discrete Choice Modeling-I
Lecture-20: Discrete Choice Modeling-I 1 In Today s Class Introduction to discrete choice models General formulation Binary choice models Specification Model estimation Application Case Study 2 Discrete
More informationPattern Recognition 2
Pattern Recognition 2 KNN,, Dr. Terence Sim School of Computing National University of Singapore Outline 1 2 3 4 5 Outline 1 2 3 4 5 The Bayes Classifier is theoretically optimum. That is, prob. of error
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationMATH Notebook 3 Spring 2018
MATH448001 Notebook 3 Spring 2018 prepared by Professor Jenny Baglivo c Copyright 2010 2018 by Jenny A. Baglivo. All Rights Reserved. 3 MATH448001 Notebook 3 3 3.1 One Way Layout........................................
More informationTopic 12. The Split-plot Design and its Relatives (continued) Repeated Measures
12.1 Topic 12. The Split-plot Design and its Relatives (continued) Repeated Measures 12.9 Repeated measures analysis Sometimes researchers make multiple measurements on the same experimental unit. We have
More informationApplied Statistics and Econometrics
Applied Statistics and Econometrics Lecture 6 Saul Lach September 2017 Saul Lach () Applied Statistics and Econometrics September 2017 1 / 53 Outline of Lecture 6 1 Omitted variable bias (SW 6.1) 2 Multiple
More information6.867 Machine Learning
6.867 Machine Learning Problem set 1 Solutions Thursday, September 19 What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove.
More information(6, 4) Is there arbitrage in this market? If so, find all arbitrages. If not, find all pricing kernels.
Advanced Financial Models Example sheet - Michaelmas 208 Michael Tehranchi Problem. Consider a two-asset model with prices given by (P, P 2 ) (3, 9) /4 (4, 6) (6, 8) /4 /2 (6, 4) Is there arbitrage in
More informationHeteroskedasticity. Part VII. Heteroskedasticity
Part VII Heteroskedasticity As of Oct 15, 2015 1 Heteroskedasticity Consequences Heteroskedasticity-robust inference Testing for Heteroskedasticity Weighted Least Squares (WLS) Feasible generalized Least
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationAdvanced Econometrics I
Lecture Notes Autumn 2010 Dr. Getinet Haile, University of Mannheim 1. Introduction Introduction & CLRM, Autumn Term 2010 1 What is econometrics? Econometrics = economic statistics economic theory mathematics
More informationModel Estimation Example
Ronald H. Heck 1 EDEP 606: Multivariate Methods (S2013) April 7, 2013 Model Estimation Example As we have moved through the course this semester, we have encountered the concept of model estimation. Discussions
More informationMatrices and Vectors
Matrices and Vectors James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University November 11, 2013 Outline 1 Matrices and Vectors 2 Vector Details 3 Matrix
More informationBayesian optimal designs for discrete choice experiments with partial profiles
Bayesian optimal designs for discrete choice experiments with partial profiles Roselinde Kessels Bradley Jones Peter Goos Roselinde Kessels is a post-doctoral researcher in econometrics at Universiteit
More informationApplied Quantitative Methods II
Applied Quantitative Methods II Lecture 10: Panel Data Klára Kaĺıšková Klára Kaĺıšková AQM II - Lecture 10 VŠE, SS 2016/17 1 / 38 Outline 1 Introduction 2 Pooled OLS 3 First differences 4 Fixed effects
More information1 Outline. 1. Motivation. 2. SUR model. 3. Simultaneous equations. 4. Estimation
1 Outline. 1. Motivation 2. SUR model 3. Simultaneous equations 4. Estimation 2 Motivation. In this chapter, we will study simultaneous systems of econometric equations. Systems of simultaneous equations
More informatione author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls
e author and the promoter give permission to consult this master dissertation and to copy it or parts of it for personal use. Each other use falls under the restrictions of the copyright, in particular
More informationHypothesis testing Goodness of fit Multicollinearity Prediction. Applied Statistics. Lecturer: Serena Arima
Applied Statistics Lecturer: Serena Arima Hypothesis testing for the linear model Under the Gauss-Markov assumptions and the normality of the error terms, we saw that β N(β, σ 2 (X X ) 1 ) and hence s
More informationEconometrics (60 points) as the multivariate regression of Y on X 1 and X 2? [6 points]
Econometrics (60 points) Question 7: Short Answers (30 points) Answer parts 1-6 with a brief explanation. 1. Suppose the model of interest is Y i = 0 + 1 X 1i + 2 X 2i + u i, where E(u X)=0 and E(u 2 X)=
More informationCalculating indicators with PythonBiogeme
Calculating indicators with PythonBiogeme Michel Bierlaire May 17, 2017 Report TRANSP-OR 170517 Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique
More informationCourse: ESO-209 Home Work: 1 Instructor: Debasis Kundu
Home Work: 1 1. Describe the sample space when a coin is tossed (a) once, (b) three times, (c) n times, (d) an infinite number of times. 2. A coin is tossed until for the first time the same result appear
More informationThe legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.
1 Chapter 1: Research Design Principles The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization. 2 Chapter 2: Completely Randomized Design
More informationData Mining. Dimensionality reduction. Hamid Beigy. Sharif University of Technology. Fall 1395
Data Mining Dimensionality reduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 42 Outline 1 Introduction 2 Feature selection
More informationFinal Exam. Problem Score Problem Score 1 /10 8 /10 2 /10 9 /10 3 /10 10 /10 4 /10 11 /10 5 /10 12 /10 6 /10 13 /10 7 /10 Total /130
EE103/CME103: Introduction to Matrix Methods December 9 2015 S. Boyd Final Exam You may not use any books, notes, or computer programs (e.g., Julia). Throughout this exam we use standard mathematical notation;
More informationStatistical Inference of Covariate-Adjusted Randomized Experiments
1 Statistical Inference of Covariate-Adjusted Randomized Experiments Feifang Hu Department of Statistics George Washington University Joint research with Wei Ma, Yichen Qin and Yang Li Email: feifang@gwu.edu
More informationStatistical Methods. Missing Data snijders/sm.htm. Tom A.B. Snijders. November, University of Oxford 1 / 23
1 / 23 Statistical Methods Missing Data http://www.stats.ox.ac.uk/ snijders/sm.htm Tom A.B. Snijders University of Oxford November, 2011 2 / 23 Literature: Joseph L. Schafer and John W. Graham, Missing
More informationExtending causal inferences from a randomized trial to a target population
Extending causal inferences from a randomized trial to a target population Issa Dahabreh Center for Evidence Synthesis in Health, Brown University issa dahabreh@brown.edu January 16, 2019 Issa Dahabreh
More informationX t = a t + r t, (7.1)
Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical
More informationAdvanced topics from statistics
Advanced topics from statistics Anders Ringgaard Kristensen Advanced Herd Management Slide 1 Outline Covariance and correlation Random vectors and multivariate distributions The multinomial distribution
More informationAn Overview of Choice Models
An Overview of Choice Models Dilan Görür Gatsby Computational Neuroscience Unit University College London May 08, 2009 Machine Learning II 1 / 31 Outline 1 Overview Terminology and Notation Economic vs
More informationChapter 3 ANALYSIS OF RESPONSE PROFILES
Chapter 3 ANALYSIS OF RESPONSE PROFILES 78 31 Introduction In this chapter we present a method for analysing longitudinal data that imposes minimal structure or restrictions on the mean responses over
More informationStatistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data
Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data 999 Prentice-Hall, Inc. Chap. 9 - Chapter Topics Comparing Two Independent Samples: Z Test for the Difference
More informationLecture 11. Multivariate Normal theory
10. Lecture 11. Multivariate Normal theory Lecture 11. Multivariate Normal theory 1 (1 1) 11. Multivariate Normal theory 11.1. Properties of means and covariances of vectors Properties of means and covariances
More informationECON3327: Financial Econometrics, Spring 2016
ECON3327: Financial Econometrics, Spring 2016 Wooldridge, Introductory Econometrics (5th ed, 2012) Chapter 11: OLS with time series data Stationary and weakly dependent time series The notion of a stationary
More informationBIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES
BIOL 458 - Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES PART 1: INTRODUCTION TO ANOVA Purpose of ANOVA Analysis of Variance (ANOVA) is an extremely useful statistical method
More informationLogistic Regression: Regression with a Binary Dependent Variable
Logistic Regression: Regression with a Binary Dependent Variable LEARNING OBJECTIVES Upon completing this chapter, you should be able to do the following: State the circumstances under which logistic regression
More informationLinear Classification. CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington
Linear Classification CSE 6363 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington 1 Example of Linear Classification Red points: patterns belonging
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationEstimating and Testing the US Model 8.1 Introduction
8 Estimating and Testing the US Model 8.1 Introduction The previous chapter discussed techniques for estimating and testing complete models, and this chapter applies these techniques to the US model. For
More informationUsing Estimating Equations for Spatially Correlated A
Using Estimating Equations for Spatially Correlated Areal Data December 8, 2009 Introduction GEEs Spatial Estimating Equations Implementation Simulation Conclusion Typical Problem Assess the relationship
More informationStructural Reliability
Structural Reliability Thuong Van DANG May 28, 2018 1 / 41 2 / 41 Introduction to Structural Reliability Concept of Limit State and Reliability Review of Probability Theory First Order Second Moment Method
More informationSliced Inverse Regression
Sliced Inverse Regression Ge Zhao gzz13@psu.edu Department of Statistics The Pennsylvania State University Outline Background of Sliced Inverse Regression (SIR) Dimension Reduction Definition of SIR Inversed
More informationSimulating Uniform- and Triangular- Based Double Power Method Distributions
Journal of Statistical and Econometric Methods, vol.6, no.1, 2017, 1-44 ISSN: 1792-6602 (print), 1792-6939 (online) Scienpress Ltd, 2017 Simulating Uniform- and Triangular- Based Double Power Method Distributions
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationStatistical Inference: Estimation and Confidence Intervals Hypothesis Testing
Statistical Inference: Estimation and Confidence Intervals Hypothesis Testing 1 In most statistics problems, we assume that the data have been generated from some unknown probability distribution. We desire
More informationUNIVERSITY OF TORONTO Faculty of Arts and Science
UNIVERSITY OF TORONTO Faculty of Arts and Science December 2013 Final Examination STA442H1F/2101HF Methods of Applied Statistics Jerry Brunner Duration - 3 hours Aids: Calculator Model(s): Any calculator
More informationLinear Regression with Time Series Data
Econometrics 2 Linear Regression with Time Series Data Heino Bohn Nielsen 1of21 Outline (1) The linear regression model, identification and estimation. (2) Assumptions and results: (a) Consistency. (b)
More informationMultiple Regression Analysis. Part III. Multiple Regression Analysis
Part III Multiple Regression Analysis As of Sep 26, 2017 1 Multiple Regression Analysis Estimation Matrix form Goodness-of-Fit R-square Adjusted R-square Expected values of the OLS estimators Irrelevant
More informationLeast Mean Squares Regression. Machine Learning Fall 2018
Least Mean Squares Regression Machine Learning Fall 2018 1 Where are we? Least Squares Method for regression Examples The LMS objective Gradient descent Incremental/stochastic gradient descent Exercises
More informationIntroduction to Simple Linear Regression
Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department
More informationI T L S. INSTITUTE of TRANSPORT and LOGISTICS STUDIES. Sample optimality in the design of stated choice experiments
I T L S WORKING PAPER ITLS-WP-05-3 Sample optimality in the design of stated choice experiments By John M Rose & Michiel CJ Bliemer July 005 Faculty of Civil Engineering and Geosciences Delft University
More informationOptimal Designs for 2 k Experiments with Binary Response
1 / 57 Optimal Designs for 2 k Experiments with Binary Response Dibyen Majumdar Mathematics, Statistics, and Computer Science College of Liberal Arts and Sciences University of Illinois at Chicago Joint
More informationTHE MULTIVARIATE LINEAR REGRESSION MODEL
THE MULTIVARIATE LINEAR REGRESSION MODEL Why multiple regression analysis? Model with more than 1 independent variable: y 0 1x1 2x2 u It allows : -Controlling for other factors, and get a ceteris paribus
More informationSection 3: Simple Linear Regression
Section 3: Simple Linear Regression Carlos M. Carvalho The University of Texas at Austin McCombs School of Business http://faculty.mccombs.utexas.edu/carlos.carvalho/teaching/ 1 Regression: General Introduction
More informationProbability Theory for Machine Learning. Chris Cremer September 2015
Probability Theory for Machine Learning Chris Cremer September 2015 Outline Motivation Probability Definitions and Rules Probability Distributions MLE for Gaussian Parameter Estimation MLE and Least Squares
More informationPart I Behavioral Models
Part I Behavioral Models 2 Properties of Discrete Choice Models 2.1 Overview This chapter describes the features that are common to all discrete choice models. We start by discussing the choice set, which
More informationMulti-Robotic Systems
CHAPTER 9 Multi-Robotic Systems The topic of multi-robotic systems is quite popular now. It is believed that such systems can have the following benefits: Improved performance ( winning by numbers ) Distributed
More information