Train the model with a subset of the data. Test the model on the remaining data (the validation set) What data to choose for training vs. test?

Size: px
Start display at page:

Download "Train the model with a subset of the data. Test the model on the remaining data (the validation set) What data to choose for training vs. test?"

Transcription

1

2 Train the model with a subset of the data Test the model on the remaining data (the validation set) What data to choose for training vs. test? In a time-series dimension, it is natural to hold out the last year (or time period) of the data, to simulate predicting the future based on all past data. In most settings, however, we ll randomly select our training/test sets.

3 A class of methods to do many training/test splits and average over all the runs Here is a simple example of 5-fold cross validation. Gives 5 test sets 5 estimates of MSE. The 5-fold CV estimate is obtained by averaging these values.

4 Split the data up into K folds. Iteratively leave fold k out of the training data and use it to test. The more folds, the smaller each testing set is (more training data), but the more times we need to run the estimation procedure. Using rules of thumb like 5 10 folds is often utilized in practice. This can be done with a simple for loop in R For generalized linear models, the cv.glm() function can be used to perform k-fold cross validation. For example, this code loops over 10 possible polynomial orders and computes the 10- fold cross-validated error in each step

5 The validation set methodology works when we specify a class of models Examples: Linear models with p-order polynomials A set of related models such as LDA, QDA and Logit Non-parametric models with a smooth parameter Once I have a class of models, I loop over the class to try each one, and compute a validation set error on each pass. K-fold CV methods helps ensure my estimates will be stable

6

7

8

9 An alternative to the approach of fitting many models and picking the best is to fit a single model using all the predictors and throw out the less useful ones in one step. This could also help pare down my feature set into a more manageable set before trying out fancy models and deeper analysis Helps guard against overfitting or have higher model interpretability Regularization helps achieve these aims!

10 The suppose we have p features and want to fit the following estimating equation p y i = α + j=1 β j x ij + ε i OLS will find መβ s that minimize least squares in the training set. We have learned that this will tend to overfit models Another way of putting this, we end up with too complex ( squiggly ) models. They have too much variance.

11 Is a method to reduce variance in our model by imposing a penalty for having high መβ s. We can then tune this penalty to minimize MSE in a validation set. To do so, we minimize:

12 The size of the penalty term λ determines how aggressively we lower the coefficients Since the penalty function is the absolute value, we will often set መβ s equal to zero The idea is that for a feature k that does not improve fit by very much, it s not worth suffering the penalty, λ β k, of adding it into the model

13

14

15

16

17 Binary outcomes: equal to 1 or 0. Examples: coin is heads or tails, person is guilty or innocent, default or not on a loan, etc. Continuous outcomes: can take an real value Categorical outcomes: can take one of N qualitative values. Example: mapping symptons to a well-defined disease. Traditional regression is designed for continuous outcomes. We use classification for binary and categorical outcomes

18 A mapping of categories to numbers comes with strong implications Would imply different differences across qualitative conditions

19 Outcome: 0 or 1 Given features X, we want to say what is the probability the outcome=1. We can write this as P(outcome = 1 X) Our model will output a probability even though real outcomes will always be 0 or 1

20

21

22

23

24

25

26 Inverse Elasticity Rule MR MC Profit Max (MR=MC) Price-cost margin (Lerner index) = 1 over elasticity p c( q) p 0 p qp( q) c'( q) q qp p Price minus marginal costs divided by price is referred to as gross margin

27 Inverse Elasticity Rule p c( q) p qp p Suppose MC=0. Then quantity is chosen so that elasticity is 1. Intuition: if marginal costs are zero, then optimize for revenue. Total revenue grows until elasticity = If MC>0, one will stop before reaching elasticity = 1. p p qp 1 p 1 27

28 Inverse Elasticity Rule 2 Suppose we sell n goods indexed i=1,,n Demands x i (p) Profit n p mc( i) x ( p i 1 ). i i If we assume constant marginal cost, this simplification is an example of selling the same good in multiple markets or to multiple customer types Cross-price elasticity p j dx i ij. xi dp j Note no minus sign. Positive substitutes; Negative complements 28

29 Representative Consumer Assumption If there is a representative consumer maximizing utility: max u(x)-px, so u ( x) p and u( x) dx Thus there are symmetric cross-derivatives x p i j x p Recall this rule from multivariate calculus j i dp This rule need not hold in practice, but is a commonly made assumption From the total derivative of FOC 29

30 30 In Matrix Notation Price cost margin: n j j i j i n j i j j i i p x i mc p x p x i mc p x p 1 1 )) ( ( )) ( ( 0 n j ij j j i p i mc p x 1 )) ( ( 1, ) ( i i i p i mc p L 0 = 1 + E L, and thus L = - E -1 1

31 Two Good Formula Rule for inverting a 2x2 matrix L = - E -1 1 yields L 1 L 2 = e 11 e 12 e 21 e Divide top and bottom by e 22 L 1 = e 22 e 12 e 11 e 22 e 12 e 21 = 1 e 12 e 22 e 11 e 12e 21 e 22 = e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 Factor out e 11 Multiply top and bottom by e 11 31

32 Two Good Formula L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 e 12 e 21 e 22 e 11 will be between 0 and 1 because e 12 e 21 < e 22 e 11 This is because cross price elasticities have to be smaller than the relevant own price elasticities. 32

33 Two Good Formula L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 = 1 e = 1 e 11 >

34 Two Good Formula for Substitutes L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 = 1 e = 1 e 11 >

35 Two Good Formula for Substitutes L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 35

36 Two Good Formula for Complements L = - E -1 1 yields L 1 = 1 e 1 12 e

37 Two Good Formula Review L = - E -1 1 yields = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 e 12 > 0, goods are substitutes. A price decrease on product 2 decreases sales on product 1 (go in same direction) e 12 < 0, goods are complements. A price decrease on product 2 increases sales on product 1 (go in opposite directions) e 11 & e 22 will be negative due to law of demand (note before we embedded the negative sign) 37

38 v 2 p B p 2 Buy Good 2 Buy Both Reducing bundle price gives the additional sales of both goods with a single price cut Buy Nothing Buy Good 1 p 1 v 1 38

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

Applied Machine Learning Annalisa Marsico

Applied Machine Learning Annalisa Marsico Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature

More information

Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War

Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War Spring 009 Main question: In 1955 quantities of autos sold were higher while prices were lower, relative

More information

Introduction to Statistical modeling: handout for Math 489/583

Introduction to Statistical modeling: handout for Math 489/583 Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect

More information

Answer Key: Problem Set 1

Answer Key: Problem Set 1 Answer Key: Problem Set 1 Econ 409 018 Fall Question 1 a The profit function (revenue minus total cost) is π(q) = P (q)q cq The first order condition with respect to (henceforth wrt) q is P (q )q + P (q

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

Machine Learning for OR & FE

Machine Learning for OR & FE Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com

More information

Chapter 3 Multiple Regression Complete Example

Chapter 3 Multiple Regression Complete Example Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be

More information

ECON 594: Lecture #6

ECON 594: Lecture #6 ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was

More information

The Monopolist. The Pure Monopolist with symmetric D matrix

The Monopolist. The Pure Monopolist with symmetric D matrix University of California, Davis Department of Agricultural and Resource Economics ARE 252 Optimization with Economic Applications Lecture Notes 5 Quirino Paris The Monopolist.................................................................

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next

More information

Microeconomic Theory -1- Introduction

Microeconomic Theory -1- Introduction Microeconomic Theory -- Introduction. Introduction. Profit maximizing firm with monopoly power 6 3. General results on maximizing with two variables 8 4. Model of a private ownership economy 5. Consumer

More information

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7 Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum

More information

Advanced Microeconomic Analysis, Lecture 6

Advanced Microeconomic Analysis, Lecture 6 Advanced Microeconomic Analysis, Lecture 6 Prof. Ronaldo CARPIO April 10, 017 Administrative Stuff Homework # is due at the end of class. I will post the solutions on the website later today. The midterm

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the

More information

Bertrand Model of Price Competition. Advanced Microeconomic Theory 1

Bertrand Model of Price Competition. Advanced Microeconomic Theory 1 Bertrand Model of Price Competition Advanced Microeconomic Theory 1 ҧ Bertrand Model of Price Competition Consider: An industry with two firms, 1 and 2, selling a homogeneous product Firms face market

More information

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR

More information

Answer Key: Problem Set 3

Answer Key: Problem Set 3 Answer Key: Problem Set Econ 409 018 Fall Question 1 a This is a standard monopoly problem; using MR = a 4Q, let MR = MC and solve: Q M = a c 4, P M = a + c, πm = (a c) 8 The Lerner index is then L M P

More information

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16 COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS6 Lecture 3: Classification with Logistic Regression Advanced optimization techniques Underfitting & Overfitting Model selection (Training-

More information

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question

Reminders. Thought questions should be submitted on eclass. Please list the section related to the thought question Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a

More information

Lecture 4 Discriminant Analysis, k-nearest Neighbors

Lecture 4 Discriminant Analysis, k-nearest Neighbors Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se

More information

f( x) f( y). Functions which are not one-to-one are often called many-to-one. Take the domain and the range to both be all the real numbers:

f( x) f( y). Functions which are not one-to-one are often called many-to-one. Take the domain and the range to both be all the real numbers: I. UNIVARIATE CALCULUS Given two sets X and Y, a function is a rule that associates each member of X with exactly one member of Y. That is, some x goes in, and some y comes out. These notations are used

More information

Adding Production to the Theory

Adding Production to the Theory Adding Production to the Theory We begin by considering the simplest situation that includes production: two goods, both of which have consumption value, but one of which can be transformed into the other.

More information

Statistical aspects of prediction models with high-dimensional data

Statistical aspects of prediction models with high-dimensional data Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by

More information

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)

Contents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II) Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture

More information

Overfitting, Bias / Variance Analysis

Overfitting, Bias / Variance Analysis Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

MATH 236 ELAC FALL 2017 CA 9 NAME: SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.

MATH 236 ELAC FALL 2017 CA 9 NAME: SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. MATH 236 ELAC FALL 207 CA 9 NAME: SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. ) 27 p 3 27 p 3 ) 2) If 9 t 3 4t 9-2t = 3, find t. 2) Solve the equation.

More information

Empirical Industrial Organization (ECO 310) University of Toronto. Department of Economics Fall Instructor: Victor Aguirregabiria

Empirical Industrial Organization (ECO 310) University of Toronto. Department of Economics Fall Instructor: Victor Aguirregabiria Empirical Industrial Organization (ECO 30) University of Toronto. Department of Economics Fall 208. Instructor: Victor Aguirregabiria FINAL EXAM Tuesday, December 8th, 208. From 7pm to 9pm (2 hours) Exam

More information

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017

Linear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017 Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical Preliminaries Random Variables Random Variables X: A mapping from Ω to ℝ that describes the question we care about

More information

COMS 4771 Regression. Nakul Verma

COMS 4771 Regression. Nakul Verma COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the

More information

Example 2: The demand function for a popular make of 12-speed bicycle is given by

Example 2: The demand function for a popular make of 12-speed bicycle is given by Sometimes, the unit price will not be given. Instead, product will be sold at market price, and you ll be given both supply and demand equations. In this case, we can find the equilibrium point (Section

More information

Regularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline

Regularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline Other Measures 1 / 52 sscott@cse.unl.edu learning can generally be distilled to an optimization problem Choose a classifier (function, hypothesis) from a set of functions that minimizes an objective function

More information

EC611--Managerial Economics

EC611--Managerial Economics EC611--Managerial Economics Optimization Techniques and New Management Tools Dr. Savvas C Savvides, European University Cyprus Models and Data Model a framework based on simplifying assumptions it helps

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

Linear Model Selection and Regularization

Linear Model Selection and Regularization Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In

More information

Theory of Value Fall 2017

Theory of Value Fall 2017 Division of the Humanities and Social Sciences Ec 121a KC Border Theory of Value Fall 2017 Lecture 2: Profit Maximization 2.1 Market equilibria as maximizers For a smooth function f, if its derivative

More information

Classification: Linear Discriminant Analysis

Classification: Linear Discriminant Analysis Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based

More information

Review of probabilities

Review of probabilities CS 1675 Introduction to Machine Learning Lecture 5 Density estimation Milos Hauskrecht milos@pitt.edu 5329 Sennott Square Review of probabilities 1 robability theory Studies and describes random processes

More information

Fundamentals of Machine Learning. Mohammad Emtiyaz Khan EPFL Aug 25, 2015

Fundamentals of Machine Learning. Mohammad Emtiyaz Khan EPFL Aug 25, 2015 Fundamentals of Machine Learning Mohammad Emtiyaz Khan EPFL Aug 25, 25 Mohammad Emtiyaz Khan 24 Contents List of concepts 2 Course Goals 3 2 Regression 4 3 Model: Linear Regression 7 4 Cost Function: MSE

More information

CSC321 Lecture 2: Linear Regression

CSC321 Lecture 2: Linear Regression CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,

More information

Holdout and Cross-Validation Methods Overfitting Avoidance

Holdout and Cross-Validation Methods Overfitting Avoidance Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest

More information

Bivariate Relationships Between Variables

Bivariate Relationships Between Variables Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods

More information

Econ Slides from Lecture 10

Econ Slides from Lecture 10 Econ 205 Sobel Econ 205 - Slides from Lecture 10 Joel Sobel September 2, 2010 Example Find the tangent plane to {x x 1 x 2 x 2 3 = 6} R3 at x = (2, 5, 2). If you let f (x) = x 1 x 2 x3 2, then this is

More information

COMS 4771 Introduction to Machine Learning. James McInerney Adapted from slides by Nakul Verma

COMS 4771 Introduction to Machine Learning. James McInerney Adapted from slides by Nakul Verma COMS 4771 Introduction to Machine Learning James McInerney Adapted from slides by Nakul Verma Announcements HW1: Please submit as a group Watch out for zero variance features (Q5) HW2 will be released

More information

School of Business. Blank Page

School of Business. Blank Page Maxima and Minima 9 This unit is designed to introduce the learners to the basic concepts associated with Optimization. The readers will learn about different types of functions that are closely related

More information

Second Order Derivatives. Background to Topic 6 Maximisation and Minimisation

Second Order Derivatives. Background to Topic 6 Maximisation and Minimisation Second Order Derivatives Course Manual Background to Topic 6 Maximisation and Minimisation Jacques (4 th Edition): Chapter 4.6 & 4.7 Y Y=a+bX a X Y= f (X) = a + bx First Derivative dy/dx = f = b constant

More information

Lecture 14: Shrinkage

Lecture 14: Shrinkage Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables? Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what

More information

Decision trees COMS 4771

Decision trees COMS 4771 Decision trees COMS 4771 1. Prediction functions (again) Learning prediction functions IID model for supervised learning: (X 1, Y 1),..., (X n, Y n), (X, Y ) are iid random pairs (i.e., labeled examples).

More information

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina

Direct Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:

More information

MS&E 226: Small Data

MS&E 226: Small Data MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate

More information

Part I: Exercise of Monopoly Power. Chapter 1: Monopoly. Two assumptions: A1. Quality of goods is known by consumers; A2. No price discrimination.

Part I: Exercise of Monopoly Power. Chapter 1: Monopoly. Two assumptions: A1. Quality of goods is known by consumers; A2. No price discrimination. Part I: Exercise of Monopoly Power Chapter 1: Monopoly Two assumptions: A1. Quality of goods is known by consumers; A2. No price discrimination. Best known monopoly distortion: p>mc DWL (section 1). Other

More information

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013

Machine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013 Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you

More information

Linear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1

Linear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1 Linear Regression Machine Learning CSE546 Kevin Jamieson University of Washington Oct 5, 2017 1 The regression problem Given past sales data on zillow.com, predict: y = House sale price from x = {# sq.

More information

Microeconomics II Lecture 4. Marshallian and Hicksian demands for goods with an endowment (Labour supply)

Microeconomics II Lecture 4. Marshallian and Hicksian demands for goods with an endowment (Labour supply) Leonardo Felli 30 October, 2002 Microeconomics II Lecture 4 Marshallian and Hicksian demands for goods with an endowment (Labour supply) Define M = m + p ω to be the endowment of the consumer. The Marshallian

More information

Economics 101. Lecture 2 - The Walrasian Model and Consumer Choice

Economics 101. Lecture 2 - The Walrasian Model and Consumer Choice Economics 101 Lecture 2 - The Walrasian Model and Consumer Choice 1 Uncle Léon The canonical model of exchange in economics is sometimes referred to as the Walrasian Model, after the early economist Léon

More information

Math Spring 2017 Mathematical Models in Economics

Math Spring 2017 Mathematical Models in Economics 2017 - Steven Tschantz Math 3660 - Spring 2017 Mathematical Models in Economics Steven Tschantz 2/7/17 Profit maximizing firms Logit demand model Problem Determine the impact of a merger between two of

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables

More information

BAYESIAN DECISION THEORY

BAYESIAN DECISION THEORY Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will

More information

Last Revised: :19: (Fri, 12 Jan 2007)(Revision:

Last Revised: :19: (Fri, 12 Jan 2007)(Revision: 0-0 1 Demand Lecture Last Revised: 2007-01-12 16:19:03-0800 (Fri, 12 Jan 2007)(Revision: 67) a demand correspondence is a special kind of choice correspondence where the set of alternatives is X = { x

More information

Linear classifiers: Logistic regression

Linear classifiers: Logistic regression Linear classifiers: Logistic regression STAT/CSE 416: Machine Learning Emily Fox University of Washington April 19, 2018 How confident is your prediction? The sushi & everything else were awesome! The

More information

Kernel Logistic Regression and the Import Vector Machine

Kernel Logistic Regression and the Import Vector Machine Kernel Logistic Regression and the Import Vector Machine Ji Zhu and Trevor Hastie Journal of Computational and Graphical Statistics, 2005 Presented by Mingtao Ding Duke University December 8, 2011 Mingtao

More information

Cross Validation & Ensembling

Cross Validation & Ensembling Cross Validation & Ensembling Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) CV & Ensembling Machine Learning

More information

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop

Modeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector

More information

SECTION 5.1: Polynomials

SECTION 5.1: Polynomials 1 SECTION 5.1: Polynomials Functions Definitions: Function, Independent Variable, Dependent Variable, Domain, and Range A function is a rule that assigns to each input value x exactly output value y =

More information

Mathematics 2 for Business Schools Topic 7: Application of Integration to Economics. Building Competence. Crossing Borders.

Mathematics 2 for Business Schools Topic 7: Application of Integration to Economics. Building Competence. Crossing Borders. Mathematics 2 for Business Schools Topic 7: Application of Integration to Economics Building Competence. Crossing Borders. Spring Semester 2017 Learning objectives After finishing this section you should

More information

Lecture 2: Linear regression

Lecture 2: Linear regression Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued

More information

Evaluation. Andrea Passerini Machine Learning. Evaluation

Evaluation. Andrea Passerini Machine Learning. Evaluation Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain

More information

cxx ab.ec Warm up OH 2 ax 16 0 axtb Fix any a, b, c > What is the x 2 R that minimizes ax 2 + bx + c

cxx ab.ec Warm up OH 2 ax 16 0 axtb Fix any a, b, c > What is the x 2 R that minimizes ax 2 + bx + c Warm up D cai.yo.ie p IExrL9CxsYD Sglx.Ddl f E Luo fhlexi.si dbll Fix any a, b, c > 0. 1. What is the x 2 R that minimizes ax 2 + bx + c x a b Ta OH 2 ax 16 0 x 1 Za fhkxiiso3ii draulx.h dp.d 2. What is

More information

Basic Business Statistics, 10/e

Basic Business Statistics, 10/e Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:

More information

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples

Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples February 24, 2011 Summary: We introduce the Nash Equilibrium: an outcome (action profile) which is stable in the sense that no player

More information

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation

More information

Advanced Microeconomic Theory. Chapter 6: Partial and General Equilibrium

Advanced Microeconomic Theory. Chapter 6: Partial and General Equilibrium Advanced Microeconomic Theory Chapter 6: Partial and General Equilibrium Outline Partial Equilibrium Analysis General Equilibrium Analysis Comparative Statics Welfare Analysis Advanced Microeconomic Theory

More information

The prediction of house price

The prediction of house price 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Machine Learning Linear Regression. Prof. Matteo Matteucci

Machine Learning Linear Regression. Prof. Matteo Matteucci Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Evaluation requires to define performance measures to be optimized

Evaluation requires to define performance measures to be optimized Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation

More information

Non-parametric Methods

Non-parametric Methods Non-parametric Methods Machine Learning Alireza Ghane Non-Parametric Methods Alireza Ghane / Torsten Möller 1 Outline Machine Learning: What, Why, and How? Curve Fitting: (e.g.) Regression and Model Selection

More information

Horizontal mergers: Merger in the salmon market Unilateral effects in Cournot markets with differentiated products

Horizontal mergers: Merger in the salmon market Unilateral effects in Cournot markets with differentiated products Horizontal mergers: Unilateral effects in Cournot markets with differentiated products 1 1 Conseil de la Concurrence, Paris This presentation represents a personal view and does not necessarily reflect

More information

Recap from previous lecture

Recap from previous lecture Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience

More information

Categorical Predictor Variables

Categorical Predictor Variables Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively

More information

Linear Models for Regression CS534

Linear Models for Regression CS534 Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict

More information

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?

Machine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables? Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

Variations of Logistic Regression with Stochastic Gradient Descent

Variations of Logistic Regression with Stochastic Gradient Descent Variations of Logistic Regression with Stochastic Gradient Descent Panqu Wang(pawang@ucsd.edu) Phuc Xuan Nguyen(pxn002@ucsd.edu) January 26, 2012 Abstract In this paper, we extend the traditional logistic

More information

Lecture 5: Logistic Regression. Neural Networks

Lecture 5: Logistic Regression. Neural Networks Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture

More information

Chapter 14 Student Lecture Notes 14-1

Chapter 14 Student Lecture Notes 14-1 Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this

More information

Classification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business

Classification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business Classification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business 1 1. Classification 2. Logistic Regression, One Predictor 3. Inference:

More information

Decision Trees: Overfitting

Decision Trees: Overfitting Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9

More information

Industrial Organization Lecture 7: Product Differentiation

Industrial Organization Lecture 7: Product Differentiation Industrial Organization Lecture 7: Product Differentiation Nicolas Schutz Nicolas Schutz Product Differentiation 1 / 57 Introduction We now finally drop the assumption that firms offer homogeneous products.

More information

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu

Logistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data

More information

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS

INSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS INSTITUTE AND FACULTY OF ACTUARIES Curriculum 09 SPECIMEN SOLUTIONS Subject CSA Risk Modelling and Survival Analysis Institute and Faculty of Actuaries Sample path A continuous time, discrete state process

More information

Optimization and Gradient Descent

Optimization and Gradient Descent Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function

More information