Train the model with a subset of the data. Test the model on the remaining data (the validation set) What data to choose for training vs. test?
|
|
- Susanna Mitchell
- 5 years ago
- Views:
Transcription
1
2 Train the model with a subset of the data Test the model on the remaining data (the validation set) What data to choose for training vs. test? In a time-series dimension, it is natural to hold out the last year (or time period) of the data, to simulate predicting the future based on all past data. In most settings, however, we ll randomly select our training/test sets.
3 A class of methods to do many training/test splits and average over all the runs Here is a simple example of 5-fold cross validation. Gives 5 test sets 5 estimates of MSE. The 5-fold CV estimate is obtained by averaging these values.
4 Split the data up into K folds. Iteratively leave fold k out of the training data and use it to test. The more folds, the smaller each testing set is (more training data), but the more times we need to run the estimation procedure. Using rules of thumb like 5 10 folds is often utilized in practice. This can be done with a simple for loop in R For generalized linear models, the cv.glm() function can be used to perform k-fold cross validation. For example, this code loops over 10 possible polynomial orders and computes the 10- fold cross-validated error in each step
5 The validation set methodology works when we specify a class of models Examples: Linear models with p-order polynomials A set of related models such as LDA, QDA and Logit Non-parametric models with a smooth parameter Once I have a class of models, I loop over the class to try each one, and compute a validation set error on each pass. K-fold CV methods helps ensure my estimates will be stable
6
7
8
9 An alternative to the approach of fitting many models and picking the best is to fit a single model using all the predictors and throw out the less useful ones in one step. This could also help pare down my feature set into a more manageable set before trying out fancy models and deeper analysis Helps guard against overfitting or have higher model interpretability Regularization helps achieve these aims!
10 The suppose we have p features and want to fit the following estimating equation p y i = α + j=1 β j x ij + ε i OLS will find መβ s that minimize least squares in the training set. We have learned that this will tend to overfit models Another way of putting this, we end up with too complex ( squiggly ) models. They have too much variance.
11 Is a method to reduce variance in our model by imposing a penalty for having high መβ s. We can then tune this penalty to minimize MSE in a validation set. To do so, we minimize:
12 The size of the penalty term λ determines how aggressively we lower the coefficients Since the penalty function is the absolute value, we will often set መβ s equal to zero The idea is that for a feature k that does not improve fit by very much, it s not worth suffering the penalty, λ β k, of adding it into the model
13
14
15
16
17 Binary outcomes: equal to 1 or 0. Examples: coin is heads or tails, person is guilty or innocent, default or not on a loan, etc. Continuous outcomes: can take an real value Categorical outcomes: can take one of N qualitative values. Example: mapping symptons to a well-defined disease. Traditional regression is designed for continuous outcomes. We use classification for binary and categorical outcomes
18 A mapping of categories to numbers comes with strong implications Would imply different differences across qualitative conditions
19 Outcome: 0 or 1 Given features X, we want to say what is the probability the outcome=1. We can write this as P(outcome = 1 X) Our model will output a probability even though real outcomes will always be 0 or 1
20
21
22
23
24
25
26 Inverse Elasticity Rule MR MC Profit Max (MR=MC) Price-cost margin (Lerner index) = 1 over elasticity p c( q) p 0 p qp( q) c'( q) q qp p Price minus marginal costs divided by price is referred to as gross margin
27 Inverse Elasticity Rule p c( q) p qp p Suppose MC=0. Then quantity is chosen so that elasticity is 1. Intuition: if marginal costs are zero, then optimize for revenue. Total revenue grows until elasticity = If MC>0, one will stop before reaching elasticity = 1. p p qp 1 p 1 27
28 Inverse Elasticity Rule 2 Suppose we sell n goods indexed i=1,,n Demands x i (p) Profit n p mc( i) x ( p i 1 ). i i If we assume constant marginal cost, this simplification is an example of selling the same good in multiple markets or to multiple customer types Cross-price elasticity p j dx i ij. xi dp j Note no minus sign. Positive substitutes; Negative complements 28
29 Representative Consumer Assumption If there is a representative consumer maximizing utility: max u(x)-px, so u ( x) p and u( x) dx Thus there are symmetric cross-derivatives x p i j x p Recall this rule from multivariate calculus j i dp This rule need not hold in practice, but is a commonly made assumption From the total derivative of FOC 29
30 30 In Matrix Notation Price cost margin: n j j i j i n j i j j i i p x i mc p x p x i mc p x p 1 1 )) ( ( )) ( ( 0 n j ij j j i p i mc p x 1 )) ( ( 1, ) ( i i i p i mc p L 0 = 1 + E L, and thus L = - E -1 1
31 Two Good Formula Rule for inverting a 2x2 matrix L = - E -1 1 yields L 1 L 2 = e 11 e 12 e 21 e Divide top and bottom by e 22 L 1 = e 22 e 12 e 11 e 22 e 12 e 21 = 1 e 12 e 22 e 11 e 12e 21 e 22 = e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 Factor out e 11 Multiply top and bottom by e 11 31
32 Two Good Formula L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 e 12 e 21 e 22 e 11 will be between 0 and 1 because e 12 e 21 < e 22 e 11 This is because cross price elasticities have to be smaller than the relevant own price elasticities. 32
33 Two Good Formula L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 = 1 e = 1 e 11 >
34 Two Good Formula for Substitutes L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 = 1 e = 1 e 11 >
35 Two Good Formula for Substitutes L = - E -1 1 yields L 1 = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 35
36 Two Good Formula for Complements L = - E -1 1 yields L 1 = 1 e 1 12 e
37 Two Good Formula Review L = - E -1 1 yields = 1 e 11 1 e 12 e 22 1 e 12e 21 e 22 e 11 e 12 > 0, goods are substitutes. A price decrease on product 2 decreases sales on product 1 (go in same direction) e 12 < 0, goods are complements. A price decrease on product 2 increases sales on product 1 (go in opposite directions) e 11 & e 22 will be negative due to law of demand (note before we embedded the negative sign) 37
38 v 2 p B p 2 Buy Good 2 Buy Both Reducing bundle price gives the additional sales of both goods with a single price cut Buy Nothing Buy Good 1 p 1 v 1 38
Final Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationApplied Machine Learning Annalisa Marsico
Applied Machine Learning Annalisa Marsico OWL RNA Bionformatics group Max Planck Institute for Molecular Genetics Free University of Berlin 22 April, SoSe 2015 Goals Feature Selection rather than Feature
More informationBresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War
Bresnahan, JIE 87: Competition and Collusion in the American Automobile Industry: 1955 Price War Spring 009 Main question: In 1955 quantities of autos sold were higher while prices were lower, relative
More informationIntroduction to Statistical modeling: handout for Math 489/583
Introduction to Statistical modeling: handout for Math 489/583 Statistical modeling occurs when we are trying to model some data using statistical tools. From the start, we recognize that no model is perfect
More informationAnswer Key: Problem Set 1
Answer Key: Problem Set 1 Econ 409 018 Fall Question 1 a The profit function (revenue minus total cost) is π(q) = P (q)q cq The first order condition with respect to (henceforth wrt) q is P (q )q + P (q
More informationLinear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Regression II: Regularization and Shrinkage Methods Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com
More informationChapter 3 Multiple Regression Complete Example
Department of Quantitative Methods & Information Systems ECON 504 Chapter 3 Multiple Regression Complete Example Spring 2013 Dr. Mohammad Zainal Review Goals After completing this lecture, you should be
More informationECON 594: Lecture #6
ECON 594: Lecture #6 Thomas Lemieux Vancouver School of Economics, UBC May 2018 1 Limited dependent variables: introduction Up to now, we have been implicitly assuming that the dependent variable, y, was
More informationThe Monopolist. The Pure Monopolist with symmetric D matrix
University of California, Davis Department of Agricultural and Resource Economics ARE 252 Optimization with Economic Applications Lecture Notes 5 Quirino Paris The Monopolist.................................................................
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next
More informationMicroeconomic Theory -1- Introduction
Microeconomic Theory -- Introduction. Introduction. Profit maximizing firm with monopoly power 6 3. General results on maximizing with two variables 8 4. Model of a private ownership economy 5. Consumer
More informationMathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7
Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum
More informationAdvanced Microeconomic Analysis, Lecture 6
Advanced Microeconomic Analysis, Lecture 6 Prof. Ronaldo CARPIO April 10, 017 Administrative Stuff Homework # is due at the end of class. I will post the solutions on the website later today. The midterm
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Prediction Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict the
More informationBertrand Model of Price Competition. Advanced Microeconomic Theory 1
Bertrand Model of Price Competition Advanced Microeconomic Theory 1 ҧ Bertrand Model of Price Competition Consider: An industry with two firms, 1 and 2, selling a homogeneous product Firms face market
More informationMath for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han
Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR
More informationAnswer Key: Problem Set 3
Answer Key: Problem Set Econ 409 018 Fall Question 1 a This is a standard monopoly problem; using MR = a 4Q, let MR = MC and solve: Q M = a c 4, P M = a + c, πm = (a c) 8 The Lerner index is then L M P
More informationCOMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS16
COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS6 Lecture 3: Classification with Logistic Regression Advanced optimization techniques Underfitting & Overfitting Model selection (Training-
More informationReminders. Thought questions should be submitted on eclass. Please list the section related to the thought question
Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a
More informationLecture 4 Discriminant Analysis, k-nearest Neighbors
Lecture 4 Discriminant Analysis, k-nearest Neighbors Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University. Email: fredrik.lindsten@it.uu.se fredrik.lindsten@it.uu.se
More informationf( x) f( y). Functions which are not one-to-one are often called many-to-one. Take the domain and the range to both be all the real numbers:
I. UNIVARIATE CALCULUS Given two sets X and Y, a function is a rule that associates each member of X with exactly one member of Y. That is, some x goes in, and some y comes out. These notations are used
More informationAdding Production to the Theory
Adding Production to the Theory We begin by considering the simplest situation that includes production: two goods, both of which have consumption value, but one of which can be transformed into the other.
More informationStatistical aspects of prediction models with high-dimensional data
Statistical aspects of prediction models with high-dimensional data Anne Laure Boulesteix Institut für Medizinische Informationsverarbeitung, Biometrie und Epidemiologie February 15th, 2017 Typeset by
More informationContents Lecture 4. Lecture 4 Linear Discriminant Analysis. Summary of Lecture 3 (II/II) Summary of Lecture 3 (I/II)
Contents Lecture Lecture Linear Discriminant Analysis Fredrik Lindsten Division of Systems and Control Department of Information Technology Uppsala University Email: fredriklindsten@ituuse Summary of lecture
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationMATH 236 ELAC FALL 2017 CA 9 NAME: SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question.
MATH 236 ELAC FALL 207 CA 9 NAME: SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. ) 27 p 3 27 p 3 ) 2) If 9 t 3 4t 9-2t = 3, find t. 2) Solve the equation.
More informationEmpirical Industrial Organization (ECO 310) University of Toronto. Department of Economics Fall Instructor: Victor Aguirregabiria
Empirical Industrial Organization (ECO 30) University of Toronto. Department of Economics Fall 208. Instructor: Victor Aguirregabiria FINAL EXAM Tuesday, December 8th, 208. From 7pm to 9pm (2 hours) Exam
More informationLinear Models: Comparing Variables. Stony Brook University CSE545, Fall 2017
Linear Models: Comparing Variables Stony Brook University CSE545, Fall 2017 Statistical Preliminaries Random Variables Random Variables X: A mapping from Ω to ℝ that describes the question we care about
More informationCOMS 4771 Regression. Nakul Verma
COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the
More informationExample 2: The demand function for a popular make of 12-speed bicycle is given by
Sometimes, the unit price will not be given. Instead, product will be sold at market price, and you ll be given both supply and demand equations. In this case, we can find the equilibrium point (Section
More informationRegularization. CSCE 970 Lecture 3: Regularization. Stephen Scott and Vinod Variyam. Introduction. Outline
Other Measures 1 / 52 sscott@cse.unl.edu learning can generally be distilled to an optimization problem Choose a classifier (function, hypothesis) from a set of functions that minimizes an objective function
More informationEC611--Managerial Economics
EC611--Managerial Economics Optimization Techniques and New Management Tools Dr. Savvas C Savvides, European University Cyprus Models and Data Model a framework based on simplifying assumptions it helps
More informationMark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.
CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationLinear Model Selection and Regularization
Linear Model Selection and Regularization Recall the linear model Y = β 0 + β 1 X 1 + + β p X p + ɛ. In the lectures that follow, we consider some approaches for extending the linear model framework. In
More informationTheory of Value Fall 2017
Division of the Humanities and Social Sciences Ec 121a KC Border Theory of Value Fall 2017 Lecture 2: Profit Maximization 2.1 Market equilibria as maximizers For a smooth function f, if its derivative
More informationClassification: Linear Discriminant Analysis
Classification: Linear Discriminant Analysis Discriminant analysis uses sample information about individuals that are known to belong to one of several populations for the purposes of classification. Based
More informationReview of probabilities
CS 1675 Introduction to Machine Learning Lecture 5 Density estimation Milos Hauskrecht milos@pitt.edu 5329 Sennott Square Review of probabilities 1 robability theory Studies and describes random processes
More informationFundamentals of Machine Learning. Mohammad Emtiyaz Khan EPFL Aug 25, 2015
Fundamentals of Machine Learning Mohammad Emtiyaz Khan EPFL Aug 25, 25 Mohammad Emtiyaz Khan 24 Contents List of concepts 2 Course Goals 3 2 Regression 4 3 Model: Linear Regression 7 4 Cost Function: MSE
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationHoldout and Cross-Validation Methods Overfitting Avoidance
Holdout and Cross-Validation Methods Overfitting Avoidance Decision Trees Reduce error pruning Cost-complexity pruning Neural Networks Early stopping Adjusting Regularizers via Cross-Validation Nearest
More informationBivariate Relationships Between Variables
Bivariate Relationships Between Variables BUS 735: Business Decision Making and Research 1 Goals Specific goals: Detect relationships between variables. Be able to prescribe appropriate statistical methods
More informationEcon Slides from Lecture 10
Econ 205 Sobel Econ 205 - Slides from Lecture 10 Joel Sobel September 2, 2010 Example Find the tangent plane to {x x 1 x 2 x 2 3 = 6} R3 at x = (2, 5, 2). If you let f (x) = x 1 x 2 x3 2, then this is
More informationCOMS 4771 Introduction to Machine Learning. James McInerney Adapted from slides by Nakul Verma
COMS 4771 Introduction to Machine Learning James McInerney Adapted from slides by Nakul Verma Announcements HW1: Please submit as a group Watch out for zero variance features (Q5) HW2 will be released
More informationSchool of Business. Blank Page
Maxima and Minima 9 This unit is designed to introduce the learners to the basic concepts associated with Optimization. The readers will learn about different types of functions that are closely related
More informationSecond Order Derivatives. Background to Topic 6 Maximisation and Minimisation
Second Order Derivatives Course Manual Background to Topic 6 Maximisation and Minimisation Jacques (4 th Edition): Chapter 4.6 & 4.7 Y Y=a+bX a X Y= f (X) = a + bx First Derivative dy/dx = f = b constant
More informationLecture 14: Shrinkage
Lecture 14: Shrinkage Reading: Section 6.2 STATS 202: Data mining and analysis October 27, 2017 1 / 19 Shrinkage methods The idea is to perform a linear regression, while regularizing or shrinking the
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, What about continuous variables?
Linear Regression Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2014 1 What about continuous variables? n Billionaire says: If I am measuring a continuous variable, what
More informationDecision trees COMS 4771
Decision trees COMS 4771 1. Prediction functions (again) Learning prediction functions IID model for supervised learning: (X 1, Y 1),..., (X n, Y n), (X, Y ) are iid random pairs (i.e., labeled examples).
More informationDirect Learning: Linear Regression. Donglin Zeng, Department of Biostatistics, University of North Carolina
Direct Learning: Linear Regression Parametric learning We consider the core function in the prediction rule to be a parametric function. The most commonly used function is a linear function: squared loss:
More informationMS&E 226: Small Data
MS&E 226: Small Data Lecture 6: Model complexity scores (v3) Ramesh Johari ramesh.johari@stanford.edu Fall 2015 1 / 34 Estimating prediction error 2 / 34 Estimating prediction error We saw how we can estimate
More informationPart I: Exercise of Monopoly Power. Chapter 1: Monopoly. Two assumptions: A1. Quality of goods is known by consumers; A2. No price discrimination.
Part I: Exercise of Monopoly Power Chapter 1: Monopoly Two assumptions: A1. Quality of goods is known by consumers; A2. No price discrimination. Best known monopoly distortion: p>mc DWL (section 1). Other
More informationMachine Learning CSE546 Carlos Guestrin University of Washington. September 30, 2013
Bayesian Methods Machine Learning CSE546 Carlos Guestrin University of Washington September 30, 2013 1 What about prior n Billionaire says: Wait, I know that the thumbtack is close to 50-50. What can you
More informationLinear Regression. Machine Learning CSE546 Kevin Jamieson University of Washington. Oct 5, Kevin Jamieson 1
Linear Regression Machine Learning CSE546 Kevin Jamieson University of Washington Oct 5, 2017 1 The regression problem Given past sales data on zillow.com, predict: y = House sale price from x = {# sq.
More informationMicroeconomics II Lecture 4. Marshallian and Hicksian demands for goods with an endowment (Labour supply)
Leonardo Felli 30 October, 2002 Microeconomics II Lecture 4 Marshallian and Hicksian demands for goods with an endowment (Labour supply) Define M = m + p ω to be the endowment of the consumer. The Marshallian
More informationEconomics 101. Lecture 2 - The Walrasian Model and Consumer Choice
Economics 101 Lecture 2 - The Walrasian Model and Consumer Choice 1 Uncle Léon The canonical model of exchange in economics is sometimes referred to as the Walrasian Model, after the early economist Léon
More informationMath Spring 2017 Mathematical Models in Economics
2017 - Steven Tschantz Math 3660 - Spring 2017 Mathematical Models in Economics Steven Tschantz 2/7/17 Profit maximizing firms Logit demand model Problem Determine the impact of a merger between two of
More informationMachine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall
Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume
More informationPATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 2: PROBABILITY DISTRIBUTIONS Parametric Distributions Basic building blocks: Need to determine given Representation: or? Recall Curve Fitting Binary Variables
More informationBAYESIAN DECISION THEORY
Last updated: September 17, 2012 BAYESIAN DECISION THEORY Problems 2 The following problems from the textbook are relevant: 2.1 2.9, 2.11, 2.17 For this week, please at least solve Problem 2.3. We will
More informationLast Revised: :19: (Fri, 12 Jan 2007)(Revision:
0-0 1 Demand Lecture Last Revised: 2007-01-12 16:19:03-0800 (Fri, 12 Jan 2007)(Revision: 67) a demand correspondence is a special kind of choice correspondence where the set of alternatives is X = { x
More informationLinear classifiers: Logistic regression
Linear classifiers: Logistic regression STAT/CSE 416: Machine Learning Emily Fox University of Washington April 19, 2018 How confident is your prediction? The sushi & everything else were awesome! The
More informationKernel Logistic Regression and the Import Vector Machine
Kernel Logistic Regression and the Import Vector Machine Ji Zhu and Trevor Hastie Journal of Computational and Graphical Statistics, 2005 Presented by Mingtao Ding Duke University December 8, 2011 Mingtao
More informationCross Validation & Ensembling
Cross Validation & Ensembling Shan-Hung Wu shwu@cs.nthu.edu.tw Department of Computer Science, National Tsing Hua University, Taiwan Machine Learning Shan-Hung Wu (CS, NTHU) CV & Ensembling Machine Learning
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationSECTION 5.1: Polynomials
1 SECTION 5.1: Polynomials Functions Definitions: Function, Independent Variable, Dependent Variable, Domain, and Range A function is a rule that assigns to each input value x exactly output value y =
More informationMathematics 2 for Business Schools Topic 7: Application of Integration to Economics. Building Competence. Crossing Borders.
Mathematics 2 for Business Schools Topic 7: Application of Integration to Economics Building Competence. Crossing Borders. Spring Semester 2017 Learning objectives After finishing this section you should
More informationLecture 2: Linear regression
Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued
More informationEvaluation. Andrea Passerini Machine Learning. Evaluation
Andrea Passerini passerini@disi.unitn.it Machine Learning Basic concepts requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain
More informationcxx ab.ec Warm up OH 2 ax 16 0 axtb Fix any a, b, c > What is the x 2 R that minimizes ax 2 + bx + c
Warm up D cai.yo.ie p IExrL9CxsYD Sglx.Ddl f E Luo fhlexi.si dbll Fix any a, b, c > 0. 1. What is the x 2 R that minimizes ax 2 + bx + c x a b Ta OH 2 ax 16 0 x 1 Za fhkxiiso3ii draulx.h dp.d 2. What is
More informationBasic Business Statistics, 10/e
Chapter 4 4- Basic Business Statistics th Edition Chapter 4 Introduction to Multiple Regression Basic Business Statistics, e 9 Prentice-Hall, Inc. Chap 4- Learning Objectives In this chapter, you learn:
More informationGame Theory and Algorithms Lecture 2: Nash Equilibria and Examples
Game Theory and Algorithms Lecture 2: Nash Equilibria and Examples February 24, 2011 Summary: We introduce the Nash Equilibrium: an outcome (action profile) which is stable in the sense that no player
More informationLecture 9: Large Margin Classifiers. Linear Support Vector Machines
Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation
More informationAdvanced Microeconomic Theory. Chapter 6: Partial and General Equilibrium
Advanced Microeconomic Theory Chapter 6: Partial and General Equilibrium Outline Partial Equilibrium Analysis General Equilibrium Analysis Comparative Statics Welfare Analysis Advanced Microeconomic Theory
More informationThe prediction of house price
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationMachine Learning Linear Regression. Prof. Matteo Matteucci
Machine Learning Linear Regression Prof. Matteo Matteucci Outline 2 o Simple Linear Regression Model Least Squares Fit Measures of Fit Inference in Regression o Multi Variate Regession Model Least Squares
More informationSUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION
SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology
More informationEvaluation requires to define performance measures to be optimized
Evaluation Basic concepts Evaluation requires to define performance measures to be optimized Performance of learning algorithms cannot be evaluated on entire domain (generalization error) approximation
More informationNon-parametric Methods
Non-parametric Methods Machine Learning Alireza Ghane Non-Parametric Methods Alireza Ghane / Torsten Möller 1 Outline Machine Learning: What, Why, and How? Curve Fitting: (e.g.) Regression and Model Selection
More informationHorizontal mergers: Merger in the salmon market Unilateral effects in Cournot markets with differentiated products
Horizontal mergers: Unilateral effects in Cournot markets with differentiated products 1 1 Conseil de la Concurrence, Paris This presentation represents a personal view and does not necessarily reflect
More informationRecap from previous lecture
Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience
More informationCategorical Predictor Variables
Categorical Predictor Variables We often wish to use categorical (or qualitative) variables as covariates in a regression model. For binary variables (taking on only 2 values, e.g. sex), it is relatively
More informationLinear Models for Regression CS534
Linear Models for Regression CS534 Example Regression Problems Predict housing price based on House size, lot size, Location, # of rooms Predict stock price based on Price history of the past month Predict
More informationMachine Learning CSE546 Sham Kakade University of Washington. Oct 4, What about continuous variables?
Linear Regression Machine Learning CSE546 Sham Kakade University of Washington Oct 4, 2016 1 What about continuous variables? Billionaire says: If I am measuring a continuous variable, what can you do
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationVariations of Logistic Regression with Stochastic Gradient Descent
Variations of Logistic Regression with Stochastic Gradient Descent Panqu Wang(pawang@ucsd.edu) Phuc Xuan Nguyen(pxn002@ucsd.edu) January 26, 2012 Abstract In this paper, we extend the traditional logistic
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feed-forward neural networks Backpropagation Tricks for training neural networks COMP-652, Lecture
More informationChapter 14 Student Lecture Notes 14-1
Chapter 14 Student Lecture Notes 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter 14 Multiple Regression Analysis and Model Building Chap 14-1 Chapter Goals After completing this
More informationClassification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business
Classification: Logistic Regression and Naive Bayes Book Chapter 4. Carlos M. Carvalho The University of Texas McCombs School of Business 1 1. Classification 2. Logistic Regression, One Predictor 3. Inference:
More informationDecision Trees: Overfitting
Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9
More informationIndustrial Organization Lecture 7: Product Differentiation
Industrial Organization Lecture 7: Product Differentiation Nicolas Schutz Nicolas Schutz Product Differentiation 1 / 57 Introduction We now finally drop the assumption that firms offer homogeneous products.
More informationLogistic Regression Review Fall 2012 Recitation. September 25, 2012 TA: Selen Uguroglu
Logistic Regression Review 10-601 Fall 2012 Recitation September 25, 2012 TA: Selen Uguroglu!1 Outline Decision Theory Logistic regression Goal Loss function Inference Gradient Descent!2 Training Data
More informationINSTITUTE AND FACULTY OF ACTUARIES. Curriculum 2019 SPECIMEN SOLUTIONS
INSTITUTE AND FACULTY OF ACTUARIES Curriculum 09 SPECIMEN SOLUTIONS Subject CSA Risk Modelling and Survival Analysis Institute and Faculty of Actuaries Sample path A continuous time, discrete state process
More informationOptimization and Gradient Descent
Optimization and Gradient Descent INFO-4604, Applied Machine Learning University of Colorado Boulder September 12, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the function
More information