Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

Similar documents
Data Mining Part 4. Prediction

Holdout and Cross-Validation Methods Overfitting Avoidance

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td

CS145: INTRODUCTION TO DATA MINING

Decision Trees. Each internal node : an attribute Branch: Outcome of the test Leaf node or terminal node: class label.

Rule Generation using Decision Trees

Data Mining Part 5. Prediction

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition

Classification Using Decision Trees

Course in Data Science

Neural Networks and Ensemble Methods for Classification

CS7267 MACHINE LEARNING

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Day 5: Generative models, structured classification

Regression. Simple Linear Regression Multiple Linear Regression Polynomial Linear Regression Decision Tree Regression Random Forest Regression

Machine Learning 3. week

Learning Decision Trees

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

Bayesian Classification. Bayesian Classification: Why?

Machine Learning 2nd Edi7on

Support Vector Machine. Industrial AI Lab.

Data Analytics for Social Science

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine Learning (CSE 446): Neural Networks

CS 6375 Machine Learning

FINAL: CS 6375 (Machine Learning) Fall 2014

Linear and Logistic Regression. Dr. Xiaowei Huang

Week 1, video 3: Classifiers, Part 1

Probabilistic Machine Learning. Industrial AI Lab.

Online Learning and Sequential Decision Making

Linear Methods for Classification

Variance Reduction and Ensemble Methods

9/26/17. Ridge regression. What our model needs to do. Ridge Regression: L2 penalty. Ridge coefficients. Ridge coefficients

Decision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

Support Vector Machines. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Lectures in AstroStatistics: Topics in Machine Learning for Astronomers

Introduction to Machine Learning Midterm Exam

Statistical Consulting Topics Classification and Regression Trees (CART)

Day 3: Classification, logistic regression

ECLT 5810 Classification Neural Networks. Reference: Data Mining: Concepts and Techniques By J. Hand, M. Kamber, and J. Pei, Morgan Kaufmann

Review of Lecture 1. Across records. Within records. Classification, Clustering, Outlier detection. Associations

Terminology for Statistical Data

Adaptive Crowdsourcing via EM with Prior

Lecture 7: DecisionTrees

Support Vector Machine (continued)

Supervised Learning. George Konidaris

Support Vector Machines

Informal Definition: Telling things apart

day month year documentname/initials 1

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers

Algebra I Learning Targets Chapter 1: Equations and Inequalities (one variable) Section Section Title Learning Target(s)

Nonlinear Support Vector Machines through Iterative Majorization and I-Splines

Learning Decision Trees

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

Machine Learning, Fall 2009: Midterm

ECE521 week 3: 23/26 January 2017

8th Grade Math! 4th Quarter Pacing Guide LESSON PLANNING. Delivery Date

1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, )

Introduction to Machine Learning

Multivariate statistical methods and data mining in particle physics

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Nonlinear Classification

TenMarks Curriculum Alignment Guide: GO Math! Grade 8

Data Mining 4. Cluster Analysis

Bagging. Ryan Tibshirani Data Mining: / April Optional reading: ISL 8.2, ESL 8.7

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Mathematics: Essential Learning Expectations: 9 th -12th Grade:

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018

Introduction to Signal Detection and Classification. Phani Chavali

Reteach 2-3. Graphing Linear Functions. 22 Holt Algebra 2. Name Date Class

Data Mining Based Anomaly Detection In PMU Measurements And Event Detection

EECS 349:Machine Learning Bryan Pardo

Linear Classifiers: Expressiveness

Single and multiple linear regression analysis

Decision Tree Analysis for Classification Problems. Entscheidungsunterstützungssysteme SS 18

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Machine Learning Basics

Machine Learning & Data Mining

Error Functions & Linear Regression (2)

ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015

2018 CS420, Machine Learning, Lecture 5. Tree Models. Weinan Zhang Shanghai Jiao Tong University

Learning Goals. 2. To be able to distinguish between a dependent and independent variable.

Statistical Concepts. Constructing a Trend Plot

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Introduction to Basic Statistics Version 2

1 Handling of Continuous Attributes in C4.5. Algorithm

Review: Support vector machines. Machine learning techniques and image analysis

Prediction of RPO by Model Tree. Saleh Shahinfar Department of Dairy Science University of Wisconsin-Madison JAM2013

Advanced statistical methods for data analysis Lecture 2

CS6220: DATA MINING TECHNIQUES

Statistics Toolbox 6. Apply statistical algorithms and probability models

CHAPTER-17. Decision Tree Induction

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

CS145: INTRODUCTION TO DATA MINING

Transcription:

Data Mining 3.6 Regression Analysis Fall 2008 Instructor: Dr. Masoud Yaghini

Outline Introduction Straight-Line Linear Regression Multiple Linear Regression Other Regression Models References

Introduction

Introduction Numerical prediction is similar to classification construct a model use model to predict continuous or ordered value for a given input Prediction is different from classification Classification refers to predict categorical class label Prediction models continuous-valued functions Major method for prediction: regression model the relationship between one or more independent or predictor variables and a dependent or response variable

Introduction In the context of data mining the predictor variables are the attributes of interest describing the tuple that are known. The response variable is what we want to predict Many texts use the terms regression and numeric prediction Some classification techniques can be adapted for prediction. e.g. backpropagation, support vector machines, and k-nearest-neighbor classifiers

Introduction Regression analysis is a good choice when all of the predictor variables are continuous valued as well. Regression analysis methods Linear regression Straight-line linear regression Multiple linear regression Non-linear regression Generalized linear model Poisson regression Logistic regression Log-linear models Regression trees and Model trees

Straight-Line Linear Regression

Linear Regression Straight-line linear regression: involves a response variable y and a single predictor variable x y = w 0 + w 1 x where w 0 (y-intercept) and w 1 (slope) are regression coefficients

Linear regression Method of least squares: estimates the best-fitting straight line as the one that minimizes the error between the actual data and the estimate of the line. D: a training set x: consisting of values of predictor variable y: response variable D : data points of the form(x1, y1), (x2, y2),, (x D, y D ). : the mean value of x1, x2,, x D : the mean value of y1, y2,, y D

Example: Salary problem The table shows a set of paired data where x is the number of years of work experience of a college graduate and y is the corresponding salary of the graduate.

Linear Regression The 2-D data can be graphed on a scatter plot. The plot suggests a linear relationship between the two variables, x and y.

Example: Salary data Given the above data, we compute we get the equation of the least squares line is estimated by

Example: Salary data Linear Regression: Y=3.5*X+23.2 120 100 80 Salary 60 40 20 0 0 5 10 15 20 25 Years

Multiple Linear Regression

Multiple linear regression Multiple linear regression involves more than one predictor variable Training data is of the form (X 1, y 1 ), (X 2, y 2 ),, (X D, y D ) where the X i are the n-dimensional training data with associated class labels, y i An example of a multiple linear regression model based on two predictor attributes:

Example: CPU performance data

Multiple Linear Regression Various statistical measures exist for determining how well the proposed model can predict y. (described later) Obviously, the greater the number of predictor attributes is, the slower the performance is. Before applying regression analysis, it is common to perform attribute subset selection to eliminate attributes that are unlikely to be good predictors for y. In general, regression analysis is accurate for prediction, except when the data contain outliers.

Other Regression Models

Nonlinear Regression Sometimes we can get a more accurate model using a nonlinear model, For example, y = w 0 + w 1 x + w 2 x 2 + w 3 x 3 Some nonlinear models can be transformed into linear regression model. For example, the above function can be converted to linear with new variables: x 2 = x 2, x 3 = x 3 y = w 0 + w 1 x + w 2 x 2 + w 3 x 3

Generalized linear models Generalized linear model is foundation on which linear regression can be applied to modeling categorical response variables Common types of generalized linear models include Logistic regression: Models the probability of some event occurring as a linear function of a set of predictor variables. Poisson regression: models the data that exhibit a Poisson distribution

Log-linear models In the log-linear method, all attributes must be categorical Continuous-valued attributes must first be discretized.

Regression Trees and Model Trees Trees to predict continuous values rather than class labels Regression and model trees tend to be more accurate than linear regression when the data are not represented well by a simple linear model

Regression trees Regression tree: a decision tree where each leaf predicts a numeric quantity Proposed in CART system (Breiman et al. 1984) CART: Classification And Regression Trees Predicted value is average value of training instances that reach the leaf

Example: CPU performance problem Regression tree for the CPU data

Example: CPU performance problem We calculate the average of the absolute values of the errors between the predicted and the actual CPU performance measures It turns out to be significantly less for the tree than for the regression equation.

Model tree Model tree: Each leaf holds a regression model A multivariate linear equation for the predicted attribute Proposed by Quinlan (1992) A more general case than regression tree

Example: CPU performance problem Model tree for the CPU data

References

References J. Han, M. Kamber, Data Mining: Concepts and Techniques, Elsevier Inc. (2006). (Chapter 6) I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, 2nd Edition, Elsevier Inc., 2005. (Chapter 6)

The end