Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
|
|
- Steven Daniel
- 6 years ago
- Views:
Transcription
1 Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons. Consequences of volatons. Methods for detectng volatons. Remedes for volatons. Methods for detectng volatons of assumptons are often referred to as dagnostcs n that they are used to dagnose or reveal problems n the data. We now extend our use of dagnostcs to address two other ssues of potental mportance n almost any applcaton of multple regresson: Outlers Multcollnearty Each of these phenomena can have a major mpact on regresson results and nterpretaton. Any hghqualty applcaton of MLR should nclude a drect and careful nvestgaton of both ssues to determne f any serous problem s present and, f so, how t should be remeded.
2 2 Outlers Outlers are atypcal data ponts that do not ft wth the rest of the data. An outler may arse due to some sort of contamnaton or error, or may be a vald but very extreme observaton. (More on ths later.) Outlers may have a dramatc mpact on results of regresson analyses, potentally havng major mpact on effects szes and regresson coeffcents. Outlers may cause a weak (or zero) lnear relatonshp to appear to be a strong lnear relatonshp, or may have the opposte effect by maskng a strong lnear relatonshp. Outlers tend to have a stronger effect when n s small than when n s large. Detecton of outlers In applcatons of MLR t s useful to be able to detect and dentfy outlers. As we shall see later, once an outler s detected the nvestgator must carefully attempt to determne the source or cause of the atypcal observaton and then consder how to
3 3 remedy the stuaton. But frst we wll consder methods for detectng outlers. Outler detecton methods nvolve the use of statstcs that are obtaned for each case (or observaton). Three types of detecton measures are commonly used: Leverage: Extremty of each case on the IVs. Dscrepancy: Extremty of each case on the DV. Influence: Influence of each case on regresson results. These statstcs are commonly avalable n commercal statstcal software. The general approach s to obtan such measures for each case and then determne whch cases, f any, exhbt suffcently extreme values to be consdered outlers. Leverage: Measures of leverage assess the extremty, or atypcalty, of each case on the IVs. Extreme cases have the potental to have great nfluence on results of regresson analyses.
4 4 When there s only one IV, the usual measure of leverage s gven by: h 1 ( X X = + 2 n x 2 ) Cases near the mean of X produce low values of h, whereas cases further from the mean produce larger values. Ths measure of leverage can be extended to the case of k IVs, but the formula requres matrx representaton and s omtted here. In ths more general context, observatons near the centrod (jont mean) of the dstrbuton of the IVs yeld low values of h, and cases further from the centrod yeld larger values. Once we obtan h values, one for each of the n cases, we need to examne them to dentfy extreme values. There are two common approaches: 1) Index plot: Ths s a plot where the horzontal axs represents case number, runnng from 1 to n, and the vertcal axs represents the leverage measure. See examples n the text. Vsual nspecton of the plot can reveal cases wth extreme leverage values. Index plots are most useful when n s not too large.
5 2) Cutoff values: There are rule-of-thumb cutoffs that are more useful when n s large. Common values nclude h >2(k+1)n or h >3(k+1)n. A general prncple s that we do not want to dentfy a large number of cases as outlers, so the use of more extreme values wll result n fewer cases dentfed. Ths process helps us to dentfy observatons that are hghly dscrepant on the IVs and whch thus have a large potental nfluence on results of the regresson analyss. Dscrepancy: Measures of dscrepancy assess extremty on the DV n the context of the regresson model. A smple measure of extremty would be the raw regresson resdual for each case: Y Yˆ However, t must be kept n mnd that an extreme observaton wll nfluence the regresson lne n such a way as to make the correspondng resdual smaller for that observaton. A better measure of dscrepancy for case would be the value of the resdual that would be obtaned f that case were not ncluded n the regresson model. 5
6 6 Ths value s desgnated d = Y Y ( ) where Y ˆ ( ) s the predcted value of Y that would be obtaned for case usng a regresson equaton derved from the sample excludng case. Observatons exhbtng a large value of d are cases that are devant n terms of ther resduals when the regresson equaton s derved based on the rest of the sample. To put these values on a standardzed scale we defne d t = SE d These values are called Studentzed resduals. We then wsh to dentfy extreme values by usng ether an ndex plot (when n s small to moderate) or cutoff values. Snce these resduals approxmately follow a t-dstrbuton, common cutoffs are ±2 n small to moderate samples, and ±3 or ±4 n large samples. By ths process we can dentfy observatons that are hghly dscrepant on the DV n the context of the regresson model. ˆ
7 7 Influence: Influence statstcs measure the nfluence of each case on the regresson model. These measures combne nformaton represented by the notons of leverage and dscrepancy. There are two knds of nfluence measures: global measures, and measures of nfluence on a specfc regresson coeffcent. Global nfluence: A global nfluence measure assesses the change n the predcted Y values as a functon of whether an observaton,, s ncluded n the sample or not. For each case we obtan a measure called DFFITS : DFFITS = Yˆ Yˆ MS ( ) resdual( ) The numerator represents the dfference n predcted Y values obtaned when case s ncluded vs. excluded from the analyss, and the denomnator serves to standardze these values. h Ths ndex of nfluence s closely related to another commonly used ndex called Cook s dstance: DFFITS Cook' sds tan ce ( k + 1) 2
8 Once we obtan one of these measures for each case we agan seek to dentfy extreme values usng an ndex plot, when n s not too large, or cutoffs. Commonly used cutoffs are values >1 when n s small to moderate, or values > 2 ( k + 1) when n s large. Ths process helps us to dentfy observatons that have a relatvely large global nfluence on the results of the regresson analyss. Influence on specfc regresson coeffcents: In some studes we may have a prmary nterest n estmatng and nterpretng the value of regresson coeffcents assocated wth specfc IVs. In such stuatons we would be nterested n whether those partcular coeffcents mght be hghly nfluenced by outlers. Such nfluences can be assessed usng an ndex called DFBETA. For a gven IV, j, we can obtan a DFBETA value for each case, : DFBETA j B j B = SE B j ( ) where B j s the regresson coeffcent obtaned from the full sample, and B j() s the coeffcent obtaned when case s excluded from the sample. n j( ) 8
9 9 Ths value represents the nfluence of observaton on the regresson coeffcent B j. Once we obtan one of these measures for each case we agan seek to dentfy extreme values usng an ndex plot, when n s not too large, or cutoffs. Commonly used cutoffs are ±1 for small to moderate n, and larger values such as ± 2 when n s large. Ths approach allows us to dentfy observatons that have a large nfluence on the value of B j. In practce we can obtan DFBETA values assocated wth each B j f so desred. n General approach Gven a sample of n observatons on k IVs and one DV, we can obtan any or all of these dagnostc measures. For each measure we can dentfy extreme observatons usng ndex plots and/or conventonal cutoff values. Index plots are more useful for small to moderate n, whle cutoffs are useful for larger n.
10 In any event, we do not want ths process to result n dentfcaton of large numbers of outlers. Any outlers that are dentfed must be evaluated, and t s mpractcal to evaluate large numbers of such cases. Of the varous dagnostc statstcs defned above, measures of nfluence are probably the most useful and mportant. These statstcs combne nformaton represented by measures of leverage and dscrepancy and ndcate actual nfluence of each case on results of regresson analyses. Cases that exhbt extreme values of leverage or dscrepancy but do not exhbt substantal nfluence are probably not problematc. For cases dentfed as havng hgh nfluence, measures of leverage and dscrepancy can provde more detal about the specfc nature of the extremty or atypcalty of those cases. 10 What to do when outlers are dentfed When outlers are clearly dentfed t s useful and potentally of great mportance to attempt to determne ther source or cause.
11 Two prmary causes: Contamnaton or error: Some sort of error occurred n the measurement or data recordng process. In such cases, the error may be fxed or, f that s not possble, the outlers may be deleted from the sample. Vald but rare cases: The outlers are vald observatons but are extreme n some way relatve to the rest of the sample. Determnng what to do wth outlers of the second type can be problematc. There s a tenson between two goals: 1) Retanng and seekng to account for all vald data. 2) Obtanng results that represent the general effects present n the populaton, are not overly nfluenced by ndvdual cases, and generalze and cross-valdate well. Both objectves are legtmate and mportant. As a result, outlers should not be deleted casually. 11
12 An attempt to determne the nature of the outlers can provde mportant nformaton and nsghts. For example: Outlers may be observatons from a dfferent populaton than the one of nterest. Outlers may arse due to unexpected or unrecognzed effects or phenomena. Outlers may reveal msspecfcaton of the regresson model (e.g., nonlnear v. lnear). 12 It should also be noted that falure to delete outlers may be ethcally problematc. It can and does happen that a fndng of a statstcally sgnfcant effect occurs because of effects of a small number of outlers, and that f those outlers are removed the effect vanshes. One could argue that t would be unethcal to report such a sgnfcant effect wthout notng ts dependence on the presence of a few extreme observatons. If a decson s made to delete outlers from the data set so as to elmnate ther nfluence, then the followng ponts should be kept n mnd:
13 13 1) It s essental that the nvestgator report ths decson along wth the number and nature of the outlers that have been deleted. 2) The deleton of outlers produces, n effect, a new sample. In that new sample, values of all of the dagnostc statstcs defned above would be dfferent than they were pror to the deleton of the outlers. It s advsable to re-compute the dagnostc measures to determne whether any observatons would now be dentfed as outlers, although they were not dentfed as outlers n the full sample. Fnally, note that there exst robust regresson methods that are desgned to be less senstve to effects of outlers. These methods do not use ordnary least squares estmaton as n the standard MLR methods. See text for dscusson and references.
Comparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationChapter 15 - Multiple Regression
Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term
More informationChapter 9: Statistical Inference and the Relationship between Two Variables
Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationStatistics for Business and Economics
Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationChapter 8 Indicator Variables
Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n
More informationEcon107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased
More informationChapter 14 Simple Linear Regression
Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng
More informationStatistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation
Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear
More informationChap 10: Diagnostics, p384
Chap 10: Dagnostcs, p384 Multcollnearty 10.5 p406 Defnton Multcollnearty exsts when two or more ndependent varables used n regresson are moderately or hghly correlated. - when multcollnearty exsts, regresson
More information/ n ) are compared. The logic is: if the two
STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence
More informationChapter 15 Student Lecture Notes 15-1
Chapter 15 Student Lecture Notes 15-1 Basc Busness Statstcs (9 th Edton) Chapter 15 Multple Regresson Model Buldng 004 Prentce-Hall, Inc. Chap 15-1 Chapter Topcs The Quadratc Regresson Model Usng Transformatons
More information2016 Wiley. Study Session 2: Ethical and Professional Standards Application
6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationOn the detection of influential outliers in linear regression analysis
Amercan Journal of Theoretcal and Appled Statstcs 04; 3(4): 00-06 Publshed onlne July 30, 04 (http://www.scencepublshnggroup.com/j/ajtas) do: 0.648/j.ajtas.040304.4 ISSN: 36-8999 (Prnt); ISSN: 36-9006
More informationBasic Business Statistics, 10/e
Chapter 13 13-1 Basc Busness Statstcs 11 th Edton Chapter 13 Smple Lnear Regresson Basc Busness Statstcs, 11e 009 Prentce-Hall, Inc. Chap 13-1 Learnng Objectves In ths chapter, you learn: How to use regresson
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationIntroduction to Generalized Linear Models
INTRODUCTION TO STATISTICAL MODELLING TRINITY 00 Introducton to Generalzed Lnear Models I. Motvaton In ths lecture we extend the deas of lnear regresson to the more general dea of a generalzed lnear model
More informationSTATISTICS QUESTIONS. Step by Step Solutions.
STATISTICS QUESTIONS Step by Step Solutons www.mathcracker.com 9//016 Problem 1: A researcher s nterested n the effects of famly sze on delnquency for a group of offenders and examnes famles wth one to
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More information28. SIMPLE LINEAR REGRESSION III
8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted
More informationx i1 =1 for all i (the constant ).
Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by
More informationLINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables
LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory
More information18. SIMPLE LINEAR REGRESSION III
8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.
More informationRegression. The Simple Linear Regression Model
Regresson Smple Lnear Regresson Model Least Squares Method Coeffcent of Determnaton Model Assumptons Testng for Sgnfcance Usng the Estmated Regresson Equaton for Estmaton and Predcton Resdual Analss: Valdatng
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More informationThe Ordinary Least Squares (OLS) Estimator
The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal
More informationJanuary Examinations 2015
24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed
More informationANOVA. The Observations y ij
ANOVA Stands for ANalyss Of VArance But t s a test of dfferences n means The dea: The Observatons y j Treatment group = 1 = 2 = k y 11 y 21 y k,1 y 12 y 22 y k,2 y 1, n1 y 2, n2 y k, nk means: m 1 m 2
More informationFREQUENCY DISTRIBUTIONS Page 1 of The idea of a frequency distribution for sets of observations will be introduced,
FREQUENCY DISTRIBUTIONS Page 1 of 6 I. Introducton 1. The dea of a frequency dstrbuton for sets of observatons wll be ntroduced, together wth some of the mechancs for constructng dstrbutons of data. Then
More informationLearning Objectives for Chapter 11
Chapter : Lnear Regresson and Correlaton Methods Hldebrand, Ott and Gray Basc Statstcal Ideas for Managers Second Edton Learnng Objectves for Chapter Usng the scatterplot n regresson analyss Usng the method
More information[The following data appear in Wooldridge Q2.3.] The table below contains the ACT score and college GPA for eight college students.
PPOL 59-3 Problem Set Exercses n Smple Regresson Due n class /8/7 In ths problem set, you are asked to compute varous statstcs by hand to gve you a better sense of the mechancs of the Pearson correlaton
More informationREGRESSION ANALYSIS II- MULTICOLLINEARITY
REGRESSION ANALYSIS II- MULTICOLLINEARITY QUESTION 1 Departments of Open Unversty of Cyprus A and B consst of na = 35 and nb = 30 students respectvely. The students of department A acheved an average test
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationPolynomial Regression Models
LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance
More informationResource Allocation and Decision Analysis (ECON 8010) Spring 2014 Foundations of Regression Analysis
Resource Allocaton and Decson Analss (ECON 800) Sprng 04 Foundatons of Regresson Analss Readng: Regresson Analss (ECON 800 Coursepak, Page 3) Defntons and Concepts: Regresson Analss statstcal technques
More informationEcon107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)
I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,
More informationSIMPLE LINEAR REGRESSION
Smple Lnear Regresson and Correlaton Introducton Prevousl, our attenton has been focused on one varable whch we desgnated b x. Frequentl, t s desrable to learn somethng about the relatonshp between two
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationis the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors
Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson
More informationOutlier Detection in Logistic Regression: A Quest for Reliable Knowledge from Predictive Modeling and Classification
Outler Detecton n Logstc egresson: A Quest for elable Knowledge from Predctve Modelng and Classfcaton Abdul Nurunnab, Geoff West Department of Spatal Scences, Curtn Unversty, Perth, Australa CC for Spatal
More informationCorrelation and Regression
Correlaton and Regresson otes prepared by Pamela Peterson Drake Index Basc terms and concepts... Smple regresson...5 Multple Regresson...3 Regresson termnology...0 Regresson formulas... Basc terms and
More informationChapter 12 Analysis of Covariance
Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty
More informationIII. Econometric Methodology Regression Analysis
Page Econ07 Appled Econometrcs Topc : An Overvew of Regresson Analyss (Studenmund, Chapter ) I. The Nature and Scope of Econometrcs. Lot s of defntons of econometrcs. Nobel Prze Commttee Paul Samuelson,
More informationNow we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity
ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the
More informationCSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography
CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationActivity #13: Simple Linear Regression. actgpa.sav; beer.sav;
ctvty #3: Smple Lnear Regresson Resources: actgpa.sav; beer.sav; http://mathworld.wolfram.com/leastfttng.html In the last actvty, we learned how to quantfy the strength of the lnear relatonshp between
More informationUsing the estimated penetrances to determine the range of the underlying genetic model in casecontrol
Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable
More informationChapter 6. Supplemental Text Material
Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.
More informationEconometrics of Panel Data
Econometrcs of Panel Data Jakub Mućk Meetng # 8 Jakub Mućk Econometrcs of Panel Data Meetng # 8 1 / 17 Outlne 1 Heterogenety n the slope coeffcents 2 Seemngly Unrelated Regresson (SUR) 3 Swamy s random
More informationWeek3, Chapter 4. Position and Displacement. Motion in Two Dimensions. Instantaneous Velocity. Average Velocity
Week3, Chapter 4 Moton n Two Dmensons Lecture Quz A partcle confned to moton along the x axs moves wth constant acceleraton from x =.0 m to x = 8.0 m durng a 1-s tme nterval. The velocty of the partcle
More informationCorrelation and Regression. Correlation 9.1. Correlation. Chapter 9
Chapter 9 Correlaton and Regresson 9. Correlaton Correlaton A correlaton s a relatonshp between two varables. The data can be represented b the ordered pars (, ) where s the ndependent (or eplanator) varable,
More informationSOME METHODS OF DETECTION OF OUTLIERS IN LINEAR REGRESSION MODEL
SOME METHODS OF DETECTION OF OUTLIERS IN LINEAR REGRESSION MODEL RANJIT KUMAR PAUL M. Sc. (Agrcultural Statstcs), Roll No. 4405 IASRI, Lbrary Avenue, New Delh-11001 Charperson: Dr. L. M. Bhar Abstract:
More informationECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics
ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott
More information[ ] λ λ λ. Multicollinearity. multicollinearity Ragnar Frisch (1934) perfect exact. collinearity. multicollinearity. exact
Multcollnearty multcollnearty Ragnar Frsch (934 perfect exact collnearty multcollnearty K exact λ λ λ K K x+ x+ + x 0 0.. λ, λ, λk 0 0.. x perfect ntercorrelated λ λ λ x+ x+ + KxK + v 0 0.. v 3 y β + β
More information4.3 Poisson Regression
of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)
More informationLaboratory 3: Method of Least Squares
Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth
More informatione i is a random error
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown
More informationChapter 14 Simple Linear Regression Page 1. Introduction to regression analysis 14-2
Chapter 4 Smple Lnear Regresson Page. Introducton to regresson analyss 4- The Regresson Equaton. Lnear Functons 4-4 3. Estmaton and nterpretaton of model parameters 4-6 4. Inference on the model parameters
More informationMethods of Detecting Outliers in A Regression Analysis Model.
Methods of Detectng Outlers n A Regresson Analyss Model. Ogu, A. I. *, Inyama, S. C+, Achugamonu, P. C++ *Department of Statstcs, Imo State Unversty,Owerr +Department of Mathematcs, Federal Unversty of
More informationA METHOD FOR DETECTING OUTLIERS IN FUZZY REGRESSION
OPERATIONS RESEARCH AND DECISIONS No. 2 21 Barbara GŁADYSZ* A METHOD FOR DETECTING OUTLIERS IN FUZZY REGRESSION In ths artcle we propose a method for dentfyng outlers n fuzzy regresson. Outlers n a sample
More informationUncertainty as the Overlap of Alternate Conditional Distributions
Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant
More informationLecture 6: Introduction to Linear Regression
Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6
More informationInfluence Diagnostics on Competing Risks Using Cox s Model with Censored Data. Jalan Gombak, 53100, Kuala Lumpur, Malaysia.
Proceedngs of the 8th WSEAS Internatonal Conference on APPLIED MAHEMAICS, enerfe, Span, December 16-18, 5 (pp14-138) Influence Dagnostcs on Competng Rsks Usng Cox s Model wth Censored Data F. A. M. Elfak
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationY = β 0 + β 1 X 1 + β 2 X β k X k + ε
Chapter 3 Secton 3.1 Model Assumptons: Multple Regresson Model Predcton Equaton Std. Devaton of Error Correlaton Matrx Smple Lnear Regresson: 1.) Lnearty.) Constant Varance 3.) Independent Errors 4.) Normalty
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationLecture 16 Statistical Analysis in Biomaterials Research (Part II)
3.051J/0.340J 1 Lecture 16 Statstcal Analyss n Bomaterals Research (Part II) C. F Dstrbuton Allows comparson of varablty of behavor between populatons usng test of hypothess: σ x = σ x amed for Brtsh statstcan
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More informationEmpirical Methods for Corporate Finance. Identification
mprcal Methods for Corporate Fnance Identfcaton Causalt Ultmate goal of emprcal research n fnance s to establsh a causal relatonshp between varables.g. What s the mpact of tangblt on leverage?.g. What
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationx yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.
The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator
More informationDiscussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek
Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson
More informationOnline Appendix to: Axiomatization and measurement of Quasi-hyperbolic Discounting
Onlne Appendx to: Axomatzaton and measurement of Quas-hyperbolc Dscountng José Lus Montel Olea Tomasz Strzaleck 1 Sample Selecton As dscussed before our ntal sample conssts of two groups of subjects. Group
More informationSIMPLE LINEAR REGRESSION and CORRELATION
Expermental Desgn and Statstcal Methods Workshop SIMPLE LINEAR REGRESSION and CORRELATION Jesús Pedrafta Arlla jesus.pedrafta@uab.cat Departament de Cènca Anmal dels Alments Items Correlaton: degree of
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationChapter 3. Two-Variable Regression Model: The Problem of Estimation
Chapter 3. Two-Varable Regresson Model: The Problem of Estmaton Ordnary Least Squares Method (OLS) Recall that, PRF: Y = β 1 + β X + u Thus, snce PRF s not drectly observable, t s estmated by SRF; that
More informationRESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA
Operatons Research and Applcatons : An Internatonal Journal (ORAJ), Vol.4, No.3/4, November 17 RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEAED MEASUREMEN DAA Munsr Al, Yu Feng, Al choo, Zamr
More informationThis column is a continuation of our previous column
Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard
More informationA LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS. Dr. Derald E. Wentzien, Wesley College, (302) ,
A LINEAR PROGRAM TO COMPARE MULTIPLE GROSS CREDIT LOSS FORECASTS Dr. Derald E. Wentzen, Wesley College, (302) 736-2574, wentzde@wesley.edu ABSTRACT A lnear programmng model s developed and used to compare
More informationInterval Estimation in the Classical Normal Linear Regression Model. 1. Introduction
ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model
More informationIndeterminate pin-jointed frames (trusses)
Indetermnate pn-jonted frames (trusses) Calculaton of member forces usng force method I. Statcal determnacy. The degree of freedom of any truss can be derved as: w= k d a =, where k s the number of all
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationLecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding
Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an
More informationTransfer Functions. Convenient representation of a linear, dynamic model. A transfer function (TF) relates one input and one output: ( ) system
Transfer Functons Convenent representaton of a lnear, dynamc model. A transfer functon (TF) relates one nput and one output: x t X s y t system Y s The followng termnology s used: x y nput output forcng
More informationStatistics II Final Exam 26/6/18
Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the
More informationLab 2e Thermal System Response and Effective Heat Transfer Coefficient
58:080 Expermental Engneerng 1 OBJECTIVE Lab 2e Thermal System Response and Effectve Heat Transfer Coeffcent Warnng: though the experment has educatonal objectves (to learn about bolng heat transfer, etc.),
More information