Indices of Distances: Characteristics and Detection of Abnormal Points

Size: px
Start display at page:

Download "Indices of Distances: Characteristics and Detection of Abnormal Points"

Transcription

1 Iteratioal Joural of Mathematics ad Computer Sciece, 8(2013), o. 2, M CS Idices of Distaces: Characteristics ad Detectio of Abormal Poits Hicham Y. Abdallah Departmet of Applied Mathematics Faculty of Sciece-1 Lebaese Uiversity Hadath, Lebao habdalah@ul.edu.lb (Received September 5, 2012, Accepted November 1, 2013) Abstract Gettig a robust regressio requires the detectio of abormal poits. I this article we give a solutio to this problem based o Cook s distace, DFFITS, DFBETAS, amog others. We the compare the bouds for those distaces which are used to detect abormal poit. 1 Itroductio Statistics is the sciece whose object is to collect, process ad aalyze data from the observatio of radom pheomea, that is to say where the accidet occurs. Data aalysis is used to describe the pheomea, make predictios ad decisios about them. For this, several statistical methods are available for these studies, but the most used is the regressio method. Despite its effectiveess, the problem of ifluetial poits ad outliers makes it less robust ad affects its optimum results. Our objective is to solve this problem by showig the differece betwee the ifluetial poits ad outliers. We begi our study with a defiitio of ifluetial poits ad outliers ad the we discuss the differet methods of detectio of abormal values. I Key words ad phrases: Outliers, ifluetial poits, distaces. AMS (MOS) Subject Classificatios: 62J07,62J12. ISSN c 2013,

2 56 H. Y. Abdallah additio, we propose a theoretical compariso betwee the differet idices to fid the most effective idex that helps us detect abormal poits. 2 Ifluetial Poits ad Outliers The study of residues y i yi ca idetify outliers observatios or commets that play a importat role i determiig regressio, where y is the predictio of y. The two types of abormal poits are outliers ad ifluetial poits. The first correspods to observatios outside the orm, while the secod to poits that weigh (urealistically) o estimates: if removed, the results would be differet sigificatly. There are several methods for detectig these values accordig to their types. Oce these observatios have bee idetified, it may be better to remove them or use other more robust criteria. 1 - Ifluetial Poits: A ifluetial item weighs heavily i the regressio; that is to say, the results are quite differet depedig o whether or ot the poit is take ito accout i the regressio. The problem of ifluetial values arises i busiess especially where surveys that collect ecoomic variables have distributios that are highly osymmetric. Ifluetial values are problematic because they geerally lead to ustable estimators. I other words, icludig or excludig a ifluetial sample value usually has a sigificat impact o the volatility of a estimator. It is possible to miimize their impact through a appropriate samplig pla. However, it is geerally ot possible to completely elimiate the problem of ifluetial values at each step of the pla. As a result, it is importat to develop robust estimatio methods i the presece of ifluetial values. 2 - Outliers: Before presetig cocepts related to outliers, it is ecessary to defie them more precisely as may authors have attempted to describe them ad the defiitios have chaged over time. Amog these authors, Grubbs has defied these values as follows: A outlier is a observatio that appears to deviate markedly compared to all other members of the sample i which it appears. I 1994, Barett ad Lewis defied a outlier i a data set as a observatio (or set of observatios) which appears to be icosistet with other data. That is, outliers correspod to observatios outside the orm of the populatio studied. These poits ca

3 Idices of Distaces 57 distort the results of the regressio. 3 The methods for detectio of abormal values 1 - Detectio of outliers: The idices that help us detect outliers are: a-stadardized Residue: e i This residue compares y with y the residue i the presece of the ith observatio. A abormally large value of a stadardized residual is cosidered suspected. More precisely, the model is correct 95% of the time if oe has: e i t(0.025; p) or e e i i = s. 1 h ii The observatios that qualify as outliers belog to the same regio: e i t(0.025; p). b-studet residue: e i This residue compares y with y without the ith observatio. I this case, a value of abormally large Studet residue is cosidered suspected. For such observatios we have: e i >t(0.025; p 1). I practice, we use the followig formulas: where s: stadard deviatio e i = e ( i) s( i) 1 h ii, e i = y i y i h ii = x i (X X) 1 x i where x i is the ith row of X, e ( i) is the Residue without the ith observatio, ad s ( i) is the Stadard Deviatio without the ith observatio.

4 58 H. Y. Abdallah The previous two criteria cotribute to detect potetially ifluetial observatios by their distace to the size of the residues. This iformatio is sythesized i the criteria directly assessig the ifluece of a observatio o some parameters. All these idicators suggest comparig a estimated parameter without the ith observatio ad this same parameter is estimated with all the observatios. C-treatmet of outliers We have already metioed the mai methods of outlier detectio to fially get to address these issues ad have a specific regressio model. Several methods of treatmet are: Reject: robustly elimiate extreme values determied followig the test mismatch. Icorporate: chage the distributio model i order to icorporate outliers. Idetify: keep outliers, sice they may represet particularly importat features. Accommodate: adopt statistical methods that miimize the impact of outliers o the statistical aalysis. 2-Detectio of ifluetial poits: The idices that help us detect ifluetial poits are as follows: a-leverage poit: I regressio aalysis, we call a Leverage poit a observatio i that sigificatly affects the estimators because its values over other variables differ much of the rest of the data ad the idicated distace betwee observatio i ad the ceter of gravity of the cloud of poits. Leverage poit h ii observatio is read from the mai diagoal of the matrix Hat Matrix, ad it appears as a measure of the ifluece of the ith observatio o its proper predictio. I practice, a observatio i is cosidered as a poit Leverage poit if h ii > 2(p+1). Note also that a observatio h ii approachig 1 is a observatio with a very importat Leverage poit. b-cook s distace: Cook s distace from a observatio is a measure of the ifluece of this observatio o all the set of predictios of a model. Oe calculates a distace betwee the vector β of coefficiets of the regressio ad the vector β(i) obtaied by repeatig the regressio without observatio i. Whe Cook s distace is ormalized, a value greater tha 1 is likely. 4 However, a limit of is ofte better as the calibratio of 1 ca p 1 permit ifluetial values.

5 Idices of Distaces 59 I practice, we have the followig two formulas i this case D i = i=1 (ŷ i ŷ i ( i)) 2 σ 2 ɛ (p +1) D i = ( β β ( i) ) (X X) 1 ( β β ( i)) σ ɛ 2 (p +1) c-dfbetas: The purpose of this method is to measure the ifluece of a poit o the estimated coefficiet. It is ormalized so as to be comparable from oe variable to aother. A suspected observatio is such that DFBETAS> 2 (). Note: If there are may variables, we first cosider the globally ifluetial observatios (Cook) ad the for this observatio the variable(s) causig that ifluece (DFBETAS). I practice, (DFBETAS) j i = β j β j ( i), s( i) (X X) 1 j,j where (X X) 1 j,j is the jth positio of the diagoal of (X X) Other criteria: a-the DFFITS: The DFFITS of a observatio is a measure of the ifluece of this observatio o predictio of its eigevalue by the model. It gives the differece betwee the adjusted value for observatio i ad the predicted value of y for i i the estimated model without this observatio i. We cosider a observatio is ifluetial whe: p +1 (DFBETAS) i > 2 or (DFBETAS) i > ŷi ŷ i ( i) s( i) h ii = e i hii 1 h ii b-covratio: The COVRATIO measures disparities betwee the precisio of the estimators; that is to say, the geeralized variace of estimators give by

6 60 H. Y. Abdallah Var( β) =s 2 det(x X) 1. The presece of the observatio i improves the accuracy i the sese that it reduces the variace estimators if COVRATIO> 1 Istead, COVRATIO <1 idicates that the presece of the observatio i degrades the variace. But the most commo detectio rule is: 3(p +1) COV RATIO i 1 >. V-Theoretical Compariso of distaces Based o the critical regios of these differet distaces, we ca choose the oe that gives the best detectio result. Rule: The method with the smallest critical regio is the most accurate. So compare these regios: For Leverage poit ad COVRATIO: If the critical regio for the Leverage poit 2(p+1) ad that of 2 COV RATIO 3(p+1) +1, the 2(p+1) < 3(p+1) + 1 ad so the Leverage poit is more accurate tha COVRATIO. For DFBETAS ad DFFITS: 2 If the critical regio for DFBETAS is p+1 ad that for DFFITS is 2, the 2 p+1 < 2 ad therefore DFBETAS is more accurate tha DFFITS. 2 For Cook s distace ad Leverage poit: The critical regio for the Leverage poit is 2(p+1) ad that of Cook s is 4 p 1 ad so 4 p 1 < 2(p+1). For Cook s distace ad DFBETAS: The critical regio is DFBETAS is 2 ad that of Cook s is 4 4 < 2 p 1. p 1 ad so Comparig the critical regio of the criteria already metioed above: 4 < 1 < 2ad 2 p 1 < 1 < 2ad 2(p+1) < 2 for p +1 <ad 3(p+1) +1> 2 shows that DCOOK is the best distace i detectig suspected poits.

7 Idices of Distaces 61 4 Relatioships betwee the distaces The purpose of this sectio is to explai the relatioships amog the differet distaces. Leverage poitage ad Cook The Leverage poit arm measures horizotally the differece betwee the observed poit ad the mea X of the explaatory variable. They deped oly o the values of that variable. As for Cook distaces, they measure somehow the overall importace of horizotal ad vertical gaps. I geeral, a poit ca be characterized by a sigificat residue, without beig very ifluetial, if the Leverage poit arm is ot very high. Similarly, a poit may have a large Leverage poit arm without beig particularly ifluetial, if the residue which is characterized is low. A poit is ot therefore ifluetial i the sese of Cook s distace if both, its residue ad Leverage poit arm are importat. DFFITS ad Cook s distace Cook s distace ad DFFITS deped o Leverage poit ad CookD ca be represeted as a fuctio of Leverage poit ad Studet residue ad eve DFFITS. This shows that the observatios with high Leverage poit are the highest values of DCook ad DFFITS the have a great ifluece o the predictios of the model. COVRATIO, DFFITS ad DFBETAS Observig the rule of COVRATIO ad those of DFFITS ad DFBETAS, we ote that they do ot deped o sample size while COVRATIO depeds o. DCook ad DFBETAS If there are may variables, we first look at the globally ifluetial observatios (DCook) ad the for these observatios with variable(s) causig that ifluece (DFBETAS). 5 Table summary The followig table summarizes the correspodig case detectio of abormal poits.

8 62 H. Y. Abdallah Stadardized residuals Studet residuals Leverage poit Cook s distace Result e i > 2 whereas i residue is sigificatly 0 e i > 2 whereas i observatio requires a ivestigatio h ii > 2(p+1) CookD>1 idicates a abormal effect DFBETAS DFBETAS > 2 COVRATIO COV RATIO 1 > 3(p+1) p+1 DFFITS DEFITS > 2 Purpose Large residue detectio thus atypical observatio Detectio of large residue thus atypical observatio Measure the ifluece of observatio i o the estimators Measure the effect of the removal of the observatio i o the predictio of values Measure the ifluece of a poit o the estimated coefficiet Measure the effect of the ith observatio accuracy Measure the ifluece of observatio i o the predictio of its eigevalue 6 Practical Applicatio To illustrate our study, we propose a real example for a example of 100 studets at the Faculty of Sciece of the Lebaese Uiversity takig as variable the average score i the Masters, first, secod ad third years. I order to detect outliers for this example regressio is performed i the first step to explai the marks i Masters accordig to the two explaatory variables are the grades i the first year ad those i the secod ad third year together ad, as a secod step, we determie the critical areas of idices cited i the study:

9 Idices of Distaces 63 Studet Major Master Average of Average of Number s Average(AM) secod ad first year third years (A1) (A23) 1 biology biology Chemical biochemistry biology biochemistry biology biochemistry biology Fudametal Electroics biology biology biochemistry biology biochemistry mathematical chemistrymolecular Molecular Chemistry 20 biochemistry Fudametal Electroics biochemistry Fudametal biology Biology: Elective Chemical Computer biology biochemistry biology biochemistry biochemistry

10 64 H. Y. Abdallah Studet Number Major Master Average of Average of s Average(AM) secod ad first year third years (A1) (A23) chemistrymolecular 35 Electroics biochemistry biochemistry Computer Fudametal Chemical biochemistry chemistrymolecular biochemistry biochemistry biology biology chemistrymolecular Chemistry optio Evirometal Scieces 49 biology biology biology Electroics biochemistry biology biology Computer biology Electroics biochemistry Computer biochemistry biochemistry biology chemistrymolecular Chemical

11 Idices of Distaces 65 Studet Number Major Master s Average(AM) Average of secod ad third years (A23) Average of first year (A1) 66 biology biochemistry biochemistry biology chemistrymolecular Fudametal chemistrymolecular Computer Fudametal biochemistry Chemical biochemistry Computer Electroics biochemistry mathematics biochemistry Electroics mathematics Electroics biochemistry Electroics biochemistry biochemistry Fudametal Computer biology Electroics biology Computer biochemistry biology biochemistry Electroics Biology: Elec

12 66 H. Y. Abdallah 7 Iterpretatio The proposed study has two variables (p = 2) ad 100 observatios, The model is obtaied AM= A A1. For a threshold usig the SAS software, we had the followig results: Usig the Studet residual ad stadardized residual, each observatio more tha two is a abormal fidig. By examiig the differet values??of the residues shows that the observatio of which 12 are medium (A1 = 57.67, A23 = 71.88, AM = 81.57) with a studet residue = is the first atypical value, observatio 78 (A1 = 60.42, A23 = 71.80, AM = 64.37) each have 2035 ad as the value of STUDENT RESIDUAL RSTUDENT ad is the secod atypical value. Usig the Leverage poit, each greater tha 2(p+1) = 2(2+1 =0.06 observatio 100 is a uusual observatio. The the software uses gives us the values??of the matrix diagram Hat. We takes a few examples: 6 (A1=79.92, A23=79.26, AM=81.18) ad the Leverage poit is (h=0,0611) ; 9 (A1=69.50, A23=79.20, AM=80.40) (h=0,0801) ; 48(A1=82.67, A23=83.88, AM=81.05) (h=0,0966)... 4 From Cook s distace, each more tha = 4 =0.041 observatio p is a uusual observatio. While examiig the colum COOK foud that 56 observatios whose average are (A1 = 77.67, A23 = 77.71, AM = 60.62) ad Cook (D = 0.056) ad 63 (A1 = 82.08, A23 = 83.12, AM = 87.97) (D = 0.080) ad 84 (A1 = 67.75, A23 = 61.10, AM = 57.30) (D = 0.045);) are just outliers. The dfbetas idicates that each variable has a value greater tha 2 =0.2 correspods to a uique value, the the variable A23 has a uique value 5 (A23 = 64.55),7 (63.13),12 (71.88), 63 (83.12),66 (70.55),84 (61.10),87 (58.47),90 (71.22) ad the A1 variable 5 (A1 = 67.00), 12 (57.67),25 (50.00), 45 (78.25), 66 (50.50), 84 (67.75),90 (53.92). Usig COVRATIO, each observatio has covratio 1 > 3(p+1) = 0.07 is a uusual observatio. The follows that the observatios 6 (Average A1 = 79.92, A23 = 79.26, AM = 81.18), 8 (A1 = 76.50, A23 = 77.54, AM = 76.00), 9 (A1 = 69.50, A23 = 79.20, AM = 80.40), 12 (A1 = 57.67, A23 = 71.88, AM = 81.57), 19 (A1 = 84.67, A23 = 78.58, AM = 77.13), 20 (A1 = 56.33, A23 = 74.92, AM = 77.62) are outliers. Fially, we ote that the outliers will differ from oe remote to aother this is due to the existig differece betwee the distaces. I our project, comparig outliers obtaied usig differet distaces shows that DCook is

13 Idices of Distaces 67 best. The results are give usig DCOOK as follows: For the idividual 56, the average i the first three years (A1 = 77.67, A23 = 77.71) is higher tha the average Masters (AM = 60.62) -for the idividual 63, the average i the first three years (A1 = 82.08, A23 = 83.12) is lower tha the average Masters (AM = 87.97) -for the idividual 84, the average i the first three years (A1 = 67.75, A23 = 61.10) is higher tha the average Masters (AM = 57.30) -for the idividual 87, the average i the first three years (A1 = 50.75, A23 = 58.47) is lower tha the average Masters (AM = 68.42) Thus, these results are cosistet with our study, as the fact that these poits are atypical of the variatio is i the opposite directio betwee the explaatory variables ad to explai that. 8 Coclusio The observatios cotaied i the databases must absolutely be validated because the appearace of outliers is ievitable because of the quality of data processed ad the various sources of errors durig acquisitio. To esure high quality iformatio, a search for stragglers or outliers must be doe before the use of databases. So this article has helped us to differetiate a outlier from ifluetial. We also studied several methods for the detectio of outliers by showig that the test suitable for the detectio of abormal poits is Cook s distace. Simulatios o large files are the subject of curret research. Refereces [1] Ricco Rakotomolala, Pratique de la régressio liéaire multiple (diagostic et sélectio de variable), uiversité lumière Lyo 2, [2] Ricco Rakotomolala, Poits atypiques et poits ifluets, régressio liéaire multiple, uiversité LYON 2, [3] David A Belsey, Edwi Kuhroy, Regressio diagostic idetifyig ifluetial data ad sources of coliearity, [4] Pierre Adré Comilo, Eric Matzer Lober, Regressio: théorie et applicatios, Spriger, 2007.

14 68 H. Y. Abdallah [5] Lauret Carraro, Itroductio a la régressio, [6] Mathieu Resche-Rigo, Outills de régressio, résidus, mesure d ifluece idividuelle, [7] Stéphae Cau, Diagostic de la régressio, [8] Philippe Besse, Pratique de la modélisatio statistique, 2000.

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 9 Multicolliearity Dr Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Multicolliearity diagostics A importat questio that

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable

Response Variable denoted by y it is the variable that is to be predicted measure of the outcome of an experiment also called the dependent variable Statistics Chapter 4 Correlatio ad Regressio If we have two (or more) variables we are usually iterested i the relatioship betwee the variables. Associatio betwee Variables Two variables are associated

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals

Sampling Error. Chapter 6 Student Lecture Notes 6-1. Business Statistics: A Decision-Making Approach, 6e. Chapter Goals Chapter 6 Studet Lecture Notes 6-1 Busiess Statistics: A Decisio-Makig Approach 6 th Editio Chapter 6 Itroductio to Samplig Distributios Chap 6-1 Chapter Goals After completig this chapter, you should

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + 62. Power series Defiitio 16. (Power series) Give a sequece {c }, the series c x = c 0 + c 1 x + c 2 x 2 + c 3 x 3 + is called a power series i the variable x. The umbers c are called the coefficiets of

More information

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers

Chapter If n is odd, the median is the exact middle number If n is even, the median is the average of the two middle numbers Chapter 4 4-1 orth Seattle Commuity College BUS10 Busiess Statistics Chapter 4 Descriptive Statistics Summary Defiitios Cetral tedecy: The extet to which the data values group aroud a cetral value. Variatio:

More information

Lecture 15: Learning Theory: Concentration Inequalities

Lecture 15: Learning Theory: Concentration Inequalities STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Final Examination Solutions 17/6/2010

Final Examination Solutions 17/6/2010 The Islamic Uiversity of Gaza Faculty of Commerce epartmet of Ecoomics ad Political Scieces A Itroductio to Statistics Course (ECOE 30) Sprig Semester 009-00 Fial Eamiatio Solutios 7/6/00 Name: I: Istructor:

More information

Statistical Fundamentals and Control Charts

Statistical Fundamentals and Control Charts Statistical Fudametals ad Cotrol Charts 1. Statistical Process Cotrol Basics Chace causes of variatio uavoidable causes of variatios Assigable causes of variatio large variatios related to machies, materials,

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

ANALYSIS OF EXPERIMENTAL ERRORS

ANALYSIS OF EXPERIMENTAL ERRORS ANALYSIS OF EXPERIMENTAL ERRORS All physical measuremets ecoutered i the verificatio of physics theories ad cocepts are subject to ucertaities that deped o the measurig istrumets used ad the coditios uder

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Chapter 2 Descriptive Statistics

Chapter 2 Descriptive Statistics Chapter 2 Descriptive Statistics Statistics Most commoly, statistics refers to umerical data. Statistics may also refer to the process of collectig, orgaizig, presetig, aalyzig ad iterpretig umerical data

More information

The target reliability and design working life

The target reliability and design working life Safety ad Security Egieerig IV 161 The target reliability ad desig workig life M. Holický Kloker Istitute, CTU i Prague, Czech Republic Abstract Desig workig life ad target reliability levels recommeded

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Lecture 24: Variable selection in linear models

Lecture 24: Variable selection in linear models Lecture 24: Variable selectio i liear models Cosider liear model X = Z β + ε, β R p ad Varε = σ 2 I. Like the LSE, the ridge regressio estimator does ot give 0 estimate to a compoet of β eve if that compoet

More information

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6)

STAT 350 Handout 19 Sampling Distribution, Central Limit Theorem (6.6) STAT 350 Hadout 9 Samplig Distributio, Cetral Limit Theorem (6.6) A radom sample is a sequece of radom variables X, X 2,, X that are idepedet ad idetically distributed. o This property is ofte abbreviated

More information

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS

A NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS Jural Karya Asli Loreka Ahli Matematik Vol. No. (010) page 6-9. Jural Karya Asli Loreka Ahli Matematik A NEW CLASS OF -STEP RATIONAL MULTISTEP METHODS 1 Nazeeruddi Yaacob Teh Yua Yig Norma Alias 1 Departmet

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions

MCT242: Electronic Instrumentation Lecture 2: Instrumentation Definitions Faculty of Egieerig MCT242: Electroic Istrumetatio Lecture 2: Istrumetatio Defiitios Overview Measuremet Error Accuracy Precisio ad Mea Resolutio Mea Variace ad Stadard deviatio Fiesse Sesitivity Rage

More information

There is no straightforward approach for choosing the warmup period l.

There is no straightforward approach for choosing the warmup period l. B. Maddah INDE 504 Discrete-Evet Simulatio Output Aalysis () Statistical Aalysis for Steady-State Parameters I a otermiatig simulatio, the iterest is i estimatig the log ru steady state measures of performace.

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

Power and Type II Error

Power and Type II Error Statistical Methods I (EXST 7005) Page 57 Power ad Type II Error Sice we do't actually kow the value of the true mea (or we would't be hypothesizig somethig else), we caot kow i practice the type II error

More information

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS

Lecture 5: Parametric Hypothesis Testing: Comparing Means. GENOME 560, Spring 2016 Doug Fowler, GS Lecture 5: Parametric Hypothesis Testig: Comparig Meas GENOME 560, Sprig 2016 Doug Fowler, GS (dfowler@uw.edu) 1 Review from last week What is a cofidece iterval? 2 Review from last week What is a cofidece

More information

Comparison of Methods for Estimation of Sample Sizes under the Weibull Distribution

Comparison of Methods for Estimation of Sample Sizes under the Weibull Distribution Iteratioal Joural of Applied Egieerig Research ISSN 0973-4562 Volume 12, Number 24 (2017) pp. 14273-14278 Research Idia Publicatios. http://www.ripublicatio.com Compariso of Methods for Estimatio of Sample

More information

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines)

Dr. Maddah ENMG 617 EM Statistics 11/26/12. Multiple Regression (2) (Chapter 15, Hines) Dr Maddah NMG 617 M Statistics 11/6/1 Multiple egressio () (Chapter 15, Hies) Test for sigificace of regressio This is a test to determie whether there is a liear relatioship betwee the depedet variable

More information

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION

[412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION [412] A TEST FOR HOMOGENEITY OF THE MARGINAL DISTRIBUTIONS IN A TWO-WAY CLASSIFICATION BY ALAN STUART Divisio of Research Techiques, Lodo School of Ecoomics 1. INTRODUCTION There are several circumstaces

More information

Data Analysis and Statistical Methods Statistics 651

Data Analysis and Statistical Methods Statistics 651 Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

ECON 3150/4150, Spring term Lecture 1

ECON 3150/4150, Spring term Lecture 1 ECON 3150/4150, Sprig term 2013. Lecture 1 Ragar Nymoe Uiversity of Oslo 15 Jauary 2013 1 / 42 Refereces to Lecture 1 ad 2 Hill, Griffiths ad Lim, 4 ed (HGL) Ch 1-1.5; Ch 2.8-2.9,4.3-4.3.1.3 Bårdse ad

More information

Bayesian Methods: Introduction to Multi-parameter Models

Bayesian Methods: Introduction to Multi-parameter Models Bayesia Methods: Itroductio to Multi-parameter Models Parameter: θ = ( θ, θ) Give Likelihood p(y θ) ad prior p(θ ), the posterior p proportioal to p(y θ) x p(θ ) Margial posterior ( θ, θ y) is Iterested

More information

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

Simple Linear Regression

Simple Linear Regression Chapter 2 Simple Liear Regressio 2.1 Simple liear model The simple liear regressio model shows how oe kow depedet variable is determied by a sigle explaatory variable (regressor). Is is writte as: Y i

More information

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic http://ijspccseetorg Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 A Relatioship Betwee the Oe-Way MANOVA Test Statistic ad the Hotellig Lawley Trace Test Statistic Hasthika S Rupasighe

More information

Measures of Spread: Standard Deviation

Measures of Spread: Standard Deviation Measures of Spread: Stadard Deviatio So far i our study of umerical measures used to describe data sets, we have focused o the mea ad the media. These measures of ceter tell us the most typical value of

More information

a. For each block, draw a free body diagram. Identify the source of each force in each free body diagram.

a. For each block, draw a free body diagram. Identify the source of each force in each free body diagram. Pre-Lab 4 Tesio & Newto s Third Law Refereces This lab cocers the properties of forces eerted by strigs or cables, called tesio forces, ad the use of Newto s third law to aalyze forces. Physics 2: Tipler

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion

µ and π p i.e. Point Estimation x And, more generally, the population proportion is approximately equal to a sample proportion Poit Estimatio Poit estimatio is the rather simplistic (ad obvious) process of usig the kow value of a sample statistic as a approximatio to the ukow value of a populatio parameter. So we could for example

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notatio Math 113 - Itroductio to Applied Statistics Name : Use Word or WordPerfect to recreate the followig documets. Each article is worth 10 poits ad ca be prited ad give to the istructor

More information

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 13, Part A Analysis of Variance and Experimental Design Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A)

REGRESSION (Physics 1210 Notes, Partial Modified Appendix A) REGRESSION (Physics 0 Notes, Partial Modified Appedix A) HOW TO PERFORM A LINEAR REGRESSION Cosider the followig data poits ad their graph (Table I ad Figure ): X Y 0 3 5 3 7 4 9 5 Table : Example Data

More information

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation

Confidence Interval for Standard Deviation of Normal Distribution with Known Coefficients of Variation Cofidece Iterval for tadard Deviatio of Normal Distributio with Kow Coefficiets of Variatio uparat Niwitpog Departmet of Applied tatistics, Faculty of Applied ciece Kig Mogkut s Uiversity of Techology

More information

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity

More information

(6) Fundamental Sampling Distribution and Data Discription

(6) Fundamental Sampling Distribution and Data Discription 34 Stat Lecture Notes (6) Fudametal Samplig Distributio ad Data Discriptio ( Book*: Chapter 8,pg5) Probability& Statistics for Egieers & Scietists By Walpole, Myers, Myers, Ye 8.1 Radom Samplig: Populatio:

More information

A PROCEDURE TO MODIFY THE FREQUENCY AND ENVELOPE CHARACTERISTICS OF EMPIRICAL GREEN'S FUNCTION. Lin LU 1 SUMMARY

A PROCEDURE TO MODIFY THE FREQUENCY AND ENVELOPE CHARACTERISTICS OF EMPIRICAL GREEN'S FUNCTION. Lin LU 1 SUMMARY A POCEDUE TO MODIFY THE FEQUENCY AND ENVELOPE CHAACTEISTICS OF EMPIICAL GEEN'S FUNCTION Li LU SUMMAY Semi-empirical method, which divides the fault plae of large earthquake ito mets ad uses small groud

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Trimmed Mean as an Adaptive Robust Estimator of a Location Parameter for Weibull Distribution

Trimmed Mean as an Adaptive Robust Estimator of a Location Parameter for Weibull Distribution World Academy of Sciece Egieerig ad echology Iteratioal Joural of Mathematical ad Computatioal Scieces Vol: No:6 008 rimmed Mea as a Adaptive Robust Estimator of a Locatio Parameter for Weibull Distributio

More information

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS

BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS BUSINESS STATISTICS (PART-9) AVERAGE OR MEASURES OF CENTRAL TENDENCY: THE GEOMETRIC AND HARMONIC MEANS. INTRODUCTION We have so far discussed three measures of cetral tedecy, viz. The Arithmetic Mea, Media

More information

Decomposition of Gini and the generalized entropy inequality measures. Abstract

Decomposition of Gini and the generalized entropy inequality measures. Abstract Decompositio of Gii ad the geeralized etropy iequality measures Mussard Stéphae LAMETA Uiversity of Motpellier I Terraza Michel LAMETA Uiversity of Motpellier I Seyte Fraçoise LAMETA Uiversity of Motpellier

More information

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}.

Example: Find the SD of the set {x j } = {2, 4, 5, 8, 5, 11, 7}. 1 (*) If a lot of the data is far from the mea, the may of the (x j x) 2 terms will be quite large, so the mea of these terms will be large ad the SD of the data will be large. (*) I particular, outliers

More information

On a Smarandache problem concerning the prime gaps

On a Smarandache problem concerning the prime gaps O a Smaradache problem cocerig the prime gaps Felice Russo Via A. Ifate 7 6705 Avezzao (Aq) Italy felice.russo@katamail.com Abstract I this paper, a problem posed i [] by Smaradache cocerig the prime gaps

More information

ADVANCED SOFTWARE ENGINEERING

ADVANCED SOFTWARE ENGINEERING ADVANCED SOFTWARE ENGINEERING COMP 3705 Exercise Usage-based Testig ad Reliability Versio 1.0-040406 Departmet of Computer Ssciece Sada Narayaappa, Aeliese Adrews Versio 1.1-050405 Departmet of Commuicatio

More information

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics

CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS. 8.1 Random Sampling. 8.2 Some Important Statistics CHAPTER 8 FUNDAMENTAL SAMPLING DISTRIBUTIONS AND DATA DESCRIPTIONS 8.1 Radom Samplig The basic idea of the statistical iferece is that we are allowed to draw ifereces or coclusios about a populatio based

More information

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram. Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,

More information

Output Analysis (2, Chapters 10 &11 Law)

Output Analysis (2, Chapters 10 &11 Law) B. Maddah ENMG 6 Simulatio Output Aalysis (, Chapters 10 &11 Law) Comparig alterative system cofiguratio Sice the output of a simulatio is radom, the comparig differet systems via simulatio should be doe

More information

UNIT 11 MULTIPLE LINEAR REGRESSION

UNIT 11 MULTIPLE LINEAR REGRESSION UNIT MULTIPLE LINEAR REGRESSION Structure. Itroductio release relies Obectives. Multiple Liear Regressio Model.3 Estimatio of Model Parameters Use of Matrix Notatio Properties of Least Squares Estimates.4

More information

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations

Section 1.1. Calculus: Areas And Tangents. Difference Equations to Differential Equations Differece Equatios to Differetial Equatios Sectio. Calculus: Areas Ad Tagets The study of calculus begis with questios about chage. What happes to the velocity of a swigig pedulum as its positio chages?

More information

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test.

Because it tests for differences between multiple pairs of means in one test, it is called an omnibus test. Math 308 Sprig 018 Classes 19 ad 0: Aalysis of Variace (ANOVA) Page 1 of 6 Itroductio ANOVA is a statistical procedure for determiig whether three or more sample meas were draw from populatios with equal

More information

Groupe de Recherche en Économie et Développement International. Cahier de Recherche / Working Paper 10-18

Groupe de Recherche en Économie et Développement International. Cahier de Recherche / Working Paper 10-18 Groupe de Recherche e Écoomie et Développemet Iteratioal Cahier de Recherche / Workig Paper 0-8 Quadratic Pe's Parade ad the Computatio of the Gii idex Stéphae Mussard, Jules Sadefo Kamdem Fraçoise Seyte

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Power Comparison of Some Goodness-of-fit Tests

Power Comparison of Some Goodness-of-fit Tests Florida Iteratioal Uiversity FIU Digital Commos FIU Electroic Theses ad Dissertatios Uiversity Graduate School 7-6-2016 Power Compariso of Some Goodess-of-fit Tests Tiayi Liu tliu019@fiu.edu DOI: 10.25148/etd.FIDC000750

More information

General IxJ Contingency Tables

General IxJ Contingency Tables page1 Geeral x Cotigecy Tables We ow geeralize our previous results from the prospective, retrospective ad cross-sectioal studies ad the Poisso samplig case to x cotigecy tables. For such tables, the test

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression

Outline. Linear regression. Regularization functions. Polynomial curve fitting. Stochastic gradient descent for regression. MLE for regression REGRESSION 1 Outlie Liear regressio Regularizatio fuctios Polyomial curve fittig Stochastic gradiet descet for regressio MLE for regressio Step-wise forward regressio Regressio methods Statistical techiques

More information

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence Chapter 8: Estimatig with Cofidece Sectio 8.2 The Practice of Statistics, 4 th editio For AP* STARNES, YATES, MOORE Chapter 8 Estimatig with Cofidece 8.1 Cofidece Itervals: The Basics 8.2 8.3 Estimatig

More information

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle the single best answer for each multiple choice question. Your choice should be made clearly. TEST #1 STA 4853 March 6, 2017 Name: Please read the followig directios. DO NOT TURN THE PAGE UNTIL INSTRUCTED TO DO SO Directios This exam is closed book ad closed otes. There are 32 multiple choice questios.

More information

Analysis of Experimental Data

Analysis of Experimental Data Aalysis of Experimetal Data 6544597.0479 ± 0.000005 g Quatitative Ucertaity Accuracy vs. Precisio Whe we make a measuremet i the laboratory, we eed to kow how good it is. We wat our measuremets to be both

More information

Chapter 12 - Quality Cotrol Example: The process of llig 12 ouce cas of Dr. Pepper is beig moitored. The compay does ot wat to uderll the cas. Hece, a target llig rate of 12.1-12.5 ouces was established.

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance

Hypothesis Testing. Evaluation of Performance of Learned h. Issues. Trade-off Between Bias and Variance Hypothesis Testig Empirically evaluatig accuracy of hypotheses: importat activity i ML. Three questios: Give observed accuracy over a sample set, how well does this estimate apply over additioal samples?

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Confidence Intervals

Confidence Intervals Cofidece Itervals Berli Che Deartmet of Comuter Sciece & Iformatio Egieerig Natioal Taiwa Normal Uiversity Referece: 1. W. Navidi. Statistics for Egieerig ad Scietists. Chater 5 & Teachig Material Itroductio

More information