Overdispersion study of poisson and zero-inflated poisson regression for some characteristics of the data on lamda, n, p

Size: px
Start display at page:

Download "Overdispersion study of poisson and zero-inflated poisson regression for some characteristics of the data on lamda, n, p"

Transcription

1 Iteratioal Joural of Advaces i Itelliget Iformatics ISSN: Overdispersio study of poisso ad zero-iflated poisso regressio for some characteristics of the data o lamda,, p Lili Puspita Rahayu a,1,*, Kusma Sadik b,2, Idahwati b,3 a Miistry of Educatio ad Culture of The Republic of Idoesia, Idoesia b Departmet of Statistics, Bogor Agriculture Uiversity, Idoesia 1 lilipuspitarahayu@gmail.com *; 2 kusmasadik@gmail.com; 3 idah.stk@gmail.com * correspodig author ARTICLE INFO AB ST R ACT Article history: Received November 19, 2016 Revised November 25, 2016 Accepted November 25, 2016 Keywords: Overdispersio Poisso Zero iflated Poisso Regressio Simulatio data Poisso distributio is oe of discrete distributio that is ofte used i modelig of rare evets. The data obtaied i form of couts with oegative itegers. Oe of aalysis that is used i modelig cout data is Poisso regressio. Deviatio of assumptio that ofte occurs i the Poisso regressio is overdispersio. Cause of overdispersio is a excess zero probability o the respose variable. Solvig model that be used to overcome of overdispersio is zero-iflated Poisso (ZIP) regressio. The research aimed to develop a study of overdispersio for Poisso ad ZIP regressio o some characteristics of the data. Overdispersio o some characteristics of the data that were studied i this research are simulated by combiig the parameter of Poisso distributio (), zero probability (p), ad sample size () o the respose variable the comparig the Poisso ad ZIP regressio models. Overdispersio study o data simulatio showed that the larger,, ad p, the better is the model of ZIP tha Poisso regressio. The results of this simulatio are also stregtheed by the exploratio of Pearso residual i Poisso ad ZIP regressio Copyright 2016 Iteratioal Joural of Advaces i Itelliget Iformatics. All rights reserved. I. Itroductio Poisso distributio is oe of the discrete distributio that is ofte used for modelig o rare occasios. The data obtaied i the form of couts with o-egative itegers. Oe form of aalysis used to model cout data is Poisso regressio. Poisso regressio aalysis showed a relatioship betwee the explaatory variables with the respose variable that spread Poisso. Poisso regressio has equidispersio assumptios, a coditio i which the mea ad variace of the respose variable equal value. Deviatio assumptios that ofte evets o Poisso regressio is overdispersio/ uderdispersio. Overdispersio is the variace greater tha the mea, while the value uderdispersio is the variace smaller tha mea value o respose variable. Applicatio about uderdispersio o Poisso regressio is rare evetrig, it is because there is o low variace value of the respose variable o real data [1]. Problem ofte ecoutered i Poisso regressio was overdispersio. This coditio is caused by the explaatory variable that ca t be explaied i the model, so it is possible the high variability of the respose variable caused by other variables. Cause of overdispersio that ofte evet i Poisso regressio is zero probability that excess o the respose variable. Oe result is the stadard deviatio of parameter estimate to be uderestimate ad the sigificace of the explaatory variables to be overstate, resultig ivalid coclusios [2]. The hadlig model ca be used to overcome overdipersio due to zero probability excess o the respose variable Poisso regressio such as models of hurdle Poisso regressio, zero-iflated Poisso (ZIP) regressio, ad Semiparametric hurdle Poisso regressio [3]. Hadlig model that will be used i this paper is the model of ZIP regressio, because it is more coveiet tha models of hurdle Poisso regressio ad Semiparametric hurdle Poisso. Superiority of ZIP regressio is very easily applied to several fields [4] such as agriculture, aimal husbadry, biostatistics, ad idustry. DOI: W : E : ifo@ijai.org

2 ISSN: Iteratioal Joural of Advaces i Itelliget Iformatics 141 I additio, estimate of the parameter o ZIP regressio model ca be iterpretatio easily, ad ca explai the reaso of the mea smaller i the respose variable. Research that has bee doe before, startig the ZIP regressio model developed as a solutio overdispersio hadlig of Poisso regressio models usig simulatio studies that Xie et. al. [5] usig a type II error i the simulatio with combiig the parameter of Poisso distributio, zero probability for sample size o the respose variable. Numa [6] developed a Wald test for compariso of Poisso ad ZIP regressio models. Wald test developmet performed simulatios with determiatio of zero probability o the respose variable based o the value parameter of the Poisso distributio. Developmet of the research that has bee doe previously, the researcher wated to study develops the overdispersio o some characteristics of the data. Overdispersio which will be examied i this study are simulated by combiig the value of the parameter of the Poisso distributio, zero probability, ad sample size o the respose variable. Furthermore, comparig Poisso ad ZIP regressio models based o exploratio of the respose variable, ad the evaluatio of the predictio models. Ay simulatio o characteristics data are expected to determie the cause of overdispersio o respose variable. II. Poissio Regressio Hardi ad Hilbe [7] stated that the Poisso regressio model provides a stadard framework for aalysis of data cout. Poisso regressio is a form of geeral liear model. Let y i, i=1,2,, represets cout of those rare occasios i the period with value of the parameter Poisso distributio lambda ( i). y i is a Poisso radom variable that spread by the mass fuctio probability of the followig f(y i ; i ) = e y i i i, y i = 0, 1, 2, y i! with assumig of Poisso regressio is E(Y i ) = Var(Y i ) = i If Poisso regressio is used for the coditio overdispersio, the the result is ot exact because the value of mea ad variace Poisso regressio cotai dispersio Var (Y) = τ E (Y) = τ with τ is ratio of dispersio. Whe there was overdispersio o Poisso regressio the value of τ is more tha oe ad costat. Dispersio is a measure of variat of a group of data to the mea data. Small dispersio values showed a homogeeous variety i the data, while the big dispersio values idicate heterogeeity i data. Method to estimate parameters of Poisso regressio coefficiets are maximum likelihood method. Suppose X is explaatory variable that size matrix x (p+1). Radom variables y i ad i is a row vector of X, will be liked with the log lik fuctio. l i i Xb i exp Xb The model i (5) is a Poisso regressio model with parameter b is the coefficiet estimate. III. Zero Iflated Poisso Regressio Jasakul ad Hide [1] state that if the Y i are idepedet radom variables that have a ZIP distributio, the the value of zero is assumed to arise from the same two steps. The first step

3 142 Iteratioal Joural of Advaces i Itelliget Iformatics ISSN: evets o probability that oly produces zero observatios deoted by pi. The secod step evets o the probability that result of data cout spread Poisso with parameter is deoted by (1- pi).i geeral, the zero value of the first step is called structural zeros, ad the zero value of the secod step is called samplig zeros. Variables Yi ~ ZIP i, pi have assumptios o the ZIP regressio 1 E Y p ad Var Y i i i i p 1 pi i 2 i i i variables Y i has overdispersio p i>0 if, the overdispersio will reduce to Poisso models whe p i=0. Value p i>0 explais that there is a icrease value of zero o the respose variable. Method to estimate parameters of ZIP regressio coefficiets is maximum likelihood method. Loglikelihood fuctio for observatios y 1,,y o ZIP regressio model is used to simplify the calculatio to get parameter estimate coefficiet. Maximizig of log-likelihood fuctio will give the same result as maximizig the likelihood fuctio. ZIP regressio models of divided ito two compoets, amely discrete data models for ad zero-iflatio models for p. If the explaatory variables used i l ad logit model o the same value, the the ZIP regressio model is l Xb p l Xg 1 p with X is the matrix of explaatory variables, while b ad g is the vector of parameter estimate of ZIP regressio coefficiets, each sized (q+1)x1 ad (r+1)x1 i equatio (8). Explaatory variables used i the model l be the same or differet from the explaatory variables used i the logit model. Maximum likelihood estimatio for b ad g are obtaied by usig the expectatio maximizatio (EM) algorithm which provides a simple way, so that it ca be applied to stadard software to match the geeral liear model. IV. Desig of Simulatio The data used i this study is the simulatio data. Simulatio data was geerated based o the characteristics of the data. Characteristics of the data i the form of lambda () startig from = 0.6, 0.8, 1, 6, 8, 10, ad 20, the zero probability (p) are p=0.1, 0.3, 0.5, ad 0.7, ad sample size () are =,, ad. The data geerated are useful to obtai parameter estimators of Poisso ad ZIP regressio. The coefficiet of regressio parameters were determied are 0=3, ad 1=0.01. Variables were determied to make the Poisso ad ZIP regressio models are explaatory variable (X), the respose variable (Y). Variable X cosists of variable X which is a ormal radom variable spreads (μ,1). Variable X is assumed as a fixed variable. Geeratig variables X ad Y o simulatio study carried out by stages, amely: 1. Geeratig variable Y based o the value of,, p have bee determied. 2. Geeratig variable X is with the first loop, are: Separate variables Y become Y zero ad Y ot zero. Trasform variable X with formula x i= (l (y i) - 0)/ 1, which is y i from variable Y ot zero.

4 ISSN: Iteratioal Joural of Advaces i Itelliget Iformatics 143 Iitialize the result of trasformatio variable X as X ot zero. 3. Geeratig variable X o variable Y ot zero ad Y zero with the secod loop are If the variable Y is zero, the the variable x i is gotte by samplig with replacemet o variable X ot zero. If the variable Y is ot zero, the the variables x i is geerated from Normal distributio with mea of trasformatio result from 2(ii) ad variace is 1 with sample size =1. Simulatio data o the variables X ad Y geerated by the software program R ver ad will be repeated r= replicatios. There are 84 simulatio coditios used i this study. The accuracy of parameter estimators i the Poisso ad ZIP regressio models ca be see from the relative absolute bias (RAB) ad relative root mea square error (RRMSE) [8]. Furthermore, the accuracy of the estimate y i Poisso ad ZIP regressio models ca be see from Pearso residual (PR) ad sum of absolute Pearso residual (SAPR) [9]. The equatio of value RAB, RRMSE, PR, ad SAPR are defied respectively i (9), (10), (11), ad (12). - RAB = i i=1 x% (9) θ )2 RRMSE = ( i i=1 % (10) PR i = j=1 (y ij y ij ) Var(Y) the SAPR = i=1 PR i (12) where r is the umber of simulatio replicatios, i is the parameter estimators of Poisso ad ZIP regressio i th, is the actual parameter. The, y ij is the respose variable o repeat to-i ad observatios to- j, ad y ij is the estimate y of the Poisso ad ZIP regressio model o repeat to-i ad observatios to- j, ad Var (Y) is the estimate variace of Poisso ad ZIP regressio. The, the smaller values of RAB, RRMSE, ad SAPR o regressio model ca be said to be gettig better. (11) V. Results Simulatio study cosisted of 84 cases of simulatio which is a characteristic of data from the combiatio of,, ad p. Simulatios were performed to evaluate the results of estimatig the parameters of the Poisso ad ZIP regressio usig percetage of ARB, RRMSE, ad average of SAPR. The value obtaied from the simulatio was repeated times. The evaluatio results of the simulatio data to be clarified with the results of exploratio ad testig of the variable Y. A. Exploratio ad testig at the variable Y Characteristics of data simulatio agaits,, ad p is tested idicates that the rise of the value of p effect o. Value of 0.6 with p that is tested 0.3, the the variables Y produces rage p of 0.3 to 0.5. This is because the that has small value still has p from Poisso distributio relatively large. This statemet ca be explaied by the cumulative Poisso distributio table. Value is tested to 0.6, 0.8, 1, 6, ad 8 still have zero probability of a Poisso distributio, while for the other is tested, amely 10 ad 20 already have ot zero probability of a Poisso distributio. Exploratio variable Y to,, ad p is tested idicatio excess zero probability, so that the ecessary tests o variables Y. The test is i the form of scores test ad chi-square test were able to geeralize the geeral coclusios o the results of exploratio at variables Y. The coditio of excess zero probability at the variable Y due to overdispersio. Fly ad Fracis [10] state that whe the score test results value of zero excess o a variable, the chaces are the variables do ot spread Poisso distributio, but has ZIP distributio. The score test results with α of 0.05 o simulatio data at variable Y agaist a combiatio of,, p are show i Table 1. Scores test idicate that the larger,, ad p is tested, the the larger percetage of excess zero at the variable Y.

5 144 Iteratioal Joural of Advaces i Itelliget Iformatics ISSN: Table 1. Percetage of score test o combiatio of,, p (%) Furthermore, the results of chi-square test with α of 0.05 for the Poisso ad ZIP distributio agaist the combiatio of,, p are show i Table 2. The chi-square test for Poisso distributio shows that the larger,, ad p is tested, the percetage Poisso distributio will be smaller at the variable Y. The results of the score test ad the chi-square test for Poisso distributio is iversely proportioal to the larger,, ad p is tested. Chi-square test for ZIP distributio idicates that ZIP regressio able to overcome overdispersio due to excess p at the variable Y. This coditio is idicated by the larger value of, the the percetage Poisso distributio reaches 0%, while the percetage of ZIP distributio i the rage of 60% to 80%. Table 2. Percetage of chi-square test o combiatio of,, p (%) Poisso ZIP Poisso ZIP Poisso ZIP Poisso ZIP

6 ISSN: Iteratioal Joural of Advaces i Itelliget Iformatics 145 B. The testig overdipserio o Poisso ad ZIP regressio Results of exploratio ad testig of the simulatio variables Y is a step that must be checked before performig Poisso regressio aalysis. Whe the variable Y has the characteristics of the data that led to overdispersio due to excessi zero value, the ZIP regressio became oe of the completio of the Poisso regressio. Overdipersio coditios o ay combiatio of,, ad p are tested i Poisso ad ZIP regressio ca be traced from the dispersio ratio (τ) ad the Pearso chi-square test at 5% sigificace level. Ratio of τ shows the value of a statistical result Pearso chi-square test of the degrees of freedom (-k). The value of degrees of freedom Poisso ad ZIP regressio is differet, because the Poisso regressio usig k=2, are the parameter estimate b 0 ad b 1. ZIP regressio usig k=4 is based o discrete models for ad zero-iflatio modesl for p are g 0 ad g 1, the l 0 ad l 1. The average ratio of τ obtaied from times of repetitio of the combiatio of,, ad p is tested i Table 3. The overdispersio idicated by the ratio τ is greater tha oe. The ratio τ o Poisso regressio will be compared with ZIP regressio. The ratio τ most at risk are at p=0.7 i Poisso regressio with the value is the larger. The ratio τ of the Poisso regressio showed that the larger ad p, the the ratio τ more tha oe i every is tested. Poisso regressio suffered overdispersio by the larger, p, ad is tested. The ratio τ of ZIP regressio has a value of less tha oe i every, p, ad is tested, so ZIP regressio able to overcome overdispersio caused excess zero probability at the variable Y. Table 3. Dispersio ratio o Poisso ad ZIP regressio Poisso ZIP Poisso ZIP Poisso ZIP Poisso ZIP * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Overdispersio The percetage of overdispersio have the similar result with the ratio of τ. Pearso chi-square test that sigificat shows overdispersio. Value of overdispersio cotaied i Table 4 show that i every, p, ad is tested to get overdispersio value reachig 0% for ZIP regressio. These results idicate that ZIP regressio is better to hadle overdispersio caused excess zero probability o the variable Y. The percetage of overdispersio o Poisso regressio showed that the larger ad p, the the greater overdispersio i each is tested that idicated by the value reached %. These results are cosistet with the score test ad the chi-square test that idicates the larger,, ad p is tested, the the greater zero probability that appears ad the less spread Poisso at the variable Y.

7 146 Iteratioal Joural of Advaces i Itelliget Iformatics ISSN: Table 4. Percetage of overdispersio o Poisso ad ZIP regressio Poisso ZIP Poisso ZIP Poisso ZIP Poisso ZIP * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * 0.0 * Overdipersio C. Evaluatio of the estimatio o Poisso ad ZIP regressio model The goodess of fit o Poisso ad ZIP regressio model showed that the larger,, ad p is tested, the ZIP regressio is better tha Poisso regressio. Furthermore, the compariso of Poisso ad ZIP regressio models is take based o the evaluatio parameter estimate ad y. The average of ARB for each combiatio of,, ad p are tested o estimate θ 1 i Table 5. The average of ARB o ZIP regressio i Table 5 produces a value is smaller as elargemet,, ad p is tested agaist estimate θ 1. The value of ARB o Poisso regressio has a miimum value that is cotaied i = 6. The average of ARB o estimate θ 1 idicates that Poisso regressio is better tha ZIP regressio o the value of are 0.6, 0.8, ad 1 i each of p ad is tested. Value of are 6, 8, 10, ad 20 i each of the p ad is tested showed that ZIP regressio is better tha Poisso regressio. Table 5. Average of ARB agaits estimate θ 1 o Poisso ad ZIP regressio Poisso ZIP Poisso ZIP Poisso ZIP Poisso ZIP * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Average of ARB is smaller The result of RRMSE is similar with the ARB o estimate θ 1 i Table 6. MSE cotais two compoets, amely variace estimate (accuracy) ad the bias (accuracy) [11]. The estimatio with

8 ISSN: Iteratioal Joural of Advaces i Itelliget Iformatics 147 good character of MSE is that ca cotrols the variace ad bias. The big value of RRMSE show the big variace estimate, so the risk to the estimatio results, the accuracy estimatio is lower. Table 6. Average of RRMSE agaits estimate θ 1 o Poisso ad ZIP regressio Poisso ZIP Poisso ZIP Poisso ZIP Poisso ZIP * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Average of RRMSE is smaller The average of SAPR o the estimate y is gotte by Poisso ad ZIP regressio models. The estimate y i ZIP regressio model usig two models are discrete model for ad zero-iflatio model for p. I Table 7 shows that the larger,, ad p is tested, the the larger the value of SAPR. ZIP Regressio have the average of SAPR that is smaller tha Poisso regressio i every,, ad p is tested. Table 7. Average of SAPR agaits estimate y o Poisso ad ZIP regressio Poisso ZIP Poisso ZIP Poisso ZIP Poisso ZIP * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Average of SAPR is smaller

9 148 Iteratioal Joural of Advaces i Itelliget Iformatics ISSN: A good estimator should have a small bias ad variace. Poisso ad ZIP regressio compared based o the ability to cotrol bias ad variace estimate to get the high accuracy estimatio. Bias ad variace of Poisso ad ZIP regressio parameter estimate show o average of ARB ad RRMSE. The, to determie the predictability of Poisso ad ZIP regressio models show o average of SAPR o the value estimate y. Value of ARB ad RRMSE o estimate θ 1 i Poisso ad ZIP regressio idicates that the larger,, ad p is tested, the ZIP regressio is better tha Poisso regressio. The average of SAPR estimate y from the Poisso ad ZIP regressio model showed that ZIP regressio is better tha Poisso regressio i every,, ad p is tested. ZIP regressio is better tha Poisso regressio i solvig the overdispersio due to the may of excess zero value at the variable Y. The estimatio evaluatio results of Poisso ad ZIP regressio model i accordace with the results of the testig overdispersio, exploratio ad testig of the variable Y. VI. Coclusio Overdispersio study of the simulatio data from combiatio of,, p that is tested idicates that the greater,, ad p, the the scores test get excess the zero probability are gettig bigger ad chisquared test get percetage of Poisso distributio is gettig smaller. The compariso showed that ZIP regressio is better tha Poisso regressio based o the percetage of overdispersio ad ratio of dispersio, the value of ARB ad RRMSE o estimate θ 1, ad the average of SAPR o estimate y, alog with the growig,, ad p that is tested. VII. Ope Problem The desig of simulatio performed i this study started from geerate variable Y ad the variable X is based o the determiatio of parameters, so there is the relatioship betwee the two variables. I a subsequet study to do differet phases of desig simulatio ad there is a residual effect. Refereces [1] N. Jasakul ad J. P. Hide, Score tests for zero-iflated Poisso models, Comput. Stat. Data Aal., vol. 40, o. 1, pp , [2] N. Ismail ad A. A. Jemai, Hadlig overdispersio with egative biomial ad geeralized Poisso regressio models, i Casualty Actuarial Society Forum, 2007, pp [3] M. Ridout, C. G. B. Demétrio, ad J. Hide, Models for cout data with may zeros, i Proceedigs of the XIXth iteratioal biometric coferece, 1998, vol. 19, pp [4] D. Lambert, Zero-iflated Poisso regressio, with a applicatio to defects i maufacturig, Techometrics, vol. 34, o. 1, pp. 1 14, [5] M. Xie, B. He, ad T. N. Goh, Zero-iflated Poisso model i statistical process cotrol, Comput. Stat. Data Aal., vol. 38, o. 2, pp , [6] S. Numa, Aalysis of extra zero couts usig zero-iflated Poisso models, Price of Sogkla Uiversity, [7] J. W. Hardi, J. M. Hilbe, ad J. Hilbe, Geeralized liear models ad extesios. Stata press, [8] R. Savic ad M. Lavielle, Performace i populatio models for cout data, part II: a ew SAEM algorithm, J. Pharmacokiet. Pharmacody., vol. 36, o. 4, pp , [9] A. C. Camero ad P. K. Trivedi, Regressio aalysis of cout data, vol. 53. Cambridge uiversity press, [10] M. Fly ad L. A. Fracis, More flexible GLMs zero-iflated models ad hybrid models, Casualty Actuar. Soc, vol. 2009, pp , [11] G. Casella ad R. L. Berger, Statistical iferece, vol. 2. Duxbury Pacific Grove, CA, 2002.

Describing the Relation between Two Variables

Describing the Relation between Two Variables Copyright 010 Pearso Educatio, Ic. Tables ad Formulas for Sulliva, Statistics: Iformed Decisios Usig Data 010 Pearso Educatio, Ic Chapter Orgaizig ad Summarizig Data Relative frequecy = frequecy sum of

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 13, Part A Analysis of Variance and Experimental Design Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

More information

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution

The Sampling Distribution of the Maximum. Likelihood Estimators for the Parameters of. Beta-Binomial Distribution Iteratioal Mathematical Forum, Vol. 8, 2013, o. 26, 1263-1277 HIKARI Ltd, www.m-hikari.com http://d.doi.org/10.12988/imf.2013.3475 The Samplig Distributio of the Maimum Likelihood Estimators for the Parameters

More information

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND.

MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. XI-1 (1074) MOST PEOPLE WOULD RATHER LIVE WITH A PROBLEM THEY CAN'T SOLVE, THAN ACCEPT A SOLUTION THEY CAN'T UNDERSTAND. R. E. D. WOOLSEY AND H. S. SWANSON XI-2 (1075) STATISTICAL DECISION MAKING Advaced

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010, 2007, 2004 Pearso Educatio, Ic. Comparig Two Proportios Read the first two paragraphs of pg 504. Comparisos betwee two percetages are much more commo

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc. Chapter 22 Comparig Two Proportios Copyright 2010 Pearso Educatio, Ic. Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more

More information

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract Goodess-Of-Fit For The Geeralized Expoetial Distributio By Amal S. Hassa stitute of Statistical Studies & Research Cairo Uiversity Abstract Recetly a ew distributio called geeralized expoetial or expoetiated

More information

Lecture 7: Properties of Random Samples

Lecture 7: Properties of Random Samples Lecture 7: Properties of Radom Samples 1 Cotiued From Last Class Theorem 1.1. Let X 1, X,...X be a radom sample from a populatio with mea µ ad variace σ

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to:

STA Learning Objectives. Population Proportions. Module 10 Comparing Two Proportions. Upon completing this module, you should be able to: STA 2023 Module 10 Comparig Two Proportios Learig Objectives Upo completig this module, you should be able to: 1. Perform large-sample ifereces (hypothesis test ad cofidece itervals) to compare two populatio

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Sample Size Determination (Two or More Samples)

Sample Size Determination (Two or More Samples) Sample Sie Determiatio (Two or More Samples) STATGRAPHICS Rev. 963 Summary... Data Iput... Aalysis Summary... 5 Power Curve... 5 Calculatios... 6 Summary This procedure determies a suitable sample sie

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION

CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION CHAPTER 4 BIVARIATE DISTRIBUTION EXTENSION 4. Itroductio Numerous bivariate discrete distributios have bee defied ad studied (see Mardia, 97 ad Kocherlakota ad Kocherlakota, 99) based o various methods

More information

A proposed discrete distribution for the statistical modeling of

A proposed discrete distribution for the statistical modeling of It. Statistical Ist.: Proc. 58th World Statistical Cogress, 0, Dubli (Sessio CPS047) p.5059 A proposed discrete distributio for the statistical modelig of Likert data Kidd, Marti Cetre for Statistical

More information

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N.

3/3/2014. CDS M Phil Econometrics. Types of Relationships. Types of Relationships. Types of Relationships. Vijayamohanan Pillai N. 3/3/04 CDS M Phil Old Least Squares (OLS) Vijayamohaa Pillai N CDS M Phil Vijayamoha CDS M Phil Vijayamoha Types of Relatioships Oly oe idepedet variable, Relatioship betwee ad is Liear relatioships Curviliear

More information

Probability and statistics: basic terms

Probability and statistics: basic terms Probability ad statistics: basic terms M. Veeraraghava August 203 A radom variable is a rule that assigs a umerical value to each possible outcome of a experimet. Outcomes of a experimet form the sample

More information

Mathematical Modeling of Optimum 3 Step Stress Accelerated Life Testing for Generalized Pareto Distribution

Mathematical Modeling of Optimum 3 Step Stress Accelerated Life Testing for Generalized Pareto Distribution America Joural of Theoretical ad Applied Statistics 05; 4(: 6-69 Published olie May 8, 05 (http://www.sciecepublishiggroup.com/j/ajtas doi: 0.648/j.ajtas.05040. ISSN: 6-8999 (Prit; ISSN: 6-9006 (Olie Mathematical

More information

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity

More information

STA6938-Logistic Regression Model

STA6938-Logistic Regression Model Dr. Yig Zhag STA6938-Logistic Regressio Model Topic -Simple (Uivariate) Logistic Regressio Model Outlies:. Itroductio. A Example-Does the liear regressio model always work? 3. Maximum Likelihood Curve

More information

Bayesian and E- Bayesian Method of Estimation of Parameter of Rayleigh Distribution- A Bayesian Approach under Linex Loss Function

Bayesian and E- Bayesian Method of Estimation of Parameter of Rayleigh Distribution- A Bayesian Approach under Linex Loss Function Iteratioal Joural of Statistics ad Systems ISSN 973-2675 Volume 12, Number 4 (217), pp. 791-796 Research Idia Publicatios http://www.ripublicatio.com Bayesia ad E- Bayesia Method of Estimatio of Parameter

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Simulation. Two Rule For Inverting A Distribution Function

Simulation. Two Rule For Inverting A Distribution Function Simulatio Two Rule For Ivertig A Distributio Fuctio Rule 1. If F(x) = u is costat o a iterval [x 1, x 2 ), the the uiform value u is mapped oto x 2 through the iversio process. Rule 2. If there is a jump

More information

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic http://ijspccseetorg Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 A Relatioship Betwee the Oe-Way MANOVA Test Statistic ad the Hotellig Lawley Trace Test Statistic Hasthika S Rupasighe

More information

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable Iteratioal Joural of Probability ad Statistics 01, 1(4: 111-118 DOI: 10.593/j.ijps.010104.04 Estimatio of Populatio Mea Usig Co-Efficiet of Variatio ad Media of a Auxiliary Variable J. Subramai *, G. Kumarapadiya

More information

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain

Since X n /n P p, we know that X n (n. Xn (n X n ) Using the asymptotic result above to obtain an approximation for fixed n, we obtain Assigmet 9 Exercise 5.5 Let X biomial, p, where p 0, 1 is ukow. Obtai cofidece itervals for p i two differet ways: a Sice X / p d N0, p1 p], the variace of the limitig distributio depeds oly o p. Use the

More information

Access to the published version may require journal subscription. Published with permission from: Elsevier.

Access to the published version may require journal subscription. Published with permission from: Elsevier. This is a author produced versio of a paper published i Statistics ad Probability Letters. This paper has bee peer-reviewed, it does ot iclude the joural pagiatio. Citatio for the published paper: Forkma,

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Linear Regression Models

Linear Regression Models Liear Regressio Models Dr. Joh Mellor-Crummey Departmet of Computer Sciece Rice Uiversity johmc@cs.rice.edu COMP 528 Lecture 9 15 February 2005 Goals for Today Uderstad how to Use scatter diagrams to ispect

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9

PSYCHOLOGICAL RESEARCH (PYC 304-C) Lecture 9 Hypothesis testig PSYCHOLOGICAL RESEARCH (PYC 34-C Lecture 9 Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

NCSS Statistical Software. Tolerance Intervals

NCSS Statistical Software. Tolerance Intervals Chapter 585 Itroductio This procedure calculates oe-, ad two-, sided tolerace itervals based o either a distributio-free (oparametric) method or a method based o a ormality assumptio (parametric). A two-sided

More information

11 Correlation and Regression

11 Correlation and Regression 11 Correlatio Regressio 11.1 Multivariate Data Ofte we look at data where several variables are recorded for the same idividuals or samplig uits. For example, at a coastal weather statio, we might record

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract

Goodness-Of-Fit For The Generalized Exponential Distribution. Abstract Goodess-Of-Fit For The Geeralized Expoetial Distributio By Amal S. Hassa stitute of Statistical Studies & Research Cairo Uiversity Abstract Recetly a ew distributio called geeralized expoetial or expoetiated

More information

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions

Assessment and Modeling of Forests. FR 4218 Spring Assignment 1 Solutions Assessmet ad Modelig of Forests FR 48 Sprig Assigmet Solutios. The first part of the questio asked that you calculate the average, stadard deviatio, coefficiet of variatio, ad 9% cofidece iterval of the

More information

1 Models for Matched Pairs

1 Models for Matched Pairs 1 Models for Matched Pairs Matched pairs occur whe we aalyse samples such that for each measuremet i oe of the samples there is a measuremet i the other sample that directly relates to the measuremet i

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Explorig Data: Distributios Look for overall patter (shape, ceter, spread) ad deviatios (outliers). Mea (use a calculator): x = x 1 + x 2 + +

More information

Extreme Value Charts and Analysis of Means (ANOM) Based on the Log Logistic Distribution

Extreme Value Charts and Analysis of Means (ANOM) Based on the Log Logistic Distribution Joural of Moder Applied Statistical Methods Volume 11 Issue Article 0 11-1-01 Extreme Value Charts ad Aalysis of Meas (ANOM) Based o the Log Logistic istributio B. Sriivasa Rao R.V.R & J.C. College of

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting

Lecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions

Overview. p 2. Chapter 9. Pooled Estimate of. q = 1 p. Notation for Two Proportions. Inferences about Two Proportions. Assumptions Chapter 9 Slide Ifereces from Two Samples 9- Overview 9- Ifereces about Two Proportios 9- Ifereces about Two Meas: Idepedet Samples 9-4 Ifereces about Matched Pairs 9-5 Comparig Variatio i Two Samples

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution.

Important Formulas. Expectation: E (X) = Σ [X P(X)] = n p q σ = n p q. P(X) = n! X1! X 2! X 3! X k! p X. Chapter 6 The Normal Distribution. Importat Formulas Chapter 3 Data Descriptio Mea for idividual data: X = _ ΣX Mea for grouped data: X= _ Σf X m Stadard deviatio for a sample: _ s = Σ(X _ X ) or s = 1 (Σ X ) (Σ X ) ( 1) Stadard deviatio

More information

(all terms are scalars).the minimization is clearer in sum notation:

(all terms are scalars).the minimization is clearer in sum notation: 7 Multiple liear regressio: with predictors) Depedet data set: y i i = 1, oe predictad, predictors x i,k i = 1,, k = 1, ' The forecast equatio is ŷ i = b + Use matrix otatio: k =1 b k x ik Y = y 1 y 1

More information

Lecture 3. Properties of Summary Statistics: Sampling Distribution

Lecture 3. Properties of Summary Statistics: Sampling Distribution Lecture 3 Properties of Summary Statistics: Samplig Distributio Mai Theme How ca we use math to justify that our umerical summaries from the sample are good summaries of the populatio? Lecture Summary

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2018 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Modified Ratio Estimators Using Known Median and Co-Efficent of Kurtosis

Modified Ratio Estimators Using Known Median and Co-Efficent of Kurtosis America Joural of Mathematics ad Statistics 01, (4): 95-100 DOI: 10.593/j.ajms.01004.05 Modified Ratio s Usig Kow Media ad Co-Efficet of Kurtosis J.Subramai *, G.Kumarapadiya Departmet of Statistics, Podicherry

More information

Correlation Regression

Correlation Regression Correlatio Regressio While correlatio methods measure the stregth of a liear relatioship betwee two variables, we might wish to go a little further: How much does oe variable chage for a give chage i aother

More information

Stat 139 Homework 7 Solutions, Fall 2015

Stat 139 Homework 7 Solutions, Fall 2015 Stat 139 Homework 7 Solutios, Fall 2015 Problem 1. I class we leared that the classical simple liear regressio model assumes the followig distributio of resposes: Y i = β 0 + β 1 X i + ɛ i, i = 1,...,,

More information

GG313 GEOLOGICAL DATA ANALYSIS

GG313 GEOLOGICAL DATA ANALYSIS GG313 GEOLOGICAL DATA ANALYSIS 1 Testig Hypothesis GG313 GEOLOGICAL DATA ANALYSIS LECTURE NOTES PAUL WESSEL SECTION TESTING OF HYPOTHESES Much of statistics is cocered with testig hypothesis agaist data

More information

1 Review of Probability & Statistics

1 Review of Probability & Statistics 1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5

More information

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:

More information

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual

More information

Control Charts for Mean for Non-Normally Correlated Data

Control Charts for Mean for Non-Normally Correlated Data Joural of Moder Applied Statistical Methods Volume 16 Issue 1 Article 5 5-1-017 Cotrol Charts for Mea for No-Normally Correlated Data J. R. Sigh Vikram Uiversity, Ujjai, Idia Ab Latif Dar School of Studies

More information

Lesson 11: Simple Linear Regression

Lesson 11: Simple Linear Regression Lesso 11: Simple Liear Regressio Ka-fu WONG December 2, 2004 I previous lessos, we have covered maily about the estimatio of populatio mea (or expected value) ad its iferece. Sometimes we are iterested

More information

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments:

STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. Comments: Recall: STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Commets:. So far we have estimates of the parameters! 0 ad!, but have o idea how good these estimates are. Assumptio: E(Y x)! 0 +! x (liear coditioal

More information

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes

Statistics Lecture 27. Final review. Administrative Notes. Outline. Experiments. Sampling and Surveys. Administrative Notes Admiistrative Notes s - Lecture 7 Fial review Fial Exam is Tuesday, May 0th (3-5pm Covers Chapters -8 ad 0 i textbook Brig ID cards to fial! Allowed: Calculators, double-sided 8.5 x cheat sheet Exam Rooms:

More information

November 2002 Course 4 solutions

November 2002 Course 4 solutions November Course 4 solutios Questio # Aswer: B φ ρ = = 5. φ φ ρ = φ + =. φ Solvig simultaeously gives: φ = 8. φ = 6. Questio # Aswer: C g = [(.45)] = [5.4] = 5; h= 5.4 5 =.4. ˆ π =.6 x +.4 x =.6(36) +.4(4)

More information

Comparison of Minimum Initial Capital with Investment and Non-investment Discrete Time Surplus Processes

Comparison of Minimum Initial Capital with Investment and Non-investment Discrete Time Surplus Processes The 22 d Aual Meetig i Mathematics (AMM 207) Departmet of Mathematics, Faculty of Sciece Chiag Mai Uiversity, Chiag Mai, Thailad Compariso of Miimum Iitial Capital with Ivestmet ad -ivestmet Discrete Time

More information

Pearson Edexcel Level 3 Advanced Subsidiary and Advanced GCE in Statistics

Pearson Edexcel Level 3 Advanced Subsidiary and Advanced GCE in Statistics Pearso Edecel Level 3 Advaced Subsidiary ad Advaced GCE i Statistics Statistical formulae ad tables For first certificatio from Jue 018 for: Advaced Subsidiary GCE i Statistics (8ST0) For first certificatio

More information

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y

S Y Y = ΣY 2 n. Using the above expressions, the correlation coefficient is. r = SXX S Y Y 1 Sociology 405/805 Revised February 4, 004 Summary of Formulae for Bivariate Regressio ad Correlatio Let X be a idepedet variable ad Y a depedet variable, with observatios for each of the values of these

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Section 14. Simple linear regression.

Section 14. Simple linear regression. Sectio 14 Simple liear regressio. Let us look at the cigarette dataset from [1] (available to dowload from joural s website) ad []. The cigarette dataset cotais measuremets of tar, icotie, weight ad carbo

More information

New Correlation for Calculating Critical Pressure of Petroleum Fractions

New Correlation for Calculating Critical Pressure of Petroleum Fractions IARJSET ISSN (Olie) 2393-8021 ISSN (Prit) 2394-1588 Iteratioal Advaced Research Joural i Sciece, Egieerig ad Techology New Correlatio for Calculatig Critical Pressure of Petroleum Fractios Sayed Gomaa,

More information

Stat 319 Theory of Statistics (2) Exercises

Stat 319 Theory of Statistics (2) Exercises Kig Saud Uiversity College of Sciece Statistics ad Operatios Research Departmet Stat 39 Theory of Statistics () Exercises Refereces:. Itroductio to Mathematical Statistics, Sixth Editio, by R. Hogg, J.

More information

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium

Statistical Hypothesis Testing. STAT 536: Genetic Statistics. Statistical Hypothesis Testing - Terminology. Hardy-Weinberg Disequilibrium Statistical Hypothesis Testig STAT 536: Geetic Statistics Kari S. Dorma Departmet of Statistics Iowa State Uiversity September 7, 006 Idetify a hypothesis, a idea you wat to test for its applicability

More information

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis

Section 9.2. Tests About a Population Proportion 12/17/2014. Carrying Out a Significance Test H A N T. Parameters & Hypothesis Sectio 9.2 Tests About a Populatio Proportio P H A N T O M S Parameters Hypothesis Assess Coditios Name the Test Test Statistic (Calculate) Obtai P value Make a decisio State coclusio Sectio 9.2 Tests

More information

Modeling and Estimation of a Bivariate Pareto Distribution using the Principle of Maximum Entropy

Modeling and Estimation of a Bivariate Pareto Distribution using the Principle of Maximum Entropy Sri Laka Joural of Applied Statistics, Vol (5-3) Modelig ad Estimatio of a Bivariate Pareto Distributio usig the Priciple of Maximum Etropy Jagathath Krisha K.M. * Ecoomics Research Divisio, CSIR-Cetral

More information

R. van Zyl 1, A.J. van der Merwe 2. Quintiles International, University of the Free State

R. van Zyl 1, A.J. van der Merwe 2. Quintiles International, University of the Free State Bayesia Cotrol Charts for the Two-parameter Expoetial Distributio if the Locatio Parameter Ca Take o Ay Value Betwee Mius Iity ad Plus Iity R. va Zyl, A.J. va der Merwe 2 Quitiles Iteratioal, ruaavz@gmail.com

More information

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab

Table 12.1: Contingency table. Feature b. 1 N 11 N 12 N 1b 2 N 21 N 22 N 2b. ... a N a1 N a2 N ab Sectio 12 Tests of idepedece ad homogeeity I this lecture we will cosider a situatio whe our observatios are classified by two differet features ad we would like to test if these features are idepedet

More information

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency

Math 152. Rumbos Fall Solutions to Review Problems for Exam #2. Number of Heads Frequency Math 152. Rumbos Fall 2009 1 Solutios to Review Problems for Exam #2 1. I the book Experimetatio ad Measuremet, by W. J. Youde ad published by the by the Natioal Sciece Teachers Associatio i 1962, the

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Simple Linear Regression

Simple Linear Regression Simple Liear Regressio 1. Model ad Parameter Estimatio (a) Suppose our data cosist of a collectio of pairs (x i, y i ), where x i is a observed value of variable X ad y i is the correspodig observatio

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes.

IE 230 Probability & Statistics in Engineering I. Closed book and notes. No calculators. 120 minutes. Closed book ad otes. No calculators. 120 miutes. Cover page, five pages of exam, ad tables for discrete ad cotiuous distributios. Score X i =1 X i / S X 2 i =1 (X i X ) 2 / ( 1) = [i =1 X i 2 X 2 ] / (

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

Regression, Inference, and Model Building

Regression, Inference, and Model Building Regressio, Iferece, ad Model Buildig Scatter Plots ad Correlatio Correlatio coefficiet, r -1 r 1 If r is positive, the the scatter plot has a positive slope ad variables are said to have a positive relatioship

More information

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution

A Note on Box-Cox Quantile Regression Estimation of the Parameters of the Generalized Pareto Distribution A Note o Box-Cox Quatile Regressio Estimatio of the Parameters of the Geeralized Pareto Distributio JM va Zyl Abstract: Makig use of the quatile equatio, Box-Cox regressio ad Laplace distributed disturbaces,

More information

ADVANCED SOFTWARE ENGINEERING

ADVANCED SOFTWARE ENGINEERING ADVANCED SOFTWARE ENGINEERING COMP 3705 Exercise Usage-based Testig ad Reliability Versio 1.0-040406 Departmet of Computer Ssciece Sada Narayaappa, Aeliese Adrews Versio 1.1-050405 Departmet of Commuicatio

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

Estimation of Gumbel Parameters under Ranked Set Sampling

Estimation of Gumbel Parameters under Ranked Set Sampling Joural of Moder Applied Statistical Methods Volume 13 Issue 2 Article 11-2014 Estimatio of Gumbel Parameters uder Raked Set Samplig Omar M. Yousef Al Balqa' Applied Uiversity, Zarqa, Jorda, abuyaza_o@yahoo.com

More information

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS

NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS NANYANG TECHNOLOGICAL UNIVERSITY SYLLABUS FOR ENTRANCE EXAMINATION FOR INTERNATIONAL STUDENTS AO-LEVEL MATHEMATICS STRUCTURE OF EXAMINATION PAPER. There will be oe 2-hour paper cosistig of 4 questios.

More information

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Final Review. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Fial Review Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milto Stewart School of Idustrial Systems & Egieerig Georgia Tech 1 Radom samplig model radom samples populatio radom samples: x 1,..., x

More information

Parameter, Statistic and Random Samples

Parameter, Statistic and Random Samples Parameter, Statistic ad Radom Samples A parameter is a umber that describes the populatio. It is a fixed umber, but i practice we do ot kow its value. A statistic is a fuctio of the sample data, i.e.,

More information

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality

A goodness-of-fit test based on the empirical characteristic function and a comparison of tests for normality A goodess-of-fit test based o the empirical characteristic fuctio ad a compariso of tests for ormality J. Marti va Zyl Departmet of Mathematical Statistics ad Actuarial Sciece, Uiversity of the Free State,

More information

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution

Maximum likelihood estimation from record-breaking data for the generalized Pareto distribution METRON - Iteratioal Joural of Statistics 004, vol. LXII,. 3, pp. 377-389 NAGI S. ABD-EL-HAKIM KHALAF S. SULTAN Maximum likelihood estimatio from record-breakig data for the geeralized Pareto distributio

More information