ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS
|
|
- Grant Singleton
- 6 years ago
- Views:
Transcription
1 Producto Systems ad Iformato Egeerg Volume 5 (2009), pp ESTIMATION OF MISCLASSIFICATION ERROR USING BAYESIAN CLASSIFIERS PÉTER BARABÁS Uversty of Msolc, Hugary Departmet of Iformato Techology barabas@t.u-msolc.hu LÁSZLÓ KOVÁCS Uversty of Msolc, Hugary Departmet of Iformato Techology ovacs@t.u-msolc.hu [Receved Jauary 2009 ad accepted Aprl 2009] Abstract. Bayesa classfers provde relatvely good performace compared wth other more complex algorthms. Msclassfcato rato s very low for traed samples, but the case of outlers the msclassfcato error may crease sgfcatly. The usage of summato hac method Bayesa classfcato algorthm ca reduce the msclassfcatos rate for utraed samples. The goal of ths paper s to aalyze the applcablty of summato hac Bayesa classfers geeral. Keywords: Bayesa classfer, summato hac, polyomal dstrbuto, msclassfcato error. Itroducto The Bayesa classfcato method s a geeratve statstcal classfer. Studes comparg classfcato algorthms have foud that the smple or Nave Bayesa classfer provdes relatvely good performace compared wth other more complex algorthms. Accuracy of classfcato s a very mportat property of a classfer, a measure of whch ca be separated to two parts: a measure of accuracy case of traed samples ad a measure of accuracy case of utraed samples. Nave Bayesa classfcato s geerally very accurate the frst case sce all testg samples are traed before ad have o outlers; the secod case the effcecy s worse due to outlers. I [], the role of outlers s examed classfcato methods, the Nave Bayesa classfcato s reactve to outlers, ad they ca cause msclassfcato. Usage of summato hac ca reduce the effect of outlers. The goal of our research s to aalyze the geeralzato capablty of Bayesa classfcato usg summato hac. I the secod part a short summary about Nave Bayesa classfcato s gve. I the thrd part the cocept of summato hac s troduced ad examed. I the fourth part the classfcato methods are
2 42 P. BARABÁS AND L. KOVÁCS aalyzed cosderg the msclassfcato error. Fally, the test results ad coclusos have bee summarzed the last secto. It s assumed that the obects to be classfed are descrbed by -dmesoal patter vectors x = (x,,x ) R. The dmesos correspod to the attrbutes of the obects. Every patter vector s assocated wth a class label c, where the total umber of classes s m. The class label c deotes that the obect belogs to the -th class. Thus, a classfer ca be regarded as a fucto g x ) : R { c,..., c }. (.) ( m The optmal classfcato fucto s amed at mmzg the msclassfcato rs [2]. The rs value R depeds o the probablty of the dfferet classes ad o the msclassfcato cost of the classes: R ( g( x ) x) = b( g( x) c ) c x), (.2) c where c x) deotes the codtoal probablty of c for the patter vector x ad b(c c ) deotes the cost value of decdg favor of c stead of the correct class c. The cost fucto b has usually the followg smplfed form: 0, f c = c b ( c c ) = (.3), f c c. Usg ths d of fucto b, the msclassfcato error value ca be gve by R( g( x ) x) = g x c ( ) x). (.4) The optmal classfcato fucto mmzes the value R(g(x) x). As thus f c c c x ) =, (.5) P ( g( x) x) max, (.6) the the R(g(x) x) has a mmal value. The decso rule whch mmzes the average rs s the Bayes rule whch assgs the x patter vector to the class that has the greatest probablty for x[3].
3 ESTIMATION OF MISCLASSIFICATION ERROR USING BAYES CLASSIFICATION Bayes classfcato A Bayesa classfer s based o Bayes theorem whch relates to the codtoal ad margal probabltes of two radom evets. Let A ad B deote evets. Codtoal probablty A B) s the probablty of evet A, gve the occurrece of evet B. Margal probablty s the ucodtoal probablty A) of evet A, regardless of whether evet B does or does ot occur. The smplfed verso of Bayesa theorem ca be wrtte for evet A ad B as follows: B A) A) P ( A B) =. (2.) B) If s the complemetary evet of A, called ot A. Let A, A 2, A 3, be a partto of the evet space. The geeral form of the theorem s gve as: B A ) A ) A B) =. B A ) A ) (2.2) Let C = {c } deote the set of classes. The observable propertes of the obects are descrbed by vector x. A obect wth propertes x has to be classfed to the class for whch the c x) probablty s maxmal. O the bass of Bayes theorem: x c ) c ) c x) =. (2.3) x) Sce x) s the same for all we have to maxmze oly the expresso x c )c ). The value c ) s gve a pror or ca be apprecated wth relatve frequeces from the samples. Accordg to the assumpto of Nave Bayes classfcato the attrbutes a gve class are codtoally depedet of every other attrbute. So the ot probablty model ca be expressed as,..., x ) = c ) x c ) = c, x. (2.4) Usg the above equato the probablty of class c for a obect featured by vector x s equal to
4 44 P. BARABÁS AND L. KOVÁCS c c ) x c = x) =. x) ) (2.5) For the case where c* x) s maxmal the correspodg class label [5] s: c * = c C { c )} = argmaxc C c) = x c) argmax x. (2.6) If a gve class ad feature ever occur together the trag set, the the relatve frequecy wll be zero. Thus, the total probablty s also set to zero. Oe of the smplest solutos of ths problem s to add to all occurreces of the gve attrbute. I case of a large umber of samples the dstorto of probabltes s margal ad the formato loss through the zero tag ca be elmated successfully. Ths techque s called Laplace estmato [4]. A more refed soluto s to add p stead of to the relatve frequeces, where p s the relatve frequecy of th attrbute value the global teachg set, ot oly the set belogg to class c. 3. Summato hac Outlers the classfcato ca dcate faulty data whch cause msclassfcato. The use of summato hac s a optoal method to reduce the msclassfcato error. Summato hac s a ad-hoc replacemet of a product by a sum a probablstc expresso []. Ths hac s usually explaed as a devce to cope wth outlers, wth o formal dervato. Ths ote shows that the hac does mae sese probablstcally, ad ca be best thought of as replacg a outler-sestve lelhood wth a outler-tolerat oe. Let us defe a vector x wth compoets x,x 2,,x ad a class c. I Bayes classfcato where the vector values are codtoally depedet: x c) = x c). (3.) = I ths case the probablty s sestve to outlers dvdual dmesos so f ay x c) value s equal to 0, the product wll be zero. Usg summato hac we get the followg:
5 ESTIMATION OF MISCLASSIFICATION ERROR USING BAYES CLASSIFICATION 45 x c) x c). (3.2) = I ths case the result wll be zero f ad oly f all p(x c) values are equal to 0. Usg (2.9) ad (3.2) the computg of wer class s based upo the followg formula: c * argmax x x c), (3.3) = = c C { c )} argmax c C c) Applyg summato hac the error of classfcato ca be reduced. I every equato above the frequecy probabltes are replaced wth ther approxmated values, where P e e ( ) = lm, (3.4) t ad t s the total umber of trals ad e s the umber of trals where evet e occurred. If the umber of test evets approaches fty, the relatve frequecy value wll coverge to the probablty value. I may classfcato tass; a small umber of samples s gve [6], the umber of tests s low, so a larger approxmato error wll arse the calculatos. We ca wrte the probablty as follows: x = v c x = x = v c ) + Δ, (3.5) where meas the error of approxmato. The cumulated classfcato error case of summato hac ca be computed by the summato of the error elemets. Ths error value dffers from the classfcato error for the product of probabltes as t s calculated by the followg form: = ( P ( x = v c ) + Δ ) ( x = v c )). (3.6) =
6 46 P. BARABÁS AND L. KOVÁCS 4. Aalyss of approxmato error The ma cause of msclassfcato s the error of the approxmated probablty values show formula (3.4). To calculate the error value, the followg model s appled. Let {c} be the set of classes, ad {a } the set of attrbutes where a attrbute may be of vector value. A test case s descrbed by a (a,c) par where c deotes the class related to the a attrbute. The uow probablty that a belogs to c s deoted by p. The relatve frequecy of the evet that a belogs to c s deoted by g. I the calculatos p are approxmated by g. The classfcato of the attrbute ca be regarded as a stochastc evet, where p, g ) deotes the probablty that g wll be used the calculatos stead of g. Let X(x ) be a -dmesoal stochastc varable, where x deotes the umber of attrbutes classfed as c. X has a polyomal dstrbuto: where N! 2 P ( x =, x2 = 2,..., x = ) = P P2... P, (4.)!!...! 2 N =, P =. (4.2) = = A gve g(, 2,, r ) frequecy value has dfferet P probabltes for the dfferet p(p,p 2,,p r ) probablty tuples. The p(p,p 2,,p r ) wth maxmal P value s assumed to be the real probablty value tuple. As the maxmum lelhood approxmato of the probablty s the frequecy value, the relatve frequeces are the best approxmatos of real probabltes: P. (4.3) N The probablty of other p vectors ca also be calculated wth ths formula. For the case = 2 the resultg P dstrbuto fucto s show Fg.. I the ext step, the approxmato error of product P s calculated. It s clear that the larger the dfferece betwee p ad g, the hgher the error value s. O the other had, the lower the dfferece betwee p ad g, the hgher probablty of ths par s. I the vestgato, the average error value s calculated the followg way:
7 ESTIMATION OF MISCLASSIFICATION ERROR USING BAYES CLASSIFICATION 47 ε ( g ) = P ( p, g) ε ( p, g), p (4.4) where ε deotes the error value, where ε(p,g): the error value of matchg p wth g, p,g): the probablty of matchg p wth g ad ε(g): the average error related to frequecy vector g., Fgure. Probablty fucto for the case =2 I the test case, the error formula for p ad the mea value of error ca be computed as follows: ε ( p, g) = p ( p ) ( ), (4.5) N N Fg. 2 shows the error fucto for the test bomal case. The umber of attempts s 00 where the umber of attrbutes belogg to class c s 30. I the Fgure, ca be see the mmum error s case of p=0.3. Sce the fucto s symmetrc, aother mmum pot ca be foud at p=0.7.
8 48 P. BARABÁS AND L. KOVÁCS, Fgure 2. Error fucto for the case p=0.3 I Nave Bayesa classfer the accuracy depeds strogly o the umber of attempts. The larger the test pool, the better the accuracy s. I Fg. 3 the mea value error fucto ca be see for dfferet N values. The results show that for a small umber of N values the use of summato hac ca mprove the accuracy but for a larger test pool the Nave Bayesa classfer s the domat oe. 5. Test results I frst tests [7] the referece pots were geerated wth uform dstrbuto space. The wer was the Nave Bayesa classfer teachg ad testg phase equally. The teachg accuracy had values from 80% to 00% depedg o evromet parameters. Usg summato hac ths accuracy decreased by about 0%. The testg accuracy s far lower, t s betwee 40 ad 70 percet case of Nave Bayesa classfer ad lower usg summato hac. The relatvely large rage of result values ca be explaed by the overtrag of the model whch ca be cotrolled by the correct choce of evromet parameters.
9 ESTIMATION OF MISCLASSIFICATION ERROR USING BAYES CLASSIFICATION 49 I later tests the referece pots were geerated sparsely, so the space has a small rego wth a relatvely large umber of referece pots ad outsde ths rego there are oly a few referece pots., Fgure 3. Mea value error fucto for dfferet (, N) values I the case of ths dstrbuto, the accuracy of classfers has chaged. The teachg accuracy of Nave Bayesa classfer remaed hghly smlar to other cases ad the use of summato hac brought up the accuracy to the Nave Bayesa. I the testg phase the experece shows that some cases the summato hac soluto ca mprove the effcecy of classfcato ad may cases t exceeds the Nave Bayesa. It cofrms the assumptos that the usage of summato hac Bayesa classfcato ca crease accuracy whe the samples cota a great umber of utraed attrbute values. The accuracy of classfcato depeds o may parameters of the evromet. Oe of the most mportat factors s the maxmum attrbute value parameter. Fg. 4 shows the accuracy fuctos for the followg maxmum attrbute parameter values: 20 (NB20,SH20), 00 (NB00,SH00) ad 500 (NB500,SH500). The otato NB s for Nave Bayesa algorthm ad SH for the modfed Bayesa algorthm. The accuracy of both algorthms has creased wth creasg the sze of the trag set.
10 50 P. BARABÁS AND L. KOVÁCS accuracy (%) sze of trag set NB20 SH20 NB00 SH00 NB500 SH500 Fgure 4. Relatve accuracy of algorthms accordg to umber of teachg samples 6. Coclusos Summato hac s a alteratve for the Nave Bayesa classfer wth larger probablty approxmato errors. Tag a decso tree as a referece classfer, we have compared the Nave Bayesa classfer wth the Bayesa classfer usg summato hac. The test results show that both methods ca yeld the same accuracy as the decso tree method has the case of large trag sets. REFERENCES [] THOMAS P. MINKA: The summato hac as a outler model, techcal ote, August 22, 2003 [2] HOLSTROM L, KOISTIEN P, LAAKSONEN J., OJA E: Neural ad Statstcal Classfers - Taxoomy ad Two Case Studes, IEEE Tras. O Neural Networs, Vol 8, No, 997. [3] KOVÁCS L., TERSTYÁNSZKI G.: Improved Classfcato Algorthm for the Couter Propagato Networ, Proceedgs of IJCNN 2000, Como, Italy. [4] JOAQUIM P. MARQUES DE SÁ: Appled Statstcs Usg SPSS, Statstca, Matlab ad R, Sprger, 2007, pp [5] FUCHUN PENG, DALE SHUURMANS, SHAOJUN WANG: Augmetg Nave Bayes Classfers wth Statstcal Laguage Models, Iformato Retreval, 7, Kluwer Academc Publshers, 2004, Netherlads, pp [6] ROBERT P.W. DUNN: Small sample sze geeralzato, 9 th Scadava Coferece of Image Aalyss, Jue 6-9, 995, Uppsala, Swede [7] BARABÁS P., KOVÁCS L.: Usablty of summato hac Bayes Classfcato, 9 th Iteratoal Symposum of Hugara Researchers o Computatoal Itellgece ad Iformatcs, November 6-8, 2008, Budapest, Hugary
Estimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationFunctions of Random Variables
Fuctos of Radom Varables Chapter Fve Fuctos of Radom Varables 5. Itroducto A geeral egeerg aalyss model s show Fg. 5.. The model output (respose) cotas the performaces of a system or product, such as weght,
More informationBounds on the expected entropy and KL-divergence of sampled multinomial distributions. Brandon C. Roy
Bouds o the expected etropy ad KL-dvergece of sampled multomal dstrbutos Brado C. Roy bcroy@meda.mt.edu Orgal: May 18, 2011 Revsed: Jue 6, 2011 Abstract Iformato theoretc quattes calculated from a sampled
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationTESTS BASED ON MAXIMUM LIKELIHOOD
ESE 5 Toy E. Smth. The Basc Example. TESTS BASED ON MAXIMUM LIKELIHOOD To llustrate the propertes of maxmum lkelhood estmates ad tests, we cosder the smplest possble case of estmatg the mea of the ormal
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationSTK4011 and STK9011 Autumn 2016
STK4 ad STK9 Autum 6 Pot estmato Covers (most of the followg materal from chapter 7: Secto 7.: pages 3-3 Secto 7..: pages 3-33 Secto 7..: pages 35-3 Secto 7..3: pages 34-35 Secto 7.3.: pages 33-33 Secto
More informationSolving Constrained Flow-Shop Scheduling. Problems with Three Machines
It J Cotemp Math Sceces, Vol 5, 2010, o 19, 921-929 Solvg Costraed Flow-Shop Schedulg Problems wth Three Maches P Pada ad P Rajedra Departmet of Mathematcs, School of Advaced Sceces, VIT Uversty, Vellore-632
More information{ }{ ( )} (, ) = ( ) ( ) ( ) Chapter 14 Exercises in Sampling Theory. Exercise 1 (Simple random sampling): Solution:
Chapter 4 Exercses Samplg Theory Exercse (Smple radom samplg: Let there be two correlated radom varables X ad A sample of sze s draw from a populato by smple radom samplg wthout replacemet The observed
More informationThe Mathematical Appendix
The Mathematcal Appedx Defto A: If ( Λ, Ω, where ( λ λ λ whch the probablty dstrbutos,,..., Defto A. uppose that ( Λ,,..., s a expermet type, the σ-algebra o λ λ λ are defed s deoted by ( (,,...,, σ Ω.
More informationBayes Estimator for Exponential Distribution with Extension of Jeffery Prior Information
Malaysa Joural of Mathematcal Sceces (): 97- (9) Bayes Estmator for Expoetal Dstrbuto wth Exteso of Jeffery Pror Iformato Hadeel Salm Al-Kutub ad Noor Akma Ibrahm Isttute for Mathematcal Research, Uverst
More informationLecture Notes Types of economic variables
Lecture Notes 3 1. Types of ecoomc varables () Cotuous varable takes o a cotuum the sample space, such as all pots o a le or all real umbers Example: GDP, Polluto cocetrato, etc. () Dscrete varables fte
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More information6.867 Machine Learning
6.867 Mache Learg Problem set Due Frday, September 9, rectato Please address all questos ad commets about ths problem set to 6.867-staff@a.mt.edu. You do ot eed to use MATLAB for ths problem set though
More informationNon-uniform Turán-type problems
Joural of Combatoral Theory, Seres A 111 2005 106 110 wwwelsevercomlocatecta No-uform Turá-type problems DhruvMubay 1, Y Zhao 2 Departmet of Mathematcs, Statstcs, ad Computer Scece, Uversty of Illos at
More informationTHE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA
THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for
More informationUnimodality Tests for Global Optimization of Single Variable Functions Using Statistical Methods
Malaysa Umodalty Joural Tests of Mathematcal for Global Optmzato Sceces (): of 05 Sgle - 5 Varable (007) Fuctos Usg Statstcal Methods Umodalty Tests for Global Optmzato of Sgle Varable Fuctos Usg Statstcal
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationA New Measure of Probabilistic Entropy. and its Properties
Appled Mathematcal Sceces, Vol. 4, 200, o. 28, 387-394 A New Measure of Probablstc Etropy ad ts Propertes Rajeesh Kumar Departmet of Mathematcs Kurukshetra Uversty Kurukshetra, Ida rajeesh_kuk@redffmal.com
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More informationSimple Linear Regression
Statstcal Methods I (EST 75) Page 139 Smple Lear Regresso Smple regresso applcatos are used to ft a model descrbg a lear relatoshp betwee two varables. The aspects of least squares regresso ad correlato
More informationModule 7: Probability and Statistics
Lecture 4: Goodess of ft tests. Itroducto Module 7: Probablty ad Statstcs I the prevous two lectures, the cocepts, steps ad applcatos of Hypotheses testg were dscussed. Hypotheses testg may be used to
More informationClass 13,14 June 17, 19, 2015
Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral
More informationAnalysis of Lagrange Interpolation Formula
P IJISET - Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue, December 4. www.jset.com ISS 348 7968 Aalyss of Lagrage Iterpolato Formula Vjay Dahya PDepartmet of MathematcsMaharaja Surajmal
More informationEntropy ISSN by MDPI
Etropy 2003, 5, 233-238 Etropy ISSN 1099-4300 2003 by MDPI www.mdp.org/etropy O the Measure Etropy of Addtve Cellular Automata Hasa Aı Arts ad Sceces Faculty, Departmet of Mathematcs, Harra Uversty; 63100,
More informationOrdinary Least Squares Regression. Simple Regression. Algebra and Assumptions.
Ordary Least Squares egresso. Smple egresso. Algebra ad Assumptos. I ths part of the course we are gog to study a techque for aalysg the lear relatoshp betwee two varables Y ad X. We have pars of observatos
More informationUnsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationρ < 1 be five real numbers. The
Lecture o BST 63: Statstcal Theory I Ku Zhag, /0/006 Revew for the prevous lecture Deftos: covarace, correlato Examples: How to calculate covarace ad correlato Theorems: propertes of correlato ad covarace
More informationGenerating Multivariate Nonnormal Distribution Random Numbers Based on Copula Function
7659, Eglad, UK Joural of Iformato ad Computg Scece Vol. 2, No. 3, 2007, pp. 9-96 Geeratg Multvarate Noormal Dstrbuto Radom Numbers Based o Copula Fucto Xaopg Hu +, Jam He ad Hogsheg Ly School of Ecoomcs
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationTHE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5
THE ROYAL STATISTICAL SOCIETY 06 EAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5 The Socety s provdg these solutos to assst cadtes preparg for the examatos 07. The solutos are teded as learg ads ad should
More informationBayes Interval Estimation for binomial proportion and difference of two binomial proportions with Simulation Study
IJIEST Iteratoal Joural of Iovatve Scece, Egeerg & Techology, Vol. Issue 5, July 04. Bayes Iterval Estmato for bomal proporto ad dfferece of two bomal proportos wth Smulato Study Masoud Gaj, Solmaz hlmad
More informationDepartment of Agricultural Economics. PhD Qualifier Examination. August 2011
Departmet of Agrcultural Ecoomcs PhD Qualfer Examato August 0 Istructos: The exam cossts of sx questos You must aswer all questos If you eed a assumpto to complete a questo, state the assumpto clearly
More informationChapter 3 Sampling For Proportions and Percentages
Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys
More informationSome Notes on the Probability Space of Statistical Surveys
Metodološk zvezk, Vol. 7, No., 200, 7-2 ome Notes o the Probablty pace of tatstcal urveys George Petrakos Abstract Ths paper troduces a formal presetato of samplg process usg prcples ad cocepts from Probablty
More informationBeam Warming Second-Order Upwind Method
Beam Warmg Secod-Order Upwd Method Petr Valeta Jauary 6, 015 Ths documet s a part of the assessmet work for the subject 1DRP Dfferetal Equatos o Computer lectured o FNSPE CTU Prague. Abstract Ths documet
More informationBAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL DISTRIBUTION
Iteratoal Joural of Mathematcs ad Statstcs Studes Vol.4, No.3, pp.5-39, Jue 06 Publshed by Europea Cetre for Research Trag ad Developmet UK (www.eajourals.org BAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL
More informationChapter 8. Inferences about More Than Two Population Central Values
Chapter 8. Ifereces about More Tha Two Populato Cetral Values Case tudy: Effect of Tmg of the Treatmet of Port-We tas wth Lasers ) To vestgate whether treatmet at a youg age would yeld better results tha
More information2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.
.5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely
More informationMedian as a Weighted Arithmetic Mean of All Sample Observations
Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of
More informationOverview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression
Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30 Recap: regresso problems Iput to learg problem: trag
More informationX X X E[ ] E X E X. is the ()m n where the ( i,)th. j element is the mean of the ( i,)th., then
Secto 5 Vectors of Radom Varables Whe workg wth several radom varables,,..., to arrage them vector form x, t s ofte coveet We ca the make use of matrx algebra to help us orgaze ad mapulate large umbers
More informationSimple Linear Regression
Correlato ad Smple Lear Regresso Berl Che Departmet of Computer Scece & Iformato Egeerg Natoal Tawa Normal Uversty Referece:. W. Navd. Statstcs for Egeerg ad Scetsts. Chapter 7 (7.-7.3) & Teachg Materal
More informationLECTURE 2: Linear and quadratic classifiers
LECURE : Lear ad quadratc classfers g Part : Bayesa Decso heory he Lkelhood Rato est Maxmum A Posteror ad Maxmum Lkelhood Dscrmat fuctos g Part : Quadratc classfers Bayes classfers for ormally dstrbuted
More informationMAX-MIN AND MIN-MAX VALUES OF VARIOUS MEASURES OF FUZZY DIVERGENCE
merca Jr of Mathematcs ad Sceces Vol, No,(Jauary 0) Copyrght Md Reader Publcatos wwwjouralshubcom MX-MIN ND MIN-MX VLUES OF VRIOUS MESURES OF FUZZY DIVERGENCE RKTul Departmet of Mathematcs SSM College
More informationLecture 8: Linear Regression
Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE
More informationLecture 3 Probability review (cont d)
STATS 00: Itroducto to Statstcal Iferece Autum 06 Lecture 3 Probablty revew (cot d) 3. Jot dstrbutos If radom varables X,..., X k are depedet, the ther dstrbuto may be specfed by specfyg the dvdual dstrbuto
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More information(b) By independence, the probability that the string 1011 is received correctly is
Soluto to Problem 1.31. (a) Let A be the evet that a 0 s trasmtted. Usg the total probablty theorem, the desred probablty s P(A)(1 ɛ ( 0)+ 1 P(A) ) (1 ɛ 1)=p(1 ɛ 0)+(1 p)(1 ɛ 1). (b) By depedece, the probablty
More informationSPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS
SPECIAL CONSIDERAIONS FOR VOLUMERIC Z-ES FOR PROPORIONS Oe s stctve reacto to the questo of whether two percetages are sgfcatly dfferet from each other s to treat them as f they were proportos whch the
More information( ) = ( ) ( ) Chapter 13 Asymptotic Theory and Stochastic Regressors. Stochastic regressors model
Chapter 3 Asmptotc Theor ad Stochastc Regressors The ature of eplaator varable s assumed to be o-stochastc or fed repeated samples a regresso aalss Such a assumpto s approprate for those epermets whch
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationMULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov
Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationConfidence Intervals for Double Exponential Distribution: A Simulation Approach
World Academy of Scece, Egeerg ad Techology Iteratoal Joural of Physcal ad Mathematcal Sceces Vol:6, No:, 0 Cofdece Itervals for Double Expoetal Dstrbuto: A Smulato Approach M. Alrasheed * Iteratoal Scece
More informationA tighter lower bound on the circuit size of the hardest Boolean functions
Electroc Colloquum o Computatoal Complexty, Report No. 86 2011) A tghter lower boud o the crcut sze of the hardest Boolea fuctos Masak Yamamoto Abstract I [IPL2005], Fradse ad Mlterse mproved bouds o the
More informationChapter 13, Part A Analysis of Variance and Experimental Design. Introduction to Analysis of Variance. Introduction to Analysis of Variance
Chapter, Part A Aalyss of Varace ad Epermetal Desg Itroducto to Aalyss of Varace Aalyss of Varace: Testg for the Equalty of Populato Meas Multple Comparso Procedures Itroducto to Aalyss of Varace Aalyss
More informationOn generalized fuzzy mean code word lengths. Department of Mathematics, Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India
merca Joural of ppled Mathematcs 04; (4): 7-34 Publshed ole ugust 30, 04 (http://www.scecepublshggroup.com//aam) do: 0.648/.aam.04004.3 ISSN: 330-0043 (Prt); ISSN: 330-006X (Ole) O geeralzed fuzzy mea
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationBayesian Classification. CS690L Data Mining: Classification(2) Bayesian Theorem: Basics. Bayesian Theorem. Training dataset. Naïve Bayes Classifier
Baa Classfcato CS6L Data Mg: Classfcato() Referece: J. Ha ad M. Kamber, Data Mg: Cocepts ad Techques robablstc learg: Calculate explct probabltes for hypothess, amog the most practcal approaches to certa
More informationLecture 07: Poles and Zeros
Lecture 07: Poles ad Zeros Defto of poles ad zeros The trasfer fucto provdes a bass for determg mportat system respose characterstcs wthout solvg the complete dfferetal equato. As defed, the trasfer fucto
More information12.2 Estimating Model parameters Assumptions: ox and y are related according to the simple linear regression model
1. Estmatg Model parameters Assumptos: ox ad y are related accordg to the smple lear regresso model (The lear regresso model s the model that says that x ad y are related a lear fasho, but the observed
More informationMOLECULAR VIBRATIONS
MOLECULAR VIBRATIONS Here we wsh to vestgate molecular vbratos ad draw a smlarty betwee the theory of molecular vbratos ad Hückel theory. 1. Smple Harmoc Oscllator Recall that the eergy of a oe-dmesoal
More informationThird handout: On the Gini Index
Thrd hadout: O the dex Corrado, a tala statstca, proposed (, 9, 96) to measure absolute equalt va the mea dfferece whch s defed as ( / ) where refers to the total umber of dvduals socet. Assume that. The
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationMean is only appropriate for interval or ratio scales, not ordinal or nominal.
Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot
More informationCubic Nonpolynomial Spline Approach to the Solution of a Second Order Two-Point Boundary Value Problem
Joural of Amerca Scece ;6( Cubc Nopolyomal Sple Approach to the Soluto of a Secod Order Two-Pot Boudary Value Problem W.K. Zahra, F.A. Abd El-Salam, A.A. El-Sabbagh ad Z.A. ZAk * Departmet of Egeerg athematcs
More informationPermutation Tests for More Than Two Samples
Permutato Tests for ore Tha Two Samples Ferry Butar Butar, Ph.D. Abstract A F statstc s a classcal test for the aalyss of varace where the uderlyg dstrbuto s a ormal. For uspecfed dstrbutos, the permutato
More informationA Robust Total Least Mean Square Algorithm For Nonlinear Adaptive Filter
A Robust otal east Mea Square Algorthm For Nolear Adaptve Flter Ruxua We School of Electroc ad Iformato Egeerg X'a Jaotog Uversty X'a 70049, P.R. Cha rxwe@chare.com Chogzhao Ha, azhe u School of Electroc
More informationPTAS for Bin-Packing
CS 663: Patter Matchg Algorthms Scrbe: Che Jag /9/00. Itroducto PTAS for B-Packg The B-Packg problem s NP-hard. If we use approxmato algorthms, the B-Packg problem could be solved polyomal tme. For example,
More informationMEASURES OF DISPERSION
MEASURES OF DISPERSION Measure of Cetral Tedecy: Measures of Cetral Tedecy ad Dsperso ) Mathematcal Average: a) Arthmetc mea (A.M.) b) Geometrc mea (G.M.) c) Harmoc mea (H.M.) ) Averages of Posto: a) Meda
More informationPROJECTION PROBLEM FOR REGULAR POLYGONS
Joural of Mathematcal Sceces: Advaces ad Applcatos Volume, Number, 008, Pages 95-50 PROJECTION PROBLEM FOR REGULAR POLYGONS College of Scece Bejg Forestry Uversty Bejg 0008 P. R. Cha e-mal: sl@bjfu.edu.c
More informationDerivation of 3-Point Block Method Formula for Solving First Order Stiff Ordinary Differential Equations
Dervato of -Pot Block Method Formula for Solvg Frst Order Stff Ordary Dfferetal Equatos Kharul Hamd Kharul Auar, Kharl Iskadar Othma, Zara Bb Ibrahm Abstract Dervato of pot block method formula wth costat
More informationBayesian Inferences for Two Parameter Weibull Distribution Kipkoech W. Cheruiyot 1, Abel Ouko 2, Emily Kirimi 3
IOSR Joural of Mathematcs IOSR-JM e-issn: 78-578, p-issn: 9-765X. Volume, Issue Ver. II Ja - Feb. 05, PP 4- www.osrjourals.org Bayesa Ifereces for Two Parameter Webull Dstrbuto Kpkoech W. Cheruyot, Abel
More informationPart 4b Asymptotic Results for MRR2 using PRESS. Recall that the PRESS statistic is a special type of cross validation procedure (see Allen (1971))
art 4b Asymptotc Results for MRR usg RESS Recall that the RESS statstc s a specal type of cross valdato procedure (see Alle (97)) partcular to the regresso problem ad volves fdg Y $,, the estmate at the
More information2006 Jamie Trahan, Autar Kaw, Kevin Martin University of South Florida United States of America
SOLUTION OF SYSTEMS OF SIMULTANEOUS LINEAR EQUATIONS Gauss-Sedel Method 006 Jame Traha, Autar Kaw, Kev Mart Uversty of South Florda Uted States of Amerca kaw@eg.usf.edu Itroducto Ths worksheet demostrates
More informationSupervised learning: Linear regression Logistic regression
CS 57 Itroducto to AI Lecture 4 Supervsed learg: Lear regresso Logstc regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Data: D { D D.. D D Supervsed learg d a set of eamples s
More informationLecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1,
Lecture (Part II) Materals Covered Ths Lecture: Chapter 2 (2.6 --- 2.0) The umber of ways of parttog dstct obects to dstct groups cotag, 2,, obects, respectvely, where each obect appears exactly oe group
More informationBootstrap Method for Testing of Equality of Several Coefficients of Variation
Cloud Publcatos Iteratoal Joural of Advaced Mathematcs ad Statstcs Volume, pp. -6, Artcle ID Sc- Research Artcle Ope Access Bootstrap Method for Testg of Equalty of Several Coeffcets of Varato Dr. Navee
More informationA New Family of Transformations for Lifetime Data
Proceedgs of the World Cogress o Egeerg 4 Vol I, WCE 4, July - 4, 4, Lodo, U.K. A New Famly of Trasformatos for Lfetme Data Lakhaa Watthaacheewakul Abstract A famly of trasformatos s the oe of several
More informationECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity
ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1
STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal
More informationDescriptive Statistics
Page Techcal Math II Descrptve Statstcs Descrptve Statstcs Descrptve statstcs s the body of methods used to represet ad summarze sets of data. A descrpto of how a set of measuremets (for eample, people
More informationObjectives of Multiple Regression
Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationhp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations
HP 30S Statstcs Averages ad Stadard Devatos Average ad Stadard Devato Practce Fdg Averages ad Stadard Devatos HP 30S Statstcs Averages ad Stadard Devatos Average ad stadard devato The HP 30S provdes several
More informationThe number of observed cases The number of parameters. ith case of the dichotomous dependent variable. the ith case of the jth parameter
LOGISTIC REGRESSION Notato Model Logstc regresso regresses a dchotomous depedet varable o a set of depedet varables. Several methods are mplemeted for selectg the depedet varables. The followg otato s
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More informationAssignment 5/MATH 247/Winter Due: Friday, February 19 in class (!) (answers will be posted right after class)
Assgmet 5/MATH 7/Wter 00 Due: Frday, February 9 class (!) (aswers wll be posted rght after class) As usual, there are peces of text, before the questos [], [], themselves. Recall: For the quadratc form
More informationVOL. 3, NO. 11, November 2013 ISSN ARPN Journal of Science and Technology All rights reserved.
VOL., NO., November 0 ISSN 5-77 ARPN Joural of Scece ad Techology 0-0. All rghts reserved. http://www.ejouralofscece.org Usg Square-Root Iverted Gamma Dstrbuto as Pror to Draw Iferece o the Raylegh Dstrbuto
More informationModule 7. Lecture 7: Statistical parameter estimation
Lecture 7: Statstcal parameter estmato Parameter Estmato Methods of Parameter Estmato 1) Method of Matchg Pots ) Method of Momets 3) Mamum Lkelhood method Populato Parameter Sample Parameter Ubased estmato
More informationOn Fuzzy Arithmetic, Possibility Theory and Theory of Evidence
O Fuzzy rthmetc, Possblty Theory ad Theory of Evdece suco P. Cucala, Jose Vllar Isttute of Research Techology Uversdad Potfca Comllas C/ Sata Cruz de Marceado 6 8 Madrd. Spa bstract Ths paper explores
More informationOutline. Point Pattern Analysis Part I. Revisit IRP/CSR
Pot Patter Aalyss Part I Outle Revst IRP/CSR, frst- ad secod order effects What s pot patter aalyss (PPA)? Desty-based pot patter measures Dstace-based pot patter measures Revst IRP/CSR Equal probablty:
More information