Gene Selection for Colon Cancer Classification using Bayesian Model Averaging of Linear and Quadratic Discriminants
|
|
- Reginald Mathews
- 5 years ago
- Views:
Transcription
1 Gene Selection for Colon Cancer Classification using Bayesian Model Averaging of Linear and Quadratic Discriinants Oyebayo Ridwan Olaniran*, Mohd Asrul Affendi Abdullah Departent of Matheatics and Statistics, Faculty of Applied Sciences and Technology, Universiti Tun Hussein Onn Malaysia, Pagoh Educational Hub, Pagoh, Johor, Malaysia. Received 30 Septeber 2017; accepted 30 Noveber 2017; available online 28 Deceber 2017 Abstract: Recent findings reveal that various cancer types can be diagnosed using non-clinical approach which involves onitoring of the biological saples using their genes expression profiles. Two of the widely used ethods are Linear and Quadratic discriinant analyses. In this paper, the behaviours of Linear and Quadratic Discriinants Analyses (LDA and QDA) were observed within the fraework of Bayesian odel averaging. We applied Bayesian odel averaging to tackle odel uncertain proble inherent in discriinant analysis. We calibrated the developed classifier on published real life icroarray colon cancer data that contained 2000 gene expression profiles easured on 62 biological saples that coprised 40 tuorous tissue saples and 22 noral tissue saples. The data were also pre-processed using logarithic transforation to base 10 and zero ean unit variance noralization as often done with the dataset. In addition, coparison with Rando Forest (RF), Gradient Boosting Machine (GBM) and Bayesian Additive Regression Trees (BART) was also achieved. Various perforance results established the supreacy of the proposed ethod. Keyword: Cancer; Linear and Quadratic discriinant; Bayesian; Model Uncertainty; Model averaging. 1. Introduction Recent findings reveal that various cancer types can be diagnosed using non-clinical approach which involves onitoring of the biological saples using their genes expression profiles. However, this advanceent is possible due to the enhanceent of icroarray technology which ade it possible to observe gene expression levels of several gene chips concurrently [1, 2]. Several authors have discussed the health benefits of the non-clinical diagnosis breakthrough, but the ajor proble that still exists is how to adequately identify the few subsets of thousands genes whose inforation can be used to reliably classify the RNA saples into their respective biological groups. In addition, it has been observed that adequacy of any ethod strongly depends on the health probles [3]. Several types of achine learning algoriths have been proposed to perfor the task of non-clinical diagnosis RNA (essenger Ribonucleic acid). RNA saples are usually collected one several biological features (genes). The resulting data structure are of the for n p, where the nuber of patients n is far less than the nuber of biological features. This scenario is often tered as High-diensional data [4]. Highdiensionality poses serious proble in statistical analysis and in-fact when building achine learning algoriths. This is because ost statistical ethods require the nuber of patients (equations) to be ore than the nuber of attributes (paraeters) for a unique solution to exist. Bayesian procedures are the eerging solution to ost applications of statistics in the recent tie. In fact, it has the least error rate in theory [5,6]. LDA and QDA are often regarded as the Bayesian classifier [2, 7] because they are otivated by Bayes theore. Two of the iportant assuptions of LDA and QDA is norality assuption of the feature space (x) and also orthogonality of feature space [4]. The efficiency or accuracy of LDA and QDA classifier strongly relies on these two assuptions. The classifiers becoe unstable when the assuptions are not et. High-diensional data usually violates these assuptions as in the case of the data used in this research. The data do not satisfy the norality assuption as well as the diensionality proble often lead to ulticollinearity. In the light of this, robust *Corresponding author: rid4stat@yahoo.co 2017 UTHM Publisher. All right reserved. penerbit.uth.edu.y/ojs/index.php/jst
2 ethods like Rando Forests [8] stochastic gradient boosting [9], Bayesian additive regression trees [10], Bayesian additive regression trees using Bayesian odel averaging [6]. have been developed. The ethods are efficient but are coputationally expensive than the siple LDA and QDA. Therefore, in this paper, we developed enseble of LDA and QDA using Bayesian odel averaging approach in order to increase the efficiency of LDA and QDA when analysing high-diensional data. 2. Bayes Classifier The foreost Bayesian classification ethods are linear and quadratic discriinant analysis [11]. Specifically, Bayes classifier is defined by f C (x) over a rando vector X and rando vector Y where f C (x) Pr(X = x Y = c) (1) denoting the density function of X for an observation that coes fro the Cth class. In other words, f C (x) is relatively large if there is a high probability that an observation in the Cth class has X x, and f C (x) is sall if it is very unlikely that an observation in the Cth class has X x. Then Bayes theore states that; Pr(X = x Y = c) = π Cf C (x) C l=1 π l f l (x) (2) If we assue x is norally distributed and estiate its associated location and scale paraeter by axiizing the likelihood iplies we are constructing a Linear Discriinant Analysis (LDA) or Quadratic Discriinant Analysis (QDA). The classifier constructed is LDA if we assuing equality of class variance if otherwise its QDA. Forally, the discriinant δ c (x) for a class c is define as; δ c (x) = x T Σ 1 µ C 1 2 μ C T Σ 1 µ C + logπ c (3) where µ C is the location ean for class c, and Σ is covariance atrix of p predictors for class c. (2) is often referred to as LDA since we assue class variance are the sae across predictors. If otherwise we can obtain QDA as: The estiates of the unknown paraeters µ 1,..., µ C, π 1,..., π C, and Σ are usually estiated via Maxiu Likelihood (MLE, [11]) as; μ C = 1 n c n c x i i:y i =C C Σ = n c 1 n C Σ c c=1 The estiates obtained are then plugged into (3) and (4) to deterine the LDA and QDA classifier. 3. Bayesian Model Averaging of LDA and QDA Given a classification rule δ(x) as earlier derived, and its associated prior probability P(δ), then we can define a posterior density P(δ k Y) = L(δ k (x))p(δ k ) k=1 L(δ k (x))p(δ k ) (5) as density of all possible classifiers. Usually, what deterines classifier k is the subset of feature x in the atrix. For each k, there exist r subset of x space. Following the earlier work of Clyde on Bayesian odel averaging in [12]. It was derived that under regularized prior P(δ) can be approxiated with the posterior P(δ k Y) as; P(δ k Y) = exp( 0.5BIC(δ k))p(δ k ) P(δ k Y) = where; k=1 exp( 0.5BIC(δ k ))P(δ k ) exp( 0.5BIC(δ k)) k=1 exp( 0.5BIC(δ k )) L(δ k (x)) = exp( 0.5BIC(δ k )) BIC(δ k ) = 2log[L(δ k (x))] + plog(n) (6) (7) After posterior estiation, we can then estiate paraeter of interest by; P[ Y] = P(δ k Y)P( δ k, Y) k=1 δ c (x) = x T Σ 1 µ C 1 2 μ C T Σ c 1 µ C + logπ c (4) 141
3 Specifically, for LDA and QDA, the posterior class probabilities are: For LDA; P[X = x Y = c] = P(LDA Y)P(X = x LDA, Y) k=1 (8) For QDA; P[X = x Y = c] = P(QDA Y)P(X = x QDA, Y) k=1 (9) Equation (8) and (9) correspond to proposed ethods used in this research. 4. Data Calibration The data eployed for this study were obtained fro icroarray Princeton repository ( data/index.htl.) on colon cancer. The data contained 2000 gene expression profiles easured on 62 biological saples that coprised 40 tuorous tissue saples and 22 noral tissue saples. Details of the icroarray experient that produces the data can be found in the works of Alon works in [13]. The data were also pre-processed using logarithic transforation to base 10 and zero ean unit variance noralization as often done with the dataset. We copared the perforance of the ethods BMA-LDA and BMA-QDA with LDA, QDA, RF, BART and GBM using the class specific and overall perforance etrics. The etrics used include overall accuracy, balance accuracy, sensitivity, specificity, negative predictive value, positive predictive value, false positive and false negative. The etrics are coputed based on the confusion atrix between 10 folds cross validated test saples and the actual class. The confusion atrix is presented in Table 1 as observed in [14]; where TN represents True Negative, FP is the False Positive, FN represents False Negative and TP is the True Positive. Also, N* is the total predicted negative and P* represents total predicted positive. Siilarly, N is the total actual negative while P is the total actual positive. T represents the total nuber of observation equivalent to; T = TN + FP + TP + FN (10) Here negative eans noral cells while positive eans tuour cells. Accuracy (ACC): %ACC = 100 ( TN+TP ) T Sensitivity:%Sensitivity = 100 ( TP P ) Specificity:%Specificity = 100 ( TN N ) Balance Accuracy (BACC): %Sensitivity + %Specificity %BACC = 2 Positive Predictive Value (PPV): %PPV = 100 ( TP P ) Positive Predictive Value (PPV): %NPV = 100 ( TN N ) False Positive Rate (FPR): %FPR = 100 %Specificity False Negative Rate (FNR): %FNR = 100 %Sensitivity Misclassification Error Rate (MER): %MER = 100 %ACC Table 1 Confusion atrix True Class Predicted Class Total 0 TN FP N 1 FN TP P Total N* P* T 0: Noral, 1: Tuour 142
4 5. Results and Discussion Table 2 and Table 3 show the results based on v folds cross validation where v = 10. The overall perforance easure using accuracy revealed that the ost accurate classifier for diagnosing colon cancer is either BMA-LDA or BMA-QDA. To control class ibalance, we coputed the balance accuracy which also reveals that BMA-LDA or BMA-QDA are the ost accurate classifiers. Table 2 Perforance easures in (%) for BMA-LDA, BMA-QDA, LDA and QDA based on average of 10 folds cross validation. Methods Metrics BMA-LDA BMA-QDA LDA QDA sens specs PPV NPV FP FN er acc BACC Table 3 Perforance easures in (%) for RF, BART and GBM based on average of 10 folds cross validation. Methods Metrics RF BART GBM sens specs PPV NPV FP FN er acc BACC The class specific etrics is very iportant when diagnosing cancer. The procedure ust be highly sensitive for it to be able to detect its presence as early as possible. Sensitivity result in Table 1 shows that the ost sensitive classifiers are BMA-LDA and BMA-QDA with about 97% sensitivity. 6. Conclusion In this paper, we have presented Bayesian odel averaging approach of LDA and QDA for non-clinical diagnosis of colon cancer. The results using the real life dataset indicated that BMA-LDA and BMA-QDA are the best in ters of overall diagnostic accuracy as well as class specific accuracy. Acknowledgeent This work was supported by Universiti Tun Hussein Onn, Malaysia (grant nuber Vot, U607). References [1] Yahya W. B., Olaniran O. R. and Ige, S. O. (2014). On Bayesian Conjugate Noral Linear Regression and Ordinary Least Square Regression Methods: A Monte Carlo Study in Ilorin Journal of Science, Vol. 1 No. 1 pp [2] Olaniran, O.R., Olaniran, S. F., Yahya, W. B., Banjoko, A. W., Garba, M. K., Ausa, L. B. and Gatta, N. F.(2016). Iproved Bayesian Feature Selection and Classification Methods Using Bootstrap Prior Techniques in Anale. Seria Inforatică. Vol. 14 No. 2 pp [3] Si, Adelene YL;Minary, Peter; Levitt, Michael (2012). "Modeling nucleic acids" in Current Opinion in Structural Biology Nucleic acids/sequences and topology. Vol. 22 No. 3 pp [4] Hastie, T., Jaes, G., Witten, D., & Tibshirani, R. (2013). An Introduction to statistical learning. Springer, New York. [5] Lesaffre E. and Lawson, A. B. (2013). Bayesian Biostatistics. John Wiley & Sons, Ltd, New Jersey. [6] Hernández, B., Raftery, A. E., Pennington, S. R., & Parnell, A. C. (2015). Bayesian Additive Regression Trees using Bayesian Model Averaging. 143
5 [7] Olaniran, O. R., & Yahya, W. B. (2017). Bayesian Hypothesis Testing of Two Noral Saples using Bootstrap Prior Technique in Journal of Modern Applied Statistical Methods, Vol. 16 No. 2 pp [8] Breian, L. (2001). Rando forests. Machine Learning, Vol. 45, pp [9] Friedan, J. H. (2002). Greedy function approxiation: A gradient boosting achine in Annals Statistics., Vol. 29 No. 5 pp [10] Chipan, H.A. George, E. I. and McCulloch R. E. (2010). BART: Bayesian Additive Regression Trees in Annals Applied Statistics, Vol. 4, pp [11] Hastie T, Tibshirani R, Friedan J (2011). The Eleents of Statistical Learning: Prediction, Inference and Data Mining. 2nd edition. Springer- Verlag, New York. [12] Clyde, M. (1999). Bayesian Model Averaging and Model Search Strategies (with discussion). in Bayesian Statistics 6, eds. J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Sith, 157{185. Oxford: Oxford University Press. [13] Alon, U., Barkai, N., Notteran, D.A., Gish, K., Ybarra, S., Mack, D., Levine, A.J.(1999). Broad Patterns of Gene Expression Revealed by Clustering Analysis of Tuor and Noral Colon Tissues Probed by Oligonucleotide Arrays. in Proceeding of National Acadeia of Science, Vol. 96 pp [14] Banjoko, A. W., Yahya, W. B., Garba, M. K., Olaniran, O. R., Dauda, K. A., & Olorede, K. O. (2015). Efficient Support Vector Machine Classification of Diffuse Large B-Cell Lyphoa and Follicular Lyphoa MRNA Tissue Saples. Annals. Coputer Science Series, Vol. 13 No. 2 pp
Pattern Recognition and Machine Learning. Learning and Evaluation for Pattern Recognition
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2017 Lesson 1 4 October 2017 Outline Learning and Evaluation for Pattern Recognition Notation...2 1. The Pattern Recognition
More informationBoosting with log-loss
Boosting with log-loss Marco Cusuano-Towner Septeber 2, 202 The proble Suppose we have data exaples {x i, y i ) i =... } for a two-class proble with y i {, }. Let F x) be the predictor function with the
More informationIntelligent Systems: Reasoning and Recognition. Perceptrons and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley osig 1 Winter Seester 2018 Lesson 6 27 February 2018 Outline Perceptrons and Support Vector achines Notation...2 Linear odels...3 Lines, Planes
More informationEstimating Parameters for a Gaussian pdf
Pattern Recognition and achine Learning Jaes L. Crowley ENSIAG 3 IS First Seester 00/0 Lesson 5 7 Noveber 00 Contents Estiating Paraeters for a Gaussian pdf Notation... The Pattern Recognition Proble...3
More informationCombining Classifiers
Cobining Classifiers Generic ethods of generating and cobining ultiple classifiers Bagging Boosting References: Duda, Hart & Stork, pg 475-480. Hastie, Tibsharini, Friedan, pg 246-256 and Chapter 10. http://www.boosting.org/
More informationMachine Learning Basics: Estimators, Bias and Variance
Machine Learning Basics: Estiators, Bias and Variance Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Basics
More informationProbability Distributions
Probability Distributions In Chapter, we ephasized the central role played by probability theory in the solution of pattern recognition probles. We turn now to an exploration of soe particular exaples
More informationKeywords: Estimator, Bias, Mean-squared error, normality, generalized Pareto distribution
Testing approxiate norality of an estiator using the estiated MSE and bias with an application to the shape paraeter of the generalized Pareto distribution J. Martin van Zyl Abstract In this work the norality
More informationLecture 12: Ensemble Methods. Introduction. Weighted Majority. Mixture of Experts/Committee. Σ k α k =1. Isabelle Guyon
Lecture 2: Enseble Methods Isabelle Guyon guyoni@inf.ethz.ch Introduction Book Chapter 7 Weighted Majority Mixture of Experts/Coittee Assue K experts f, f 2, f K (base learners) x f (x) Each expert akes
More informationExtension of CSRSM for the Parametric Study of the Face Stability of Pressurized Tunnels
Extension of CSRSM for the Paraetric Study of the Face Stability of Pressurized Tunnels Guilhe Mollon 1, Daniel Dias 2, and Abdul-Haid Soubra 3, M.ASCE 1 LGCIE, INSA Lyon, Université de Lyon, Doaine scientifique
More informationNon-Parametric Non-Line-of-Sight Identification 1
Non-Paraetric Non-Line-of-Sight Identification Sinan Gezici, Hisashi Kobayashi and H. Vincent Poor Departent of Electrical Engineering School of Engineering and Applied Science Princeton University, Princeton,
More informationA Note on the Applied Use of MDL Approximations
A Note on the Applied Use of MDL Approxiations Daniel J. Navarro Departent of Psychology Ohio State University Abstract An applied proble is discussed in which two nested psychological odels of retention
More informationCS Lecture 13. More Maximum Likelihood
CS 6347 Lecture 13 More Maxiu Likelihood Recap Last tie: Introduction to axiu likelihood estiation MLE for Bayesian networks Optial CPTs correspond to epirical counts Today: MLE for CRFs 2 Maxiu Likelihood
More informationA Simple Regression Problem
A Siple Regression Proble R. M. Castro March 23, 2 In this brief note a siple regression proble will be introduced, illustrating clearly the bias-variance tradeoff. Let Y i f(x i ) + W i, i,..., n, where
More informationEnsemble Based on Data Envelopment Analysis
Enseble Based on Data Envelopent Analysis So Young Sohn & Hong Choi Departent of Coputer Science & Industrial Systes Engineering, Yonsei University, Seoul, Korea Tel) 82-2-223-404, Fax) 82-2- 364-7807
More informationarxiv: v1 [stat.ot] 7 Jul 2010
Hotelling s test for highly correlated data P. Bubeliny e-ail: bubeliny@karlin.ff.cuni.cz Charles University, Faculty of Matheatics and Physics, KPMS, Sokolovska 83, Prague, Czech Republic, 8675. arxiv:007.094v
More informationKernel Methods and Support Vector Machines
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley ENSIAG 2 / osig 1 Second Seester 2012/2013 Lesson 20 2 ay 2013 Kernel ethods and Support Vector achines Contents Kernel Functions...2 Quadratic
More informationSupport Vector Machine Classification of Uncertain and Imbalanced data using Robust Optimization
Recent Researches in Coputer Science Support Vector Machine Classification of Uncertain and Ibalanced data using Robust Optiization RAGHAV PAT, THEODORE B. TRAFALIS, KASH BARKER School of Industrial Engineering
More informationBlock designs and statistics
Bloc designs and statistics Notes for Math 447 May 3, 2011 The ain paraeters of a bloc design are nuber of varieties v, bloc size, nuber of blocs b. A design is built on a set of v eleents. Each eleent
More informationBayes Decision Rule and Naïve Bayes Classifier
Bayes Decision Rule and Naïve Bayes Classifier Le Song Machine Learning I CSE 6740, Fall 2013 Gaussian Mixture odel A density odel p(x) ay be ulti-odal: odel it as a ixture of uni-odal distributions (e.g.
More informationInspection; structural health monitoring; reliability; Bayesian analysis; updating; decision analysis; value of information
Cite as: Straub D. (2014). Value of inforation analysis with structural reliability ethods. Structural Safety, 49: 75-86. Value of Inforation Analysis with Structural Reliability Methods Daniel Straub
More informationTesting equality of variances for multiple univariate normal populations
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Inforation Sciences 0 esting equality of variances for ultiple univariate
More informationE. Alpaydın AERFAISS
E. Alpaydın AERFAISS 00 Introduction Questions: Is the error rate of y classifier less than %? Is k-nn ore accurate than MLP? Does having PCA before iprove accuracy? Which kernel leads to highest accuracy
More informationA Simplified Analytical Approach for Efficiency Evaluation of the Weaving Machines with Automatic Filling Repair
Proceedings of the 6th SEAS International Conference on Siulation, Modelling and Optiization, Lisbon, Portugal, Septeber -4, 006 0 A Siplified Analytical Approach for Efficiency Evaluation of the eaving
More informationare equal to zero, where, q = p 1. For each gene j, the pairwise null and alternative hypotheses are,
Page of 8 Suppleentary Materials: A ultiple testing procedure for ulti-diensional pairwise coparisons with application to gene expression studies Anjana Grandhi, Wenge Guo, Shyaal D. Peddada S Notations
More informationPAC-Bayes Analysis Of Maximum Entropy Learning
PAC-Bayes Analysis Of Maxiu Entropy Learning John Shawe-Taylor and David R. Hardoon Centre for Coputational Statistics and Machine Learning Departent of Coputer Science University College London, UK, WC1E
More informationBiostatistics Department Technical Report
Biostatistics Departent Technical Report BST006-00 Estiation of Prevalence by Pool Screening With Equal Sized Pools and a egative Binoial Sapling Model Charles R. Katholi, Ph.D. Eeritus Professor Departent
More informationAn Improved Particle Filter with Applications in Ballistic Target Tracking
Sensors & ransducers Vol. 72 Issue 6 June 204 pp. 96-20 Sensors & ransducers 204 by IFSA Publishing S. L. http://www.sensorsportal.co An Iproved Particle Filter with Applications in Ballistic arget racing
More informationUsing EM To Estimate A Probablity Density With A Mixture Of Gaussians
Using EM To Estiate A Probablity Density With A Mixture Of Gaussians Aaron A. D Souza adsouza@usc.edu Introduction The proble we are trying to address in this note is siple. Given a set of data points
More informationTracking using CONDENSATION: Conditional Density Propagation
Tracking using CONDENSATION: Conditional Density Propagation Goal Model-based visual tracking in dense clutter at near video frae rates M. Isard and A. Blake, CONDENSATION Conditional density propagation
More information1 Bounding the Margin
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #12 Scribe: Jian Min Si March 14, 2013 1 Bounding the Margin We are continuing the proof of a bound on the generalization error of AdaBoost
More informationFeature Extraction Techniques
Feature Extraction Techniques Unsupervised Learning II Feature Extraction Unsupervised ethods can also be used to find features which can be useful for categorization. There are unsupervised ethods that
More informationExperimental Design For Model Discrimination And Precise Parameter Estimation In WDS Analysis
City University of New York (CUNY) CUNY Acadeic Works International Conference on Hydroinforatics 8-1-2014 Experiental Design For Model Discriination And Precise Paraeter Estiation In WDS Analysis Giovanna
More informationBayesian Approach for Fatigue Life Prediction from Field Inspection
Bayesian Approach for Fatigue Life Prediction fro Field Inspection Dawn An and Jooho Choi School of Aerospace & Mechanical Engineering, Korea Aerospace University, Goyang, Seoul, Korea Srira Pattabhiraan
More informationDERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS
DERIVING PROPER UNIFORM PRIORS FOR REGRESSION COEFFICIENTS N. van Erp and P. van Gelder Structural Hydraulic and Probabilistic Design, TU Delft Delft, The Netherlands Abstract. In probles of odel coparison
More informationE0 370 Statistical Learning Theory Lecture 6 (Aug 30, 2011) Margin Analysis
E0 370 tatistical Learning Theory Lecture 6 (Aug 30, 20) Margin Analysis Lecturer: hivani Agarwal cribe: Narasihan R Introduction In the last few lectures we have seen how to obtain high confidence bounds
More informationUnderstanding Machine Learning Solution Manual
Understanding Machine Learning Solution Manual Written by Alon Gonen Edited by Dana Rubinstein Noveber 17, 2014 2 Gentle Start 1. Given S = ((x i, y i )), define the ultivariate polynoial p S (x) = i []:y
More informationThe proofs of Theorem 1-3 are along the lines of Wied and Galeano (2013).
A Appendix: Proofs The proofs of Theore 1-3 are along the lines of Wied and Galeano (2013) Proof of Theore 1 Let D[d 1, d 2 ] be the space of càdlàg functions on the interval [d 1, d 2 ] equipped with
More informationBootstrapping Dependent Data
Bootstrapping Dependent Data One of the key issues confronting bootstrap resapling approxiations is how to deal with dependent data. Consider a sequence fx t g n t= of dependent rando variables. Clearly
More informationGeneralized Augmentation for Control of the k-familywise Error Rate
International Journal of Statistics in Medical Research, 2012, 1, 113-119 113 Generalized Augentation for Control of the k-failywise Error Rate Alessio Farcoeni* Departent of Public Health and Infectious
More informationModel Fitting. CURM Background Material, Fall 2014 Dr. Doreen De Leon
Model Fitting CURM Background Material, Fall 014 Dr. Doreen De Leon 1 Introduction Given a set of data points, we often want to fit a selected odel or type to the data (e.g., we suspect an exponential
More informationProbabilistic Machine Learning
Probabilistic Machine Learning by Prof. Seungchul Lee isystes Design Lab http://isystes.unist.ac.kr/ UNIST Table of Contents I.. Probabilistic Linear Regression I... Maxiu Likelihood Solution II... Maxiu-a-Posteriori
More informationEstimation of the Mean of the Exponential Distribution Using Maximum Ranked Set Sampling with Unequal Samples
Open Journal of Statistics, 4, 4, 64-649 Published Online Septeber 4 in SciRes http//wwwscirporg/ournal/os http//ddoiorg/436/os4486 Estiation of the Mean of the Eponential Distribution Using Maiu Ranked
More informationPattern Recognition and Machine Learning. Artificial Neural networks
Pattern Recognition and Machine Learning Jaes L. Crowley ENSIMAG 3 - MMIS Fall Seester 2016 Lessons 7 14 Dec 2016 Outline Artificial Neural networks Notation...2 1. Introduction...3... 3 The Artificial
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optiization and Approxiation Instructor: Moritz Hardt Eail: hardt+ee227c@berkeley.edu Graduate Instructor: Max Sichowitz Eail: sichow+ee227c@berkeley.edu October
More informationMultivariate Methods. Matlab Example. Principal Components Analysis -- PCA
Multivariate Methos Xiaoun Qi Principal Coponents Analysis -- PCA he PCA etho generates a new set of variables, calle principal coponents Each principal coponent is a linear cobination of the original
More informationProc. of the IEEE/OES Seventh Working Conference on Current Measurement Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES
Proc. of the IEEE/OES Seventh Working Conference on Current Measureent Technology UNCERTAINTIES IN SEASONDE CURRENT VELOCITIES Belinda Lipa Codar Ocean Sensors 15 La Sandra Way, Portola Valley, CA 98 blipa@pogo.co
More informationESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS. A Thesis. Presented to. The Faculty of the Department of Mathematics
ESTIMATING AND FORMING CONFIDENCE INTERVALS FOR EXTREMA OF RANDOM POLYNOMIALS A Thesis Presented to The Faculty of the Departent of Matheatics San Jose State University In Partial Fulfillent of the Requireents
More informationThe Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Parameters
journal of ultivariate analysis 58, 96106 (1996) article no. 0041 The Distribution of the Covariance Matrix for a Subset of Elliptical Distributions with Extension to Two Kurtosis Paraeters H. S. Steyn
More informationTEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES
TEST OF HOMOGENEITY OF PARALLEL SAMPLES FROM LOGNORMAL POPULATIONS WITH UNEQUAL VARIANCES S. E. Ahed, R. J. Tokins and A. I. Volodin Departent of Matheatics and Statistics University of Regina Regina,
More informationA MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION
A eshsize boosting algorith in kernel density estiation A MESHSIZE BOOSTING ALGORITHM IN KERNEL DENSITY ESTIMATION C.C. Ishiekwene, S.M. Ogbonwan and J.E. Osewenkhae Departent of Matheatics, University
More informationIntroduction to Machine Learning. Recitation 11
Introduction to Machine Learning Lecturer: Regev Schweiger Recitation Fall Seester Scribe: Regev Schweiger. Kernel Ridge Regression We now take on the task of kernel-izing ridge regression. Let x,...,
More informationSmoothing Framework for Automatic Track Initiation in Clutter
Soothing Fraework for Autoatic Track Initiation in Clutter Rajib Chakravorty Networked Sensor Technology (NeST) Laboratory Faculty of Engineering University of Technology, Sydney Broadway -27,Sydney, NSW
More informationInference in the Presence of Likelihood Monotonicity for Polytomous and Logistic Regression
Advances in Pure Matheatics, 206, 6, 33-34 Published Online April 206 in SciRes. http://www.scirp.org/journal/ap http://dx.doi.org/0.4236/ap.206.65024 Inference in the Presence of Likelihood Monotonicity
More informationBayesian Learning. Chapter 6: Bayesian Learning. Bayes Theorem. Roles for Bayesian Methods. CS 536: Machine Learning Littman (Wu, TA)
Bayesian Learning Chapter 6: Bayesian Learning CS 536: Machine Learning Littan (Wu, TA) [Read Ch. 6, except 6.3] [Suggested exercises: 6.1, 6.2, 6.6] Bayes Theore MAP, ML hypotheses MAP learners Miniu
More informationStatistical Logic Cell Delay Analysis Using a Current-based Model
Statistical Logic Cell Delay Analysis Using a Current-based Model Hanif Fatei Shahin Nazarian Massoud Pedra Dept. of EE-Systes, University of Southern California, Los Angeles, CA 90089 {fatei, shahin,
More information1 Proof of learning bounds
COS 511: Theoretical Machine Learning Lecturer: Rob Schapire Lecture #4 Scribe: Akshay Mittal February 13, 2013 1 Proof of learning bounds For intuition of the following theore, suppose there exists a
More informationA method to determine relative stroke detection efficiencies from multiplicity distributions
A ethod to deterine relative stroke detection eiciencies ro ultiplicity distributions Schulz W. and Cuins K. 2. Austrian Lightning Detection and Inoration Syste (ALDIS), Kahlenberger Str.2A, 90 Vienna,
More informationRecovering Data from Underdetermined Quadratic Measurements (CS 229a Project: Final Writeup)
Recovering Data fro Underdeterined Quadratic Measureents (CS 229a Project: Final Writeup) Mahdi Soltanolkotabi Deceber 16, 2011 1 Introduction Data that arises fro engineering applications often contains
More informationFoundations of Machine Learning Boosting. Mehryar Mohri Courant Institute and Google Research
Foundations of Machine Learning Boosting Mehryar Mohri Courant Institute and Google Research ohri@cis.nyu.edu Weak Learning Definition: concept class C is weakly PAC-learnable if there exists a (weak)
More informationJ11.3 STOCHASTIC EVENT RECONSTRUCTION OF ATMOSPHERIC CONTAMINANT DISPERSION
J11.3 STOCHASTIC EVENT RECONSTRUCTION OF ATMOSPHERIC CONTAMINANT DISPERSION Inanc Senocak 1*, Nicolas W. Hengartner, Margaret B. Short 3, and Brent W. Daniel 1 Boise State University, Boise, ID, Los Alaos
More informationDISSIMILARITY MEASURES FOR ICA-BASED SOURCE NUMBER ESTIMATION. Seungchul Lee 2 2. University of Michigan. Ann Arbor, MI, USA.
Proceedings of the ASME International Manufacturing Science and Engineering Conference MSEC June -8,, Notre Dae, Indiana, USA MSEC-7 DISSIMILARIY MEASURES FOR ICA-BASED SOURCE NUMBER ESIMAION Wei Cheng,
More informationA Smoothed Boosting Algorithm Using Probabilistic Output Codes
A Soothed Boosting Algorith Using Probabilistic Output Codes Rong Jin rongjin@cse.su.edu Dept. of Coputer Science and Engineering, Michigan State University, MI 48824, USA Jian Zhang jian.zhang@cs.cu.edu
More informationSupport recovery in compressed sensing: An estimation theoretic approach
Support recovery in copressed sensing: An estiation theoretic approach Ain Karbasi, Ali Horati, Soheil Mohajer, Martin Vetterli School of Coputer and Counication Sciences École Polytechnique Fédérale de
More informationCOS 424: Interacting with Data. Written Exercises
COS 424: Interacting with Data Hoework #4 Spring 2007 Regression Due: Wednesday, April 18 Written Exercises See the course website for iportant inforation about collaboration and late policies, as well
More informationA remark on a success rate model for DPA and CPA
A reark on a success rate odel for DPA and CPA A. Wieers, BSI Version 0.5 andreas.wieers@bsi.bund.de Septeber 5, 2018 Abstract The success rate is the ost coon evaluation etric for easuring the perforance
More informationSupport Vector Machines. Maximizing the Margin
Support Vector Machines Support vector achines (SVMs) learn a hypothesis: h(x) = b + Σ i= y i α i k(x, x i ) (x, y ),..., (x, y ) are the training exs., y i {, } b is the bias weight. α,..., α are the
More informationSPECTRUM sensing is a core concept of cognitive radio
World Acadey of Science, Engineering and Technology International Journal of Electronics and Counication Engineering Vol:6, o:2, 202 Efficient Detection Using Sequential Probability Ratio Test in Mobile
More informationThis model assumes that the probability of a gap has size i is proportional to 1/i. i.e., i log m e. j=1. E[gap size] = i P r(i) = N f t.
CS 493: Algoriths for Massive Data Sets Feb 2, 2002 Local Models, Bloo Filter Scribe: Qin Lv Local Models In global odels, every inverted file entry is copressed with the sae odel. This work wells when
More informationW-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS
W-BASED VS LATENT VARIABLES SPATIAL AUTOREGRESSIVE MODELS: EVIDENCE FROM MONTE CARLO SIMULATIONS. Introduction When it coes to applying econoetric odels to analyze georeferenced data, researchers are well
More informationAN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS
Statistica Sinica 6 016, 1709-178 doi:http://dx.doi.org/10.5705/ss.0014.0034 AN OPTIMAL SHRINKAGE FACTOR IN PREDICTION OF ORDERED RANDOM EFFECTS Nilabja Guha 1, Anindya Roy, Yaakov Malinovsky and Gauri
More informationGrafting: Fast, Incremental Feature Selection by Gradient Descent in Function Space
Journal of Machine Learning Research 3 (2003) 1333-1356 Subitted 5/02; Published 3/03 Grafting: Fast, Increental Feature Selection by Gradient Descent in Function Space Sion Perkins Space and Reote Sensing
More informationSharp Time Data Tradeoffs for Linear Inverse Problems
Sharp Tie Data Tradeoffs for Linear Inverse Probles Saet Oyak Benjain Recht Mahdi Soltanolkotabi January 016 Abstract In this paper we characterize sharp tie-data tradeoffs for optiization probles used
More informationPaul M. Goggans Department of Electrical Engineering, University of Mississippi, Anderson Hall, University, Mississippi 38677
Evaluation of decay ties in coupled spaces: Bayesian decay odel selection a),b) Ning Xiang c) National Center for Physical Acoustics and Departent of Electrical Engineering, University of Mississippi,
More informationCorrelated Bayesian Model Fusion: Efficient Performance Modeling of Large-Scale Tunable Analog/RF Integrated Circuits
Correlated Bayesian odel Fusion: Efficient Perforance odeling of Large-Scale unable Analog/RF Integrated Circuits Fa Wang and Xin Li ECE Departent, Carnegie ellon University, Pittsburgh, PA 53 {fwang,
More informationWarning System of Dangerous Chemical Gas in Factory Based on Wireless Sensor Network
565 A publication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 59, 07 Guest Editors: Zhuo Yang, Junie Ba, Jing Pan Copyright 07, AIDIC Servizi S.r.l. ISBN 978-88-95608-49-5; ISSN 83-96 The Italian Association
More informationMeasuring Temperature with a Silicon Diode
Measuring Teperature with a Silicon Diode Due to the high sensitivity, nearly linear response, and easy availability, we will use a 1N4148 diode for the teperature transducer in our easureents 10 Analysis
More informationTesting the lag length of vector autoregressive models: A power comparison between portmanteau and Lagrange multiplier tests
Working Papers 2017-03 Testing the lag length of vector autoregressive odels: A power coparison between portanteau and Lagrange ultiplier tests Raja Ben Hajria National Engineering School, University of
More informationLower Bounds for Quantized Matrix Completion
Lower Bounds for Quantized Matrix Copletion Mary Wootters and Yaniv Plan Departent of Matheatics University of Michigan Ann Arbor, MI Eail: wootters, yplan}@uich.edu Mark A. Davenport School of Elec. &
More informationUsing a De-Convolution Window for Operating Modal Analysis
Using a De-Convolution Window for Operating Modal Analysis Brian Schwarz Vibrant Technology, Inc. Scotts Valley, CA Mark Richardson Vibrant Technology, Inc. Scotts Valley, CA Abstract Operating Modal Analysis
More informationMODELLING OF DYNAMIC MEASUREMENTS FOR UNCERTAINTY ANALYSIS BY MEANS OF DISCRETIZED STATE-SPACE FORMS
XIX IMEKO World Congress Fundaental and Applied Metrology Septeber 6, 2009, Lisbon, Portugal MODELLING OF DYNAMIC MEASUREMENTS FOR UNCERTAINTY ANALYSIS BY MEANS OF DISCRETIZED STATE-SPACE FORMS Klaus Dieter
More informationSequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5,
Sequence Analysis, WS 14/15, D. Huson & R. Neher (this part by D. Huson) February 5, 2015 31 11 Motif Finding Sources for this section: Rouchka, 1997, A Brief Overview of Gibbs Sapling. J. Buhler, M. Topa:
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a ournal published by Elsevier. The attached copy is furnished to the author for internal non-coercial research and education use, including for instruction at the authors institution
More informationA general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics, EPFL, Lausanne Phone: Fax:
A general forulation of the cross-nested logit odel Michel Bierlaire, EPFL Conference paper STRC 2001 Session: Choices A general forulation of the cross-nested logit odel Michel Bierlaire, Dpt of Matheatics,
More informationLecture 21. Interior Point Methods Setup and Algorithm
Lecture 21 Interior Point Methods In 1984, Kararkar introduced a new weakly polynoial tie algorith for solving LPs [Kar84a], [Kar84b]. His algorith was theoretically faster than the ellipsoid ethod and
More informationSpine Fin Efficiency A Three Sided Pyramidal Fin of Equilateral Triangular Cross-Sectional Area
Proceedings of the 006 WSEAS/IASME International Conference on Heat and Mass Transfer, Miai, Florida, USA, January 18-0, 006 (pp13-18) Spine Fin Efficiency A Three Sided Pyraidal Fin of Equilateral Triangular
More informationPORTMANTEAU TESTS FOR ARMA MODELS WITH INFINITE VARIANCE
doi:10.1111/j.1467-9892.2007.00572.x PORTMANTEAU TESTS FOR ARMA MODELS WITH INFINITE VARIANCE By J.-W. Lin and A. I. McLeod The University of Western Ontario First Version received Septeber 2006 Abstract.
More informationStochastic Subgradient Methods
Stochastic Subgradient Methods Lingjie Weng Yutian Chen Bren School of Inforation and Coputer Science University of California, Irvine {wengl, yutianc}@ics.uci.edu Abstract Stochastic subgradient ethods
More informationarxiv: v1 [cs.lg] 8 Jan 2019
Data Masking with Privacy Guarantees Anh T. Pha Oregon State University phatheanhbka@gail.co Shalini Ghosh Sasung Research shalini.ghosh@gail.co Vinod Yegneswaran SRI international vinod@csl.sri.co arxiv:90.085v
More informationZISC Neural Network Base Indicator for Classification Complexity Estimation
ZISC Neural Network Base Indicator for Classification Coplexity Estiation Ivan Budnyk, Abdennasser Сhebira and Kurosh Madani Iages, Signals and Intelligent Systes Laboratory (LISSI / EA 3956) PARIS XII
More informationMultiscale Entropy Analysis: A New Method to Detect Determinism in a Time. Series. A. Sarkar and P. Barat. Variable Energy Cyclotron Centre
Multiscale Entropy Analysis: A New Method to Detect Deterinis in a Tie Series A. Sarkar and P. Barat Variable Energy Cyclotron Centre /AF Bidhan Nagar, Kolkata 700064, India PACS nubers: 05.45.Tp, 89.75.-k,
More informationIdentical Maximum Likelihood State Estimation Based on Incremental Finite Mixture Model in PHD Filter
Identical Maxiu Lielihood State Estiation Based on Increental Finite Mixture Model in PHD Filter Gang Wu Eail: xjtuwugang@gail.co Jing Liu Eail: elelj20080730@ail.xjtu.edu.cn Chongzhao Han Eail: czhan@ail.xjtu.edu.cn
More informationA Poisson process reparameterisation for Bayesian inference for extremes
A Poisson process reparaeterisation for Bayesian inference for extrees Paul Sharkey and Jonathan A. Tawn Eail: p.sharkey1@lancs.ac.uk, j.tawn@lancs.ac.uk STOR-i Centre for Doctoral Training, Departent
More informationMSEC MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL SOLUTION FOR MAINTENANCE AND PERFORMANCE
Proceeding of the ASME 9 International Manufacturing Science and Engineering Conference MSEC9 October 4-7, 9, West Lafayette, Indiana, USA MSEC9-8466 MODELING OF DEGRADATION PROCESSES TO OBTAIN AN OPTIMAL
More informationA note on the multiplication of sparse matrices
Cent. Eur. J. Cop. Sci. 41) 2014 1-11 DOI: 10.2478/s13537-014-0201-x Central European Journal of Coputer Science A note on the ultiplication of sparse atrices Research Article Keivan Borna 12, Sohrab Aboozarkhani
More informationSupport Vector Machines MIT Course Notes Cynthia Rudin
Support Vector Machines MIT 5.097 Course Notes Cynthia Rudin Credit: Ng, Hastie, Tibshirani, Friedan Thanks: Şeyda Ertekin Let s start with soe intuition about argins. The argin of an exaple x i = distance
More informationQuantum algorithms (CO 781, Winter 2008) Prof. Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search
Quantu algoriths (CO 781, Winter 2008) Prof Andrew Childs, University of Waterloo LECTURE 15: Unstructured search and spatial search ow we begin to discuss applications of quantu walks to search algoriths
More informationState Estimation Problem for the Action Potential Modeling in Purkinje Fibers
APCOM & ISCM -4 th Deceber, 203, Singapore State Estiation Proble for the Action Potential Modeling in Purinje Fibers *D. C. Estuano¹, H. R. B.Orlande and M. J.Colaço Federal University of Rio de Janeiro
More informationIN modern society that various systems have become more
Developent of Reliability Function in -Coponent Standby Redundant Syste with Priority Based on Maxiu Entropy Principle Ryosuke Hirata, Ikuo Arizono, Ryosuke Toohiro, Satoshi Oigawa, and Yasuhiko Takeoto
More informationIntelligent Systems: Reasoning and Recognition. Artificial Neural Networks
Intelligent Systes: Reasoning and Recognition Jaes L. Crowley MOSIG M1 Winter Seester 2018 Lesson 7 1 March 2018 Outline Artificial Neural Networks Notation...2 Introduction...3 Key Equations... 3 Artificial
More information