Information-based Feature Selection
|
|
- Amberlynn Atkins
- 5 years ago
- Views:
Transcription
1 Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with high-dimesioal datasets. These applicatios iclude gee expressio array aalysis, combiatorial chemistry ad text processig of olie documets. Usig feature selectio brigs about several advatages. First, it leads to lower computatioal cost ad time. Less memory is eeded to store the data ad less processig power is eeded. Feature selectio helps improve the performace of the predictors by avoidig overfittig. It ca also capture the uderlyig coectio betwee the data. Ad perhaps the most importat aspect, it ca break through the barrier of high-dimesioality. To select the most relevat subset of features, we eed a mathematical tool to measure depedece amog radom variables. I this work, we use the cocept of mutual iformatio. Mutual iformatio is a well-kow depedece measure i iformatio theory. For ay arbitrary pair of discrete radom variables, X X ad Y Y, Mutual Iformatio is defied as I(X; Y ) = x X, y Y p X,Y (x, y) log p X,Y (x, y) p X (x) p Y (y). (1) The paper is orgaized as follows. I sectio 2 the method of Maximum-Relevace Miimum- Redudacy (MRMR) is preseted alog with Maximum Joit Relevat (MJR) method. I sectio 3, we preset our method to solve the feature selectio problem. Sectio 4 presets the result of our algorithm tested o Madelo dataset. Fially, sectio 5 discusses the coclusio. 2 Mutual Iformatio as a tool for Feature Selectio As discussed earlier, mutual iformatio is a powerful tool i measurig relevace amog radom variables. Hece, it ca be a useful mathematical tool to fid ad select relevat features. I other words, if our goal is to select o more tha k features a optimal task is to solve arg max S =k I(X S; Y ), (2) where X S = {X i : i S}. However, as k gets larger our estimatio of mutual iformatio becomes less accurate. It is because for large k s we do ot have eough samples to estimate mutual iformatio accurately. Hece, the objective fuctio i (2) should be modified so that it becomes estimable by available samples. I the ext sectios, we first discuss a past approach to solve this issue ad the propose a ew solutio to improve such approaches. 1
2 2.1 Max-Relevace Mi-Redudacy (MRMR) approach As metioed earlier, we aim to idetify the most relevat subset of features whose size is limited to a give factor. Note that this is ot the same as characterizig the k best features with the most idividual mutual iformatio to the target Y. I fact, differet features may share redudat iformatio o the target. Thus, redudacy is aother importat factor to be cosidered i feature selectio. To balace the trade-off betwee relevace ad redudacy, the followig modified objective fuctio (MRMR) has bee suggested i [2]: Φ(X S, Y ) = 1 S i S I(X i ; Y ) 1 S 2 i,j S I(X i ; X j ). (3) Here, the first term measures the average relevace of features to the target, while the secod term measures average pairwise redudacy amog selected features. Therefore, maximizig Φ(X S, Y ) leads to idetifyig a well-characterizig feature subset whose total iformatio o the target is close to the optimal feature subset s. To maximize this objective, they used a iductive approach where first the most iformative feature is chose, ad the ext features are iductively added by solvig the followig at every step: 2.2 Maximum Joit Relevace arg max I(X j ; Y ) 1 I(X j ; X i ). (4) X j X\S m m 1 X i S m Although MRMR is a well-kow feature selectio method, there are several applicatios where the test error rate ever goes below some large thresholds like 34% which seems quite usatisfactory. Note that (3) icludes oly up to pairwise iteractios. By cosiderig higher order iteractios we ca become able to select a more iformative feature subset which i tur results i smaller error rates. To this ed, Maximum Joit Relevat (MJR) algorithm chages the iductive rule of (4) to a more sesitive oe [3]: arg max I(X j, X i ; Y ). (5) X j X\S m X i S m Nevertheless, we may agai ecouter the issue of lack of eough samples to estimate the secod order mutual iformatio appeared i the above formulatio. As a matter of fact, a cosiderable umber of third order empirical margials may become too small, ad thus it requires a more accurate estimatio of mutual iformatio tha the empirical oe. Therefore, i ext sectio we are goig to propose a ew algorithm to estimate mutual iformatio with higher accuracy. As a importat advatage, this estimatio techique reduces the required sample size to estimate mutual iformatio withi the same accuracy. 3 Adaptive Maximum Joit Relevat I this sectio, we propose the Adaptive Maximum Joit Relevat (AMJR) feature selectio algorithm to tackle the istability problem i MJR. Similar to MJR, we use the criterio i (5) to iteratively select the most relevat features. However, we propose a ew scheme to estimate the mutual iformatios which stabilize the algorithm i small traiig set regimes. We build our estimatio techique based o fuctioal estimatio method proposed i [4]. Specifically, i order to 2
3 estimate I(X j, X i ; Y ) at each step, we have to estimate the joit etropies accordig to the followig idetity: I(X j, X i ; Y ) = H(X j, X i ) + H(Y ) H(X j, X i, Y ). (6) I order to describe the estimatio method i AMJR, cosider for example, estimatig H(X j, X i ). Followig from [4], first the empirical joit distributio of (X j, X i ) is computed accordig to ˆP a,b = 1 1{(X j, X i ) (t) = (a, b)}, (7) t=1 where is the size of traiig set ad (X j, X i ) (t) is the joit value of t th traiig example. Note that a ad b are assumed to take value i some fite sets A ad B, respectively. Now, assumig that P a,b is the true joit probability of (X j, X i ) at poit (a, b), the true joit etropy would be H(X j, X i ) = a A, b B P a,b log P a,b. (8) I order to provide the estimator Ĥ(X j, X i ) of H(X j, X i ), oe aive way is substitute each P a,b i (8) with its estimate ˆP a,b. This method which is used i MJR, is i fact the source of istability o the performace sice most of the estimated probabilities are very small. I AMJR, we cosider two cases for the estimated joit probabilities: If ˆP a,b log, we use it as a estimatio of P a,b i (8). If ˆP a,b < log, first we fit a polyomial f of order log to the fuctio x log x i the iterval (0, log ). The, we use f( ˆP a,b ) as a estimatio for P a,b log P a,b i (8). As we see i Sectio 4, the approximatio polyomial f itroduces stability to the algorithm ad improves its performace. Cosequetly, the estimatio of H(X j, X i ) i AMJR would be ( Ĥ(X j, X i ) = ˆP a,b log ˆP a,b + ) f( ˆP a,b ). (9) ˆP a,b log ˆP a,b < log Similarly, the estimatios Ĥ(X j, X i, Y ) ad Ĥ(Y ) are provided for H(X j, X i, Y ) ad H(Y ), respectively. Fially, the mutual iformatio is estimated as Î(X j, X i ; Y ) = Ĥ(X j, X i ) + Ĥ(Y ) Ĥ(X j, X i, Y ). (10) 4 Numerical Results I this sectio we provide umerical results to cofirm our theoretical aalysis. We perform differet feature selectio ad classificatio methods o the dataset Madelo released i NIPS 2003 feature selectio challege [5]. This data set cosists of 2000 samples each cotaiig 500 cotiuous iput features ad oe biary output respose. Here we have used 1400 samples (70%) as the traiig set ad used the other 600 samples (30%) as the test set. I order to explore the effect of sample size o differet feature selectio methods, we quatize the iput space ito 3 ad 5 levels, uiformly. Thus, we have two scearios. I the first oe, the iput features are quatized separately ito three levels which correspods to the large traiig set regime 3
4 0.5 5 MRMR MJR classificatio error rate # selected features Figure 1: SVM classificatio error for 3-level quatizatio of iput space. (sice each level happes too may times ad we have small umber of probabilities to estimate). I the secod sceario, the iput features are quatized separately ito 5 levels. The later sceario correspods to a small traiig set regime where there are a large umber of probabilities to estimate. Figure 1 compares the misclassificatio error of MRMR ad MJR feature selectio algorithms for differet umber of features. Here, SVM is used as the classificatio method ad the iput space is quatized ito 3 levels. Sice this sceario correspods to large traiig set regime, the MJR outperforms MRMR as depicted i the figure. I Fig. 2, the SVM misclassificatio error of MJR ad AMJR has bee compared for differet umber of selected features. Here, the iput space is quatized ito 5 level which correspods to the small traiig set sceario. As depicted i this figure, MJR has ustable performace i this sceario while AMJR shows stable ad better performace. This figure cofirms our theoretical aalysis of istability of MJR ad shows that our proposed method (AMJR) removes the istability problem almost completely. The advatage of the proposed method AMJR method is further described i Fig. 3. I this figure, the SVM misclassificatio error of AMJR ad MRMR methods are compared for differt umber of selected features. Here, the iput space is quatized ito 5 levels (small traiig set regime). As depicted i this figure, AMJR substatially outperforms MRMR for ay umber of AMJR MJR classificatio error rate # selected features Figure 2: SVM classificatio error for 5-level quatizatio of iput space. 4
5 0.5 5 AMJR MRMR classificatio error # Selected Features Figure 3: SVM classificatio error for 5-level quatizatio of iput space. selected features. It worth metioig that other tha SVM, we have also repeated the above experimets for logistic regressio ad classificatio trees ad the same relative results were obtaied. Sice our focus is o comparig the feature selectio algorithms (ad ot the classificatio methods), ad also due to the lack of space, the results for these methods are ot provided here. 5 Coclusio Feature selectio is a idispesable part of solutio whe dealig with high-dimesioal datasets. Oe powerful tool to address this problem is mutual iformatio. A commo approach is to use Maximum Relevace Miimum Redudacy (MRMR) approach to solve the feature selectio problem. I this paper, based o isight from iformatio theory, a ew objective fuctio is used. Also, a ovel mutual iformatio estimator is used eablig us to discretize the data ito fier levels. Combiig the ovel mutual iformatio estimator with the ew objective fuctio, a error rate 3 times lower tha that of MRMR is demostrated. Refereces [1] T. Cover, ad J. Thomas. Elemets of iformatio theory, Joh Wiley & Sos, [2] H. Peg, H. Log, ad C. Dig, Feature selectio based o mutual iformatio criteria of maxdepedecy, max-relevace, ad mi-redudacy. Patter Aalysis ad Machie Itelligece, IEEE Trasactios o 27.8, 2005, [3] H. Yag, ad J. Moody. Data Visualizatio ad Feature Selectio: New Algorithms for Nogaussia Data. NIPS [4] J. Jiao, K. Vekat, Y. Ha, T. Weissma, Miimax Estimatio of Fuctioals of Discrete Distributios, available o arxiv [5] Available olie: 5
10-701/ Machine Learning Mid-term Exam Solution
0-70/5-78 Machie Learig Mid-term Exam Solutio Your Name: Your Adrew ID: True or False (Give oe setece explaatio) (20%). (F) For a cotiuous radom variable x ad its probability distributio fuctio p(x), it
More informationStatistical Pattern Recognition
Statistical Patter Recogitio Classificatio: No-Parametric Modelig Hamid R. Rabiee Jafar Muhammadi Sprig 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Ageda Parametric Modelig No-Parametric Modelig
More informationLecture 11: Decision Trees
ECE9 Sprig 7 Statistical Learig Theory Istructor: R. Nowak Lecture : Decisio Trees Miimum Complexity Pealized Fuctio Recall the basic results of the last lectures: let X ad Y deote the iput ad output spaces
More informationEECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1
EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum
More informationOptimally Sparse SVMs
A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but
More informationThe Random Walk For Dummies
The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli
More informationREGRESSION WITH QUADRATIC LOSS
REGRESSION WITH QUADRATIC LOSS MAXIM RAGINSKY Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X, Y ), where, as before, X is a R d
More informationLecture 17: Minimax estimation of high-dimensional functionals. 1 Estimating the fundamental limit is easier than achieving it: other loss functions
EE378A tatistical igal Processig Lecture 3-05/29/207 Lecture 7: Miimax estimatio of high-dimesioal fuctioals Lecturer: Jiatao Jiao cribe: Joatha Lacotte Estimatig the fudametal limit is easier tha achievig
More informationRegression with quadratic loss
Regressio with quadratic loss Maxim Ragisky October 13, 2015 Regressio with quadratic loss is aother basic problem studied i statistical learig theory. We have a radom couple Z = X,Y, where, as before,
More informationECE 901 Lecture 12: Complexity Regularization and the Squared Loss
ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality
More informationIP Reference guide for integer programming formulations.
IP Referece guide for iteger programmig formulatios. by James B. Orli for 15.053 ad 15.058 This documet is iteded as a compact (or relatively compact) guide to the formulatio of iteger programs. For more
More informationCS284A: Representations and Algorithms in Molecular Biology
CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by
More informationSupport vector machine revisited
6.867 Machie learig, lecture 8 (Jaakkola) 1 Lecture topics: Support vector machie ad kerels Kerel optimizatio, selectio Support vector machie revisited Our task here is to first tur the support vector
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 12
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract I this lecture we derive risk bouds for kerel methods. We will start by showig that Soft Margi kerel SVM correspods to miimizig
More informationBasics of Probability Theory (for Theory of Computation courses)
Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.
More informationRandom Variables, Sampling and Estimation
Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig
More informationLecture 11: Channel Coding Theorem: Converse Part
EE376A/STATS376A Iformatio Theory Lecture - 02/3/208 Lecture : Chael Codig Theorem: Coverse Part Lecturer: Tsachy Weissma Scribe: Erdem Bıyık I this lecture, we will cotiue our discussio o chael codig
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 11
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture Tolstikhi Ilya Abstract We will itroduce the otio of reproducig kerels ad associated Reproducig Kerel Hilbert Spaces (RKHS). We will cosider couple
More informationECE 8527: Introduction to Machine Learning and Pattern Recognition Midterm # 1. Vaishali Amin Fall, 2015
ECE 8527: Itroductio to Machie Learig ad Patter Recogitio Midterm # 1 Vaishali Ami Fall, 2015 tue39624@temple.edu Problem No. 1: Cosider a two-class discrete distributio problem: ω 1 :{[0,0], [2,0], [2,2],
More informationSNAP Centre Workshop. Basic Algebraic Manipulation
SNAP Cetre Workshop Basic Algebraic Maipulatio 8 Simplifyig Algebraic Expressios Whe a expressio is writte i the most compact maer possible, it is cosidered to be simplified. Not Simplified: x(x + 4x)
More informationThe picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled
1 Lecture : Area Area ad distace traveled Approximatig area by rectagles Summatio The area uder a parabola 1.1 Area ad distace Suppose we have the followig iformatio about the velocity of a particle, how
More informationLesson 10: Limits and Continuity
www.scimsacademy.com Lesso 10: Limits ad Cotiuity SCIMS Academy 1 Limit of a fuctio The cocept of limit of a fuctio is cetral to all other cocepts i calculus (like cotiuity, derivative, defiite itegrals
More informationLecture 2: Monte Carlo Simulation
STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?
More informationAn Introduction to Randomized Algorithms
A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis
More information6.3 Testing Series With Positive Terms
6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial
More informationMath 10A final exam, December 16, 2016
Please put away all books, calculators, cell phoes ad other devices. You may cosult a sigle two-sided sheet of otes. Please write carefully ad clearly, USING WORDS (ot just symbols). Remember that the
More informationEstimation for Complete Data
Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of
More informationVector Quantization: a Limiting Case of EM
. Itroductio & defiitios Assume that you are give a data set X = { x j }, j { 2,,, }, of d -dimesioal vectors. The vector quatizatio (VQ) problem requires that we fid a set of prototype vectors Z = { z
More informationLecture 7: October 18, 2017
Iformatio ad Codig Theory Autum 207 Lecturer: Madhur Tulsiai Lecture 7: October 8, 207 Biary hypothesis testig I this lecture, we apply the tools developed i the past few lectures to uderstad the problem
More informationDiscrete Orthogonal Moment Features Using Chebyshev Polynomials
Discrete Orthogoal Momet Features Usig Chebyshev Polyomials R. Mukuda, 1 S.H.Og ad P.A. Lee 3 1 Faculty of Iformatio Sciece ad Techology, Multimedia Uiversity 75450 Malacca, Malaysia. Istitute of Mathematical
More informationChapter 7. Support Vector Machine
Chapter 7 Support Vector Machie able of Cotet Margi ad support vectors SVM formulatio Slack variables ad hige loss SVM for multiple class SVM ith Kerels Relevace Vector Machie Support Vector Machie (SVM)
More informationCommutativity in Permutation Groups
Commutativity i Permutatio Groups Richard Wito, PhD Abstract I the group Sym(S) of permutatios o a oempty set S, fixed poits ad trasiet poits are defied Prelimiary results o fixed ad trasiet poits are
More information6.867 Machine learning
6.867 Machie learig Mid-term exam October, ( poits) Your ame ad MIT ID: Problem We are iterested here i a particular -dimesioal liear regressio problem. The dataset correspodig to this problem has examples
More informationInformation Theory Tutorial Communication over Channels with memory. Chi Zhang Department of Electrical Engineering University of Notre Dame
Iformatio Theory Tutorial Commuicatio over Chaels with memory Chi Zhag Departmet of Electrical Egieerig Uiversity of Notre Dame Abstract A geeral capacity formula C = sup I(; Y ), which is correct for
More informationMonte Carlo Integration
Mote Carlo Itegratio I these otes we first review basic umerical itegratio methods (usig Riema approximatio ad the trapezoidal rule) ad their limitatios for evaluatig multidimesioal itegrals. Next we itroduce
More informationEmpirical Process Theory and Oracle Inequalities
Stat 928: Statistical Learig Theory Lecture: 10 Empirical Process Theory ad Oracle Iequalities Istructor: Sham Kakade 1 Risk vs Risk See Lecture 0 for a discussio o termiology. 2 The Uio Boud / Boferoi
More information-ORDER CONVERGENCE FOR FINDING SIMPLE ROOT OF A POLYNOMIAL EQUATION
NEW NEWTON-TYPE METHOD WITH k -ORDER CONVERGENCE FOR FINDING SIMPLE ROOT OF A POLYNOMIAL EQUATION R. Thukral Padé Research Cetre, 39 Deaswood Hill, Leeds West Yorkshire, LS7 JS, ENGLAND ABSTRACT The objective
More informationGoodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)
Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................
More informationMa 530 Introduction to Power Series
Ma 530 Itroductio to Power Series Please ote that there is material o power series at Visual Calculus. Some of this material was used as part of the presetatio of the topics that follow. What is a Power
More information1 Approximating Integrals using Taylor Polynomials
Seughee Ye Ma 8: Week 7 Nov Week 7 Summary This week, we will lear how we ca approximate itegrals usig Taylor series ad umerical methods. Topics Page Approximatig Itegrals usig Taylor Polyomials. Defiitios................................................
More informationThe z-transform. 7.1 Introduction. 7.2 The z-transform Derivation of the z-transform: x[n] = z n LTI system, h[n] z = re j
The -Trasform 7. Itroductio Geeralie the complex siusoidal represetatio offered by DTFT to a represetatio of complex expoetial sigals. Obtai more geeral characteristics for discrete-time LTI systems. 7.
More informationMath 113 Exam 3 Practice
Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you
More informationStatistics 511 Additional Materials
Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability
More information1 of 7 7/16/2009 6:06 AM Virtual Laboratories > 6. Radom Samples > 1 2 3 4 5 6 7 6. Order Statistics Defiitios Suppose agai that we have a basic radom experimet, ad that X is a real-valued radom variable
More informationw (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.
2 5. Weighted umber of late jobs 5.1. Release dates ad due dates: maximimizig the weight of o-time jobs Oce we add release dates, miimizig the umber of late jobs becomes a sigificatly harder problem. For
More informationQuantile regression with multilayer perceptrons.
Quatile regressio with multilayer perceptros. S.-F. Dimby ad J. Rykiewicz Uiversite Paris 1 - SAMM 90 Rue de Tolbiac, 75013 Paris - Frace Abstract. We cosider oliear quatile regressio ivolvig multilayer
More informationTHE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS
R775 Philips Res. Repts 26,414-423, 1971' THE SYSTEMATIC AND THE RANDOM. ERRORS - DUE TO ELEMENT TOLERANCES OF ELECTRICAL NETWORKS by H. W. HANNEMAN Abstract Usig the law of propagatio of errors, approximated
More informationSECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES
SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES Read Sectio 1.5 (pages 5 9) Overview I Sectio 1.5 we lear to work with summatio otatio ad formulas. We will also itroduce a brief overview of sequeces,
More informationLecture 16: Achieving and Estimating the Fundamental Limit
EE378A tatistical igal Processig Lecture 6-05/25/207 Lecture 6: Achievig ad Estimatig the Fudametal Limit Lecturer: Jiatao Jiao cribe: William Clary I this lecture, we formally defie the two distict problems
More informationA proposed discrete distribution for the statistical modeling of
It. Statistical Ist.: Proc. 58th World Statistical Cogress, 0, Dubli (Sessio CPS047) p.5059 A proposed discrete distributio for the statistical modelig of Likert data Kidd, Marti Cetre for Statistical
More information4.3 Growth Rates of Solutions to Recurrences
4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.
More informationDiscrete-Time Systems, LTI Systems, and Discrete-Time Convolution
EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [
More informationPattern Classification, Ch4 (Part 1)
Patter Classificatio All materials i these slides were take from Patter Classificatio (2d ed) by R O Duda, P E Hart ad D G Stork, Joh Wiley & Sos, 2000 with the permissio of the authors ad the publisher
More informationLecture 19: Convergence
Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may
More information1 Introduction to reducing variance in Monte Carlo simulations
Copyright c 010 by Karl Sigma 1 Itroductio to reducig variace i Mote Carlo simulatios 11 Review of cofidece itervals for estimatig a mea I statistics, we estimate a ukow mea µ = E(X) of a distributio by
More informationScheduling under Uncertainty using MILP Sensitivity Analysis
Schedulig uder Ucertaity usig MILP Sesitivity Aalysis M. Ierapetritou ad Zheya Jia Departmet of Chemical & Biochemical Egieerig Rutgers, the State Uiversity of New Jersey Piscataway, NJ Abstract The aim
More informationSection 5.1 The Basics of Counting
1 Sectio 5.1 The Basics of Coutig Combiatorics, the study of arragemets of objects, is a importat part of discrete mathematics. I this chapter, we will lear basic techiques of coutig which has a lot of
More informationLecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)
Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +
More informationTopics Machine learning: lecture 2. Review: the learning problem. Hypotheses and estimation. Estimation criterion cont d. Estimation criterion
.87 Machie learig: lecture Tommi S. Jaakkola MIT CSAIL tommi@csail.mit.edu Topics The learig problem hypothesis class, estimatio algorithm loss ad estimatio criterio samplig, empirical ad epected losses
More informationCS276A Practice Problem Set 1 Solutions
CS76A Practice Problem Set Solutios Problem. (i) (ii) 8 (iii) 6 Compute the gamma-codes for the followig itegers: (i) (ii) 8 (iii) 6 Problem. For this problem, we will be dealig with a collectio of millio
More informationLecture 14: Graph Entropy
15-859: Iformatio Theory ad Applicatios i TCS Sprig 2013 Lecture 14: Graph Etropy March 19, 2013 Lecturer: Mahdi Cheraghchi Scribe: Euiwoog Lee 1 Recap Bergma s boud o the permaet Shearer s Lemma Number
More informationIntroduction to Machine Learning DIS10
CS 189 Fall 017 Itroductio to Machie Learig DIS10 1 Fu with Lagrage Multipliers (a) Miimize the fuctio such that f (x,y) = x + y x + y = 3. Solutio: The Lagragia is: L(x,y,λ) = x + y + λ(x + y 3) Takig
More informationMOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE
Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:
More informationLecture 15: Learning Theory: Concentration Inequalities
STAT 425: Itroductio to Noparametric Statistics Witer 208 Lecture 5: Learig Theory: Cocetratio Iequalities Istructor: Ye-Chi Che 5. Itroductio Recall that i the lecture o classificatio, we have see that
More informationTopic 9: Sampling Distributions of Estimators
Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be
More informationDisjoint Systems. Abstract
Disjoit Systems Noga Alo ad Bey Sudaov Departmet of Mathematics Raymod ad Beverly Sacler Faculty of Exact Scieces Tel Aviv Uiversity, Tel Aviv, Israel Abstract A disjoit system of type (,,, ) is a collectio
More informationA collocation method for singular integral equations with cosecant kernel via Semi-trigonometric interpolation
Iteratioal Joural of Mathematics Research. ISSN 0976-5840 Volume 9 Number 1 (017) pp. 45-51 Iteratioal Research Publicatio House http://www.irphouse.com A collocatio method for sigular itegral equatios
More informationA RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS
J. Japa Statist. Soc. Vol. 41 No. 1 2011 67 73 A RANK STATISTIC FOR NON-PARAMETRIC K-SAMPLE AND CHANGE POINT PROBLEMS Yoichi Nishiyama* We cosider k-sample ad chage poit problems for idepedet data i a
More informationSummary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.
Key Cocepts: 1) Sketchig of scatter diagram The scatter diagram of bivariate (i.e. cotaiig two variables) data ca be easily obtaied usig GC. Studets are advised to refer to lecture otes for the GC operatios
More informationSeunghee Ye Ma 8: Week 5 Oct 28
Week 5 Summary I Sectio, we go over the Mea Value Theorem ad its applicatios. I Sectio 2, we will recap what we have covered so far this term. Topics Page Mea Value Theorem. Applicatios of the Mea Value
More informationThe Riemann Zeta Function
Physics 6A Witer 6 The Riema Zeta Fuctio I this ote, I will sketch some of the mai properties of the Riema zeta fuctio, ζ(x). For x >, we defie ζ(x) =, x >. () x = For x, this sum diverges. However, we
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationProblem Set 4 Due Oct, 12
EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios
More informationLecture 23: Minimal sufficiency
Lecture 23: Miimal sufficiecy Maximal reductio without loss of iformatio There are may sufficiet statistics for a give problem. I fact, X (the whole data set) is sufficiet. If T is a sufficiet statistic
More informationA NEW CLASS OF 2-STEP RATIONAL MULTISTEP METHODS
Jural Karya Asli Loreka Ahli Matematik Vol. No. (010) page 6-9. Jural Karya Asli Loreka Ahli Matematik A NEW CLASS OF -STEP RATIONAL MULTISTEP METHODS 1 Nazeeruddi Yaacob Teh Yua Yig Norma Alias 1 Departmet
More information6.883: Online Methods in Machine Learning Alexander Rakhlin
6.883: Olie Methods i Machie Learig Alexader Rakhli LECTURES 5 AND 6. THE EXPERTS SETTING. EXPONENTIAL WEIGHTS All the algorithms preseted so far halluciate the future values as radom draws ad the perform
More informationAPPENDIX A SMO ALGORITHM
AENDIX A SMO ALGORITHM Sequetial Miimal Optimizatio SMO) is a simple algorithm that ca quickly solve the SVM Q problem without ay extra matrix storage ad without usig time-cosumig umerical Q optimizatio
More informationProperties and Hypothesis Testing
Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.
More informationDefinitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.
Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,
More informationTHE KALMAN FILTER RAUL ROJAS
THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider
More information1 Review and Overview
DRAFT a fial versio will be posted shortly CS229T/STATS231: Statistical Learig Theory Lecturer: Tegyu Ma Lecture #3 Scribe: Migda Qiao October 1, 2013 1 Review ad Overview I the first half of this course,
More information4.1 Data processing inequality
ECE598: Iformatio-theoretic methods i high-dimesioal statistics Sprig 206 Lecture 4: Total variatio/iequalities betwee f-divergeces Lecturer: Yihog Wu Scribe: Matthew Tsao, Feb 8, 206 [Ed. Mar 22] Recall
More informationNUMERICAL METHODS FOR SOLVING EQUATIONS
Mathematics Revisio Guides Numerical Methods for Solvig Equatios Page 1 of 11 M.K. HOME TUITION Mathematics Revisio Guides Level: GCSE Higher Tier NUMERICAL METHODS FOR SOLVING EQUATIONS Versio:. Date:
More informationChapter 10: Power Series
Chapter : Power Series 57 Chapter Overview: Power Series The reaso series are part of a Calculus course is that there are fuctios which caot be itegrated. All power series, though, ca be itegrated because
More informationDesign and Analysis of Algorithms
Desig ad Aalysis of Algorithms Probabilistic aalysis ad Radomized algorithms Referece: CLRS Chapter 5 Topics: Hirig problem Idicatio radom variables Radomized algorithms Huo Hogwei 1 The hirig problem
More informationChapter 6 Sampling Distributions
Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to
More informationLinear Regression Demystified
Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to
More informationMixtures of Gaussians and the EM Algorithm
Mixtures of Gaussias ad the EM Algorithm CSE 6363 Machie Learig Vassilis Athitsos Computer Sciece ad Egieerig Departmet Uiversity of Texas at Arligto 1 Gaussias A popular way to estimate probability desity
More informationLinear Classifiers III
Uiversität Potsdam Istitut für Iformatik Lehrstuhl Maschielles Lere Liear Classifiers III Blaie Nelso, Tobias Scheffer Cotets Classificatio Problem Bayesia Classifier Decisio Liear Classifiers, MAP Models
More informationLecture 6 Chi Square Distribution (χ 2 ) and Least Squares Fitting
Lecture 6 Chi Square Distributio (χ ) ad Least Squares Fittig Chi Square Distributio (χ ) Suppose: We have a set of measuremets {x 1, x, x }. We kow the true value of each x i (x t1, x t, x t ). We would
More informationCSE 527, Additional notes on MLE & EM
CSE 57 Lecture Notes: MLE & EM CSE 57, Additioal otes o MLE & EM Based o earlier otes by C. Grat & M. Narasimha Itroductio Last lecture we bega a examiatio of model based clusterig. This lecture will be
More informationAddition: Property Name Property Description Examples. a+b = b+a. a+(b+c) = (a+b)+c
Notes for March 31 Fields: A field is a set of umbers with two (biary) operatios (usually called additio [+] ad multiplicatio [ ]) such that the followig properties hold: Additio: Name Descriptio Commutativity
More informationMachine Learning Theory Tübingen University, WS 2016/2017 Lecture 3
Machie Learig Theory Tübige Uiversity, WS 06/07 Lecture 3 Tolstikhi Ilya Abstract I this lecture we will prove the VC-boud, which provides a high-probability excess risk boud for the ERM algorithm whe
More informationReliability and Queueing
Copyright 999 Uiversity of Califoria Reliability ad Queueig by David G. Messerschmitt Supplemetary sectio for Uderstadig Networked Applicatios: A First Course, Morga Kaufma, 999. Copyright otice: Permissio
More information1 Review of Probability & Statistics
1 Review of Probability & Statistics a. I a group of 000 people, it has bee reported that there are: 61 smokers 670 over 5 960 people who imbibe (drik alcohol) 86 smokers who imbibe 90 imbibers over 5
More informationFIXED POINTS OF n-valued MULTIMAPS OF THE CIRCLE
FIXED POINTS OF -VALUED MULTIMAPS OF THE CIRCLE Robert F. Brow Departmet of Mathematics Uiversity of Califoria Los Ageles, CA 90095-1555 e-mail: rfb@math.ucla.edu November 15, 2005 Abstract A multifuctio
More informationPRACTICE PROBLEMS FOR THE FINAL
PRACTICE PROBLEMS FOR THE FINAL Math 36Q Fall 25 Professor Hoh Below is a list of practice questios for the Fial Exam. I would suggest also goig over the practice problems ad exams for Exam ad Exam 2 to
More informationThe standard deviation of the mean
Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider
More informationFACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures
FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals
More informationSequences and Series of Functions
Chapter 6 Sequeces ad Series of Fuctios 6.1. Covergece of a Sequece of Fuctios Poitwise Covergece. Defiitio 6.1. Let, for each N, fuctio f : A R be defied. If, for each x A, the sequece (f (x)) coverges
More information