MDL-Based Unsupervised Attribute Ranking

Size: px
Start display at page:

Download "MDL-Based Unsupervised Attribute Ranking"

Transcription

1 MDL-Based Unsupervsed Attrbute Rankng Zdravko Markov Computer Scence Department Central Connectcut State Unversty New Brtan, CT 06050, USA

2 MDL-Based Unsupervsed Attrbute Rankng Introducton (Attrbute Selecton) MDL-based Clusterng Model Evaluaton Illustratve Example ( play tenns data) Attrbute Rankng Algorthm Herarchcal Clusterng Algorthm Expermental Evaluaton Concluson

3 Attrbute Selecton Supervsed / Unsupervsed. Fnd the smallest set of attrbutes that maxmzes predctve accuracy best uncovers nterestng natural groupngs (clusters) n data accordng to the chosen crteron Subset Selecton / Rankng (Weghtng) Computatonally expensve: 2 m attrbute sets for m attrbutes Assumes that attrbutes are ndependent

4 Supervsed Attrbute Selecton Wrapper methods create predcton models and use the predctve accuracy of these models to measure the attrbute relevance to the classfcaton task. Flter methods drectly measure the ablty of the attrbutes to determne the class labels usng statstcal correlaton, nformaton metrcs, probablstc or other methods. There exst numerous methods n ths settng due to the wde avalablty of model evaluaton crtera n supervsed learnng.

5 Unsupervsed Attrbute Selecton Wrapper methods evaluate a subset of attrbutes by the qualty of clusterng obtaned by usng these attrbutes. Flter methods explore classcal statstcal methods for dmensonalty reducton, lke PCA and maxmum varance, nformaton-based or entropy measures. There exst very few methods n ths settng generally because of the dffculty to evaluate clusterng models.

6 Clusterng Model Evaluaton Chapter 4: Evaluatng Clusterng - MDL-Based Model and Feature Evaluaton

7 Clusterng Model Evaluaton Consder each possble clusterng as a hypothess H that descrbes (explans) data D n terms of frequent patterns (regulartes). Compute the descrpton length of the data L(D), the hypothess L(H), and data gven the hypothess L(D H). L(H) and L(D) are the mnmum number of bts needed to encode (or communcate) H and D respectvely. L(D H) represents the number of bts needed to encode D f we know H. If we know the pattern of H, no need to encode all ts occurrences n D, rather we may encode only the pattern tself and the dfferences that dentfy each ndvdual nstance n D.

8 Mnmum Descrpton Length (MDL) and Informaton Compresson The more regularty n D the shorter descrpton length L(D H). Need to balance L(D H) wth L(H), because the latter depends on the complexty of the pattern. Thus the best hypothess should mnmze the sum L(H)+L(D H) (MDL prncple) or maxmze L(D) L(H) L(D H) (Informaton Compresson)

9 Encodng MDL Hypotheses and data are unformly dstrbuted and the probablty of occurrence of an tem out of n alternatves s /n. Mnmum code length of the message that a partcular tem has occurred s log 2 /n log 2 n bts. The number of bts needed to encode the choce of k tems out of n possble tems s n log2 log n 2 k k

10 Encodng MDL (attrbute-value) Data D, nstance X D, X s a set of m attrbute values, X m - set of all attrbute values n D, k T Cluster C s defned by the set of all attrbute values T T that occur n ts members, C {X C, X T } Clusterng H {C,C 2,,C n } s defned by {T,T 2,,T n }, k T U D X X T n k k C L 2 log 2 log ) ( + m k C C D L 2 log ) ( + + m k C n k k C MDL log log log ) ( n C L H L ) ( ) ( n D C L H D L ) ( ) ( n C MDL H MDL ) ( ) (

11 Play Tenns Data ID outlook temp humdty wndy play sunny hot hgh false no 2 sunny hot hgh true no 3 overcast hot hgh false yes 4 rany mld hgh false yes 5 rany cool normal false yes 6 rany cool normal true no 7 overcast cool normal true yes 8 sunny mld hgh false no 9 sunny cool normal false yes 0 rany mld normal false yes sunny mld normal true yes 2 overcast mld hgh true yes 3 overcast hot normal false yes 4 rany mld hgh true no C {, 2, 3, 4, 8, 2, 4} (humdtyhgh) C 2 {5, 6, 7, 9, 0,, 3} (humdtynormal) T {outlooksunny, outlookovercast, outlookrany, temphot, tempmld, humdtyhgh, wndyfalse, wndytrue} T 2 {outlooksunny, outlookovercast, outlookrany, temphot, tempmld, tempcool, humdtynormal, wndyfalse, wndytrue}.

12 Clusterng Play Tenns Data k MDL( C ) log + 2 log2 n + C log k 2 k m k T 8, k 2 T 2 9, k 0, m 4, n MDL( C ) log2 + log log MDL( C ) log2 + log log MDL({C, C 2 }) MDL(humdty) bts. MDL(temp) MDL(humdty) MDL(outlook) MDL(wndy) Best attrbute s temp

13 MDL Ranker Let A have values v, v 2,, v p Clusterng {C,C 2,,C p }, where C {X x X} Let V A For each data nstance X {x, x 2,, x m } For each attrbute A For each value x V A V A {x } m A k V j j Compute MDL({C,C 2,,C p }) Incremental (no need to store nstances) Tme O(nm 2 ), n s the number of data nstances Space O(pm 2 ), p s the max number of attrbute values Evaluates 3204 nstances wth 395 attrbutes (trec data) n 3 mnutes.

14 Expermental Evaluaton Data Data Set Instances Attrbutes Classes reuters reuters-3class reuters-2class trec soybean soybean-small rs onosphere Java mplementatons of MDL rankng and clusterng avalable from

15 Expermental Evaluaton Metrcs Average Precson D r D PrecsonAtRank(k) q k k PrecsonAtRank(k) k k Classes-to-clusters accuracy ( true cluster membershp) r 0 root [5, 9] temperaturehot [2, 2] outlooksunny [2] no outlookovercast [2] yes temperaturemld [4, 2] wndyfalse [2, ] yes wndytrue [2, ] yes temperaturecool [3, ] wndyfalse [2] yes wndytrue [, ] no Clusters (leaves): 6 Correctly classfed nstances: (78%) r f a D q otherwse

16 Average Precson of Attrbute Rankng Data set D q InfoGan MDL Error Entropy reuters reuters-3class reuters-2class trec soybean soybean-small rs onosphere D q set of attrbutes selected by Wrapper Subset Evaluator wth Naïve Bayes classfer. InfoGan supervsed attrbute rankng usng Informaton Gan Evaluator. Error unsupervsed rankng based on evaluatng the qualty of clusterng by the sum of squared errors. Entropy unsupervsed rankng based on the reducton of the entropy n data when the attrbute s removed (Dash and Lu 2000).

17 Classes-To-Clusters Accuracy Wth Reuters Data 60 MDL ranked InfoGan ranked % Accuracy EM MDL ranked InfoGan ranked % Accuracy k-means

18 Classes-To-Clusters Accuracy Wth Reuters-3class Data EM MDL ranked InfoGan ranked K-means MDL ranked InfoGan ranked

19 Classes-To-Clusters Accuracy Wth Soybean Data EM MDL ranked InfoGan ranked k-means MDL ranked InfoGan ranked

20 MDL-Based Clusterng Functon MDL-Cluster(D). Choose attrbute A argmn MDL( A ) 2. Let A take values v, v 2,, v p 3. Splt data D C, C { X x X} U n p > 4. If Comp( A) Comp( C ) then stop. Return D. 5. For each,...,n Call MDL-Cluster(C )

21 Clusterng Reuters-2class Data root ( ) [608, 39] trade0 ( ) [507, 8] rate0 ( ) [339, 8] mone ( ) [48] money mone0 ( ) [9, 8] money rate ( ) [68] currenc0 ( ) [00] money currenc ( ) [68] money trade ( ) [30, 0] market0 ( ) [86, 39] countr ( ) [67, 20] trade countr0 ( ) [9, 9] trade market ( ) [5, 62] bank0 ( ) [94, ] trade bank ( ) [2, 5] money Clusters (leaves): 8 Correctly classfed nstances: 838 (90%) MDL-Cluster Tree: root ( ) [608, 39] trade0 ( ) [507, 8] money trade ( ) [30, 0] market0 ( ) [86, 39] countr ( ) [67, 20] trade countr0 ( ) [9, 9] trade market ( ) [5, 62] bank0 ( ) [94, ] trade bank ( ) [2, 5] money Clusters (leaves): 5 Correctly classfed nstances: 838 (90%)

22 Comparng MDL, EM and k-means Data set EM k-means MDL-Cluster Acc. % No. of Clusters Acc. % No. of Clusters Acc. % reuters reuters-3class reuters-2class trec soybean soybean-small rs onosphere No. of Clusters

23 Concluson MDL-ranker wthout class nformaton performs closely to the InfoGan method, whch essentally uses class nformaton. Thus, our approach can mprove the performance of clusterng algorthms n purely unsupervsed settng. MDL-cluster outperforms EM and k-means on most benchmark data sets. Numerc attrbutes? Subset evaluaton? Non-herarchcal clusterng? Thank You!

Boostrapaggregating (Bagging)

Boostrapaggregating (Bagging) Boostrapaggregatng (Baggng) An ensemble meta-algorthm desgned to mprove the stablty and accuracy of machne learnng algorthms Can be used n both regresson and classfcaton Reduces varance and helps to avod

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD.

Cluster Validation Determining Number of Clusters. Umut ORHAN, PhD. Cluster Analyss Cluster Valdaton Determnng Number of Clusters 1 Cluster Valdaton The procedure of evaluatng the results of a clusterng algorthm s known under the term cluster valdty. How do we evaluate

More information

Chapter 4.5 Association Rules. CSCI 347, Data Mining

Chapter 4.5 Association Rules. CSCI 347, Data Mining Chapter 4.5 Association Rules CSCI 347, Data Mining Mining Association Rules Can be highly computationally complex One method: Determine item sets Build rules from those item sets Vocabulary from before

More information

Statistical pattern recognition

Statistical pattern recognition Statstcal pattern recognton Bayes theorem Problem: decdng f a patent has a partcular condton based on a partcular test However, the test s mperfect Someone wth the condton may go undetected (false negatve

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Aggregation of Social Networks by Divisive Clustering Method

Aggregation of Social Networks by Divisive Clustering Method ggregaton of Socal Networks by Dvsve Clusterng Method mne Louat and Yves Lechaveller INRI Pars-Rocquencourt Rocquencourt, France {lzennyr.da_slva, Yves.Lechevaller, Fabrce.Ross}@nra.fr HCSD Beng October

More information

Evaluation for sets of classes

Evaluation for sets of classes Evaluaton for Tet Categorzaton Classfcaton accuracy: usual n ML, the proporton of correct decsons, Not approprate f the populaton rate of the class s low Precson, Recall and F 1 Better measures 21 Evaluaton

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement

Markov Chain Monte Carlo (MCMC), Gibbs Sampling, Metropolis Algorithms, and Simulated Annealing Bioinformatics Course Supplement Markov Chan Monte Carlo MCMC, Gbbs Samplng, Metropols Algorthms, and Smulated Annealng 2001 Bonformatcs Course Supplement SNU Bontellgence Lab http://bsnuackr/ Outlne! Markov Chan Monte Carlo MCMC! Metropols-Hastngs

More information

Lecture Notes on Linear Regression

Lecture Notes on Linear Regression Lecture Notes on Lnear Regresson Feng L fl@sdueducn Shandong Unversty, Chna Lnear Regresson Problem In regresson problem, we am at predct a contnuous target value gven an nput feature vector We assume

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M

CIS526: Machine Learning Lecture 3 (Sept 16, 2003) Linear Regression. Preparation help: Xiaoying Huang. x 1 θ 1 output... θ M x M CIS56: achne Learnng Lecture 3 (Sept 6, 003) Preparaton help: Xaoyng Huang Lnear Regresson Lnear regresson can be represented by a functonal form: f(; θ) = θ 0 0 +θ + + θ = θ = 0 ote: 0 s a dummy attrbute

More information

MAXIMUM A POSTERIORI TRANSDUCTION

MAXIMUM A POSTERIORI TRANSDUCTION MAXIMUM A POSTERIORI TRANSDUCTION LI-WEI WANG, JU-FU FENG School of Mathematcal Scences, Peng Unversty, Bejng, 0087, Chna Center for Informaton Scences, Peng Unversty, Bejng, 0087, Chna E-MIAL: {wanglw,

More information

Maximum Likelihood Estimation (MLE)

Maximum Likelihood Estimation (MLE) Maxmum Lkelhood Estmaton (MLE) Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 01 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

A Network Intrusion Detection Method Based on Improved K-means Algorithm

A Network Intrusion Detection Method Based on Improved K-means Algorithm Advanced Scence and Technology Letters, pp.429-433 http://dx.do.org/10.14257/astl.2014.53.89 A Network Intruson Detecton Method Based on Improved K-means Algorthm Meng Gao 1,1, Nhong Wang 1, 1 Informaton

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach

A Bayes Algorithm for the Multitask Pattern Recognition Problem Direct Approach A Bayes Algorthm for the Multtask Pattern Recognton Problem Drect Approach Edward Puchala Wroclaw Unversty of Technology, Char of Systems and Computer etworks, Wybrzeze Wyspanskego 7, 50-370 Wroclaw, Poland

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEARNING Vasant Honavar Bonformatcs and Computatonal Bology Program Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Classification as a Regression Problem

Classification as a Regression Problem Target varable y C C, C,, ; Classfcaton as a Regresson Problem { }, 3 L C K To treat classfcaton as a regresson problem we should transform the target y nto numercal values; The choce of numercal class

More information

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin

Finite Mixture Models and Expectation Maximization. Most slides are from: Dr. Mario Figueiredo, Dr. Anil Jain and Dr. Rong Jin Fnte Mxture Models and Expectaton Maxmzaton Most sldes are from: Dr. Maro Fgueredo, Dr. Anl Jan and Dr. Rong Jn Recall: The Supervsed Learnng Problem Gven a set of n samples X {(x, y )},,,n Chapter 3 of

More information

Multilayer Perceptron (MLP)

Multilayer Perceptron (MLP) Multlayer Perceptron (MLP) Seungjn Cho Department of Computer Scence and Engneerng Pohang Unversty of Scence and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjn@postech.ac.kr 1 / 20 Outlne

More information

VQ widely used in coding speech, image, and video

VQ widely used in coding speech, image, and video at Scalar quantzers are specal cases of vector quantzers (VQ): they are constraned to look at one sample at a tme (memoryless) VQ does not have such constrant better RD perfomance expected Source codng

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors

is the calculated value of the dependent variable at point i. The best parameters have values that minimize the squares of the errors Multple Lnear and Polynomal Regresson wth Statstcal Analyss Gven a set of data of measured (or observed) values of a dependent varable: y versus n ndependent varables x 1, x, x n, multple lnear regresson

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore

8/25/17. Data Modeling. Data Modeling. Data Modeling. Patrice Koehl Department of Biological Sciences National University of Singapore 8/5/17 Data Modelng Patrce Koehl Department of Bologcal Scences atonal Unversty of Sngapore http://www.cs.ucdavs.edu/~koehl/teachng/bl59 koehl@cs.ucdavs.edu Data Modelng Ø Data Modelng: least squares Ø

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

Exercises of Chapter 2

Exercises of Chapter 2 Exercses of Chapter Chuang-Cheh Ln Department of Computer Scence and Informaton Engneerng, Natonal Chung Cheng Unversty, Mng-Hsung, Chay 61, Tawan. Exercse.6. Suppose that we ndependently roll two standard

More information

Learning Theory: Lecture Notes

Learning Theory: Lecture Notes Learnng Theory: Lecture Notes Lecturer: Kamalka Chaudhur Scrbe: Qush Wang October 27, 2012 1 The Agnostc PAC Model Recall that one of the constrants of the PAC model s that the data dstrbuton has to be

More information

} Often, when learning, we deal with uncertainty:

} Often, when learning, we deal with uncertainty: Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Finding Dense Subgraphs in G(n, 1/2)

Finding Dense Subgraphs in G(n, 1/2) Fndng Dense Subgraphs n Gn, 1/ Atsh Das Sarma 1, Amt Deshpande, and Rav Kannan 1 Georga Insttute of Technology,atsh@cc.gatech.edu Mcrosoft Research-Bangalore,amtdesh,annan@mcrosoft.com Abstract. Fndng

More information

EM and Structure Learning

EM and Structure Learning EM and Structure Learnng Le Song Machne Learnng II: Advanced Topcs CSE 8803ML, Sprng 2012 Partally observed graphcal models Mxture Models N(μ 1, Σ 1 ) Z X N N(μ 2, Σ 2 ) 2 Gaussan mxture model Consder

More information

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering /

P R. Lecture 4. Theory and Applications of Pattern Recognition. Dept. of Electrical and Computer Engineering / Theory and Applcatons of Pattern Recognton 003, Rob Polkar, Rowan Unversty, Glassboro, NJ Lecture 4 Bayes Classfcaton Rule Dept. of Electrcal and Computer Engneerng 0909.40.0 / 0909.504.04 Theory & Applcatons

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU,

Classification. Representing data: Hypothesis (classifier) Lecture 2, September 14, Reading: Eric CMU, Machne Learnng 10-701/15-781, 781, Fall 2011 Nonparametrc methods Erc Xng Lecture 2, September 14, 2011 Readng: 1 Classfcaton Representng data: Hypothess (classfer) 2 1 Clusterng 3 Supervsed vs. Unsupervsed

More information

Machine Learning for Language Technology Lecture 8: Decision Trees and k- Nearest Neighbors

Machine Learning for Language Technology Lecture 8: Decision Trees and k- Nearest Neighbors Machne Learnng for Language Technology Lecture 8: Decson Trees and k- Nearest Neghbors Marna San:n Department of Lngus:cs and Phlology Uppsala Unversty, Uppsala, Sweden Autumn 2014 Acknowledgement: Thanks

More information

Problem Set 9 Solutions

Problem Set 9 Solutions Desgn and Analyss of Algorthms May 4, 2015 Massachusetts Insttute of Technology 6.046J/18.410J Profs. Erk Demane, Srn Devadas, and Nancy Lynch Problem Set 9 Solutons Problem Set 9 Solutons Ths problem

More information

Fuzzy Boundaries of Sample Selection Model

Fuzzy Boundaries of Sample Selection Model Proceedngs of the 9th WSES Internatonal Conference on ppled Mathematcs, Istanbul, Turkey, May 7-9, 006 (pp309-34) Fuzzy Boundares of Sample Selecton Model L. MUHMD SFIIH, NTON BDULBSH KMIL, M. T. BU OSMN

More information

CHAPTER IV RESEARCH FINDING AND ANALYSIS

CHAPTER IV RESEARCH FINDING AND ANALYSIS CHAPTER IV REEARCH FINDING AND ANALYI A. Descrpton of Research Fndngs To fnd out the dfference between the students who were taught by usng Mme Game and the students who were not taught by usng Mme Game

More information

Keyword Reduction for Text Categorization using Neighborhood Rough Sets

Keyword Reduction for Text Categorization using Neighborhood Rough Sets IJCSI Internatonal Journal of Computer Scence Issues, Volume 1, Issue 1, No, January 015 ISSN (rnt): 1694-0814 ISSN (Onlne): 1694-0784 www.ijcsi.org Keyword Reducton for Text Categorzaton usng Neghborhood

More information

Lossy Compression. Compromise accuracy of reconstruction for increased compression.

Lossy Compression. Compromise accuracy of reconstruction for increased compression. Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost

More information

10-701/ Machine Learning, Fall 2005 Homework 3

10-701/ Machine Learning, Fall 2005 Homework 3 10-701/15-781 Machne Learnng, Fall 2005 Homework 3 Out: 10/20/05 Due: begnnng of the class 11/01/05 Instructons Contact questons-10701@autonlaborg for queston Problem 1 Regresson and Cross-valdaton [40

More information

Clustering with Gaussian Mixtures

Clustering with Gaussian Mixtures Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

Natural Language Processing and Information Retrieval

Natural Language Processing and Information Retrieval Natural Language Processng and Informaton Retreval Support Vector Machnes Alessandro Moschtt Department of nformaton and communcaton technology Unversty of Trento Emal: moschtt@ds.untn.t Summary Support

More information

Decision-making and rationality

Decision-making and rationality Reslence Informatcs for Innovaton Classcal Decson Theory RRC/TMI Kazuo URUTA Decson-makng and ratonalty What s decson-makng? Methodology for makng a choce The qualty of decson-makng determnes success or

More information

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks

Internet Engineering. Jacek Mazurkiewicz, PhD Softcomputing. Part 3: Recurrent Artificial Neural Networks Self-Organising Artificial Neural Networks Internet Engneerng Jacek Mazurkewcz, PhD Softcomputng Part 3: Recurrent Artfcal Neural Networks Self-Organsng Artfcal Neural Networks Recurrent Artfcal Neural Networks Feedback sgnals between neurons Dynamc

More information

EGR 544 Communication Theory

EGR 544 Communication Theory EGR 544 Communcaton Theory. Informaton Sources Z. Alyazcoglu Electrcal and Computer Engneerng Department Cal Poly Pomona Introducton Informaton Source x n Informaton sources Analog sources Dscrete sources

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov

9.913 Pattern Recognition for Vision. Class IV Part I Bayesian Decision Theory Yuri Ivanov 9.93 Class IV Part I Bayesan Decson Theory Yur Ivanov TOC Roadmap to Machne Learnng Bayesan Decson Makng Mnmum Error Rate Decsons Mnmum Rsk Decsons Mnmax Crteron Operatng Characterstcs Notaton x - scalar

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Tracking with Kalman Filter

Tracking with Kalman Filter Trackng wth Kalman Flter Scott T. Acton Vrgna Image and Vdeo Analyss (VIVA), Charles L. Brown Department of Electrcal and Computer Engneerng Department of Bomedcal Engneerng Unversty of Vrgna, Charlottesvlle,

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7

Stanford University CS254: Computational Complexity Notes 7 Luca Trevisan January 29, Notes for Lecture 7 Stanford Unversty CS54: Computatonal Complexty Notes 7 Luca Trevsan January 9, 014 Notes for Lecture 7 1 Approxmate Countng wt an N oracle We complete te proof of te followng result: Teorem 1 For every

More information

Application research on rough set -neural network in the fault diagnosis system of ball mill

Application research on rough set -neural network in the fault diagnosis system of ball mill Avalable onlne www.ocpr.com Journal of Chemcal and Pharmaceutcal Research, 2014, 6(4):834-838 Research Artcle ISSN : 0975-7384 CODEN(USA) : JCPRC5 Applcaton research on rough set -neural network n the

More information

Dynamic Ensemble Selection and Instantaneous Pruning for Regression

Dynamic Ensemble Selection and Instantaneous Pruning for Regression Dynamc Ensemble Selecton and Instantaneous Prunng for Regresson Kaushala Das and Terry Wndeatt Centre for Vson Speech and Sgnal Processng Faculty of Engneerng and Physcal Scences Unversty of Surrey, Guldford,

More information

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS

CHAPTER IV RESEARCH FINDING AND DISCUSSIONS CHAPTER IV RESEARCH FINDING AND DISCUSSIONS A. Descrpton of Research Fndng. The Implementaton of Learnng Havng ganed the whole needed data, the researcher then dd analyss whch refers to the statstcal data

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Clustering & Unsupervised Learning

Clustering & Unsupervised Learning Clusterng & Unsupervsed Learnng Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Wnter 2012 UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y

More information

AN IMPROVED METHOD OF HIERARCHIC ANALYSIS FOR CHOOSING OPTIMAL INFORMATION PROTECTION SYSTEM IN COMPUTER NETWORKS

AN IMPROVED METHOD OF HIERARCHIC ANALYSIS FOR CHOOSING OPTIMAL INFORMATION PROTECTION SYSTEM IN COMPUTER NETWORKS S. M. Kusemko, Cand. Sc. (Eng.), Ass. Prof.; V. M. Melnchuk AN IMPROVED METHOD OF HIERARCHIC ANALYSIS FOR CHOOSING OPTIMAL INFORMATION PROTECTION SYSTEM IN COMPUTER NETWORKS An mproved herarchc analyss

More information

Dimension Reduction and Visualization of the Histogram Data

Dimension Reduction and Visualization of the Histogram Data The 4th Workshop n Symbolc Data Analyss (SDA 214): Tutoral Dmenson Reducton and Vsualzaton of the Hstogram Data Han-Mng Wu ( 吳漢銘 ) Department of Mathematcs Tamkang Unversty Tamsu 25137, Tawan http://www.hmwu.dv.tw

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Joint Statistical Meetings - Biopharmaceutical Section

Joint Statistical Meetings - Biopharmaceutical Section Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve

More information

Provable Security Signatures

Provable Security Signatures Provable Securty Sgnatures UCL - Louvan-la-Neuve Wednesday, July 10th, 2002 LIENS-CNRS Ecole normale supéreure Summary Introducton Sgnature FD PSS Forkng Lemma Generc Model Concluson Provable Securty -

More information

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n!

Probability Theory. The nth coefficient of the Taylor series of f(k), expanded around k = 0, gives the nth moment of x as ( ik) n n! 8333: Statstcal Mechancs I Problem Set # 3 Solutons Fall 3 Characterstc Functons: Probablty Theory The characterstc functon s defned by fk ep k = ep kpd The nth coeffcent of the Taylor seres of fk epanded

More information

MAKING A DECISION WHEN DEALING WITH UNCERTAIN CONDITIONS

MAKING A DECISION WHEN DEALING WITH UNCERTAIN CONDITIONS Luca Căbulea, Mhaela Aldea-Makng a decson when dealng wth uncertan condtons MAKING A DECISION WHEN DEALING WITH UNCERTAIN CONDITIONS. Introducton by Luca Cabulea and Mhaela Aldea The decson theory offers

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Detecting Attribute Dependencies from Query Feedback

Detecting Attribute Dependencies from Query Feedback Detectng Attrbute Dependences from Query Feedback Peter J. Haas 1, Faban Hueske 2, Volker Markl 1 1 IBM Almaden Research Center 2 Unverstät Ulm VLDB 2007 Peter J. Haas The Problem: Detectng (Parwse) Dependent

More information

Clustering & (Ken Kreutz-Delgado) UCSD

Clustering & (Ken Kreutz-Delgado) UCSD Clusterng & Unsupervsed Learnng Nuno Vasconcelos (Ken Kreutz-Delgado) UCSD Statstcal Learnng Goal: Gven a relatonshp between a feature vector x and a vector y, and d data samples (x,y ), fnd an approxmatng

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Change Detection: Current State of the Art and Future Directions

Change Detection: Current State of the Art and Future Directions Change Detecton: Current State of the Art and Future Drectons Dapeng Olver Wu Electrcal & Computer Engneerng Unversty of Florda http://www.wu.ece.ufl.edu/ Outlne Motvaton & problem statement Change detecton

More information

Manning & Schuetze, FSNLP (c)1999, 2001

Manning & Schuetze, FSNLP (c)1999, 2001 page 589 16.2 Maxmum Entropy Modelng 589 Mannng & Schuetze, FSNLP (c)1999, 2001 a decson tree that detects spam. Fndng the rght features s paramount for ths task, so desgn your feature set carefully. Exercse

More information

CHAPTER 3: BAYESIAN DECISION THEORY

CHAPTER 3: BAYESIAN DECISION THEORY HATER 3: BAYESIAN DEISION THEORY Decson mang under uncertanty 3 Data comes from a process that s completely not nown The lac of nowledge can be compensated by modelng t as a random process May be the underlyng

More information

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH

Turbulence classification of load data by the frequency and severity of wind gusts. Oscar Moñux, DEWI GmbH Kevin Bleibler, DEWI GmbH Turbulence classfcaton of load data by the frequency and severty of wnd gusts Introducton Oscar Moñux, DEWI GmbH Kevn Blebler, DEWI GmbH Durng the wnd turbne developng process, one of the most mportant

More information

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1]

Outline. Communication. Bellman Ford Algorithm. Bellman Ford Example. Bellman Ford Shortest Path [1] DYNAMIC SHORTEST PATH SEARCH AND SYNCHRONIZED TASK SWITCHING Jay Wagenpfel, Adran Trachte 2 Outlne Shortest Communcaton Path Searchng Bellmann Ford algorthm Algorthm for dynamc case Modfcatons to our algorthm

More information

An Enterprise Competitive Capability Evaluation Based on Rough Sets

An Enterprise Competitive Capability Evaluation Based on Rough Sets An Enterprse Compettve Capablty Evaluaton Based on Rough Sets 59 An Enterprse Compettve Capablty Evaluaton Based on Rough Sets Mng-Chang Lee Department of Informaton Management Fooyn Unversty, Kaohsung,

More information

Lecture Nov

Lecture Nov Lecture 18 Nov 07 2008 Revew Clusterng Groupng smlar obects nto clusters Herarchcal clusterng Agglomeratve approach (HAC: teratvely merge smlar clusters Dfferent lnkage algorthms for computng dstances

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification

Instance-Based Learning (a.k.a. memory-based learning) Part I: Nearest Neighbor Classification Instance-Based earnng (a.k.a. memory-based learnng) Part I: Nearest Neghbor Classfcaton Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n

More information

STATISTICAL MECHANICS

STATISTICAL MECHANICS STATISTICAL MECHANICS Thermal Energy Recall that KE can always be separated nto 2 terms: KE system = 1 2 M 2 total v CM KE nternal Rgd-body rotaton and elastc / sound waves Use smplfyng assumptons KE of

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

CLASSIFICATION OF HIGGS JETS AS DECAY PRODUCTS OF A RANDALL-SUNDRUM GRAVITON AT THE ATLAS EXPERIMENT

CLASSIFICATION OF HIGGS JETS AS DECAY PRODUCTS OF A RANDALL-SUNDRUM GRAVITON AT THE ATLAS EXPERIMENT CLASSIFICATION OF HIGGS JETS AS DECAY PRODUCTS OF A RANDALL-SUNDRUM GRAVITON AT THE ATLAS EXPERIMENT AVIV CUKIERMAN, ZIHAO JIANG. Introducton In 202, physcsts at the Large Hadron Collder announced the

More information

Neryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija

Neryškioji dichotominių testo klausimų ir socialinių rodiklių diferencijavimo savybių klasifikacija Neryškoj dchotomnų testo klausmų r socalnų rodklų dferencjavmo savybų klasfkacja Aleksandras KRYLOVAS, Natalja KOSAREVA, Julja KARALIŪNAITĖ Technologcal and Economc Development of Economy Receved 9 May

More information

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata

Multilayer Perceptrons and Backpropagation. Perceptrons. Recap: Perceptrons. Informatics 1 CG: Lecture 6. Mirella Lapata Multlayer Perceptrons and Informatcs CG: Lecture 6 Mrella Lapata School of Informatcs Unversty of Ednburgh mlap@nf.ed.ac.uk Readng: Kevn Gurney s Introducton to Neural Networks, Chapters 5 6.5 January,

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem

Interactive Bi-Level Multi-Objective Integer. Non-linear Programming Problem Appled Mathematcal Scences Vol 5 0 no 65 3 33 Interactve B-Level Mult-Objectve Integer Non-lnear Programmng Problem O E Emam Department of Informaton Systems aculty of Computer Scence and nformaton Helwan

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Linear Feature Engineering 11

Linear Feature Engineering 11 Lnear Feature Engneerng 11 2 Least-Squares 2.1 Smple least-squares Consder the followng dataset. We have a bunch of nputs x and correspondng outputs y. The partcular values n ths dataset are x y 0.23 0.19

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

Complete subgraphs in multipartite graphs

Complete subgraphs in multipartite graphs Complete subgraphs n multpartte graphs FLORIAN PFENDER Unverstät Rostock, Insttut für Mathematk D-18057 Rostock, Germany Floran.Pfender@un-rostock.de Abstract Turán s Theorem states that every graph G

More information

Sparse Gaussian Processes Using Backward Elimination

Sparse Gaussian Processes Using Backward Elimination Sparse Gaussan Processes Usng Backward Elmnaton Lefeng Bo, Lng Wang, and Lcheng Jao Insttute of Intellgent Informaton Processng and Natonal Key Laboratory for Radar Sgnal Processng, Xdan Unversty, X an

More information