Data Analysis. Santiago González

Size: px
Start display at page:

Download "Data Analysis. Santiago González"

Transcription

1 Santiago González

2 Contents Introduction CRISP-DM (1) Tools Data understanding Data preparation Modeling (2) Association rules? Supervised classification Clustering Assesment & Evaluation (1) Examples: (2) Neuron Classification Alzheimer disease Meduloblastoma CliDaPa (1) Special Guest Prof. Ernestina Menasalvas Stream Mining

3 Data Mining: Modeling

4 Data Mining Tasks Prediction Methods Use some variables to predict unknown or future values of other variables. Description Methods Find human-interpretable patterns that describe the data. From [Fayyad, et.al.] Advances in Knowledge Discovery and Data Mining, 1996

5 Data Mining Tasks... Association Rule Discovery [Descriptive] Classification [Predictive] Regression [Predictive] Clustering [Descriptive] Supervised cl. Unsupervised cl.

6 Data Mining Tasks... Association Rule Discovery [Descriptive] Classification [Predictive] Regression [Predictive] Clustering [Descriptive]

7 Association Rule Discovery Given a set of records each of which contain some number of items from a given collection; Produce dependency rules which will predict occurrence of an item based on occurrences of other items. TID Items 1 Bread, Coke, Milk 2 Beer, Bread 3 Beer, Coke, Diaper, Milk 4 Beer, Bread, Diaper, Milk 5 Coke, Diaper, Milk Rules Discovered: {Milk} --> {Coke} {Diaper, Milk} --> {Beer}

8 Association Rule Discovery Example: Let the rule discovered be {Bagels, } --> {Potato Chips} Potato Chips as consequent => Can be used to determine what should be done to boost its sales. Bagels in the antecedent => Can be used to see which products would be affected if the store discontinues selling bagels. Bagels in antecedent and Potato chips in consequent => Can be used to see what products should be sold with Bagels to promote sale of Potato chips!

9 Data Mining Tasks... Association Rule Discovery [Descriptive] Classification [Predictive] Regression [Predictive] Clustering [Descriptive]

10 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class (categorical). Class may be binary o not Find a model for class attribute as a function of the values of other attributes. Goal: previously unseen records should be assigned a class as accurately as possible. A testing set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

11 10 10 Classification Example Tid Refund Marital Status Taxable Income Cheat Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No No Single 75K? 2 No Married 100K No Yes Married 50K? 3 No Single 70K No No Married 150K? 4 Yes Married 120K No Yes Divorced 90K? 5 No Divorced 95K Yes No Single 40K? 6 No Married 60K No 7 Yes Divorced 220K No No Married 80K? Test Set 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Training Set Learn Classifier Model

12 Classifying Galaxies Early Class: Stages of Formation Intermediate Courtesy: Attributes: Image features, Characteristics of light waves received, etc. Late Data Size: 72 million stars, 20 million galaxies Object Catalog: 9 GB Image Database: 150 GB

13 Classification

14 Cross validation Well classified: (a+d)/sum Wrong classified: (c+b)/sum True positive (sensibility): a/a+c True negative (specificity): d/b+d False positive: b/a+c False negative: c/b+d

15 Classification: example Well classified: Wrong classified: True positive (sensibility): True negative (specificity): False positive: False negative:

16 Classification: example Well classified: 4/6 Wrong classified: 2/6 True positive (sensibility): 2/3 True negative (specificity): 2/3 False positive: 1/3 False negative: 1/3

17 Classification

18 KNN Idea: use information of the k nearest neighbours. We need to calculate the distance between samples in order to know who is nearest (euclidea, manhattan, etc.) Prior info: Number of neighbours: K Distance function: d(x,y) Learning data Testing data

19 KNN Euclidean distance Manhattan distance Quite similar Difference: absolute value instead of squared value

20 KNN Example with K = 3, two attributes and euclidean distance

21 ID3 Objective: Create a decision tree as a method to approximate a target function based on discrete values Resistant to noise in the data Is able to find or learn of a disjunction of expressions. Result can be expressed as rules: if-then Try to find the simplest tree that separe better the samples. It is a recursive algorithm Use information gain

22 ID3

23 ID3 The most discriminative feature is the one with more Information Gain: G (C,Attr 1 ) = E (C) - P(C Attr 1 =V i ) * E (Attr 1 ) where E (Attr 1 ) = - P(Attr 1 =V i ) * log 2 (P(Attr 1 =V i )) = = - P(Attr 1 =V i ) * ln(p(attr 1 =V i )) / ln(2)

24 ID3: example This feature is important?? Clasificación Supervisada

25 ID3: example G(AdministrarTratamiento,Gota) = G(AT,G) G(AT,G) = E(AT) P(G=Si) x E(G=Si) P(G=No) x E(G=No) E(G=Si) = - P(AT=Si G=Si) * log 2 (P(AT=Si G=Si)) - P(AT=No G=Si) * log 2 (P(AT=No G=Si)) = = - 3/7 * log 2 (3/7) 4/7 * log 2 (4/7) = E(G=No) = - P(AT=Si G=No) * log 2 (P(AT=Si G=No)) - P(AT=No G=No) * log 2 (P(AT=No G=No)) = - 6/7 * log 2 (6/7) 1/7 * log 2 (1/7) = E(AT)=- P(AT=Si)* log 2 (P(AT=Si)) - P(AT=No)* log 2 (P(AT=No)) = = - 9/14 * log 2 (9/14) - 5/14 * log 2 (5/14) = G(AT,G) = 0.94 P(G=Si) x P(G=No) x = = 0.94 (7/14) x (7/14) x = 0.151

26 ID3: example

27 ID3: example

28 ID3: example

29 Bayes Classifier A probabilistic framework for solving classification problems Conditional Probability: Bayes theorem: ) ( ) ( ) ( ) ( A P C P C A P A C P ) ( ), ( ) ( ) ( ), ( ) ( C P A C P C A P A P A C P A C P

30 Example of Bayes Theorem Given: A doctor knows that meningitis causes stiff neck 50% of the time Prior probability of any patient having meningitis is 1/50,000 Prior probability of any patient having stiff neck is 1/20 If a patient has stiff neck, what s the probability he/she has meningitis? P( S M ) P( M ) 0.51/ P( M S) P( S) 1/ 20

31 Bayesian Classifiers Consider each attribute and class label as random variables Given a record with attributes (A 1, A 2,,A n ) Goal is to predict class C Specifically, we want to find the value of C that maximizes P(C A 1, A 2,,A n ) Can we estimate P(C A 1, A 2,,A n ) directly from data?

32 Bayesian Classifiers Approach: compute the posterior probability P(C A 1, A 2,, A n ) for all values of C using the Bayes theorem P( C A A 1 2 A Choose value of C that maximizes P(C A 1, A 2,, A n ) n ) P( A A A C) P( C) 1 2 n P( A A A ) Equivalent to choosing value of C that maximizes P(A 1, A 2,, A n C) P(C) 1 2 n How to estimate P(A 1, A 2,, A n C )?

33 Naïve Bayes Classifier Assume independence among attributes A i when class is given: P(A 1, A 2,, A n C) = P(A 1 C j ) P(A 2 C j ) P(A n C j ) Can estimate P(A i C j ) for all A i and C j. New point is classified to C j if P(C j ) P(A i C j ) is maximal.

34 10 How to Estimate Probabilities from Data? Class: P(C) = N c /N categorica l Tid Refund Marital Status categorica l Taxable Income continuous Evade class e.g., P(No) = 7/10, P(Yes) = 3/10 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes For discrete attributes: P(A i C k ) = A ik / N c where A ik is number of instances having attribute A i and belongs to class C k Examples: P(Status=Married No) = 4/7 P(Refund=Yes Yes)=0 k

35 How to Estimate Probabilities from Data? For continuous attributes: Discretize the range into bins one ordinal attribute per bin violates independence assumption Two-way split: (A < v) or (A > v) choose only one of the two splits as new attribute Probability density estimation: Assume attribute follows a normal distribution Use data to estimate parameters of distribution (e.g., mean and standard deviation) Once probability distribution is known, can use it to estimate the conditional probability P(A i c)

36 10 How to Estimate Probabilities from Data? categorica l Tid Refund Marital Status categorica l Taxable Income continuous 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No Evade 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes class Normal distribution: P( A c ) One for each (A i,c i ) pair For (Income, Class=No): If Class=No i j 2 ( sample mean = 110 ) 2 sample variance = ij e A i ij 2 ij 2 P( Income 120 No) 1 2 (54.54) e (120110) 2(2975)

37 Example of Naïve Bayes Classifier naive Bayes Classifier: Given a Test Record: X ( Refund No, Married, Income 120K) P(Refund=Yes No) = 3/7 P(Refund=No No) = 4/7 P(Refund=Yes Yes) = 0 P(Refund=No Yes) = 1 P(Marital Status=Single No) = 2/7 P(Marital Status=Divorced No)=1/7 P(Marital Status=Married No) = 4/7 P(Marital Status=Single Yes) = 2/7 P(Marital Status=Divorced Yes)=1/7 P(Marital Status=Married Yes) = 0 For taxable income: If class=no: sample mean=110 sample variance=2975 If class=yes: sample mean=90 sample variance=25 P(X Class=No) = P(Refund=No Class=No) P(Married Class=No) P(Income=120K Class=No) = 4/7 4/ = P(X Class=Yes) = P(Refund=No Class=Yes) P(Married Class=Yes) P(Income=120K Class=Yes) = = 0 Since P(X No)P(No) > P(X Yes)P(Yes) Therefore P(No X) > P(Yes X) => Class = No

38 Naïve Bayes Classifier If one of the conditional probability is zero, then the entire expression becomes zero Probability estimation: Nic Original : P( Ai C) N Laplace : P( A i C) m - estimate : P( A i N N ic c c C) 1 c N N ic c mp m c: number of classes p: prior probability m: parameter

39 Example of Naïve Bayes Classifier Name Give Birth Can Fly Live in Water Have Legs Class human yes no no yes mammals python no no no no non-mammals salmon no no yes no non-mammals whale yes no yes no mammals frog no no sometimes yes non-mammals komodo no no no yes non-mammals bat yes yes no yes mammals pigeon no yes no yes non-mammals cat yes no no yes mammals leopard shark yes no yes no non-mammals turtle no no sometimes yes non-mammals penguin no no sometimes yes non-mammals porcupine yes no no yes mammals eel no no yes no non-mammals salamander no no sometimes yes non-mammals gila monster no no no yes non-mammals platypus no no no yes mammals owl no yes no yes non-mammals dolphin yes no yes no mammals eagle no yes no yes non-mammals Give Birth Can Fly Live in Water Have Legs Class yes no yes no? A: attributes M: mammals N: non-mammals P( A M ) P( A N) P( A M ) P( M ) P( A N) P( N) P(A M)P(M) > P(A N)P(N) => Mammals

40 Data Mining Tasks... Association Rule Discovery [Descriptive] Classification [Predictive] Regression [Predictive] Clustering [Descriptive]

41 Regression Predict a value of a given continuous valued variable based on the values of other variables, assuming a linear or nonlinear model of dependency. Greatly studied in statistics, neural network fields. Examples: Predicting sales amounts of new product based on advetising expenditure. Predicting wind velocities as a function of temperature, humidity, air pressure, etc. Time series prediction of stock market indices.

42 Regression

43 Data Mining Tasks... Association Rule Discovery [Descriptive] Classification [Predictive] Regression [Predictive] Clustering [Descriptive]

44 Clustering Definition A clustering is a set of clusters Given a set of data points, each having a set of attributes, and a similarity measure among them, find clusters such that Data points in one cluster are more similar to one another. Data points in separate clusters are less similar to one another. Similarity Measures: Euclidean Distance if attributes are continuous. Other Problem-specific Measures.

45 Illustrating Clustering Intracluster distances are minimized Intercluster distances are maximized Euclidean Distance Based Clustering in 3-D space.

46 Cluster can be Ambiguous How many clusters? Six Clusters Two Clusters Four Clusters

47 Clustering

48 Types of Clusterings Important distinction between hierarchical, partitional and density sets of clusters Partitional Clustering (K-Means) A division data objects into non-overlapping subsets (clusters) such that each data object is in exactly one subset Hierarchical clustering (Agglomerative) A set of nested clusters organized as a hierarchical tree Density clustering (DBSCAN) Clusters are regarded as regions in the data space in which the objects are dense, and which are separated by regions of low object density (noise).

49 Partitional Clustering Original Points A Partitional Clustering

50 K-Means Partitional clustering approach Each cluster is associated with a centroid (center point) Each point is assigned to the cluster with the closest centroid Number of clusters, K, must be specified The basic algorithm is very simple

51 K-Means Initial centroids are often chosen randomly. Clusters produced vary from one run to another. The centroid is (typically) the mean of the points in the cluster. Closeness is usually measured by Euclidean distance, cosine similarity, correlation, etc. K-means will converge for common similarity measures mentioned above. Most of the convergence happens in the first few iterations. Often the stopping condition is changed to Until relatively few points change clusters

52 y Importance of Choosing Initial Centroids 3 Iteration x

53 y y y y y y Importance of Choosing Initial Centroids 3 Iteration 1 3 Iteration 2 3 Iteration x x x 3 Iteration 4 3 Iteration 5 3 Iteration x x x

54 Hierarchical Clustering Produces a set of nested clusters organized as a hierarchical tree Can be visualized as a dendrogram A tree like diagram that records the sequences of merges or splits

55 Hierarchical Clustering p1 p3 p4 p2 p1 p2 p3 p4 Traditional Hierarchical Clustering Traditional Dendrogram p1 p3 p4 p2 p1 p2 p3 p4 Non-traditional Hierarchical Clustering Non-traditional Dendrogram

56 DBSCAN Original Points Clusters Resistant to Noise Can handle clusters of different shapes and sizes

57 Data Mining: Assesment

58 Assesment Algorithms Supervised Metrics Validation Algorithms Unsupervised Metrics

59 Supervised validation alg. Resubstitution

60 Supervised validation alg. Hold-out

61 Supervised validation alg. N-fold cross validation

62 Supervised validation alg. Leave-one-out (N max folds) N-cross fold validation cuando N = dim(datos)

63 Supervised validation alg Bootstrap Clasificación Supervisada

64 Supervised metrics Calibration Distance between real class and predited class. Continuous [0, ) Discrimination Probability of classification Continuous [0,1] In classification, we want to get the lowest calibration possible and the highest discrimination possible.

65 Página 65 Supervised metrics Example: Real class: 1 Predicted class: 0.6 (using regression) Discrimination: 1 supossing that if Class predicted > 0.5 then Class predicted = 1 Calibration: 0.4 (1-0.6)

66 Supervised metrics Accuracy (well classified) [Discrimination] Log Likelihood [Calibration] AUC [Discrimination] Brier Score [Calibration + Discrimination] Hosmer DW, Lemeshow S (2000) Applied logistic regression 2nd edn. Wiley, New York

67 AUC Area Under the ROC Curve Continuous [0,1]

68 Unsupervised validation

69 Unsupervised alg. Compactness, the members of each cluster should be as close to each other as possible. A common measure of compactness is the variance, which should be minimized. Separation, the clusters themselves should be widely spaced. There are three common approaches measuring the distance between two different clusters: Single linkage: It measures the distance between the closest members of the clusters. Complete linkage: It measures the distance between the most distant members. Comparison of centroids: It measures the distance between the centers of the clusters. MARIA HALKIDI, YANNIS BATISTAKIS and MICHALIS VAZIRGIANNIS On Clustering Validation Techniques, Journal of IIS, 2001

70 Measures of Cluster Validity Numerical measures that are applied to judge various aspects of cluster validity, are classified into the following three types. External Index: Used to measure the extent to which cluster labels match externally supplied class labels. Entropy Internal Index: Used to measure the goodness of a clustering structure without respect to external information. Sum of Squared Error (SSE) Relative Index: Used to compare two different clusters. Often an external or internal index is used for this function, e.g., SSE or entropy MARIA HALKIDI, YANNIS BATISTAKIS and MICHALIS VAZIRGIANNIS On Clustering Validation Techniques, Journal of IIS, 2001

71 y Points Using Similarity Matrix for Cluster Validation Order the similarity matrix with respect to cluster labels and inspect visually x Complete Link Points Similarity

72 Points y Using Similarity Matrix for Cluster Validation Clusters in random data are not so crisp Points 0 Similarity Complete Link x

73 Using Similarity Matrix for Cluster Validation DBSCAN

74 Santiago González

Advanced classifica-on methods

Advanced classifica-on methods Advanced classifica-on methods Instance-based classifica-on Bayesian classifica-on Instance-Based Classifiers Set of Stored Cases Atr1... AtrN Class A B B C A C B Store the training records Use training

More information

Lecture Notes for Chapter 5 (PART 1)

Lecture Notes for Chapter 5 (PART 1) Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 5 (PART 1) 1 Agenda Rule Based Classifier PART 1 Bayesian Classifier Artificial Neural Network Support Vector Machine PART 2

More information

7 Classification: Naïve Bayes Classifier

7 Classification: Naïve Bayes Classifier CSE4334/5334 Data Mining 7 Classifiation: Naïve Bayes Classifier Chengkai Li Department of Computer Siene and Engineering University of Texas at rlington Fall 017 Slides ourtesy of ang-ning Tan, Mihael

More information

DATA MINING LECTURE 10

DATA MINING LECTURE 10 DATA MINING LECTURE 10 Classification Nearest Neighbor Classification Support Vector Machines Logistic Regression Naïve Bayes Classifier Supervised Learning 10 10 Illustrating Classification Task Tid Attrib1

More information

DATA MINING: NAÏVE BAYES

DATA MINING: NAÏVE BAYES DATA MINING: NAÏVE BAYES 1 Naïve Bayes Classifier Thomas Bayes 1702-1761 We will start off with some mathematical background. But first we start with some visual intuition. 2 Grasshoppers Antenna Length

More information

Classification Part 2. Overview

Classification Part 2. Overview Classification Part 2 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Overview Rule based Classifiers Nearest-neighbor Classifiers Data Mining

More information

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 5. Introduction to Data Mining

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 5. Introduction to Data Mining Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 5 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Neural networks

More information

Probabilistic Methods in Bioinformatics. Pabitra Mitra

Probabilistic Methods in Bioinformatics. Pabitra Mitra Probabilistic Methods in Bioinformatics Pabitra Mitra pabitra@cse.iitkgp.ernet.in Probability in Bioinformatics Classification Categorize a new object into a known class Supervised learning/predictive

More information

Lecture Notes for Chapter 5. Introduction to Data Mining

Lecture Notes for Chapter 5. Introduction to Data Mining Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 5 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Rule-Based

More information

Classification Lecture 2: Methods

Classification Lecture 2: Methods Classification Lecture 2: Methods Jing Gao SUNY Buffalo 1 Outline Basics Problem, goal, evaluation Methods Nearest Neighbor Decision Tree Naïve Bayes Rule-based Classification Logistic Regression Support

More information

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://wwwcstauacil/~apartzin/machinelearning/ and wwwcsprincetonedu/courses/archive/fall01/cs302 /notes/11/emppt The MIT Press, 2010

More information

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION

SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION SUPERVISED LEARNING: INTRODUCTION TO CLASSIFICATION 1 Outline Basic terminology Features Training and validation Model selection Error and loss measures Statistical comparison Evaluation measures 2 Terminology

More information

Lecture VII: Classification I. Dr. Ouiem Bchir

Lecture VII: Classification I. Dr. Ouiem Bchir Lecture VII: Classification I Dr. Ouiem Bchir 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Find

More information

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each

More information

Data preprocessing. DataBase and Data Mining Group 1. Data set types. Tabular Data. Document Data. Transaction Data. Ordered Data

Data preprocessing. DataBase and Data Mining Group 1. Data set types. Tabular Data. Document Data. Transaction Data. Ordered Data Elena Baralis and Tania Cerquitelli Politecnico di Torino Data set types Record Tables Document Data Transaction Data Graph World Wide Web Molecular Structures Ordered Spatial Data Temporal Data Sequential

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

Given a collection of records (training set )

Given a collection of records (training set ) Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is (always) the class. Find a model for class attribute as a function of the values of other

More information

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata

Principles of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision

More information

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan Clustering CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Supervised vs Unsupervised Learning Supervised learning Given x ", y " "%& ', learn a function f: X Y Categorical output classification

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Clustering Lecture 1: Basics. Jing Gao SUNY Buffalo

Clustering Lecture 1: Basics. Jing Gao SUNY Buffalo Clustering Lecture 1: Basics Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics Clustering

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar 1 Types of data sets Record Tables Document Data Transaction Data Graph World Wide Web Molecular Structures

More information

Oliver Dürr. Statistisches Data Mining (StDM) Woche 11. Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften

Oliver Dürr. Statistisches Data Mining (StDM) Woche 11. Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften Statistisches Data Mining (StDM) Woche 11 Oliver Dürr Institut für Datenanalyse und Prozessdesign Zürcher Hochschule für Angewandte Wissenschaften oliver.duerr@zhaw.ch Winterthur, 29 November 2016 1 Multitasking

More information

Data Mining algorithms

Data Mining algorithms Data Mining algorithms 2017-2018 spring 02.07-09.2018 Overview Classification vs. Regression Evaluation I Basics Bálint Daróczy daroczyb@ilab.sztaki.hu Basic reachability: MTA SZTAKI, Lágymányosi str.

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 10 What is Data? Collection of data objects

More information

Part I. Linear regression & LASSO. Linear Regression. Linear Regression. Week 10 Based in part on slides from textbook, slides of Susan Holmes

Part I. Linear regression & LASSO. Linear Regression. Linear Regression. Week 10 Based in part on slides from textbook, slides of Susan Holmes Week 10 Based in part on slides from textbook, slides of Susan Holmes Part I Linear regression & December 5, 2012 1 / 1 2 / 1 We ve talked mostly about classification, where the outcome categorical. If

More information

Statistics 202: Data Mining. c Jonathan Taylor. Week 6 Based in part on slides from textbook, slides of Susan Holmes. October 29, / 1

Statistics 202: Data Mining. c Jonathan Taylor. Week 6 Based in part on slides from textbook, slides of Susan Holmes. October 29, / 1 Week 6 Based in part on slides from textbook, slides of Susan Holmes October 29, 2012 1 / 1 Part I Other classification techniques 2 / 1 Rule based classifiers Rule based classifiers Examples: if Refund==No

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

CS626 Data Analysis and Simulation

CS626 Data Analysis and Simulation CS626 Data Analysis and Simulation Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu Today: Data Analysis: A Summary Reference: Berthold, Borgelt, Hoeppner, Klawonn: Guide to Intelligent

More information

University of Florida CISE department Gator Engineering. Clustering Part 1

University of Florida CISE department Gator Engineering. Clustering Part 1 Clustering Part 1 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville What is Cluster Analysis? Finding groups of objects such that the objects

More information

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University Text Mining Dr. Yanjun Li Associate Professor Department of Computer and Information Sciences Fordham University Outline Introduction: Data Mining Part One: Text Mining Part Two: Preprocessing Text Data

More information

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td

Data Mining. Preamble: Control Application. Industrial Researcher s Approach. Practitioner s Approach. Example. Example. Goal: Maintain T ~Td Data Mining Andrew Kusiak 2139 Seamans Center Iowa City, Iowa 52242-1527 Preamble: Control Application Goal: Maintain T ~Td Tel: 319-335 5934 Fax: 319-335 5669 andrew-kusiak@uiowa.edu http://www.icaen.uiowa.edu/~ankusiak

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 14, 2014 Today s Schedule Course Project Introduction Linear Regression Model Decision Tree 2 Methods

More information

CISC 4631 Data Mining

CISC 4631 Data Mining CISC 4631 Data Mining Lecture 02: Data Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) 1 10 What is Data? Collection of data objects and their attributes An attribute

More information

1 Handling of Continuous Attributes in C4.5. Algorithm

1 Handling of Continuous Attributes in C4.5. Algorithm .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Prediction Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2015 Announcements TA Monisha s office hour has changed to Thursdays 10-12pm, 462WVH (the same

More information

Data Preprocessing. Cluster Similarity

Data Preprocessing. Cluster Similarity 1 Cluster Similarity Similarity is most often measured with the help of a distance function. The smaller the distance, the more similar the data objects (points). A function d: M M R is a distance on M

More information

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition

CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition CLUe Training An Introduction to Machine Learning in R with an example from handwritten digit recognition Ad Feelders Universiteit Utrecht Department of Information and Computing Sciences Algorithmic Data

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology COST Doctoral School, Troina 2008 Outline 1. Bayesian classification

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar 10 What is Data? Collection of data objects and their attributes Attributes An attribute is a property

More information

Machine Learning Linear Classification. Prof. Matteo Matteucci

Machine Learning Linear Classification. Prof. Matteo Matteucci Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)

More information

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run

More information

CHAPTER-17. Decision Tree Induction

CHAPTER-17. Decision Tree Induction CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes

More information

Multivariate statistical methods and data mining in particle physics

Multivariate statistical methods and data mining in particle physics Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general

More information

Lecture 3: Decision Trees

Lecture 3: Decision Trees Lecture 3: Decision Trees Cognitive Systems - Machine Learning Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning last change November 26, 2014 Ute Schmid (CogSys,

More information

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts Data Mining: Concepts and Techniques (3 rd ed.) Chapter 8 1 Chapter 8. Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Rule-Based Classification

More information

Data classification (II)

Data classification (II) Lecture 4: Data classification (II) Data Mining - Lecture 4 (2016) 1 Outline Decision trees Choice of the splitting attribute ID3 C4.5 Classification rules Covering algorithms Naïve Bayes Classification

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 10 What is Data? Collection of data objects

More information

1 Handling of Continuous Attributes in C4.5. Algorithm

1 Handling of Continuous Attributes in C4.5. Algorithm .. Spring 2009 CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Potpourri Contents 1. C4.5. and continuous attributes: incorporating continuous

More information

Computer Vision Group Prof. Daniel Cremers. 14. Clustering

Computer Vision Group Prof. Daniel Cremers. 14. Clustering Group Prof. Daniel Cremers 14. Clustering Motivation Supervised learning is good for interaction with humans, but labels from a supervisor are hard to obtain Clustering is unsupervised learning, i.e. it

More information

UVA CS 4501: Machine Learning

UVA CS 4501: Machine Learning UVA CS 4501: Machine Learning Lecture 21: Decision Tree / Random Forest / Ensemble Dr. Yanjun Qi University of Virginia Department of Computer Science Where are we? è Five major sections of this course

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1 EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle

More information

Machine Learning for Signal Processing Bayes Classification and Regression

Machine Learning for Signal Processing Bayes Classification and Regression Machine Learning for Signal Processing Bayes Classification and Regression Instructor: Bhiksha Raj 11755/18797 1 Recap: KNN A very effective and simple way of performing classification Simple model: For

More information

Clustering. Stephen Scott. CSCE 478/878 Lecture 8: Clustering. Stephen Scott. Introduction. Outline. Clustering.

Clustering. Stephen Scott. CSCE 478/878 Lecture 8: Clustering. Stephen Scott. Introduction. Outline. Clustering. 1 / 19 sscott@cse.unl.edu x1 If no label information is available, can still perform unsupervised learning Looking for structural information about instance space instead of label prediction function Approaches:

More information

A Posteriori Corrections to Classification Methods.

A Posteriori Corrections to Classification Methods. A Posteriori Corrections to Classification Methods. Włodzisław Duch and Łukasz Itert Department of Informatics, Nicholas Copernicus University, Grudziądzka 5, 87-100 Toruń, Poland; http://www.phys.uni.torun.pl/kmk

More information

Logic and machine learning review. CS 540 Yingyu Liang

Logic and machine learning review. CS 540 Yingyu Liang Logic and machine learning review CS 540 Yingyu Liang Propositional logic Logic If the rules of the world are presented formally, then a decision maker can use logical reasoning to make rational decisions.

More information

Data Exploration and Unsupervised Learning with Clustering

Data Exploration and Unsupervised Learning with Clustering Data Exploration and Unsupervised Learning with Clustering Paul F Rodriguez,PhD San Diego Supercomputer Center Predictive Analytic Center of Excellence Clustering Idea Given a set of data can we find a

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 23. Decision Trees Barnabás Póczos Contents Decision Trees: Definition + Motivation Algorithm for Learning Decision Trees Entropy, Mutual Information, Information

More information

Clustering using Mixture Models

Clustering using Mixture Models Clustering using Mixture Models The full posterior of the Gaussian Mixture Model is p(x, Z, µ,, ) =p(x Z, µ, )p(z )p( )p(µ, ) data likelihood (Gaussian) correspondence prob. (Multinomial) mixture prior

More information

Chapter 6 Classification and Prediction (2)

Chapter 6 Classification and Prediction (2) Chapter 6 Classification and Prediction (2) Outline Classification and Prediction Decision Tree Naïve Bayes Classifier Support Vector Machines (SVM) K-nearest Neighbors Accuracy and Error Measures Feature

More information

Naïve Bayes classification

Naïve Bayes classification Naïve Bayes classification 1 Probability theory Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. Examples: A person s height, the outcome of a coin toss

More information

Be able to define the following terms and answer basic questions about them:

Be able to define the following terms and answer basic questions about them: CS440/ECE448 Section Q Fall 2017 Final Review Be able to define the following terms and answer basic questions about them: Probability o Random variables, axioms of probability o Joint, marginal, conditional

More information

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom

More information

Introduction to Data Mining

Introduction to Data Mining Introduction to Data Mining Lecture #12: Frequent Itemsets Seoul National University 1 In This Lecture Motivation of association rule mining Important concepts of association rules Naïve approaches for

More information

5. Discriminant analysis

5. Discriminant analysis 5. Discriminant analysis We continue from Bayes s rule presented in Section 3 on p. 85 (5.1) where c i is a class, x isap-dimensional vector (data case) and we use class conditional probability (density

More information

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning

LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES. Supervised Learning LINEAR CLASSIFICATION, PERCEPTRON, LOGISTIC REGRESSION, SVC, NAÏVE BAYES Supervised Learning Linear vs non linear classifiers In K-NN we saw an example of a non-linear classifier: the decision boundary

More information

CS6375: Machine Learning Gautam Kunapuli. Decision Trees

CS6375: Machine Learning Gautam Kunapuli. Decision Trees Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s

More information

Multivariate Analysis Cluster Analysis

Multivariate Analysis Cluster Analysis Multivariate Analysis Cluster Analysis Prof. Dr. Anselmo E de Oliveira anselmo.quimica.ufg.br anselmo.disciplinas@gmail.com Cluster Analysis System Samples Measurements Similarities Distances Clusters

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2016 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows Kn-Nearest

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification 10-810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene

More information

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018

Data Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018 Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Algorithms for Classification: The Basic Methods

Algorithms for Classification: The Basic Methods Algorithms for Classification: The Basic Methods Outline Simplicity first: 1R Naïve Bayes 2 Classification Task: Given a set of pre-classified examples, build a model or classifier to classify new cases.

More information

Pattern recognition. "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher

Pattern recognition. To understand is to perceive patterns Sir Isaiah Berlin, Russian philosopher Pattern recognition "To understand is to perceive patterns" Sir Isaiah Berlin, Russian philosopher The more relevant patterns at your disposal, the better your decisions will be. This is hopeful news to

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Decision Trees Instructor: Yang Liu 1 Supervised Classifier X 1 X 2. X M Ref class label 2 1 Three variables: Attribute 1: Hair = {blond, dark} Attribute 2: Height = {tall, short}

More information

Support Vector Machine. Industrial AI Lab.

Support Vector Machine. Industrial AI Lab. Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different

More information

Bayesian Networks. Distinguished Prof. Dr. Panos M. Pardalos

Bayesian Networks. Distinguished Prof. Dr. Panos M. Pardalos Distinguished Prof. Dr. Panos M. Pardalos Center for Applied Optimization Department of Industrial & Systems Engineering Computer & Information Science & Engineering Department Biomedical Engineering Program,

More information

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee

Support Vector Machine. Industrial AI Lab. Prof. Seungchul Lee Support Vector Machine Industrial AI Lab. Prof. Seungchul Lee Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories /

More information

Generative Learning. INFO-4604, Applied Machine Learning University of Colorado Boulder. November 29, 2018 Prof. Michael Paul

Generative Learning. INFO-4604, Applied Machine Learning University of Colorado Boulder. November 29, 2018 Prof. Michael Paul Generative Learning INFO-4604, Applied Machine Learning University of Colorado Boulder November 29, 2018 Prof. Michael Paul Generative vs Discriminative The classification algorithms we have seen so far

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Classification: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network

More information

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability Probability theory Naïve Bayes classification Random variable: a variable whose possible values are numerical outcomes of a random phenomenon. s: A person s height, the outcome of a coin toss Distinguish

More information

The Naïve Bayes Classifier. Machine Learning Fall 2017

The Naïve Bayes Classifier. Machine Learning Fall 2017 The Naïve Bayes Classifier Machine Learning Fall 2017 1 Today s lecture The naïve Bayes Classifier Learning the naïve Bayes Classifier Practical concerns 2 Today s lecture The naïve Bayes Classifier Learning

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Clustering: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu November 3, 2015 Methods to Learn Matrix Data Text Data Set Data Sequence Data Time Series Graph

More information

Ch 4. Linear Models for Classification

Ch 4. Linear Models for Classification Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,

More information

Qualifying Exam in Machine Learning

Qualifying Exam in Machine Learning Qualifying Exam in Machine Learning October 20, 2009 Instructions: Answer two out of the three questions in Part 1. In addition, answer two out of three questions in two additional parts (choose two parts

More information

Classification Using Decision Trees

Classification Using Decision Trees Classification Using Decision Trees 1. Introduction Data mining term is mainly used for the specific set of six activities namely Classification, Estimation, Prediction, Affinity grouping or Association

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING Text Data: Topic Model Instructor: Yizhou Sun yzsun@cs.ucla.edu December 4, 2017 Methods to be Learnt Vector Data Set Data Sequence Data Text Data Classification Clustering

More information

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag Decision Trees Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Supervised Learning Input: labelled training data i.e., data plus desired output Assumption:

More information

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE MIDTERM SOLUTIONS: FALL 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE March 28, 2012 The exam is closed book. You are allowed a double sided one page cheat sheet. Answer the questions in the spaces provided on

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

Generative Model (Naïve Bayes, LDA)

Generative Model (Naïve Bayes, LDA) Generative Model (Naïve Bayes, LDA) IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University Materials from Prof. Jia Li, sta3s3cal learning book (Has3e et al.), and machine learning

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi

More information