Assignment No A-05 Aim. Pre-requisite. Objective. Problem Statement. Hardware / Software Used

Similar documents
Data Mining Part 4. Prediction

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Chapter 6 Classification and Prediction (2)

Naïve Bayesian. From Han Kamber Pei

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES

Bayesian Classification. Bayesian Classification: Why?

Naïve Bayes classification. p ij 11/15/16. Probability theory. Probability theory. Probability theory. X P (X = x i )=1 i. Marginal Probability

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Naïve Bayes classification

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 4 of Data Mining by I. H. Witten, E. Frank and M. A.

Algorithms for Classification: The Basic Methods

Machine Learning. Yuh-Jye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU

Introduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees

Naive Bayes classification

SYDE 372 Introduction to Pattern Recognition. Probability Measures for Classification: Part I

Machine Learning, Midterm Exam: Spring 2009 SOLUTION

Introduction to Bayesian Learning. Machine Learning Fall 2018

Topics. Bayesian Learning. What is Bayesian Learning? Objectives for Bayesian Learning

DEPARTMENT OF COMPUTER SCIENCE AUTUMN SEMESTER MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

CHAPTER-17. Decision Tree Induction

Bayesian Learning Features of Bayesian learning methods:

Bayesian Learning (II)

Bayesian Learning Extension

Generative classifiers: The Gaussian classifier. Ata Kaban School of Computer Science University of Birmingham

Intro. ANN & Fuzzy Systems. Lecture 15. Pattern Classification (I): Statistical Formulation

Bayes Classifiers. CAP5610 Machine Learning Instructor: Guo-Jun QI

Bayesian Decision Theory

CS 446 Machine Learning Fall 2016 Nov 01, Bayesian Learning

Classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Probabilistic classification CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2016

Click Prediction and Preference Ranking of RSS Feeds

Bayesian Learning. Bayesian Learning Criteria

Introduction to Machine Learning

Introduction to Machine Learning

The Naïve Bayes Classifier. Machine Learning Fall 2017

FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE

Logistic Regression. Machine Learning Fall 2018

Algorithmisches Lernen/Machine Learning

Bayesian Learning. CSL603 - Fall 2017 Narayanan C Krishnan

10-701/ Machine Learning - Midterm Exam, Fall 2010

CS6220: DATA MINING TECHNIQUES

Relationship between Least Squares Approximation and Maximum Likelihood Hypotheses

MLE/MAP + Naïve Bayes

MLE/MAP + Naïve Bayes

Bayes Decision Theory

CS340 Winter 2010: HW3 Out Wed. 2nd February, due Friday 11th February

Bayesian Methods: Naïve Bayes

Machine Learning, Fall 2012 Homework 2

0.5. (b) How many parameters will we learn under the Naïve Bayes assumption?

Logistic Regression. COMP 527 Danushka Bollegala

What s Cooking? Predicting Cuisines from Recipe Ingredients

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

INTRODUCTION TO DATA SCIENCE

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

PATTERN RECOGNITION AND MACHINE LEARNING

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Pattern Recognition and Machine Learning. Learning and Evaluation of Pattern Recognition Processes

BAYESIAN CLASSIFICATION

Notes on Machine Learning for and

Learning from Data 1 Naive Bayes

Midterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas

HOMEWORK #4: LOGISTIC REGRESSION

DATA MINING LECTURE 10

Data Mining. Supervised Learning. Hamid Beigy. Sharif University of Technology. Fall 1396

MODULE -4 BAYEIAN LEARNING

A Posteriori Corrections to Classification Methods.

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Midterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas

Last Time. Today. Bayesian Learning. The Distributions We Love. CSE 446 Gaussian Naïve Bayes & Logistic Regression

Lecture 9: Bayesian Learning

Brief Introduction of Machine Learning Techniques for Content Analysis

Background literature. Data Mining. Data mining: what is it?

Intelligent Systems (AI-2)

The Bayes classifier

Maximum Entropy Klassifikator; Klassifikation mit Scikit-Learn

Tutorial 2. Fall /21. CPSC 340: Machine Learning and Data Mining

Introduction to Bayesian Learning

CSCE 478/878 Lecture 6: Bayesian Learning

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

Machine Learning for Signal Processing Bayes Classification

CS6220: DATA MINING TECHNIQUES

Uncertainty. Variables. assigns to each sentence numerical degree of belief between 0 and 1. uncertainty

Predicting flight on-time performance

Intelligent Systems (AI-2)

Machine Learning: Assignment 1

FINAL: CS 6375 (Machine Learning) Fall 2014

Machine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li.

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

5. Discriminant analysis

Midterm: CS 6375 Spring 2015 Solutions

Naïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.

HOMEWORK #4: LOGISTIC REGRESSION

Multivariate statistical methods and data mining in particle physics

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

Chapter 3: Maximum-Likelihood & Bayesian Parameter Estimation (part 1)

CS145: INTRODUCTION TO DATA MINING

Bayesian Inference. Definitions from Probability: Naive Bayes Classifiers: Advantages and Disadvantages of Naive Bayes Classifiers:

Transcription:

Assignment No A-05 Aim Implement Naive Bayes to predict the work type for a person. Pre-requisite 1. Probability. 2. Scikit-Learn Python Library. 3. Programming language basics. Objective 1. To Learn basic concepts of Naive Bayes Classifier. 2. To implement application using Naive Bayes Classifier for given problem. Problem Statement Implement Naive Bayes to predict the work type for a person with following parameters: age: 30, Qualification: MTech, Experience: 8 Following table provides the details of the available data: Work Type Age Qualification Experience Consultancy 30 Ph.D. 9 Service 21 MTech. 1 Research 26 MTech. 2 Service 28 BTech. 10 Consultancy 40 MTech. 14 Research 35 Ph.D. 10 Research 27 Btech. 6 Service 32 MTech. 9 Consultancy 45 Btech. 17 Research 36 Ph.D. 7 Hardware / Software Used 1. Python 2. Scikit-Learn Library 1

Mathematical Model M = { s, e, X, Y, DD, NDD, F me, Mem shared, success, failure, CP U CoreCount } 1. s = Start State - Open dataset file and read its contents. 2. e = End State - Work type predicted. 3. X = Input - Given Dataset. 4. Y = Output - Prediction of Work type for given problem. 5. DD = Deterministic Data - Work Type, Age, Qualification, Experience. 6. NDD = Non-Deterministic Data - Truncation or rounding errors in calculations. 7. Fme = GaussianNB(), fit(input,output), predict([30,2,8]). 8. Memshared = No shared Memory is used. 9. Success = Work type predicted. 10. Failure = Work type not predicted. 11. CPUCoreCount = 1. 2

Theory The objective of classification is to analyze the input data and to develop an accurate description or model for each class using the features present in the data. This model is used to classify test data for which the class descriptions are not known. The input data, also called the training set, consists of multiple records each having multiple attributes or features.each record is tagged with a class label. The data analysis task is called as classification, where a model or classifier is constructed to predict categorical labels. (e.g. safe or risky, yes or no) Data classification is a two-step process 1. Learning 2. Classification 1. Learning: In this step, a classifier is built describing a predetermined set of data classes or concepts. This first step of the classification process can also be viewed as the learning of a mapping or function, y = f(x), that can predict the associated class label y of a given tuple X. This mapping is represented in the form of classification rules, decision trees or mathematical formula. 2. Classification: Model created in previous steps are used for classification. Test data are used to estimate the accuracy of the classification rules. If the accuracy is considered acceptable, the rules can be applied to the classification of new data tuples. Bayesian classifiers are statistical classifiers. They can predict class membership probabilities, such as the probability that a given tuple belongs to a particular class.bayesian classification is based on Bayes theorem, Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases. Bayesian classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is called class conditional independence. It is made to simplify the computations involved and, in this sense, is considered naive. (Hence it is Bayesian classifier is also called as Naive Bayes Classifier) Bayes theorem Bayes theorem is named after Thomas Bayes, who did early work in probability and decision theory during the 18th century. Let X be a data tuple. In Bayesian terms, X is considered evidence. Let H be some hypothesis, such as that the data tuple X belongs to a specified class C. For classification problems, we want to determine P(H X), the probability that the hypothesis H holds given the evidence or observed data tuple X. In other words, we are looking for the probability that tuple X belongs to class C, given that we know the attribute description of X. 1 P(H) is the posterior probability, or a posteriori probability, of H conditioned on X. i.e. P(H X) : Probability of H given X P(H X) reflects the probability that customer X will buy a computer given that we know the customers age and income. 2) P(X H) is the posterior probability of X conditioned on H. i.e P(X H) : Probability of X given H That is, it is the probability that customers, X, is 35 years old and earns $40,000, given that we know the customer will buy a computer. 3) P(H) is the prior probability, or a priori probability, of H. i.e. P(H) : Prior probability of hypothesis H 3

For example, this is the probability that any given customer will buy a computer, regardless of age, income, or any other information, for that matter. 4) P(X) is the prior probability of X. i.e. P(X) : Prior probability of data tuple X For example, it is the probability that a person from our set of customers is 35 years old and earns $40,000. Bayes theorem is useful in that it provides a way of calculating the posterior probability, P(H X), from P(H), P(X H), and P(X). Bayes theorem is given below: P (H X)=P(X H)P(H)/P(X) Bayesian classifier The nave Bayesian classifier, or simple Bayesian classifier, works as follows: 1) Let D be a training set of tuples and their associated class labels, and each tuple is represented by an n-d attribute vector X = (x1, x2,, xn) 2) Suppose there are m classes C1, C2,, Cm. Given a tuple, X, the classifier will predict that X belongs to the class having the highest posterior probability, conditioned on X. That is, the nave Bayesian classifier predicts that tuple X belongs to the class Ci if and only if, Thus we maximize P(Ci X). The class Ci for which P(Ci X) is maximized is called the maximum posteriori hypothesis. This can be derived from Bayes theorem : As P(X) is constant for all classes, only P(XCi)*P(Ci ) need be maximized. If the class prior probabilities are not known, then it is commonly assumed that the classes are equally likely, that is, P(C1 ) = P(C2 ) = = P(Cm ), and we would therefore maximize P(X Ci ). Otherwise, we maximize P(X Ci)*P(Ci ). 4) Given data sets with many attributes, it would be extremely computationally expensive to compute P(X Ci ). Thus, 4

Here xk refers to the value of attribute Ak for tuple X. For each attribute, we look at whether the attribute is categorical or continuous-valued. For instance, to compute P(X Ci ), we consider the following: a) If Ak is categorical, then P(xk Ci ) is the number of tuples of class Ci in D having the value xk for Ak, divided by Ci,D, the number of tuples of class Ci in D. b) If Ak is continuous-valued, then A continuous-valued attribute is typically assumed to have a Gaussian distribution with a mean µ and standard deviation σ, defined by 5) In order to predict the class label of X, P(X Ci )P(Ci ) is evaluated for each class Ci. The classifier predicts that the class label of tuple X is the class Ci if and only if P(X Ci )P(Ci ) > P(X C j )P(C j ) for 1 j m, j = i. In other words, the predicted class label is the class Ci for which P(X Ci )P(Ci ) is the maximum. Following formulas are used for predicting the accurate worktype for a person. 1. Probability density formula. 2. formula for calculating mean. 5

3. formula for calculating standard deviation. Using the Naive Bayes classifier we can solve the given problem as below : 1. Calculate prior Probabilities of class to be predicted: Mean of age consultancy = 38.33 Standard deviation of age consultancy = 7.63 Mean of age service = 27 Standard deviation of age service = 7.87 Mean of age research = 31 Standard deviation of age research = 5.23 Mean of experience consultancy = 13.33 Standard deviation of experience consultancy = 4.04 Mean of experience service = 6.66 Standard deviation of experience service = 4.93 Mean of experience research = 6.25 Standard deviation of experience research = 3.30 2. Calculate conditional probabilities: Prior probability of Work Type : Service = 0.3 Prior probability of Work Type : Consultancy = 0.3 Prior probability of Work Type : Research = 0.4 conditional probability P( middle age Consultancy )= 0.011 conditional probability P( middle age Service )= 0.047 conditional probability P( middle age Research )= 0.075 Conditional probability P( exp Consultancy )= 0.041 Conditional probability P( exp Service )= 0.077 6

Conditional probability P( exp Research )= 0.104 Conditional probability P( Qual Consultancy )= 0.33 Conditional probability P( Qual Service )= 0.66 Conditional probability P( Qual Research )= 0.25 Multiplication of above probabilities to give Posterior Probability of Work Type : (a) Consultancy = 0.011 * 0.041 * 0.33 * 0.3 = 0.00044 (b) Service = 0.047 * 0.77 * 0.66 * 0.3 = 0.0007165 (c) Research = 0.075 * 0.104 * 0.25 * 0.4 = 0.00078 The predicted work type is = RESEARCH Procedure Execution of Program: python NaiveBays.py Conclusion Thus we have implemented the Naive Bayes classifier in python using sklearn. 7

Program ================================================= GROUP A Assignment No : A5 Title : Implement Naive Bayes to predict the work type for a person. Roll No : Batch : B Class : BE ( Computer ) ================================================= import csv import sys import time import os from collections import Counter import numpy as np from sklearn.naive bayes import GaussianNB data=[] x=[] y=[] raw file=open( A5 nb data.csv, r ) lines=raw file.readlines() for line in lines: splitted=line.split(, ) splitted[3]=splitted[3].replace( n, ) data.append(splitted) for i in data: temp=[] for a in i[:3]: temp.append(int(a)) x.append(temp) y.append(i[3]) input=np.array(x) output=np.array(y) c=gaussiannb() c.fit(input,output) print c.predict([30,2,8]) Output administrator@administrator-optiplex-390:~/desktop/cl2$ python A5_nb.py [ Research ] 8

Plagiarism Score 9