CSE-4412(M) Midterm. There are five major questions, each worth 10 points, for a total of 50 points. Points for each sub-question are as indicated.

Similar documents
The Solution to Assignment 6

Algorithms for Classification: The Basic Methods

Administrative notes. Computational Thinking ct.cs.ubc.ca

Bayesian Classification. Bayesian Classification: Why?

Inteligência Artificial (SI 214) Aula 15 Algoritmo 1R e Classificador Bayesiano

Learning Decision Trees

Classification: Rule Induction Information Retrieval and Data Mining. Prof. Matteo Matteucci

Data Mining Part 4. Prediction

Decision Trees. Tirgul 5

Supervised Learning! Algorithm Implementations! Inferring Rudimentary Rules and Decision Trees!

( D) I(2,3) I(4,0) I(3,2) weighted avg. of entropies

Data classification (II)

Decision trees. Special Course in Computer and Information Science II. Adam Gyenge Helsinki University of Technology

EECS 349:Machine Learning Bryan Pardo

Unsupervised Learning. k-means Algorithm

Decision Support. Dr. Johan Hagelbäck.

The Naïve Bayes Classifier. Machine Learning Fall 2017

Learning Decision Trees

Decision Trees. Gavin Brown

Mining Classification Knowledge

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation

Chapter 4.5 Association Rules. CSCI 347, Data Mining

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

Data Mining and Knowledge Discovery. Petra Kralj Novak. 2011/11/29

Machine Learning. Yuh-Jye Lee. March 1, Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU

Decision Tree Learning

Decision Trees. Each internal node : an attribute Branch: Outcome of the test Leaf node or terminal node: class label.

Midterm, Fall 2003

CLASSIFICATION NAIVE BAYES. NIKOLA MILIKIĆ UROŠ KRČADINAC

Modern Information Retrieval

Data Mining. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of Computer Science

CS 6375 Machine Learning

CS4445 Data Mining and Knowledge Discovery in Databases. B Term 2014 Solutions Exam 2 - December 15, 2014

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Intelligent Data Analysis. Decision Trees

Introduction to ML. Two examples of Learners: Naïve Bayesian Classifiers Decision Trees

Decision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1

Association Rule Mining on Web

Symbolic methods in TC: Decision Trees

COMP 328: Machine Learning

Decision Trees Part 1. Rao Vemuri University of California, Davis

the tree till a class assignment is reached

Induction on Decision Trees

Mining Classification Knowledge

Università di Pisa A.A Data Mining II June 13th, < {A} {B,F} {E} {A,B} {A,C,D} {F} {B,E} {C,D} > t=0 t=1 t=2 t=3 t=4 t=5 t=6 t=7

Classification: Decision Trees

DECISION TREE LEARNING. [read Chapter 3] [recommended exercises 3.1, 3.4]

Machine Learning Alternatives to Manual Knowledge Acquisition

FINAL: CS 6375 (Machine Learning) Fall 2014

Lecture 3: Decision Trees

Introduction Association Rule Mining Decision Trees Summary. SMLO12: Data Mining. Statistical Machine Learning Overview.


Decision Trees.

Decision T ree Tree Algorithm Week 4 1

CS145: INTRODUCTION TO DATA MINING

Decision Trees / NLP Introduction

732A61/TDDD41 Data Mining - Clustering and Association Analysis

1 Frequent Pattern Mining

Introduction. Decision Tree Learning. Outline. Decision Tree 9/7/2017. Decision Tree Definition

You have 3 hours to complete the exam. Some questions are harder than others, so don t spend too long on any one question.

Decision Tree Learning and Inductive Inference

Slides for Data Mining by I. H. Witten and E. Frank

Administration. Chapter 3: Decision Tree Learning (part 2) Measuring Entropy. Entropy Function

Data Mining. Chapter 1. What s it all about?

Decision Tree Learning Mitchell, Chapter 3. CptS 570 Machine Learning School of EECS Washington State University

Data Analytics Beyond OLAP. Prof. Yanlei Diao

Decision-Tree Learning. Chapter 3: Decision Tree Learning. Classification Learning. Decision Tree for PlayTennis

Chapter 3: Decision Tree Learning

Reminders. HW1 out, due 10/19/2017 (Thursday) Group formations for course project due today (1 pt) Join Piazza (

Decision Trees.

Applied Logic. Lecture 4 part 2 Bayesian inductive reasoning. Marcin Szczuka. Institute of Informatics, The University of Warsaw

Administrative notes February 27, 2018

Lecture 3: Decision Trees

Outline. Training Examples for EnjoySport. 2 lecture slides for textbook Machine Learning, c Tom M. Mitchell, McGraw Hill, 1997

Computational Learning Theory

Bayesian Learning. Artificial Intelligence Programming. 15-0: Learning vs. Deduction

Classification and Prediction

Machine Learning, Midterm Exam: Spring 2009 SOLUTION

Data Mining and Machine Learning

Classification. Classification. What is classification. Simple methods for classification. Classification by decision tree induction

Decision Tree Learning

Cse537 Ar*fficial Intelligence Short Review 1 for Midterm 2. Professor Anita Wasilewska Computer Science Department Stony Brook University

Lazy Rule Learning Nikolaus Korfhage

Classification and Regression Trees

Jialiang Bao, Joseph Boyd, James Forkey, Shengwen Han, Trevor Hodde, Yumou Wang 10/01/2013

Soft Computing. Lecture Notes on Machine Learning. Matteo Mattecci.

Lecture 9: Bayesian Learning

Bayesian Learning Features of Bayesian learning methods:

UVA CS 4501: Machine Learning

Naïve Bayes Lecture 6: Self-Study -----

Stephen Scott.

EECS-3421a: Test #2 Electrical Engineering & Computer Science York University

CSCE 478/878 Lecture 6: Bayesian Learning

Bayesian Learning. Bayesian Learning Criteria

Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze)

Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan

Numerical Learning Algorithms

Data Mining: Concepts and Techniques. (3 rd ed.) Chapter 8. Chapter 8. Classification: Basic Concepts

CSE 355 Test 2, Fall 2016

Learning Classification Trees. Sargur Srihari

Symbolic methods in TC: Decision Trees

Transcription:

22 February 2007 CSE-4412(M) Midterm p. 1 of 12 CSE-4412(M) Midterm Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: Winter 2007 Answer the following questions to the best of your knowledge. Be precise and be careful. The exam is closed-book and closed-notes. Write any assumptions you need to make along with your answers, whenever necessary. There are five major questions, each worth 10 points, for a total of 50 points. Points for each sub-question are as indicated. If you need additional space for an answer, just indicate clearly where you are continuing. Regrade Policy Regrading should only be requested in writing. Write what you would like to be reconsidered. Note, however, that an exam accepted for regrading will be reviewed and regraded in entirety (all questions). Grading Box 1. /10 2. /10 3. /10 4. /10 5. /10 Total /50

22 February 2007 CSE-4412(M) Midterm p. 2 of 12 1. (10 points) Misc. Don t eat the Smarties TM! Calculation a. (3 points) John randomly takes a smartie from one of the bowls A, B, or C. (He randomly chose a bowl, then randomly chose a smartie.) We saw that the one he took, and then ate, was red. We also somehow know what the distribution of the smarties in the bowls was before John took one: A B C red 10 20 30 60 blue 40 30 20 90 50 50 50 150 We learn that, unfortunately, all the smarties in bowl A are poisoned! (The ones in B and C are fine.) What is the probability that John has been poisoned? State the conditional probability that represents this and calculate the probability as a fraction. b. (3 points) We learn the following about students and the salaries high is more than, or equal to, $200k and low is less than $200k that they earn in their first jobs after graduating. Students who took the data mining course are represented by dm and those who did not by dm. low high dm 70 30 100 dm 830 70 900 900 100 1000 What is the lift of dm & high? Calculate the number.

22 February 2007 CSE-4412(M) Midterm p. 3 of 12 c. (4 points) Consider that I 1 I 2 I 3 A B C A C D A C E A C F A D F A E F I 1 I 2 I 3 B C E B C F B D F C D E C E F D E F were the frequent 3-itemsets found by the Apriori algorithm. Show the 4-itemset pre-candidates that would be generated by the join step. Which of these pre-candidates would then be eliminated by the apriori pruning step (leaving the actual candidates)?

22 February 2007 CSE-4412(M) Midterm p. 4 of 12 2. (10 points) Association Rule Mining. No, I m not associating with you. Analysis An association rule can be generated as follows: For a given frequent itemset L, generate all nonempty, proper subsets of L. For every nonempty, proper subset S of L, output the rule support(l) S L S if support(s) θ minconf (in which θ minconf is the minimum confidence threshold). a. (5 points) For a frequent itemset L of size k, how many rules should be tested under this method? b. (5 points) By apriori, we know that for any nonempty subset S of S, support(s ) support(s). Given frequent itemset L and subset S of L, prove that confidence(s (L S )) confidence(s (L S)), for S S.

22 February 2007 CSE-4412(M) Midterm p. 5 of 12 3. (10 points) General. Rock, scissors, paper... Multiple Choice Each question below is worth one point. Choose one best answer for each. a. Consider the following statements about strong association rules. I. If A B then B A. II. If A B and B C then A C. III. If A C then A B C. IV. If A B C then A C. Which of the above statements are true for any A, B, and C? A. I & II B. I, II, & III C. I, II, & IV D. II & III E. none Assume that the largest frequent itemset is of size k. b. How many passes does the apriori algorithm need in worst case? A. k 1 B. k C. k + 1 D. k 2 E. 2k F. 2 k 1 c. There are at least how many frequent itemsets? A. k 1 B. k C. k + 1 D. k 2 E. 2k F. 2 k 1 d. Consider pass i of the Apriori algorithm in which the frequent i-itemsets are being found. Let n be the number of frequent items. We know that a transaction T can be eliminated during pass i if it is found that T does not contain at least some candidate i-itemsets. (T then could never support any candidate j-itemsets for j > i.) So, we will throw away T if it does not contain at least x candidate i-itemsets. What is the largest value for x that we can safely use? A. 1 B. 2 C. i 1 D. i E. i + 1 F. n

22 February 2007 CSE-4412(M) Midterm p. 6 of 12 e. Which of the following is true? A. If A B is an association rule, A and B are positively correlated. B. If A B is an association rule, A and B are at least not negatively correlated. C. If both A B and B A are association rules, A and B are positively correlated. D. If both A and B are correlated, then A B is a strong association rule. E. Association does not imply correlation. f. Apriori pruning in the search for frequent itemsets works because A. support count is monotonic with respect to itemsets. B. support count is anti-monotonic with respect to itemsets. C. support count diverges as we add to the itemset. D. we search in transaction ID order. E. it is an excellent heuristic, but it does not work 100% of the time. g. Naïve Bayesian classification A. is guaranteed never to misclassify any of its training data. B. needs no prior probabilities. C. is theoretically optimal with respect to minimizing classification error, modulo the conditional independence assumption. D. works well even with very little training data. E. cannot be made to work with continuous attributes.

22 February 2007 CSE-4412(M) Midterm p. 7 of 12 Consider that we have four classes A, B, C, & D of 100 items each that constitute our sample set. h. The expected information to classify a given sample is A. 0 B. 1 2 C. 1 D. 2 E. π Keep considering the four classes A, B, C, & D of 100 items each. Consider also a boolean attribute b let t denote true and f denote false such that s A,t = 0 s B,t = 50 s C,t = 50 s D,t = 100 s A,f = 100 s B,f = 50 s C,f = 50 s D,f = 0 i. The entropy of b with respect to the sample set is A. 0 B. 1 C. 1 1 2 D. 2 E. π j. Gain(b) is A. 0 B. 1 2 C. 1 D. 1 1 2 E. 2 F. π

22 February 2007 CSE-4412(M) Midterm p. 8 of 12 4. (10 points) Bayesian Classification. Hey, I m not that naïve. Exercise outlook temperature humidity windy? play? rainy cool normal Y no rainy cool normal N yes rainy mild high Y no rainy mild high N yes rainy mild normal N yes overcast cool normal Y yes overcast cool high Y no overcast mild high Y yes overcast hot high N yes overcast hot normal N yes sunny cool normal N yes sunny mild high N no sunny mild normal Y yes sunny hot high Y no sunny hot high N no a. (4 points) Calculate P(C i X = sunny,hot,high,n ). How would the naïve Bayes classifier classify the data instance X = sunny, hot, high, N? Does this agree with the classification given for X = sunny, hot, high, N in the table? Justify your answer via calculations.

22 February 2007 CSE-4412(M) Midterm p. 9 of 12 b. (3 points) Consider a new data instance X = overcast,cool,high,y. How would the naïve Bayes classifier classify X? Again, justify your answer via calculations. c. (3 points) What is a Laplacian correction? Would it be needed for this dataset for any classification calculations?

22 February 2007 CSE-4412(M) Midterm p. 10 of 12 5. (10 points) Decision Tree & Rule Induction. Break all the rules! Short Answer a. (3 points) Is the ID3 decision tree induction algorithm guaranteed to find an optimal tree (that is, a tree that best classifies the training tuples over all possible trees)? Why or why not? b. (2 points) When designing a sequential rule induction algorithm, one needs a rule quality measure to choose the next best candidate rule. What two things should this measure balance?

22 February 2007 CSE-4412(M) Midterm p. 11 of 12 c. (3 points) Why is a conflict resolution strategy often necessary for rule-based classifiers? d. (2 points) List three common conflict resolution strategies for rule-based classifiers.

22 February 2007 CSE-4412(M) Midterm p. 12 of 12 (Scratch space.) Relax. Turn in your exam. Go home.