Alternative Approach to Mining Association Rules

Size: px
Start display at page:

Download "Alternative Approach to Mining Association Rules"

Transcription

1 Alternative Approach to Mining Association Rules Jan Rauch 1, Milan Šimůnek Faculty of Informatics and Statistics, University of Economics Prague, Czech Republic 2 Institute of Computer Sciences, Czech Academy of Sciences, Czech Republic rauch@vse.cz, simunek@{vse.cz, cs.cas.cz} Abstract An alternative approach to mining association rules is described. Some special techniques and algorithms are used that lead to a much richer syntax of association rules with only linear complexity of computation. A free and open system LISp-Miner implements these algorithms and can serve as a demonstration of used techniques. The same techniques can be used in other kinds of mining e.g. multi-relation mining and conditional frequency analysis. 1. Introduction An association rule is in common way understood as an expression of the form of X Y, where X and Y are sets of items. The intuitive meaning is that transactions (e.g. supermarket baskets) containing set X of items tend to contain set Y of items. Two measures of intensity of association rule are used, confidence and support. An association rule discovery task is a task to find all association rules of the form X Y such that the support and confidence of X Y are above the user-defined thresholds minsup and minconf. The conventional algorithm of association rules discovery proceeds in two steps. All frequent itemsets are found in the first step. The frequent itemset is the itemset that is included in at least minsup transactions. The association rules with the confidence at least minconf are generated in the second step [1]. Particular items can be represented by Boolean attributes and a Boolean data matrix can represent the whole set of transactions. The algorithm can be modified to deal with attributes with more than two values. Thus, the association rules of the form e.g. A(a 1 ) B(b 3 ) C(c 7 ) can be mined. We suppose that the attribute A has k particular values a 1,, a k. The expression A(a 1 ) denotes the Boolean attribute that is true if the value of attribute A is a 1 etc. The goal of this paper is to draw attention to an alternative approach for mining association rules based on representation of each possible value of each attribute by a single string of bits. It is possible to mine for association rules of the form e.g. A(α) B(β) C(δ) where α is a coefficient (a subset of all the possible values) of the attribute A. The expression A(α) denotes the Boolean attribute that is true for particular row of data matrix if the value of A in this row belongs to α, similarly for B(β) and C(δ). The bit string approach makes also possible to easy compute all necessary frequencies. Then we can mine not only for association rules based on confidence and support but also for rules corresponding to further various relations of Boolean attributes including relations described by statistical hypotheses tests. It is also possible to mine for conditional association rules and to deal with missing information. The presented form of association rules can be understood as a contribution to the discussion about the notion of interesting patterns. Several data structures consisting of disjunctions and conjunctions of bit strings representing particular values of attributes are maintained to optimise generation and verification of association rules. Final algorithm is very fast and it is linearly dependent on the number of rows of the analysed data matrix. Time and memory complexity are discussed in section 3. As a demonstration of capabilities of bit string approach we present the procedure 4ft-Miner (see section 2). The 4ft-Miner procedure is a part of the academic data mining system LISp-Miner (see The bit string approach proved to be very efficient. Experiences with it lead to development of new mining procedures, an example can be found in section 4. The presented approach was first applied in connection of development of the GUHA method of mechanized hypotheses formation [2], [3]. 2. Procedure 4ft-Miner Procedure 4ft-Miner mines for association rules of the form ϕ ψ and for conditional association rules of the form ϕ ψ / χ. Here ϕ, ψ and χ are conjunctions of Boolean attributes automatically derived from manyvalued attributes in various ways. The symbol is called 4ft-quantifier. The association rule ϕ ψ means that Boolean attributes ϕ and ψ are somehow associated in the sense of the 4ft-quantifier. A conditional association rule ϕ ψ / χ means that ϕ and ψ are associated (in the sense of ) if the condition χ is satisfied. 1

2 The left part of association rule (ϕ) is called antecedent, part denoted as ψ is called succedent and χ is condition. All parts together are referred as cedents. This section describes features of the procedure 4ft- Miner to show advantages of the bit string approach. The first one is richness of possibilities how to define in a simple way the set of interesting association rules to be automatically generated and verified, see section 2.1. The second one is possibility to deal with many types of association rules, see section 2.2. The important features of output of 4ft-Miner are outlined in section Sets of Interesting Association Rules Analysed data for the procedure are stored in data matrix. Rows of the data matrix correspond to observed objects and columns correspond to attributes properties of observed object. An example is the data matrix Loans, see Figure 1. Client Age Sex Salary District Quality 1 45 M very high Prague good 2 22 F very low Plzen bad 3 37 F average Brno good 4 53 F high Benesov good M low Kolin bad F high Brod good Figure 1. Data matrix Loans Each row of the data matrix Loans describes one loan given to a client of bank. There are loans. The first row describes a loan that received a 45 years old man. This man has a very high salary and he lives in the district of Prague. The quality of his loan is good. Each cedent is a conjunction of Boolean attributes called literals. Literal is the expression of the form A(α), here A is an attribute and α is the subset of all possible values (i.e. categories) of the attribute A. The subset α is called a coefficient of the literal A(α). Examples of cedents ϕ, ψ and χ are: ϕ = Age<20;30) it is true if value of the attribute Age is in the interval <20;30), ψ = Quality(good) it is true if value of the attribute Analogous simple definition of all succedents. Analogous simple definition of all conditions (if desired). Definition of a 4ft-quantifier there are 17 types of 4ft-quantifiers. The antecedents are conjunctions of literals automatically generated from the given set of antecedent attributes. It is also possible to divide this set into several subsets called partial antecedents. A partial antecedent is also conjunction of literals, and the antecedent as whole is conjunction of partial antecedents. The partial antecedent is given by: a list of attributes some of these attributes are marked as basic (partial antecedent must contain at least one basic attribute), a minimal and maximal number of attributes to be used in partial cedent, a simple definition of the set of all literals to be generated from each attribute. Any literal can positive or negative. The positive literal is the literal A(α) itself. The negative literal is the expression A(α) the Boolean negation of A(α). The set of all literals to be generated for the particular attribute is given by: a type of coefficient. There are available six types of coefficients: subsets, intervals, left cuts, right cuts, cuts, one particular value. A minimal and maximal number of values in the coefficient. Positive/negative literal option: a) only positive literals to be generated, b) only negative literals to be generated, c) both positive and negative literals to be generated. We use the attribute A with categories {1, 2, 3, 4, 5} to give examples of particular types of coefficients: Subsets: definition of subsets with 2-3 categories defines literals A(1,2), A(1,3), A(1,4), A(1,5), A(2,3),, A(3,4),..., A(4,5), A(1,2,3), A(1,2,4), A(1,2,5), A(2,3,4),, A(3,4,5). Intervals: definition of intervals with 2-3 categories defines literals A(1,2), A(2,3), A(3,4), A(4,5), A(1,2,3), A(2,3,4) and A(3,4,5). Quality is good, Left cuts: definition of left cuts with maximally 3 χ = District(Prague, Plzen) Salary(very high) it is categories defines literals A(1), A(1,2,3) and true if both the value of the attribute District is A(1,2,3). Prague or Plzen and the value of the attribute Salary Right cuts: definition of right cuts with maximally 4 is very high. categories defines literals A(5), A(5,4), A(5,4,3) and The set of interesting association rules to be generated and tested on the given data matrix is defined by: Simple definition of all antecedents. A(5,4,3,2). Cuts: means both left cuts and right cuts. 2

3 An example of the antecedent definition is in Figure 2. Figure 2. Example of the antecedent definition There are two partial antecedents in the Figure 2. The partial antecedent Client_Basic contains attributes Sex, Salary and District. Each line defines types of coefficients to be generated for corresponding attribute. Line Sex(*), 1 1 means that subset of categories of the length from 1 to 1 are to be generate for attribute Sex. It means literals Sex(F) and Sex(M). Cuts are to be generated for attribute Salary. This attribute has categories very low, low, average, high and very high. All the possible cuts of the length from 1 to 2 are literals Salary(very low), Salary(very low, low), Salary(very high) and Salary(high, very high). Subsets of the length from 1 to 2 are to be generated for the attribute District, see District(*), 1 2. It means that all single district e.g. District(Prague) and all pairs of districts e.g. District(Plzen, Prague) will be generated. There are 77 particular districts thus literals are defined this way. The partial cedent Client_Basic has length from 1 to 3. So at least one of attributes Sex, Salary, District will be always used in the antecedent. The partial cedent Client_Age is defined such that none or one of two types of literals concerning Age will be used. By defining Age(int) 5 5 we want all the intervals of the length 5 to be generated. In other way we can say that there will be a sliding window of the length 5. The definition Age(lcut) 1 10 means that left cuts will generated, thus we will investigate young clients. An example of the coefficient given by one value is in Figure 3. In such a case we concentrate on the loans with bad quality. Figure 3. Example of the coefficient of one value Let us emphasize that each cedent and even partial cedent are treated as objects and can be copied or moved to another task or cedent Verification of Association Rules The association rule ϕ ψ means that Boolean attributes ϕ and ψ are associated in the sense of the 4ft-quantifier. The rule ϕ ψ can be true or false in the analysed data matrix M. The conditional association rule ϕ ψ / χ is true in the analysed data matrix M if the rule ϕ ψ is true in the data matrix M / χ. The data matrix M / χ consists of all rows of matrix M satisfying the condition χ. There must exist at least one such a row for ϕ ψ / χ to be true. The association rule ϕ ψ is verified on the basis of four-fold table 4ft(ϕ, ψ, M) of ϕ, ψ in M see Figure 4. M ψ ψ ϕ a b ϕ c d Figure 4. Four-fold table 4ft(ϕ, ψ, M) of ϕ, ψ in M Here a is the number of objects satisfying both ϕ and ψ, b is the number of objects satisfying ϕ and not satisfying ψ, c is the number of objects not satisfying ϕ and satisfying ψ, and d is the number of objects satisfying neither ϕ nor ψ. A true/false function based on frequencies from the four-fold table <a,b,c,d> is defined by each 4ftquantifier. The association rule ϕ ψ is true in the data matrix M if the function defined by the 4ft-quantifier is true in the four-fold table 4ft(ϕ, ψ, M) of ϕ, ψ in M. Various 4ft-quantifiers are defined in [2] and [4]. Here follow some examples: Founded implication p;base Parameters: 0 < p 1 and Base > 0 True iff a p a Base a + b The association rule ϕ p;base ψ can be interpreted as 100p per cent of objects satisfying ϕ satisfy also ψ or ϕ implies ψ on the level 100p per cent. Lower critical implication! p;α;base Parameters: 0 < p 1, Base > 0 and 0 < α 0.5 True iff a+b (a+b)! i!( a+b i)! i= a i * p *(1 p) a+b i α a Base Association rule ϕ! p; ;Base ψ corresponds to a test (on the level α) of a null hypothesis H 0 : P(ϕ ψ ) p against the alternative one H 1 : P(ϕ ψ) > p. If association rule ϕ! p; ;Base ψ is true in data matrix M then the alternative hypothesis is accepted. 3

4 Double founded implication p;base Parameters 0 < p 1 and Base > 0 True iff a p a Base a + b + c Association rule ϕ p;base ψ can be interpreted as 100p percent of objects satisfying ϕ or ψ satisfy both ϕ and ψ or ϕ ψ implies ϕ ψ on the level 100p per cent. All the implemented 4ft-quantifiers are described at The four fold table can be computed in a very fast way, see section 3. Let us remark that pre-computed tables of critical frequencies can be used to verification of 4ft-quantifiers based on statistical hypotheses tests [4]. This way we need only one test of inequality instead of computation of complex formula. When we deal with missing information we have to compute nine-fold tables or even eighteen-fold tables. The bit string approach again is used for very fast computation of these tables. There are also several possibilities how to reduce these tables back to four-fold table. For details see e.g. [5]. Figure 5. Example of the 4ft-Miner output 3. Bit String Approach The basic principle of bit-string approach is in representation of analysed data by suitable strings of bits (see section 3.1). It makes then possible to use simple algorithm and data structures to efficiently compute necessary frequencies (see 3.2) Output of 4ft-Miner 3.1. Bit-string Representation of Attributes Output of the procedure consists of all prime association rules. The association rule is prime if both it is true in the analysed data matrix and it does not follow immediately from other more simple association rules already in the output. The question is what does it mean that the association rule ϕ ψ immediately follows from more simple association rule ϕ 1 ψ 1. Answer depends on properties of the used 4ft-quantifier. The definition of prime association rule for the 4ft-quantifier of founded implication p;base must take into account that if the association rule e.g. Sex(M) p;base District(Prague) is true then the association rule Sex(M) p;base District(Prague, Plzen) is also always true. Thus the second association rule immediately follows from the first, more simple one. All the followers are automatically omitted from output. There is theoretical background of logical properties of association rules. For details see section 4 or e.g. [4]. An example of the output of 4ft-Miner is in Figure 5. This output represents the task with the set of interesting antecedents and succedents defined in Figure 2 and Figure 3 respectively and with the quantifier 0.7;20 of founded implication. The whole solution contains 46 prime association rules. Each category of each attribute (i.e. each of its possible values) is represented by one string of bits. This string is called card of category [3]. We can use the attribute District as an example. The attribute District has 77 categories: Benesov, Brno,, Prague, Plzen,, Znojmo. Its representation is shown in Figure 6. Client District Cards of Categories Brno Kolin Plzen Prague 1 Prague Plzen Brno Benesov Kolin Brod Figure 6. Cards of categories The first row of this table corresponds to column Client (row number) of the data matrix Loans, see Figure 1. The second row of the table corresponds to column District. Each of the further rows of Figure 6 is the card of one category. Each bit of the card of category corresponds to one row of the data matrix Loans. The first bit corresponds to the first row; the second bit corresponds to the second row etc. There is 1 in particular bit if there is the value (i.e. 4

5 category) in the row corresponding to this bit in the column District. Otherwise there is 0 in this bit. The first bit of the card of the category Benesov is 0 because the value in the first row of the data matrix is not Benesov (but Prague). The third bit of the card of the category Brno is 1 because of the value in the third row is Brno, etc. There are 6181 rows in the data matrix Loans, therefore bits or 773 bytes are necessary to represent one category by its card. Attribute District has 77 categories. It means that bytes (i.e ) are necessary to represent this attribute Algorithm and Data Structures Structure named card of antecedent represents each antecedent. We denote it by Card_[antecedent]. It is a string of bits of the same length as number of rows in the analysed data matrix. Each bit of card corresponds again to one row of the analysed data matrix. There is 1 in a particular bit if the row corresponding to this bit satisfies the antecedent. The card of antecedent is thus the bit-wise representation of Boolean attribute antecedent. It is created as conjunction of card of literals of all its literals. Card of literal is beforehand created as disjunction of card of categories from literal coefficient. Detail description is out of range of this article and can be found in e.g. [3]. The number of 1 s in the card of antecedent is the number of rows satisfying the antecedent. We use a lowlevel bit-string function Count(α) returning number of values 1 in the string α. The number of rows satisfying the antecedent must be equal or greater than the value of parameter Base, see section 2.2. For every generated antecedent we test whether Count(Card_[antecedent]) Base to decide if this antecedent can be at all a part of the true association rule. This test can be understood whether the corresponding itemset is frequent [1]. Both Card_[antecedent] and Card_[succedent] (analogous to card of antecedent) are used to compute frequencies of four-fold table of antecedent and succedent, see Figure 7. M Succedent Succedent Antecedent a b Here n is the total number of rows in the data matrix M. Memory used by strings of bits while running a datamining task is not a significant problem. Especially when compared to significant time improvements during generation and verification. Let us remark that e.g. lot of medical data concerns thousands of patients and tens or hundreds of attributes. The corresponding data mining tasks can be solved without problems at common PC s. Moreover in many cases we get the solution in several minutes or even in several seconds. Therefore 4ft-Miner is also suitable for teaching purposes. Here we provide results of an experiment at a Pentium 400 MHz computer with 98 MB RAM. We solved tasks to find true and prime association rules in the data matrices Loans, Loans_10 and Loans_20. The data matrix Loans_10 has 10 times more rows than original data matrix Loans. Analogously data matrix Loans_20 has 20 times more rows. There are about relevant association rules that has to generated and verified according to task definition. Only about of association rules were actually verified due to all the optimisations some of them described above. The time of solution for particular data matrices is given in Figure 8. Data matrix Loans Loans_10 Loans_20 Rows Time of sol. [sec] Figure 8. Time of solution of various tasks Let us emphasize that the time of the bit string operations AND, NOT, OR and Count is linearly dependent on the length of particular cards. The length of each card is equal to the number of rows of the analysed data matrix. Thus the time the procedure 4ft-Miner needs to solve a given task is linearly dependent on the number of rows of the analysed data matrix. 4. New Data Mining Procedures Advantages of the bit-strings approach can be further used in new data mining procedures. An example is the procedure Pareto-Miner. Figures 9 and 10 express the motivation for this procedure. Antecedent c d Both figures concern distribution of clients (see the Figure 7. Four-fold table from cards data matrix Loans, Figure 1) among particular regions. The first one concerns all clients and the second one The particular frequencies are computed in the following concerns the clients with high salary only. way: a = Count(Card_[Antecedent] Card_[Succedent]) b = Count(Card_[Antecedent]) a c = Count(Card_[Succedent]) a d = n a b c The distribution of clients with high salary remarkable differs from the distribution of all clients. The difference concerns namely the pair Prague south Moravia. It can be useful to find all segments of clients that differ in a given way from the segment of all clients in the 5

6 distribution of clients among particular regions. The Pareto-Miner procedure is intended to solve such tasks. Its input consists of: a data matrix with columns linked to attributes and rows corresponding to observed objects., a analysed attribute A (usually with several values), parameters defining a large set of conditions in the same way as a set of conditions in the 4ft-Miner procedure is defined, a criterion of interestingness of a particular condition. Figure 9. Distribution of all clients among regions Literature [1] Aggraval, R. et all.: Fast Discovery of Association Rules, Advances in Knowledge Discovery and Data Mining (Fayyad, U. M. et al. eds.), AAAI Press / The MIT Press, 1996, pp [2] Hájek, P. Havránek, T.: Mechanising Hypothesis Formation Mathematical Foundations for a General Theory, Springer-Verlag, 1978, pp [3] Rauch, J.: Some Remarks on Computer Realisations of GUHA Procedures, International Journal of Man- Machine Studies 10, 1978, pp [4] Rauch, J.: Classes of Four-Fold Table Quantifiers, Principles of Data Mining and Knowledge Discovery, (J. Zytkow, M. Quafafou, eds.), Springer-Verlag, 1998, pp [5] Rauch, J.: Four-fold Table Calculi and Missing Information, JCI S98 Association for Intelligent Machinery, Vol. II., (Wang Paul eds.), Durham, Duke University, [6] Rauch, J. Šimůnek, M.: Mining for 4ft Association Rules by 4ft-Miner, INAP 2001, The Proceeding of the International Conference On Applications of Prolog, Prolog Association of Japan, Tokyo, October 2001, pp Figure 10. Distribution of clients with high salary among regions The criterion of interestingness describes a distribution of rows of the data matrix among the particular values of the attribute A. Examples of the criteria are: a remarkable difference of the distribution when the particular condition is satisfied and the distribution for the whole analysed data matrix. The difference can be measured e.g. by number of values with different order. a remarkable difference of the distribution when the particular condition is satisfied and the distribution under an other given condition. The evaluation of these criteria requires knowledge of frequencies of particular values of the attribute A under the condition in questions. These frequencies can be computed using cards of cedents for conditions and using cards of particular categories. Thus tools already developed can be used. We can use the already developed tools for generation including particular conditions C and for computing card Card_[C]. The particular frequencies can computed such that f i,j = Count((Card_[ a i ] Card_[ s j ] Card_[C]). This paper has been supported by the grant COST ACTION 274 TARSKI (Theory and Applications of Relational Structures as Knowledge Instruments). 6

Investigating Measures of Association by Graphs and Tables of Critical Frequencies

Investigating Measures of Association by Graphs and Tables of Critical Frequencies Investigating Measures of Association by Graphs Investigating and Tables Measures of Critical of Association Frequencies by Graphs and Tables of Critical Frequencies Martin Ralbovský, Jan Rauch University

More information

Applying Domain Knowledge in Association Rules Mining Process First Experience

Applying Domain Knowledge in Association Rules Mining Process First Experience Applying Domain Knowledge in Association Rules Mining Process First Experience Jan Rauch, Milan Šimůnek Faculty of Informatics and Statistics, University of Economics, Prague nám W. Churchilla 4, 130 67

More information

USING THE AC4FT-MINER PROCEDURE IN THE MEDICAL DOMAIN. Viktor Nekvapil

USING THE AC4FT-MINER PROCEDURE IN THE MEDICAL DOMAIN. Viktor Nekvapil USING THE AC4FT-MINER PROCEDURE IN THE MEDICAL DOMAIN Viktor Nekvapil About the author 2 VŠE: IT 4IQ, 2. semestr Bachelor thesis: Using the Ac4ft-Miner procedure in the medical domain Supervisor: doc.

More information

The GUHA method and its meaning for data mining. Petr Hájek, Martin Holeňa, Jan Rauch

The GUHA method and its meaning for data mining. Petr Hájek, Martin Holeňa, Jan Rauch The GUHA method and its meaning for data mining Petr Hájek, Martin Holeňa, Jan Rauch 1 Introduction. GUHA: a method of exploratory data analysis developed in Prague since mid-sixties of the past century.

More information

Mining in Hepatitis Data by LISp-Miner and SumatraTT

Mining in Hepatitis Data by LISp-Miner and SumatraTT Mining in Hepatitis Data by LISp-Miner and SumatraTT Petr Aubrecht 1, Martin Kejkula 2, Petr Křemen 1, Lenka Nováková 1, Jan Rauch 2, Milan Šimůnek2, Olga Štěpánková1, and Monika Žáková1 1 Czech Technical

More information

Detecting Anomalous and Exceptional Behaviour on Credit Data by means of Association Rules. M. Delgado, M.D. Ruiz, M.J. Martin-Bautista, D.

Detecting Anomalous and Exceptional Behaviour on Credit Data by means of Association Rules. M. Delgado, M.D. Ruiz, M.J. Martin-Bautista, D. Detecting Anomalous and Exceptional Behaviour on Credit Data by means of Association Rules M. Delgado, M.D. Ruiz, M.J. Martin-Bautista, D. Sánchez 18th September 2013 Detecting Anom and Exc Behaviour on

More information

Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York

Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval Sargur Srihari University at Buffalo The State University of New York 1 A Priori Algorithm for Association Rule Learning Association

More information

Machine Learning and Association rules. Petr Berka, Jan Rauch University of Economics, Prague {berka

Machine Learning and Association rules. Petr Berka, Jan Rauch University of Economics, Prague {berka Machine Learning and Association rules Petr Berka, Jan Rauch University of Economics, Prague {berka rauch}@vse.cz Tutorial Outline Statistics, machine learning and data mining basic concepts, similarities

More information

Pattern Structures 1

Pattern Structures 1 Pattern Structures 1 Pattern Structures Models describe whole or a large part of the data Pattern characterizes some local aspect of the data Pattern is a predicate that returns true for those objects

More information

Total Time = 90 Minutes, Total Marks = 50. Total /50 /10 /18

Total Time = 90 Minutes, Total Marks = 50. Total /50 /10 /18 University of Waterloo Department of Electrical & Computer Engineering E&CE 223 Digital Circuits and Systems Midterm Examination Instructor: M. Sachdev October 23rd, 2007 Total Time = 90 Minutes, Total

More information

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem. 1 More on NP In this set of lecture notes, we examine the class NP in more detail. We give a characterization of NP which justifies the guess and verify paradigm, and study the complexity of solving search

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Mining Frequent Patterns and Associations: Basic Concepts (Chapter 6) Huan Sun, CSE@The Ohio State University Slides adapted from Prof. Jiawei Han @UIUC, Prof. Srinivasan

More information

Comp487/587 - Boolean Formulas

Comp487/587 - Boolean Formulas Comp487/587 - Boolean Formulas 1 Logic and SAT 1.1 What is a Boolean Formula Logic is a way through which we can analyze and reason about simple or complicated events. In particular, we are interested

More information

CHAPTER 4 CLASSICAL PROPOSITIONAL SEMANTICS

CHAPTER 4 CLASSICAL PROPOSITIONAL SEMANTICS CHAPTER 4 CLASSICAL PROPOSITIONAL SEMANTICS 1 Language There are several propositional languages that are routinely called classical propositional logic languages. It is due to the functional dependency

More information

Lecture Notes for Chapter 6. Introduction to Data Mining

Lecture Notes for Chapter 6. Introduction to Data Mining Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004

More information

D B M G Data Base and Data Mining Group of Politecnico di Torino

D B M G Data Base and Data Mining Group of Politecnico di Torino Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Association rules Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket

More information

D B M G. Association Rules. Fundamentals. Fundamentals. Elena Baralis, Silvia Chiusano. Politecnico di Torino 1. Definitions.

D B M G. Association Rules. Fundamentals. Fundamentals. Elena Baralis, Silvia Chiusano. Politecnico di Torino 1. Definitions. Definitions Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Itemset is a set including one or more items Example: {Beer, Diapers} k-itemset is an itemset that contains k

More information

Association Rules. Fundamentals

Association Rules. Fundamentals Politecnico di Torino Politecnico di Torino 1 Association rules Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket counter Association rule

More information

D B M G. Association Rules. Fundamentals. Fundamentals. Association rules. Association rule mining. Definitions. Rule quality metrics: example

D B M G. Association Rules. Fundamentals. Fundamentals. Association rules. Association rule mining. Definitions. Rule quality metrics: example Association rules Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket

More information

Propositions and Proofs

Propositions and Proofs Chapter 2 Propositions and Proofs The goal of this chapter is to develop the two principal notions of logic, namely propositions and proofs There is no universal agreement about the proper foundations

More information

732A61/TDDD41 Data Mining - Clustering and Association Analysis

732A61/TDDD41 Data Mining - Clustering and Association Analysis 732A61/TDDD41 Data Mining - Clustering and Association Analysis Lecture 6: Association Analysis I Jose M. Peña IDA, Linköping University, Sweden 1/14 Outline Content Association Rules Frequent Itemsets

More information

Data Mining and Machine Learning

Data Mining and Machine Learning Data Mining and Machine Learning Concept Learning and Version Spaces Introduction Concept Learning Generality Relations Refinement Operators Structured Hypothesis Spaces Simple algorithms Find-S Find-G

More information

CS 584 Data Mining. Association Rule Mining 2

CS 584 Data Mining. Association Rule Mining 2 CS 584 Data Mining Association Rule Mining 2 Recall from last time: Frequent Itemset Generation Strategies Reduce the number of candidates (M) Complete search: M=2 d Use pruning techniques to reduce M

More information

CHAPTER 10. Gentzen Style Proof Systems for Classical Logic

CHAPTER 10. Gentzen Style Proof Systems for Classical Logic CHAPTER 10 Gentzen Style Proof Systems for Classical Logic Hilbert style systems are easy to define and admit a simple proof of the Completeness Theorem but they are difficult to use. By humans, not mentioning

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

An Introduction to SAT Solving

An Introduction to SAT Solving An Introduction to SAT Solving Applied Logic for Computer Science UWO December 3, 2017 Applied Logic for Computer Science An Introduction to SAT Solving UWO December 3, 2017 1 / 46 Plan 1 The Boolean satisfiability

More information

Regrese a predikce pomocí fuzzy asociačních pravidel

Regrese a predikce pomocí fuzzy asociačních pravidel Regrese a predikce pomocí fuzzy asociačních pravidel Pavel Rusnok Institute for Research and Applications of Fuzzy Modeling University of Ostrava Ostrava, Czech Republic pavel.rusnok@osu.cz March 1, 2018,

More information

Interesting Patterns. Jilles Vreeken. 15 May 2015

Interesting Patterns. Jilles Vreeken. 15 May 2015 Interesting Patterns Jilles Vreeken 15 May 2015 Questions of the Day What is interestingness? what is a pattern? and how can we mine interesting patterns? What is a pattern? Data Pattern y = x - 1 What

More information

The Calculus of Computation: Decision Procedures with Applications to Verification. Part I: FOUNDATIONS. by Aaron Bradley Zohar Manna

The Calculus of Computation: Decision Procedures with Applications to Verification. Part I: FOUNDATIONS. by Aaron Bradley Zohar Manna The Calculus of Computation: Decision Procedures with Applications to Verification Part I: FOUNDATIONS by Aaron Bradley Zohar Manna 1. Propositional Logic(PL) Springer 2007 1-1 1-2 Propositional Logic(PL)

More information

Association Analysis. Part 2

Association Analysis. Part 2 Association Analysis Part 2 1 Limitations of the Support/Confidence framework 1 Redundancy: many of the returned patterns may refer to the same piece of information 2 Difficult control of output size:

More information

OPPA European Social Fund Prague & EU: We invest in your future.

OPPA European Social Fund Prague & EU: We invest in your future. OPPA European Social Fund Prague & EU: We invest in your future. Frequent itemsets, association rules Jiří Kléma Department of Cybernetics, Czech Technical University in Prague http://ida.felk.cvut.cz

More information

CSC Discrete Math I, Spring Propositional Logic

CSC Discrete Math I, Spring Propositional Logic CSC 125 - Discrete Math I, Spring 2017 Propositional Logic Propositions A proposition is a declarative sentence that is either true or false Propositional Variables A propositional variable (p, q, r, s,...)

More information

A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine

A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine A Lower Bound of 2 n Conditional Jumps for Boolean Satisfiability on A Random Access Machine Samuel C. Hsieh Computer Science Department, Ball State University July 3, 2014 Abstract We establish a lower

More information

DATA MINING - 1DL360

DATA MINING - 1DL360 DATA MINING - 1DL36 Fall 212" An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht12 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala

More information

Chapter 6. Frequent Pattern Mining: Concepts and Apriori. Meng Jiang CSE 40647/60647 Data Science Fall 2017 Introduction to Data Mining

Chapter 6. Frequent Pattern Mining: Concepts and Apriori. Meng Jiang CSE 40647/60647 Data Science Fall 2017 Introduction to Data Mining Chapter 6. Frequent Pattern Mining: Concepts and Apriori Meng Jiang CSE 40647/60647 Data Science Fall 2017 Introduction to Data Mining Pattern Discovery: Definition What are patterns? Patterns: A set of

More information

CS246 Final Exam. March 16, :30AM - 11:30AM

CS246 Final Exam. March 16, :30AM - 11:30AM CS246 Final Exam March 16, 2016 8:30AM - 11:30AM Name : SUID : I acknowledge and accept the Stanford Honor Code. I have neither given nor received unpermitted help on this examination. (signed) Directions

More information

Logic: Propositional Logic (Part I)

Logic: Propositional Logic (Part I) Logic: Propositional Logic (Part I) Alessandro Artale Free University of Bozen-Bolzano Faculty of Computer Science http://www.inf.unibz.it/ artale Descrete Mathematics and Logic BSc course Thanks to Prof.

More information

ICS141: Discrete Mathematics for Computer Science I

ICS141: Discrete Mathematics for Computer Science I ICS141: Discrete Mathematics for Computer Science I Dept. Information & Computer Sci., Originals slides by Dr. Baek and Dr. Still, adapted by J. Stelovsky Based on slides Dr. M. P. Frank and Dr. J.L. Gross

More information

Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 05

Meelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 05 Meelis Kull meelis.kull@ut.ee Autumn 2017 1 Sample vs population Example task with red and black cards Statistical terminology Permutation test and hypergeometric test Histogram on a sample vs population

More information

Frequent Itemsets and Association Rule Mining. Vinay Setty Slides credit:

Frequent Itemsets and Association Rule Mining. Vinay Setty Slides credit: Frequent Itemsets and Association Rule Mining Vinay Setty vinay.j.setty@uis.no Slides credit: http://www.mmds.org/ Association Rule Discovery Supermarket shelf management Market-basket model: Goal: Identify

More information

A Tutorial on Computational Learning Theory Presented at Genetic Programming 1997 Stanford University, July 1997

A Tutorial on Computational Learning Theory Presented at Genetic Programming 1997 Stanford University, July 1997 A Tutorial on Computational Learning Theory Presented at Genetic Programming 1997 Stanford University, July 1997 Vasant Honavar Artificial Intelligence Research Laboratory Department of Computer Science

More information

Fuzzy Propositional Logic for the Knowledge Representation

Fuzzy Propositional Logic for the Knowledge Representation Fuzzy Propositional Logic for the Knowledge Representation Alexander Savinov Institute of Mathematics Academy of Sciences Academiei 5 277028 Kishinev Moldova (CIS) Phone: (373+2) 73-81-30 EMAIL: 23LSII@MATH.MOLDOVA.SU

More information

Data Analytics Beyond OLAP. Prof. Yanlei Diao

Data Analytics Beyond OLAP. Prof. Yanlei Diao Data Analytics Beyond OLAP Prof. Yanlei Diao OPERATIONAL DBs DB 1 DB 2 DB 3 EXTRACT TRANSFORM LOAD (ETL) METADATA STORE DATA WAREHOUSE SUPPORTS OLAP DATA MINING INTERACTIVE DATA EXPLORATION Overview of

More information

Associa'on Rule Mining

Associa'on Rule Mining Associa'on Rule Mining Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata August 4 and 7, 2014 1 Market Basket Analysis Scenario: customers shopping at a supermarket Transaction

More information

Mining Positive and Negative Fuzzy Association Rules

Mining Positive and Negative Fuzzy Association Rules Mining Positive and Negative Fuzzy Association Rules Peng Yan 1, Guoqing Chen 1, Chris Cornelis 2, Martine De Cock 2, and Etienne Kerre 2 1 School of Economics and Management, Tsinghua University, Beijing

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Deductive Inference for the Interiors and Exteriors of Horn Theories

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Deductive Inference for the Interiors and Exteriors of Horn Theories MATHEMATICAL ENGINEERING TECHNICAL REPORTS Deductive Inference for the Interiors and Exteriors of Horn Theories Kazuhisa MAKINO and Hirotaka ONO METR 2007 06 February 2007 DEPARTMENT OF MATHEMATICAL INFORMATICS

More information

CS 486: Lecture 2, Thursday, Jan 22, 2009

CS 486: Lecture 2, Thursday, Jan 22, 2009 CS 486: Lecture 2, Thursday, Jan 22, 2009 Mark Bickford January 22, 2009 1 Outline Propositional formulas Interpretations and Valuations Validity and Satisfiability Truth tables and Disjunctive Normal

More information

R u t c o r Research R e p o r t. Relations of Threshold and k-interval Boolean Functions. David Kronus a. RRR , April 2008

R u t c o r Research R e p o r t. Relations of Threshold and k-interval Boolean Functions. David Kronus a. RRR , April 2008 R u t c o r Research R e p o r t Relations of Threshold and k-interval Boolean Functions David Kronus a RRR 04-2008, April 2008 RUTCOR Rutgers Center for Operations Research Rutgers University 640 Bartholomew

More information

Propositional and Predicate Logic - II

Propositional and Predicate Logic - II Propositional and Predicate Logic - II Petr Gregor KTIML MFF UK WS 2016/2017 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - II WS 2016/2017 1 / 16 Basic syntax Language Propositional logic

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Mining Frequent Patterns and Associations: Basic Concepts (Chapter 6) Huan Sun, CSE@The Ohio State University 10/17/2017 Slides adapted from Prof. Jiawei Han @UIUC, Prof.

More information

Informatics 1 - Computation & Logic: Tutorial 3

Informatics 1 - Computation & Logic: Tutorial 3 Informatics 1 - Computation & Logic: Tutorial 3 Counting Week 5: 16-20 October 2016 Please attempt the entire worksheet in advance of the tutorial, and bring all work with you. Tutorials cannot function

More information

DATA MINING - 1DL360

DATA MINING - 1DL360 DATA MINING - DL360 Fall 200 An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht0 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 04 Association Analysis Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Measuring the Good and the Bad in Inconsistent Information

Measuring the Good and the Bad in Inconsistent Information Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence Measuring the Good and the Bad in Inconsistent Information John Grant Department of Computer Science, University

More information

A Study on Monotone Self-Dual Boolean Functions

A Study on Monotone Self-Dual Boolean Functions A Study on Monotone Self-Dual Boolean Functions Mustafa Altun a and Marc D Riedel b a Electronics and Communication Engineering, Istanbul Technical University, Istanbul, Turkey 34469 b Electrical and Computer

More information

DATA MINING LECTURE 3. Frequent Itemsets Association Rules

DATA MINING LECTURE 3. Frequent Itemsets Association Rules DATA MINING LECTURE 3 Frequent Itemsets Association Rules This is how it all started Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases.

More information

Tutorial 6. By:Aashmeet Kalra

Tutorial 6. By:Aashmeet Kalra Tutorial 6 By:Aashmeet Kalra AGENDA Candidate Elimination Algorithm Example Demo of Candidate Elimination Algorithm Decision Trees Example Demo of Decision Trees Concept and Concept Learning A Concept

More information

First-Order Theorem Proving and Vampire. Laura Kovács (Chalmers University of Technology) Andrei Voronkov (The University of Manchester)

First-Order Theorem Proving and Vampire. Laura Kovács (Chalmers University of Technology) Andrei Voronkov (The University of Manchester) First-Order Theorem Proving and Vampire Laura Kovács (Chalmers University of Technology) Andrei Voronkov (The University of Manchester) Outline Introduction First-Order Logic and TPTP Inference Systems

More information

Why Learning Logic? Logic. Propositional Logic. Compound Propositions

Why Learning Logic? Logic. Propositional Logic. Compound Propositions Logic Objectives Propositions and compound propositions Negation, conjunction, disjunction, and exclusive or Implication and biconditional Logic equivalence and satisfiability Application of propositional

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Joschka Boedecker and Wolfram Burgard and Frank Hutter and Bernhard Nebel Albert-Ludwigs-Universität Freiburg

More information

03 Propositional Logic II

03 Propositional Logic II Martin Henz February 12, 2014 Generated on Wednesday 12 th February, 2014, 09:49 1 Review: Syntax and Semantics of Propositional Logic 2 3 Propositional Atoms and Propositions Semantics of Formulas Validity,

More information

Logic, Sets, and Proofs

Logic, Sets, and Proofs Logic, Sets, and Proofs David A. Cox and Catherine C. McGeoch Amherst College 1 Logic Logical Operators. A logical statement is a mathematical statement that can be assigned a value either true or false.

More information

Mathematics 114L Spring 2018 D.A. Martin. Mathematical Logic

Mathematics 114L Spring 2018 D.A. Martin. Mathematical Logic Mathematics 114L Spring 2018 D.A. Martin Mathematical Logic 1 First-Order Languages. Symbols. All first-order languages we consider will have the following symbols: (i) variables v 1, v 2, v 3,... ; (ii)

More information

Introduction to Theoretical Computer Science

Introduction to Theoretical Computer Science Introduction to Theoretical Computer Science Zdeněk Sawa Department of Computer Science, FEI, Technical University of Ostrava 17. listopadu 15, Ostrava-Poruba 708 33 Czech republic February 11, 2018 Z.

More information

Mathematical Logic Propositional Logic - Tableaux*

Mathematical Logic Propositional Logic - Tableaux* Mathematical Logic Propositional Logic - Tableaux* Fausto Giunchiglia and Mattia Fumagalli University of Trento *Originally by Luciano Serafini and Chiara Ghidini Modified by Fausto Giunchiglia and Mattia

More information

HANDOUT AND SET THEORY. Ariyadi Wijaya

HANDOUT AND SET THEORY. Ariyadi Wijaya HANDOUT LOGIC AND SET THEORY Ariyadi Wijaya Mathematics Education Department Faculty of Mathematics and Natural Science Yogyakarta State University 2009 1 Mathematics Education Department Faculty of Mathematics

More information

A Logical Formulation of the Granular Data Model

A Logical Formulation of the Granular Data Model 2008 IEEE International Conference on Data Mining Workshops A Logical Formulation of the Granular Data Model Tuan-Fang Fan Department of Computer Science and Information Engineering National Penghu University

More information

Propositional Logic: Models and Proofs

Propositional Logic: Models and Proofs Propositional Logic: Models and Proofs C. R. Ramakrishnan CSE 505 1 Syntax 2 Model Theory 3 Proof Theory and Resolution Compiled at 11:51 on 2016/11/02 Computing with Logic Propositional Logic CSE 505

More information

FUZZY ASSOCIATION RULES: A TWO-SIDED APPROACH

FUZZY ASSOCIATION RULES: A TWO-SIDED APPROACH FUZZY ASSOCIATION RULES: A TWO-SIDED APPROACH M. De Cock C. Cornelis E. E. Kerre Dept. of Applied Mathematics and Computer Science Ghent University, Krijgslaan 281 (S9), B-9000 Gent, Belgium phone: +32

More information

Advanced Topics in LP and FP

Advanced Topics in LP and FP Lecture 1: Prolog and Summary of this lecture 1 Introduction to Prolog 2 3 Truth value evaluation 4 Prolog Logic programming language Introduction to Prolog Introduced in the 1970s Program = collection

More information

Intelligent Agents. First Order Logic. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University. last change: 19.

Intelligent Agents. First Order Logic. Ute Schmid. Cognitive Systems, Applied Computer Science, Bamberg University. last change: 19. Intelligent Agents First Order Logic Ute Schmid Cognitive Systems, Applied Computer Science, Bamberg University last change: 19. Mai 2015 U. Schmid (CogSys) Intelligent Agents last change: 19. Mai 2015

More information

Logical Agents. September 14, 2004

Logical Agents. September 14, 2004 Logical Agents September 14, 2004 The aim of AI is to develop intelligent agents that can reason about actions and their effects and about the environment, create plans to achieve a goal, execute the plans,

More information

Introduction. Applications of discrete mathematics:

Introduction. Applications of discrete mathematics: Introduction Applications of discrete mathematics: Formal Languages (computer languages) Compiler Design Data Structures Computability Automata Theory Algorithm Design Relational Database Theory Complexity

More information

A MODEL-THEORETIC PROOF OF HILBERT S NULLSTELLENSATZ

A MODEL-THEORETIC PROOF OF HILBERT S NULLSTELLENSATZ A MODEL-THEORETIC PROOF OF HILBERT S NULLSTELLENSATZ NICOLAS FORD Abstract. The goal of this paper is to present a proof of the Nullstellensatz using tools from a branch of logic called model theory. In

More information

Knowledge representation DATA INFORMATION KNOWLEDGE WISDOM. Figure Relation ship between data, information knowledge and wisdom.

Knowledge representation DATA INFORMATION KNOWLEDGE WISDOM. Figure Relation ship between data, information knowledge and wisdom. Knowledge representation Introduction Knowledge is the progression that starts with data which s limited utility. Data when processed become information, information when interpreted or evaluated becomes

More information

Nested Epistemic Logic Programs

Nested Epistemic Logic Programs Nested Epistemic Logic Programs Kewen Wang 1 and Yan Zhang 2 1 Griffith University, Australia k.wang@griffith.edu.au 2 University of Western Sydney yan@cit.uws.edu.au Abstract. Nested logic programs and

More information

PROPOSITIONAL LOGIC. VL Logik: WS 2018/19

PROPOSITIONAL LOGIC. VL Logik: WS 2018/19 PROPOSITIONAL LOGIC VL Logik: WS 2018/19 (Version 2018.2) Martina Seidl (martina.seidl@jku.at), Armin Biere (biere@jku.at) Institut für Formale Modelle und Verifikation BOX Game: Rules 1. The game board

More information

LOGIC PROPOSITIONAL REASONING

LOGIC PROPOSITIONAL REASONING LOGIC PROPOSITIONAL REASONING WS 2017/2018 (342.208) Armin Biere Martina Seidl biere@jku.at martina.seidl@jku.at Institute for Formal Models and Verification Johannes Kepler Universität Linz Version 2018.1

More information

Efficiently merging symbolic rules into integrated rules

Efficiently merging symbolic rules into integrated rules Efficiently merging symbolic rules into integrated rules Jim Prentzas a, Ioannis Hatzilygeroudis b a Democritus University of Thrace, School of Education Sciences Department of Education Sciences in Pre-School

More information

APPLICATION FOR LOGICAL EXPRESSION PROCESSING

APPLICATION FOR LOGICAL EXPRESSION PROCESSING APPLICATION FOR LOGICAL EXPRESSION PROCESSING Marcin Michalak, Michał Dubiel, Jolanta Urbanek Institute of Informatics, Silesian University of Technology, Gliwice, Poland Marcin.Michalak@polsl.pl ABSTRACT

More information

1.1 Language and Logic

1.1 Language and Logic c Oksana Shatalov, Fall 2017 1 1.1 Language and Logic Mathematical Statements DEFINITION 1. A proposition is any declarative sentence (i.e. it has both a subject and a verb) that is either true or false,

More information

Mathematical Foundations of Logic and Functional Programming

Mathematical Foundations of Logic and Functional Programming Mathematical Foundations of Logic and Functional Programming lecture notes The aim of the course is to grasp the mathematical definition of the meaning (or, as we say, the semantics) of programs in two

More information

KRIPKE S THEORY OF TRUTH 1. INTRODUCTION

KRIPKE S THEORY OF TRUTH 1. INTRODUCTION KRIPKE S THEORY OF TRUTH RICHARD G HECK, JR 1. INTRODUCTION The purpose of this note is to give a simple, easily accessible proof of the existence of the minimal fixed point, and of various maximal fixed

More information

1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, )

1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, ) Logic FOL Syntax FOL Rules (Copi) 1. Courses are either tough or boring. 2. Not all courses are boring. 3. Therefore there are tough courses. (Cx, Tx, Bx, ) Dealing with Time Translate into first-order

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Guaranteeing the Accuracy of Association Rules by Statistical Significance

Guaranteeing the Accuracy of Association Rules by Statistical Significance Guaranteeing the Accuracy of Association Rules by Statistical Significance W. Hämäläinen Department of Computer Science, University of Helsinki, Finland Abstract. Association rules are a popular knowledge

More information

1 Algebraic Methods. 1.1 Gröbner Bases Applied to SAT

1 Algebraic Methods. 1.1 Gröbner Bases Applied to SAT 1 Algebraic Methods In an algebraic system Boolean constraints are expressed as a system of algebraic equations or inequalities which has a solution if and only if the constraints are satisfiable. Equations

More information

DATA MINING LECTURE 4. Frequent Itemsets, Association Rules Evaluation Alternative Algorithms

DATA MINING LECTURE 4. Frequent Itemsets, Association Rules Evaluation Alternative Algorithms DATA MINING LECTURE 4 Frequent Itemsets, Association Rules Evaluation Alternative Algorithms RECAP Mining Frequent Itemsets Itemset A collection of one or more items Example: {Milk, Bread, Diaper} k-itemset

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

Section 1.3. Let I be a set. When I is used in the following context,

Section 1.3. Let I be a set. When I is used in the following context, Section 1.3. Let I be a set. When I is used in the following context, {B i } i I, we call I the index set. The set {B i } i I is the family of sets of the form B i where i I. One could also use set builder

More information

Assignment 7 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 7 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 7 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran 1. Let X, Y be two itemsets, and let denote the support of itemset X. Then the confidence of the rule X Y,

More information

Encyclopedia of Machine Learning Chapter Number Book CopyRight - Year 2010 Frequent Pattern. Given Name Hannu Family Name Toivonen

Encyclopedia of Machine Learning Chapter Number Book CopyRight - Year 2010 Frequent Pattern. Given Name Hannu Family Name Toivonen Book Title Encyclopedia of Machine Learning Chapter Number 00403 Book CopyRight - Year 2010 Title Frequent Pattern Author Particle Given Name Hannu Family Name Toivonen Suffix Email hannu.toivonen@cs.helsinki.fi

More information

Motivation. From Propositions To Fuzzy Logic and Rules. Propositional Logic What is a proposition anyway? Outline

Motivation. From Propositions To Fuzzy Logic and Rules. Propositional Logic What is a proposition anyway? Outline Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support, Fall 2005 Instructors: Professor Lucila Ohno-Machado and Professor Staal Vinterbo Motivation From Propositions

More information

Maximum 3-SAT as QUBO

Maximum 3-SAT as QUBO Maximum 3-SAT as QUBO Michael J. Dinneen 1 Semester 2, 2016 1/15 1 Slides mostly based on Alex Fowler s and Rong (Richard) Wang s notes. Boolean Formula 2/15 A Boolean variable is a variable that can take

More information

CS156: The Calculus of Computation

CS156: The Calculus of Computation CS156: The Calculus of Computation Zohar Manna Winter 2010 It is reasonable to hope that the relationship between computation and mathematical logic will be as fruitful in the next century as that between

More information

Association Rule Mining on Web

Association Rule Mining on Web Association Rule Mining on Web What Is Association Rule Mining? Association rule mining: Finding interesting relationships among items (or objects, events) in a given data set. Example: Basket data analysis

More information

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix

Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Easy Categorization of Attributes in Decision Tables Based on Basic Binary Discernibility Matrix Manuel S. Lazo-Cortés 1, José Francisco Martínez-Trinidad 1, Jesús Ariel Carrasco-Ochoa 1, and Guillermo

More information

Intelligate: An Algorithm for Learning Boolean Functions for Dynamic Power Reduction

Intelligate: An Algorithm for Learning Boolean Functions for Dynamic Power Reduction Intelligate: An Algorithm for Learning Boolean Functions for Dynamic Power Reduction by Roni Wiener A thesis submitted in partial fulfilment of the requirements for the degree of Msc University Of Haifa

More information

OntoRevision: A Plug-in System for Ontology Revision in

OntoRevision: A Plug-in System for Ontology Revision in OntoRevision: A Plug-in System for Ontology Revision in Protégé Nathan Cobby 1, Kewen Wang 1, Zhe Wang 2, and Marco Sotomayor 1 1 Griffith University, Australia 2 Oxford University, UK Abstract. Ontologies

More information