Basic Association Rules
|
|
- Brendan Day
- 6 years ago
- Views:
Transcription
1 Basic Association Rules Downloaded 11/21/17 to Redistribution subject to SIAM license or coyright; see htt:// Abstract Previous aroaches for mining association rules generate large sets of association rules. Such sets are difficult for users to understand and manage. Here, the concet of a restricted conditional robability distribution is used to exlain an association rule. Based on this concet, a new tye of association rules, called basic association rules, is defined. We roose the GenBR algorithm to generate the set of classes of basic association rules. Theoretical analysis shows that the search sace of the algorithm can be translated to an n- cube grah. The set of classes of basic association rules generated by GenBR is easy for users to understand and manage. Our exeriments on synthetic and real datasets show that GenBR is either faster than revious aroaches or generates fewer rules or both. Keywords: Association Rules, Probabilistic/Statistical Methods, Data Reduction/Prerocessing, Restricted Conditional Probability Distributions, Inference Systems, and Basic Association Rules. 1 Introduction Techniques for mining association rules [1,2] were originally devised for alication to market basket data, but they have also been alied in many other domains to erform tasks [21,23,26]. Market basket data describes the items urchased from retail stores groued into transactions. A transaction tyically consists of items bought together at the same oint of time, but it may consist of items bought by a customer over a eriod of time. An itemset is a set of items, and a frequent itemset X is an itemset whose frequency in transactions, also referred to as its suort, denoted as su(x), is greater than a user secified suort threshold, minsu. The main task of association rule discovery is to extract frequent itemsets from market basket data and to generate association rules from these frequent itemsets. An association rule r is an imlication of the form X Y, where X and Y are two disjoint itemsets. The suort of the rule is the suort of X Y, denoted as su(r), which is given by the observed robability P(X = 1,Y = 1). The confidence of the rule, denoted conf(r), is given by the conditional observed robability P(X =1,Y = 1) / P(X = 1), which is denoted as (xy) / (x) in this aer. If an association rule has suort at least as great as minsu and Guichong Li and Howard J. Hamilton Deartment of Comuter Science University of Regina Regina, SK, Canada, S4S 0A2 {liguicho, hamilton}@cs.uregina.ca confidence at least as great as the confidence threshold called minconf, it is referred to as a valid association rule. An association rule with confidence 100% is an exact association rule; all other association rules are aroximate association rules. The Ariori algorithm [2] was roosed to discover all frequent itemsets and to generate all valid association rules corresonding to these itemsets by a fast algorithm, called FastGenRules. Many algorithms have since been roosed that reduce the time and sace required to find the frequent itemsets [2,14]. After all frequent itemsets have been found, valid association rules are generated. A serious roblem in association rule discovery is that the set of association rules can grow to be unwieldy as the number of transactions increases, esecially if the suort and confidence thresholds are small. As the number of frequent itemsets increases, the number of rules resented to the user tyically increases roortionately. Many of these rules may be redundant. The definition of redundancy for association rules has varied in revious aroaches. Toivonen et al. roosed finding a structural rule cover, which describes the same database rows as the original set of association rules [28]. Therefore, those rules that are not in the cover are regarded as redundant. In [11,20,24,30], the definition of redundant rules is based on several inference rules or an inference system. Therefore, all association rules that can be derived from other rules by alying inference rules are regarded as redundant. We adot the latter tye of definition. To address the roblem of rule redundancy, four tyes of research on mining association rules have been erformed. First, rules have been extracted based on userdefined temlates or item constraints [3,27]. Secondly, researchers have develoed interestingness measures to select only interesting rules [16,19,18]. Thirdly, researchers have roosed inference rules or inference systems to rune redundant rules and thus resent smaller, and usually more understandable sets of association rules to the user [5,11,20,24,30]. Finally, new frameworks for mining association rule have been roosed that find association rules with different formats or roerties [7,8,9]. The main roblems with revious aroaches are that they still generate too many rules, and these rules may be
2 redundant. For examle, a valid association rule X Y that is generated by one these aroaches may in fact be derived from some simler rule X Y with the same confidence as X Y, where X X and Y Y. Inference rules roosed by these aroaches do not resemble Armstrong axioms on functional deendencies. As well, in some aroaches, inference rules cannot infer the confidence of rules without extra information. In our research, we are creating an inference system on association rules, consisting of a set of inference rules such as augmentation and transitivity, which resembles the Armstrong axioms on functional deendencies and which allows the inference of the confidences of rules. The remainder of this aer is organized as follows. In Section 2, we resent related work. In Section 3, we define the concet of a basic association rule, and roose a new algorithm called GenBR for generating the set BR of classes of basic association rules from a set of frequent itemsets. The comutational comlexity of GenBR is also described. A comarison of our aroach and other aroaches is resented in Section 4. Our exeriments comared the erformance of our algorithm with that of revious algorithms on synthetic datasets and real-life datasets, with resect to the number of rules and the elased running time. Conclusions and future work are described in Section 5. 2 Previous Work Previous research showed that relatively small sets of association rules can be resented to users instead of all valid association rules. As well, for some aroaches, inference rules were suggested that allowed additional association rules to be derived from such small sets of rules. In this section, we describe three aroaches. First, reresentative association rules (RR) are based on a cover oerator with which other non-reresentative association rules can be generated [20]. Suose we have an association rule X Y. A cover oerator C, denoted C(X Y), is given by C(X Y) = {X Z V Z, V Y Z V = V } The set of all reresentative association rules is a minimal set of rules that covers all association rules by means of the cover oerator. The FastGenReresentative algorithm was roosed to efficiently comute a RR [20]. Secondly, a kind of non-redundant association rules with minimal antecedents and maximal consequents, called minimal non-redundant association rules, has been identified as articularly useful and relevant [5]. An association rule r: X Y is a minimal non-redundant association rule iff there does not exist an association rule r : X Y with su(r) = su(r ), conf(r) = conf(r ), X X and Y Y. A small non-redundant generating set for all valid association rules is formed by combining a generic basis GB for exact association rules and an informative basis IB for aroximate association rules. RI is defined as a transitive reduction of the informative basis corresonding to IB. Given a closure oerator c of the Galois connection, a set FC of frequent closed itemsets, the set G of their generators, and a artial order (inclusion relation) on the set of itemsets, the definitions of GB, IB and RI are as follows. GB = {r: g (f \ g) f FC g G f g f} IB = {r: g (f \ g) f FC g G c(g) f} RI = {r: g (f \ g) f FC g G c(g) f / f c(g) f f} Bastide et al. have roven that GB and IB contain only minimal non-redundant association rules and all exact association rules and aroximate association rules can be derived from GB and IB, resectively [5]. The GB and RI algorithms were roosed to generate a generic basis and a transitive reduction of the informative basis, resectively. According to the definition of a minimal non-redundant association rule, the suort and confidence of any association rules inferred from the generating set are the same as the suort and confidence of the rules from which they were inferred. The authors claim that none of the Armstrong axioms hold in non-redundant association rules. A similar aroach has been roosed for discovering a small cover for association rules based on closed itemsets, which adats the Duquenne-Guigues basis for exact association rules and the Luxenburger results for aroximate association rules [24]. Thirdly, informative cover has been roosed together with a new inference rule [11]. Let r, r be two association rules, denoted X Y and X Y, such that X Y X Y. If su(x ) su(x), we say that r covers r, denoted r r. The goal is to find an informative cover that covers all other association rules. The CoverRules algorithm has been roosed to generate an informative cover for association rules [11]. The cover oerator in the informative cover aroach is similar to the cover oerator in the reresentative association rule aroach. The difference between them is that the cover oerator of the informative cover aroach does not require the antecedent of the resulting association rule to be included in the antecedent of the initial association rule. In addition, the inference rocedure is not urely syntactic [11], because it uses Table 2.1. A Binary Dataset. A B C D E
3 information about the suort of the antecedent of the resulting association rule. The inference rules in the two aroaches are sound, but neither aroach infers the confidence of an association rule. We found that the rules generated by these aroaches may be redundant. If X Y is generated by any of these aroaches, it may be ossible to derive it 3.1 Definitions Definition Given a dataset D with I as a set of items and T as a set of transactions, an association rule X Y over a relation R I T is said to be in canonical form if Y = 1. According to this definition, we only consider the case from simler valid association rules. of Y containing a single item. Before we introduce other Examle 2.1. Suose we have the dataset shown in new notions, let us discuss another concet related to Table 2.1, and that minsu is 3 and minconf is 6. The conditional robability. sets of rules generated by the revious aroaches are A conditional robability distribution (CPD) P(Y X) shown in Table 2.2. is defined as P(X, Y) / P(X), where X and Y are random Consider the rules in Table 2.2 from the ersective variables [10]. Y is conditionally indeendent of Z given of a user. For RR, CE AD can be derived from simler X, denoted as I(Y, Z X), if and only if P(Y X) = P(Y X, association rules CE A and CE D by right union as Z), where X, Y, and Z are three disjoint sets of random with Armstrong axioms, and variables. The statement I(X, Z X) is referred to as a conf(ce AD) = conditional indeendence statement (CIS) [10] conf(ce A) conf(ce D). For GB, AB D is Four roerties that are satisfied by any joint unnecessary, because B D is in GB. For C, B AD can be derived from B A and B D by right union, assuming B A and B D are valid association rules. Transitivity cannot be used between GB and RI. For examle, E D in GB and D A in RI cannot be used to infer a valid association rule by transitivity. These examles show that some association rules in these generating sets are not in the most desirable form. ڤ 3 Discovery of Basic Association Rules We roose a new aroach to solve the roblems mentioned in Section 2. Table 2.2. The Generated Association Rules Algorithm Set Rule #Rule FastGenRules AR 36 FastGenRR RR CE AD, C AD, D E, D C, D A, B AD, AC DE, C DE, AE CD, E CD, A CD GB GB and RI C D, E D, AB D, E CD, CoverRules RI C CE D, B D, AC D, A D CE AD, C AD, D E, D C, D A, B AD, AC DE, C DE, E CD, A CD E CD, D E, C AD, D C, B AD, CE AD, D A robability distribution (JPD) are symmetry, decomosition, weak union, and contraction [10]. For examle, for the decomosition roerty, if I(X, Y W Z), then I (X, Y Z) and I (X, W Z). We describe a CIS in another way in Lemma Lemma Given a subset X X, we say I(Y, X \ X X ), i.e., Y is conditionally indeendent of X \ X given X, if and only if P(Y X) = P(Y X ), where X and Y are two disjoint sets of variables. Proof: Suose Z = X \ X. Then X, Y, and Z are three disjoint sets of variables. Since X = X Z, P(Y X) = P(Y X, Z). Assuming P(Y X) = P(Y X ), we derive P(Y X, Z) = P(Y X ). Thus, we have I(Y, Z X ), i.e., I(Y, X \ X X ) according to the definition of CIS. For the converse, assuming I(Y, X \ X X ), then according to the definition of CIS, we have P(Y X, X \ X ) = P(Y X ), i.e., P(Y X) = P(Y X ). ڤ Furthermore, we have Lemma Lemma Given two disjoint sets of variables X and Y, and X X, if I(Y, X \ X X ), then Z X \ X, I(Y, Z X ), i.e., Z X \ X, P(Y X) = P(Y X ) = P(Y X, Z). Proof: If Z = X \ X, the roof follows immediately. The decomosition roerty of conditional indeendencies states that if I(X, Y W Z), then I (X, Y Z) and I (X, W Z). Because I(Y, X \ X X ), Z X \ X, we obtain I(Y, Z X ) and I(Y, {X \ X } \ Z X ). From I(Y, X \ X X ), we obtain P(Y X) = P(Y X, X \ X ) = P(Y X ), and from I(Y, Z X ), we obtain P(Y X ) = P(Y X, Z). Hence, P(Y X) = P(Y X ) = P(Y X, Z). ڤ To exlain this idea more fully, we first give a more general definition as follows. Definition Given two disjoint sets of variables X and Y, and a conditional robability distribution P(Y X), a restricted conditional robability distribution (RCPD),
4 denoted as Pˆ (Yˆ Xˆ ), is a subset of P(Y X) defined by secifying subsets Domain(Yˆ ) Domain(Y) and Domain( Xˆ ) Domain(X). For binary variables X {0,1} and Y {0,1}, with Xˆ {1} and Yˆ {1}, the confidence (y x) of the association rule r: X Y is a ositive conditional robability of the RCPD Pˆ (Yˆ Xˆ ). In the following discussion, Pˆ (Yˆ Xˆ ) is simly denoted as Pˆ (Y X). Given X X, if we have Pˆ (Y X) = Pˆ (Y X ), we cannot guarantee that Z X \ X, Pˆ (Y X) = Pˆ (Y X, Z). Hence, the decomosition roerty cannot be alied in a RCPD. Examle Suose that we have the transaction dataset in Table 2.1 and that minsu is 3 and minconf is 6. Consider two association rules ACD E and D E. Although conf(acd E) = conf(d E) = 2/3, conf(cde) = 3/4, i.e., if X = {A, C, D} and Y = {E}. If we choose X ={D}, then Z = X \ X = {A, C}, (y x) = (y x, z) = (y x ) = 2/3, i.e., Pˆ (Y X) = Pˆ (Y X ). Let Z = {C} X \ X, since (y x, z) = 3/4, then (y x) (y x, z) and (y x ) (y x, z), i.e., Pˆ (Y X) Pˆ (Y X, Z) and Pˆ (Y X ) Pˆ (Y X, Z). ڤ In the context of RCPDs, we hoe to find a minimal subset X of X with resect to Y such that Pˆ (Y X) = Pˆ (Y X ), and Z X \ X, Pˆ (Y X) = Pˆ (Y X, Z). Definition Let X and Y be two disjoint sets of variables. X is a minimal conditional subset (MCS) of X with resect to Y if X is a subset of X that satisfies the following conditions: (1) Pˆ (Y X) = Pˆ (Y X ). (2) Z X \ X, Pˆ (Y X) = Pˆ (Y X, Z). (3) / X X such that conditions (1) and (2) hold for X. If X is a minimal conditional subset of itself with resect to Y, then X is conditionally minimal with resect to Y. For examle, in Table 2.1, AC is conditionally minimal with resect to E. We also define the dual. Definition Suose X and Y are disjoint sets of variables. If X is a MCS of X with resect to Y, then the restricted conditional robability distribution Pˆ (Yˆ Xˆ ) is a minimal conditional robability distribution (MCPD) of the RCPD Pˆ (Yˆ Xˆ ) with resect to Yˆ. If X is conditionally minimal with resect to Y, then Pˆ (Yˆ Xˆ ) is a MCPD of itself. If Pˆ (Yˆ Xˆ ) is a MCPD of Pˆ (Yˆ Xˆ ), then a series of equations follow, i.e., Z X \ X, Pˆ (Yˆ Xˆ ) = Pˆ (Y ˆ Xˆ, Ẑ ). Because MCS and MCPD are duals of each other, we use whichever is convenient. In revious research, MCS and MCPD have not been defined or emhasized by researchers. In the context of mining association rules, we use a RCPD Pˆ (Y X) for inference on association rules. We do not consider how Pˆ (XY) and Pˆ (X Y) behave. Examle Suose we have the transaction dataset shown in Table 2.1. Let X = {A, C, D}, Y = {E}, X = {A, C}. Because the confidences of both X Y and X Y are 2/3, i.e., the ositive conditional robability (e acd) = (e ac), we see that Pˆ (Y X) = Pˆ (Y X ) and that Z X \ X = {D}, Pˆ (Y X) = Pˆ (Y X, Z). Because (e acd) (e a) = 2/4 and (e acd) (e c) = 3/4, there is no X X = {A, C} such that Pˆ (Y X) = Pˆ (Y X ). So X is a MCS of X with resect to Y. Pˆ (Y X ) is a MCPD of Pˆ (Y X) with resect to Y. Similarly, {A}, {C}, and {E} are three MCSs of {A, C, E} with resect to D. ڤ In the context of mining association rules, a CPD P(Y X) is always referred to as a RCPD Pˆ (Yˆ Xˆ ). We define a new notion of minimal association rules analogously to minimal functional deendencies [25]. Definition Suose we have a set I of items and a transaction dataset D. A canonical association rule X Y over D is a basic association rule if X is conditionally minimal with resect to Y, i.e., / X X such that (1) P(Y X) = P(Y X ) and (2) Z X \ X, P(Y X) = P(Y X, Z). For examle, given the transaction dataset shown in Table 2.1, AC E is a basic association rule, and AC is a MCS of ACD with resect to E while P(E AC) is a MCPD of P(E ACE) with resect to E. 3.2 Comuting MCPDs According to the definition of basic association rules, either a MCS X with resect to Y or a MCPD P(Y X) corresonds to a basic association rule X Y. The confidence of the rule is a ositive conditional robability of P(Y X). Therefore, the crucial task of finding basic association rules is the comutation of all MCPDs. Given a set L of frequent itemsets, X L, and minconf, our aroach for comuting MCPDs corresonding to X is divided into two stes. We first construct a set of RCPDs in canonical form from X, and then we comute their MCPDs, in which all ositive conditional robabilities are at least as great as minconf. Our aroach is similar to the aroach for discovering the minimal directed I-Ma of a joint robability distribution (JPD) [10]. Suose that a ermutation (ordering) Y = {Y 1,,Y n } of a set of variables X = {X 1,,X n }, and (x) is a JPD of X. This aroach comutes any minimal set of redecessors i with resect
5 to Y i, and i satisfies (y i b i ) = (y i π i ), where i B i = {Y 1,,Y i-1 }. Hence, a directed minimal I-ma of (x) is constructed by designating i as arents of Y i. The differences from comuting I-ma from a JPD is that we do not ermutate items of a frequent itemset, and the conditions (contexts) in the restricted conditional robabilities contain all items in the frequent itemset excet for a single test item. For examle, to find MCPDs corresonding to the frequent itemset X 1 X 2 X 3 X 4 X 5, first the following RCPDs are constructed: P(X 1 X 2 X 3 X 4 X 5 ), P(X 2 X 1 X 3 X 4 X 5 ), P(X 3 X 1 X 2 X 4 X 5 ), P(X 4 X 1 X 2 X 3 X 5 ), and P(X 5 X 1 X 2 X 3 X 4 ). For P(X 1 X 2 X 3 X 4 X 5 ), one can observe all MCSs of X 2 X 3 X 4 X 5 with resect to X 1, such as a minimal subset {X 2 X 3 X 4 X 5 }, and Z {X 2 X 3 X 4 X 5 } \, (x 1 x 2 x 3 x 4 x 5 ) = (x 1 π) = (x 1 π, z), where P(X 1 ) is a MCPD of P(X 1 X 2 X 3 X 4 X 5 ). If (x 1 x 2 x 3 x 4 x 5 ) is less than minconf, then we sto the comutation of MCPDs. Otherwise, is a MCS of X 2 X 3 X 4 X 5 with resect to X 1. The corresonding basic association rule is X 1, as exlained in Section 3.1. Similarly, we comute the MCPDs of P(X 2 X 1 X 3 X 4 X 5 ), P(X 3 X 1 X 2 X 4 X 5 ), P(X 4 X 1 X 2 X 3 X 5 ), and P(X 5 X 1 X 2 X 3 X 4 ). Finally, a set of MCPDs is obtained. Thus, for each frequent itemset, a set of basic association rules with resect to X can be found. Because RCPDs do not obey decomosition, to comute a MCPD P(Y X ) of P(Y X), we should examine all cases, i.e., Z X \ {Y X }, P(Y X ) = P(Y X, Z). This rocess goes from to to bottom, and is deicted as a semi-lattice [12] in Figure 3.1. The frequent itemset itself, such as ABCDE, is laced at the first level. At the second level, we construct a set of RCPDs, such as P(A BCDE), P(B ACDE), P(E ABCD), P(D ABCE), and P(E ABCD). All RCPDs at the third level are constructed by setting their contexts as maximal subsets of the contexts of the RCPDs at the second level. At the fourth level, all RCPDs are formed by setting their contexts as the intersections of the contexts of the RCPDs at the third level. The number k of itemsets being intersected at level d of the semi-lattice is related to the length l of the frequent itemset at the first level. We have the formula k = d 2, 4 d l. For examle, for frequent 4-itemsets, the number of itemsets being intersected at level 4 is 2. For frequent 5-itemsets, the number of itemsets being intersected at level 5 is 3. The deth of the semi-lattice equals the size of the corresonding frequent itemset. For examle, for the comutation of the MCPDs of P(E ABCD), we first examine P(E ABC), P(E ABD), P(E ACD), and P(E BCD). If P(E ABCD) = P(E ABC) and P(E ABC) is minimal, then P(E ABC) is already a MCPD of P(E ABCD), and we examine P(E ABD), P(E ACD), and P(E BCD). Similar cases arise ABCDE P(A BCDE) P(B ACDE) P(E ABCD) P(D ABCE) P(C ABDE) level 2... P(E ABC) P(E ABD) P(E ACD) P(E BCD)... P(E AB) P(E AC) P(E AD) P(E BC) P(E BD) P(E CD) P(E A) P(E B) P(E C) P(E D) level 1 level 3 level 4 level 5 Figure 3.1. A Possible Semi-lattice for the Frequent Itemset ABCDE. for P(E ABD), P(E ACD), and P(E BCD). If P(E AB) is a MCPD of P(E ABCD), we only require P(E ABCD) = P(E ABC) = P(E ABD). The context of P(E AB) is the intersection of the contexts of P(E ABC) and P(E ABD). If P(E A) is a MCPD of P(E ABCD), we require P(E AB) = P(E AC) = P(E AD) = P(E ABCD), where P(E AB), P(E AC), and P(E AD) have already been obtained. Therefore, according to Definition and 3.1.4, P(E A) is a MCPD of P(E ABCD). During extension of the semi-lattice, the RCPDs in new children (the nodes at the next level) are also called the candidate MCPDs (not unique). We reduce the number of candidate MCPDs of new children by intersecting itemsets in the contexts of their arents (nodes at the revious level), and checking whether the candidate MCPDs of the children are equal to those of their arents, and whether the ositive conditional robabilities of their MCPDs are at least as great as minconf. To find all basic association rules, we should check all frequent itemsets and comute the corresonding minimal conditional robabilities. Therefore, we define the following concets. Definition Suose we have a transaction dataset D, minsu, minconf, and a set L of frequent itemsets over D. A class of basic association rules for X L is a set of basic association rules r:x A, denoted as C r (X), such that (1) X A X (2) X is a MCS of X \ A with resect to A (3) conf(r) minconf Definition Suose we have a transaction database D, minsu, and minconf. A basic association rule system, denoted as BR(D, minsu, minconf), is defined as the set of k distinct classes of valid basic association rules, i.e., BR(D, minsu, minconf) = {C i i r C r is a class of basic
6 association rules, 1 i k such that for all j, 1 j k, j i i, C r C j i r, C r C j r }. BR(D, minsu, minconf) is also simly denoted as BR when D, minsu, and minconf are clear from context. Examle Given the dataset in Table 2.1, minsu = 3, and minconf = 6, in Figure 3.2, we show how to comute all MCPDs corresonding to the frequent itemset ACDE. Figure 3.2 shows the search sace used to find MCSs by comuting a set of MCPDs corresonding to the frequent itemset ACDE. This search sace consists of four semi-lattices, in which the single node at the to level corresonds to the frequent itemset, and other nodes corresond to its subsets, each of which includes a ositive conditional robability. Regardless of the data in the dataset, at the second level in the structure, we always have four ositive conditional robabilities for ACDE. Each of them corresonds to an item in ACDE, such as (a cde), etc. At the third level of the structure, the itemsets aearing in the contexts of ositive conditional robabilities are always maximal subsets of the itemsets aearing in the context of ositive conditional robabilities in their arents. For examle, along the branch containing (a cde), de, ce and cd are maximal subsets of cde. If the ositive conditional robability of a child is equal to the ositive conditional robability of its arent and both ositive conditional robabilities are at least as great as minconf, then in Figure 3.2, they are connected with a bold arrow; e.g., a bold arrow is shown from the node including (a cde) to the node including (a ce), because (a cde) = (a ce). If the ositive conditional robability of a arent is not equal to the ositive conditional robability of its child, but the ositive conditional robability of the child is at least as great as minconf, we connect the arent node and the child node with a narrow arrow; e.g., a narrow arrow is shown from the node including (a cde) to the node including (a cd). If the ositive conditional robability of a arent is not equal to the ositive conditional robability of its child or a ositive conditional robability is less than minconf, further comutation of minimal conditional ACDE (a cd e) (c ade) (d ace) (e acd) (a de) (a ce) (a cd ) (c de) (c ae) (c ad) robabilities along this ath is terminated; e.g., a dotted arrow is shown from the node including (a cde) to the node including (a de). Hence, P(A CE) is a MCPD of P(A CDE). Similarly, we comute all MCPDs of P(C ADE), P(D ACE) and P(E ACD). As a result, a set of MCPDs with resect to the frequent itemset ACDE is obtained, i.e., P(A CE), P(C AE), P(D E), P(D C), P(D A) and P(E AC), where P(A CE) is a MCPD of P(A CDE) with resect to A, and P(C AE) is a MCPD of P(C ADE) with resect to C, etc. ڤ From the MCPDs corresonding the frequent itemset X, we can readily obtain a class of basic association rules, C r (X). A class of basic association rules derived from one frequent itemset may be comletely included in another class of basic association rules derived from another frequent itemset. Hence, classes of basic association rules that are comletely contained in other classes of basic association rules have no more information than the classes containing them. They are called redundant classes and are discarded. Examle Given the transaction dataset in Table 2.1, minsu = 3, and minconf = 6, from Examle 3.2.1, we obtain C r (ACDE) = {CE A, AE C, E D, C D, A D, AC E}. Similarly, we also obtain another class of basic association rules corresonding to the frequent itemset ADE, C r (ADE) = {A D, E D}. Since C r (ADE) C r (ACDE), C r (ADE) is discarded. ڤ 3.3 The GenBR Algorithm We roose the GenBR algorithm for generating BR. Its goal is different from that of the second ste of the Ariori algorithm [1], which generates all association rules. Our aroach consists of two main stes. Given a set of frequent itemsets and minconf, GenBR generates all classes of basic association rules. Secondly, the algorithm generates BR by discarding all redundant classes. The GenBR algorithm, resented in Figure 3.3, generates BR from a set of frequent itemsets L. For each frequent itemset I in L, the GenBC algorithm is called to generate a class of basic association rules corresonding to I. All classes discovered by GenBC are collected into (d ce) (d ae) (d ac) (e cd ) (e ad) (e ac) (a e) (a d ) (a c) (c e) (c d) (c a) (d e) (d c) (d a ) (e d) (e c) (e a) Figure 3.2. The Comutation of MCPDs.
7 Algorithm GenBR(L) Algorithm MinimalSubsets(i, I ) Purose: generate BR from a set L of frequent itemsets Purose: comute a set S of minimal conditional subsets of I Inut: L, a set of frequent itemsets, where L k L is all itemsets with resect to i. in L containing k items. Inut: i, an item such that I i is a frequent itemset. Outut: BR, a set of classes of basic association rules. I, an itemset. BAR = Ø Outut: S, a set of minimal conditional subsets of foreach I L k, k 2 I with resect to i. BAR = BAR GenBC(I) S = Ø end = su(i {i}) / su(i ) BR = RemoveRedundantClass(BAR) if ( minconf) then return BR S = {I } end S = MaximalSubsets(I ) Figure 3.3. The GenBR Algorithm. k = 2 Algorithm GenBC(I) while (S Ø) Purose: generate the class of basic association rules corresonding to I. foreach s S Inut: I, a frequent itemset. Outut: R, a class of basic association rules corresonding to I. 1 = su(s {i}) / su(s) if ( 1 = ) then R = Ø S = S {s} foreach item i I else S = S \ {s} I = I \ {i} end S = MinimalSubsets(i, I ) foreach s S foreach s S DelSuerset(s, S) // remove all s S such that R = R {s i} end s s from S return R S = IntersectionSet(S, k) end k = k + 1 Figure 3.4. The GenBC Algorithm. end return S BAR. The RemoveRedundantClass algorithm removes end redundant classes from BAR to give BR. Figure 3.5. The MinimalSubsets Algorithm. The GenBC algorithm, resented in Figure 3.4, generates the class of basic association rules corresonding to the frequent itemset I. The main loo of the GenBC algorithm is reeated for each item i in the frequent itemset I. It calls the MinimalSubsets algorithm for comuting the MCSs of I with resect to i. From these MCSs, the algorithm forms a set of basic association rules corresonding to I. The MinimalSubsets algorithm, resented in Figure 3.5, comutes the set of MCSs of I with resect to i. First, the algorithm determines whether = (i I ) minconf. If so, the algorithm initializes the minimal conditional subset S = {I }. The MaximalSubsets function roduces a set S of all maximal subsets of I as a set of candidate MCSs of I with resect to i, e.g., S = MaximalSubsets(BDE) = {BD, BE, DE}. Because this function is straightforward, we omit it. In the while loo, the algorithm examines the validity of candidate MCSs in S. For s S, the algorithm comutes the conditional robability (i s), and then comares (i s) with. If (i s) =, then s is a valid candidate MCS of I with resect to i, and it is stored in S. If smaller valid candidate MCSs of I are found, then suersets of them are removed from S. The DelSuerset function (omitted) does this task. The IntersectionSet(S, k) function (omitted) generates all smaller candidate MCSs of I, which are the intersections of itemsets in S in terms of the deth k of loo. For examle, the intersection of the two itemsets AE and AC is equal to A, and it is regarded as a candidate MCS. Examle Given the dataset in Table 2.1, minsu = 3, and minconf = 6, we describe the rocess of generating BR using GenBR. In the while loo of GenBR, we assume I = {ACDE} is selected from L. The GenBC algorithm is called to comute the class of basic association rules corresonding to ACDE. The main loo of the GenBC algorithm is reeated for each item in a frequent itemset. Inside the main loo, the MinimalSubsets algorithm is first called to generate all MCSs of CDE with resect to A. I = CDE and i = A. Because the ositive conditional robability (a cde) minconf, the MinimalSubsets algorithm s comuting MCPDs of P(A CDE), i.e., a set S of MCSs of CDE with resect to A. Initially, S = {I }. MaximalSubsets roduces a set S of all maximal subsets of CDE as candidate MCSs of
8 CDE with resect to A, S = {DE, CE, CD}. In the while loo, itemsets in S are checked to see if they are candidate MCSs of CDE with resect to A. When CE is examined, (a ce) = (a cde), so we obtain S = {BDE, CE}. For DE and CD, because (a de) (a cde) and (a cd) (a cde), DE and CD are removed from S, and S = {CE}. After DelSuerset, S = {CE} and S = {CE}. Because S has only one itemset, S = IntersectionSet(S, k) = Ø. The while loo in the MinimalSubsets algorithm exits, and S = {CE} is returned to GenBC. In GenBC, the basic association rule {CE A} is laced in R. Similarly, from the comutation of P(C ADE), we have a new basic association rule AE C, and R = {CE A, AE C}. From the comutation of P(D ACE), we have new basic association rules E D, C D and A D, and they are added into R. From the comutation of P(E ACD), a new basic association rule AC E is found. Finally, the algorithm generates a class of basic association rules corresonding to ACDE, R = {CE A, AE C, E D, C D, A E, AC E}. The algorithm returns R to GenBR. At this oint, BR = {{DE A, A B, D B, E B, A E}}. GenBR continues until all frequent itemsets in L, with size of at least 2, have been rocessed. Consequently, all basic association rules and their classes are generated and resented, as shown in Table 3.1. ڤ We define the following terminology over BR: (1) C r denotes a class of basic association rules. C i is a set of items, which aear in C r. (2) The class in which a basic association rule X Y resides is denoted as C r (X Y). For examle, C r (C D) might be referred to as one of C r (CDE), C r (ACD), C r (ACDE) and C r (CD), as shown in Table 3.1. (3) C i (X Y) denotes a set of items, which aear in C r (X Y). e.g, E C i (C D) ={C, D, E}. All redundant classes in Table 3.1 are discovered and discarded by the RemoveRedundantClass algorithm. A Table 3.1. A Set of Classes of Basic Association Rules. No. Class Basic association rule Redundant 1 C r (AB) B A Yes 2 C r (BD) B D Yes 3 C r (DE) E D, D E 4 C r (AD) D A, A D Yes 5 C r (CE) E C, C E Yes 6 C r (AC) A C, C A Yes 7 C r (CD) D C, C D 8 C r (ABD) B D, B A, D A, A D 9 C r (ADE) E D, A D Yes 10 C r (ACE) CE A, AC E, AE C Yes 11 C r (CDE) E D, E C, C E, C D 12 C r (ACD) A D, A C, C D, C A 13 C r (ACDE) E D, CE A, AC E, A D, AE C, C D redundant class C r (I) contains no more information than the class C r (I ) containing C r (I). For examle, in Table 3.1, C r (AB) C r (ABD). If a C r (I 1 ) is redundant and C r (I 1 ) C r (I 2 ), then I 1 I 2. All frequent itemsets are arranged in a semi-lattice based on their inclusion relation in order to discover all redundant classes. For examle, given the dataset shown in Table 2.1 and minsu = 3, all frequent itemsets form the semi-lattice shown in Figure 3.6. To identify the redundant class C r (AB), RemoveRedundantClass comares C r (AB) with C r (ABD) because AB ABD. To check whether C r (DE) is redundant, the algorithm comares C r (DE) with C r (ADE) and C r (CDE). Because C r (DE) C r (ADE) and C r (DE) C r (CDE), the algorithm does not need to comare C r (DE) with C r (ACDE), and C r (DE) is non-redundant. That means that to check whether C r (I 1 ) is redundant, we only comare C r (I 1 ) with all C r (I 2 ), where I 1 and I 2 are frequent itemsets, I 1 I 2, and I 1 = I 2-1, as justified by Theorem Theorem Given three itemsets I 1, I 2, and I 3, and I 1 I 2 I 3, if C r (I 1 ) C r (I 2 ), then C r (I 1 ) C r (I 3 ). Proof: Suose the condition is satisfied. Hence, r: X A C r (I 1 ), and a MCP (a x), but r C r (I 2 ), i.e., Z I 2 \ {X, A}, (a x) (a x, z). Because I 2 \ {X, A} I 3 \ {X, A}, Z I 3 \ {X, A}, (a x) (a x, z), i.e., P(A X) is not a MCPD of P(A I 3 \ A). Hence, r C r (I 3 ). ڤ In [22], we roosed an inference system for basic association rules. It is called the C-inference system, because it ermits a rule s confidence to be inferred. We roved that the C-inference system holds on BR and that all association rules can be derived from BR by the alication of inference rules in the C-inference system. The C-inference system is summarized in Table 3.2, where and q signify the confidences of association rules. Rules that cannot be derived from other rules by the C- inference system are called non-redundant association rules. ABD ACE ACDE ACD ADE CDE AB AC AD AE BD CD CE DE A B C D E L Hashtable Figure 3.6. A Semi-Lattice for All Frequent Itemsets
9 Table 3.2. The C-inference System on Basic Association Rules Inference Rule Premise Conclusion Augmentation X Y C r, X C i \ {X, Y} XX Y Pseudo-Transitivity q q X Y C r, XW Z C r XW Z Contraction q q X Y AR, XY Z AR X YZ Additivity q q X Y C r, W Z C r XW YZ Right union q q X Y C r, X Z C r X YZ Left union X Z C r, Y Z C r XY Z 3.4 The Comutational Comlexity of GenBR Let us analyze the comutational comlexity of GenBR. From Figure 3.1, we observed that itemsets contained in the context of RCPDs may be used to construct a secific semi-lattice for comuting MCPDs. All nodes in the semi-lattice are subitemsets of an itemset. This construction corresonds to the roblem of generating subsets in mathematics, or creating a hyercube (or a k-cube), or a Q n grah in grah theory [15]. We analyze the comutational comlexity of our algorithm based on the binomial theorem [13] as follows. According to the binomial theorem, the number of subitemsets of a k-itemset (nodes of the semi-lattice), denoted NS, is given by: NS = C 1 k + +C k k = 2 k 1 (3.1) By counting edges from the to of the semi lattice for a k-itemset, as shown in Figure 3.7, to its bottom, the number of edges, denoted NE, of the semi-lattice is given by: k 1 k NE = C k + 2C 2 k 3 1 k + 3C k + +(k 1)C k k 1 k = C k + 2C 2 k 3 k + 3C k + +(k 1)C 1 k + kc 0 k 0 kc k k k k i ik! = ick k = k i!( k i)! i= 1 k i= 1 ( k 1)! = k k ( i 1)!( k i)! i=1 k k 1 k 1 i 1 2 = k C = k k (3.2) i= 1 From Equation (3.1), we obtain the number of association rules NR for a k-itemset generated by the FastGenRules algorithm [1] as follows: NR = 2 k 2 (3.3) According to Figure 3.1, to comute the MCSs of ABCD with resect to E, in the worst case, the rocess forms a semi-lattice of ABCD. One edge in the semilattice corresonds to one comarison. For the whole itemset ABCDE, there are five semi-lattices of this kind. Hence, from Equation (3.2), the comutational comlexity TC of GenBR for a k-itemset is determined by: TC = k(k 1)(2 k 2 1) k 2 2 k 2 = O(k 2 2 k 2 ) (3.4) Given that l is the length of the longest itemsets in the set of frequent itemsets, the ratio of the comlexity of GenBR to that of FastGenRules is: R = O(l 2 2 l 2 )/O(2 l+1 ) = O(l 2 / 2 3 ) = O(l 2 ) (3.5) Although the time comlexity of GenBR exonentially increases with l, we show that GenBR erforms very well over actual datasets in the following section. 4 Comarison and Exerimental Results The overall comarison between our aroach and four revious aroaches is shown in Table 4.1. Table 4.1. The Overall Evaluation Algorithm Features Fast- Rules Fast- RR GBRI Cover- Rules GenBR Canonical form Non-redundant Infer a rule s suort Infer a rule s confidence Armstrong axioms-like #Rules All Fewer Most Fewest Best Elased time Fastest Slower Slowest Medium Fast First, consider the form of rules generated. Only GenBR generates non-redundant rules in canonical form. We believe that this tye of rules is easy for users to understand. Rules generated by GenBR are nonredundant, i.e., these rules and their confidences cannot be derived from simler rules and their confidences by using the inference rules defined in any of the other aroaches. Secondly, consider whether an inference rule ermits the suort and confidence of rules to be inferred. None of the aroaches can infer a rule s suort, and only GBRI and GenBR can infer confidences. Thirdly, consider whether the inference system resembles Armstrong s axioms; only the C-inference system does so. Fourthly, consider the number of rules generated. CoverRules generates the fewest, but GenBR organizes the basic association rules into classes related to the frequent itemsets. GenBR also tends to generate the fewest rules when minconf is high. Finally, with resect to the elased time, FastGenRules is the fastest on all datasets. GenBR is more efficient than the other aroaches, excet for a few cases. As described in the remainder of this section, exeriments on several synthetic and real-life datasets were conducted to comare the erformance of GenBR and revious algorithms with resect to the number of rules and the elased running time.
10 4.1 Exerimental Design The exerimental environment was a PC with a 2.53 GHz Intel CPU, 512MB of RAM, and Microsoft Windows XP. All algorithms were imlemented in Microsoft Visual Java++. The datasets used are listed in Table 4.2. Synthetic datasets, T10I4D100K and T20I6D100K, were generated by running GenData [17], which is a synthetic data generator from IBM. Several well-known benchmark real-world datasets were chosen from [29]. CustInfo-5 was obtained from the CustInfo database [4] by joining five tables. The number of attributes shown for the realworld datasets reresents the number after multi-valued attributes were converted to several binary attributes. Table 4.2. Synthetic and Real-world Datasets Dataset # Attributes #Items Length of Record or Transactions # Transactions Chess Connect Mushroom Molecular CustInfo T10I4D100K T20I6D100K To comare the erformance of GenBR and revious aroaches with resect to the number of rules and the elased running time, we imlemented the FastGenRules for generating association rules [2] as well as the GB algorithm for generating generic basis for exact rules, and the RI algorithm for generating informative basis for aroximate association rules [5]. The GB and the RI algorithms were combined into the GBRI algorithm for our exeriments. The FastGenReresentatives (denoted FastGenRR) algorithm generates a set RR of reresentative association rules [20]. CoverRules generates an informative cover of cover rules [11]. The GenBR algorithm consists of two stes to generate basic association rules. The first ste generates all classes of basic association rules. The second ste removes redundant classes. The GenBR algorithm is more similar to the FastGenRules algorithm than to the other algorithms. To comare the GenBR with FastGenRules with resect to the elased running time, we only consider the first ste of GenBR, which generates the basic association rules, and we ignore the elased time required for the second ste. 4.2 Results We first comared the algorithms with regard to the number of rules generated. GenBR generates fewer rules than FastGenRules, and in some cases fewer rules than FastGenRR, GBRI, and CoverRules. Table 4.3. Chess, minsu = 80 % Algorithm / minconf Fast- #Rules Elased time Rules (ms) #Rules GBRI Elased time Fast- #Rules RR Elased time Cover- #Rules Rules Elased time GenBR #Classes #Rules Elased time For Chess with minsu = 80%, the number of rules generated by the algorithms and their elased times are shown in Table 4.3. GenBR generates significantly fewer rules than FastGenRules for all settings of minconf. For examle, when minconf = 10%, FastGenRules generates rules, while GenBR generates rules. Among all aroaches, CoverRules generates the fewest rules when minconf is 90% or less. For examle, in Table 4.3, with minconf = 10%, CoverRules generates 226 rules while GenBR generates rules. However, when minconf is 100%, GenBR generates the fewest rules. For examle, in Table 4.3, GenBR generates only 6 rules, while the other aroaches generate 2228 or more rules. Table 4.4. Comarison wrt the Number of Rules. Dataset Minsu Fast- Rules Fast- RR GBRI Cover- Rules GenBR Chess 80% Chess 90% Connect-4 95% Connect-4 97% Mushroom 30% Mushroom 40% Molecular 5% Molecular 10% CustInfo-5 5% CustInfo-5 10% T10I4D100K 05% T10I4D100K 1% T20I6D100K 2% T20I6D100K 3% For other datasets and different arameters, the exerimental results concerning the number of rules generated are similar to those shown in Table 4.3. We summarize these exerimental results in Table 4.4. Because users are generally concerned with association rules with high confidences, we only describe the number of rules with 100% confidence (minconf = 100%) for all datasets besides CustInfo-5. For CustInfo-5, since there is no exact association rule when minsu = 5% or 10%, we use minconf = 90%. Under these conditions, GenBR
11 generates fewer rules than the other algorithms, excet for a few cases when CoverRules generates the fewest. Although the comutation comlexity of GenBR is exonentially greater than that of FastGenRules (as mentioned in Section 3.4), in our exeriments, we observed that the elased time for GenBR is significantly less than that for FastGenRules, excet for cases where both algorithms have their fastest erformance, which corresond to values of minconf aroaching 100%. For examle, in Table 4.3, when minconf is 10%, the elased time for GenBR is ms while that of FastGenRules is ms. When minconf is 100%, the elased time of GenBR is ms while the elased time of FastGenRules is 969 ms. Across all tested settings of minconf from 10% to 100%, GenBR has the lowest maximum elased time (25593 ms). The number of exact association rules greatly affects the elased time of GenBR. If there are more exact association rules, there may be more functional deendencies, and consequently, GenBR sends more time. We summarize the exerimental results related to elased time by recording the ratios of the elased time of GenBR to revious algorithms in Table 4.5. We set minconf = 100%. The bold entries identify cases where the corresonding algorithm is faster than GenBR. The results show that GenBR is faster than all algorithms excet FastGenRules on most resented datasets. Table 4.5. Evaluation wrt Elased Time Dataset Minsu Fast Fast Cover- Rules RR GBRI Rules Chess 80% Chess 90% Connect-4 95% Connect-4 97% Mushroom 30% Mushroom 40% Molecular 5% Molecular 10% CustInfo-5 5% CustInfo-5 10% T10I4D100K 05% T10I4D100K 1% T20I6D100K 2% T20I6D100K 3% Conclusions and Future Work 5.1 Conclusions In this aer, we roosed a new tye of association rules, called basic association rules. First, by referring the relational database theory on functional deendencies, we develoed the new concets of a restricted conditional robability distribution (RCPD), a minimal conditional robability distribution (MCPD), a minimal conditional subset (MCS), and a basic association rule. We established the C-inference system on basic association rules, which is similar to Armstrong s axioms on functional deendencies. Secondly, we roosed the GenBR algorithm for generating a set of classes of basic association rules from a set of frequent itemsets. GenBR efficiently generates basic association rules as comared with the revious aroaches for generating small sets of association rules. GenBR also generates fewer rules than revious aroaches when minconf is high. Thirdly, arguably, users will find basic association rules to be more manageable and understandable than reviously roosed reduced sets of association rules. The rules are concise, the number of rules is small, redundancy among rules in a class of basic association rules has been eliminated, and inference involving confidence values is ossible on the rules. This oint is argued at greater length in [22]. Fourthly, we showed that the search sace of our algorithm to comute basic association rules is a hyercube (n-cube) or Q n grah. This insight aided in our theoretical analysis of the algorithm. 5.2 Future Work An oen roblem is to find a more efficient algorithm for discovering basic association rules from frequent itemsets. Finding a heuristic method to discover basic association rules without generating frequent itemsets will also be challenging. We roosed the idea of a restricted conditional robability distribution as a foundation for mining association rules, and we distinguished it from the traditional conditional robability distribution. It can be regarded as a more general concet than the contextsecific conditional robability distribution described in [6]. Further work on restricted conditional robability distributions may be romising. We established the C-inference system on association rules, which may be regarded as an extension of Armstrong s axioms on functional deendencies. This kind of inference system works on exact and aroximate rules simultaneously. Further exloration of the roerties of the C-inference system is needed. References [1] R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. SIGMOD 93, ages , Washington. DC, May [2] R. Agrawal and R. Srikant. Fast algorithm for mining association rules. In Proc. VLDB 94, Santiago, Chile, [3] E. Baralis and G. Psaila. Designing temlates for mining association rules. Journal of Intelligent Information Systems, 9(1):7-32, July [4] B. Barber and H. J. Hamilton. Extracting share frequent itemsets with infrequent subsets. Data
12 Mining and Knowledge Discovery, 7(2): , Aril [5] Y. Bastide, N. Pasquier, R. Taouil, G. Stumme, and L. Lakhal. Mining minimal non-redundant association rules using frequent closed itemsets. In Proc. of the First International Conference on Comutational Logic, ages , 200 [6] C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context-secific indeendence in Bayesian networks. In Proc. 12 th Conf. on Uncertainty in Artificial Intelligence (UAI-96), ages , Portland, Oregon, [7] S. Brin, R. Motwani, and C. Silverstein. Beyond market baskets: Generalizing association rules to correlation. In Proc. SIGMOD 97, ages , May [8] S. Brin, R. Motwani, J. D. Ullman and S. Tsur. Dynamic itemset counting and imlication rules for market basket data. In Proc. SIGMOD 97, ages , Montreal, Canada, June [9] C. L. Carter, H. J. Hamilton, and N. Cercone. Share based measures for itemsets. In Proc. PKDD 97, ages 14-24, Trondheim, Norway, June [10] E. Castillo, J. M. Gutierrez, and A. S. Hadi, Exert Systems and Probabilistic Network Models, Sringer, [11] L. Cristofor and D. Simovici. Generating an informative cover for association rules. In Proc. of the IEEE International Conference on Data Mining, [12] B. A. Davey and H. A. Priestley. Introduction to Lattices and Order, Fourth edition. Cambridge University Press, [13] E. G. Goodaire and M. M. Parmenter. Discrete Mathematics with Grah Theory, Second Edition. Prentice Hall, Uer Saddle River, NJ, [14] J. Han, J. Pei, and Y. Yin. Mining frequent atterns without candidate generation. In Proc. SIGMOD 00, ages 1-12, Dallas, TX, May 200 [15] T. W. Haynes, S. T. Hedetniemi, and P. J. Slater. Fundamentals of Domination in Grahs. Marcel Dekker, [16] R. J. Hilderman and H. J. Hamilton. Knowledge Discovery and Interest Measures, Kluwer Academic, Boston, [17] IBM. GenData, [18] M. Kamber and R. Shinghal. Evaluating the interestingness of characteristic rules. In Proc. KDD 96, ages , Portland, Oregon, [19] M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivnen, and A. I. Verkamo. Finding interesting rules from large sets of discovered association rules. In Proc. of the 3 rd CIKM Conference, ages , November [20] M. Kryszkiewicz. Reresentative association rules and minimum condition maximum consequence association rules. In Proc. PKDD 98. ages , Nantes, France, [21] W. Lee, S. J. Stolfo, and K. W. Mokl. A data mining framework for building intrusion detection models. In Proc. IEEE Symosium on Security and Privacy, [22] G. Li. Basic Association Rules, M.Sc. Thesis, Deartment of Comuter Science, University of Regina, [23] W. Lin, S. A. Alvarez, and C. Ruiz. Collaborative recommendation via adative association rules mining. In Proc. WEBKDD 2000, San Francisco, CA, Aug. 200 [24] N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Closed set based discovery of small covers for association rules. In Proc. of the 15 th Conference on Advanced Databases, ages , [25] C. M. Ricardo. Database Systems: Princiles, Design, & Imlementation, Macmillan, 199 [26] K. Satou, G. Shibayama, T. Ono, Y. Yamamura, E. Furuichi, S. Kuhara, and T. Takagi. Finding association rules on heterogeneous genome data. In Proc. of the Pacific Symosium on Biocomuting 97 (PSB 97), ages , Hawaii, Jan [27] R. Srikant, Q. Vu, and R. Agrawal. Mining association rules with item constraints. In Proc. KDD 97, ages [28] H. Toivonen, M. Klemettinen, P. Ronkainen, K. Hatonen, and H. Mannila. Pruning and grouing discovered association rules. In Proc. ECML-95 Worksho on Statistics, Machine Learning, and Knowledge Discovery in Database, ages 47-52, Aril [29] UCI Machine Learning Reository. ft://ft.ics.uci.edu/ub/machine-learningdatabases/ [30] M. Zaki, Generating non-redundant association rules. In Proc. KDD 2000, ages August 200
Efficient discovery of statistically significant association rules
Efficient discovery of statistically significant association rules Wilhelmiina Hämäläinen Deartment of Comuter Science University of Helsinki Finland whamalai@cs.helsinki.fi ABSTRACT In this aer, we introduce
More informationPositive Borders or Negative Borders: How to Make Lossless Generator Based Representations Concise
Positive Borders or Negative Borders: How to Make Lossless Generator Based Representations Concise Guimei Liu 1,2 Jinyan Li 1 Limsoon Wong 2 Wynne Hsu 2 1 Institute for Infocomm Research, Singapore 2 School
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 6
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 6 Jiawei Han, Micheline Kamber, and Jian Pei University of Illinois at Urbana-Champaign & Simon Fraser University 2013 Han, Kamber & Pei. All rights
More informationChapter 6. Frequent Pattern Mining: Concepts and Apriori. Meng Jiang CSE 40647/60647 Data Science Fall 2017 Introduction to Data Mining
Chapter 6. Frequent Pattern Mining: Concepts and Apriori Meng Jiang CSE 40647/60647 Data Science Fall 2017 Introduction to Data Mining Pattern Discovery: Definition What are patterns? Patterns: A set of
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Mining Frequent Patterns and Associations: Basic Concepts (Chapter 6) Huan Sun, CSE@The Ohio State University Slides adapted from Prof. Jiawei Han @UIUC, Prof. Srinivasan
More informationData Mining Association Rules: Advanced Concepts and Algorithms. Lecture Notes for Chapter 7. Introduction to Data Mining
Data Mining Association Rules: Advanced Concets and Algorithms Lecture Notes for Chater 7 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/24 1
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationEncyclopedia of Machine Learning Chapter Number Book CopyRight - Year 2010 Frequent Pattern. Given Name Hannu Family Name Toivonen
Book Title Encyclopedia of Machine Learning Chapter Number 00403 Book CopyRight - Year 2010 Title Frequent Pattern Author Particle Given Name Hannu Family Name Toivonen Suffix Email hannu.toivonen@cs.helsinki.fi
More informationCSE 5243 INTRO. TO DATA MINING
CSE 5243 INTRO. TO DATA MINING Mining Frequent Patterns and Associations: Basic Concepts (Chapter 6) Huan Sun, CSE@The Ohio State University 10/17/2017 Slides adapted from Prof. Jiawei Han @UIUC, Prof.
More informationMODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL
Technical Sciences and Alied Mathematics MODELING THE RELIABILITY OF CISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL Cezar VASILESCU Regional Deartment of Defense Resources Management
More informationTopic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar
15-859(M): Randomized Algorithms Lecturer: Anuam Guta Toic: Lower Bounds on Randomized Algorithms Date: Setember 22, 2004 Scribe: Srinath Sridhar 4.1 Introduction In this lecture, we will first consider
More informationDistributed Rule-Based Inference in the Presence of Redundant Information
istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced
More informationA New Concise and Lossless Representation of Frequent Itemsets Using Generators and A Positive Border
A New Concise and Lossless Representation of Frequent Itemsets Using Generators and A Positive Border Guimei Liu a,b Jinyan Li a Limsoon Wong b a Institute for Infocomm Research, Singapore b School of
More information732A61/TDDD41 Data Mining - Clustering and Association Analysis
732A61/TDDD41 Data Mining - Clustering and Association Analysis Lecture 6: Association Analysis I Jose M. Peña IDA, Linköping University, Sweden 1/14 Outline Content Association Rules Frequent Itemsets
More informationAssociation Rule. Lecturer: Dr. Bo Yuan. LOGO
Association Rule Lecturer: Dr. Bo Yuan LOGO E-mail: yuanb@sz.tsinghua.edu.cn Overview Frequent Itemsets Association Rules Sequential Patterns 2 A Real Example 3 Market-Based Problems Finding associations
More informationD B M G Data Base and Data Mining Group of Politecnico di Torino
Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Association rules Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket
More informationMining Class-Dependent Rules Using the Concept of Generalization/Specialization Hierarchies
Mining Class-Dependent Rules Using the Concept of Generalization/Specialization Hierarchies Juliano Brito da Justa Neves 1 Marina Teresa Pires Vieira {juliano,marina}@dc.ufscar.br Computer Science Department
More informationAssociation Rules. Fundamentals
Politecnico di Torino Politecnico di Torino 1 Association rules Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket counter Association rule
More informationD B M G. Association Rules. Fundamentals. Fundamentals. Elena Baralis, Silvia Chiusano. Politecnico di Torino 1. Definitions.
Definitions Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Itemset is a set including one or more items Example: {Beer, Diapers} k-itemset is an itemset that contains k
More informationD B M G. Association Rules. Fundamentals. Fundamentals. Association rules. Association rule mining. Definitions. Rule quality metrics: example
Association rules Data Base and Data Mining Group of Politecnico di Torino Politecnico di Torino Objective extraction of frequent correlations or pattern from a transactional database Tickets at a supermarket
More informationLinear diophantine equations for discrete tomography
Journal of X-Ray Science and Technology 10 001 59 66 59 IOS Press Linear diohantine euations for discrete tomograhy Yangbo Ye a,gewang b and Jiehua Zhu a a Deartment of Mathematics, The University of Iowa,
More informationAn Analysis of Reliable Classifiers through ROC Isometrics
An Analysis of Reliable Classifiers through ROC Isometrics Stijn Vanderlooy s.vanderlooy@cs.unimaas.nl Ida G. Srinkhuizen-Kuyer kuyer@cs.unimaas.nl Evgueni N. Smirnov smirnov@cs.unimaas.nl MICC-IKAT, Universiteit
More informationASSOCIATION ANALYSIS FREQUENT ITEMSETS MINING. Alexandre Termier, LIG
ASSOCIATION ANALYSIS FREQUENT ITEMSETS MINING, LIG M2 SIF DMV course 207/208 Market basket analysis Analyse supermarket s transaction data Transaction = «market basket» of a customer Find which items are
More informationShadow Computing: An Energy-Aware Fault Tolerant Computing Model
Shadow Comuting: An Energy-Aware Fault Tolerant Comuting Model Bryan Mills, Taieb Znati, Rami Melhem Deartment of Comuter Science University of Pittsburgh (bmills, znati, melhem)@cs.itt.edu Index Terms
More informationCOMP 5331: Knowledge Discovery and Data Mining
COMP 5331: Knowledge Discovery and Data Mining Acknowledgement: Slides modified by Dr. Lei Chen based on the slides provided by Jiawei Han, Micheline Kamber, and Jian Pei And slides provide by Raymond
More informationDistributed Mining of Frequent Closed Itemsets: Some Preliminary Results
Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results Claudio Lucchese Ca Foscari University of Venice clucches@dsi.unive.it Raffaele Perego ISTI-CNR of Pisa perego@isti.cnr.it Salvatore
More informationUncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning
TNN-2009-P-1186.R2 1 Uncorrelated Multilinear Princial Comonent Analysis for Unsuervised Multilinear Subsace Learning Haiing Lu, K. N. Plataniotis and A. N. Venetsanooulos The Edward S. Rogers Sr. Deartment
More informationOn Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm
On Line Parameter Estimation of Electric Systems using the Bacterial Foraging Algorithm Gabriel Noriega, José Restreo, Víctor Guzmán, Maribel Giménez and José Aller Universidad Simón Bolívar Valle de Sartenejas,
More informationA Qualitative Event-based Approach to Multiple Fault Diagnosis in Continuous Systems using Structural Model Decomposition
A Qualitative Event-based Aroach to Multile Fault Diagnosis in Continuous Systems using Structural Model Decomosition Matthew J. Daigle a,,, Anibal Bregon b,, Xenofon Koutsoukos c, Gautam Biswas c, Belarmino
More informationFrequent Itemset Mining
ì 1 Frequent Itemset Mining Nadjib LAZAAR LIRMM- UM COCONUT Team (PART I) IMAGINA 17/18 Webpage: http://www.lirmm.fr/~lazaar/teaching.html Email: lazaar@lirmm.fr 2 Data Mining ì Data Mining (DM) or Knowledge
More informationOn split sample and randomized confidence intervals for binomial proportions
On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have
More informationGIVEN an input sequence x 0,..., x n 1 and the
1 Running Max/Min Filters using 1 + o(1) Comarisons er Samle Hao Yuan, Member, IEEE, and Mikhail J. Atallah, Fellow, IEEE Abstract A running max (or min) filter asks for the maximum or (minimum) elements
More informationOn the Toppling of a Sand Pile
Discrete Mathematics and Theoretical Comuter Science Proceedings AA (DM-CCG), 2001, 275 286 On the Toling of a Sand Pile Jean-Christohe Novelli 1 and Dominique Rossin 2 1 CNRS, LIFL, Bâtiment M3, Université
More informationDiscovering Non-Redundant Association Rules using MinMax Approximation Rules
Discovering Non-Redundant Association Rules using MinMax Approximation Rules R. Vijaya Prakash Department Of Informatics Kakatiya University, Warangal, India vijprak@hotmail.com Dr.A. Govardhan Department.
More informationConvex Optimization methods for Computing Channel Capacity
Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem
More informationMATHEMATICAL MODELLING OF THE WIRELESS COMMUNICATION NETWORK
Comuter Modelling and ew Technologies, 5, Vol.9, o., 3-39 Transort and Telecommunication Institute, Lomonosov, LV-9, Riga, Latvia MATHEMATICAL MODELLIG OF THE WIRELESS COMMUICATIO ETWORK M. KOPEETSK Deartment
More informationDATA MINING - 1DL360
DATA MINING - 1DL36 Fall 212" An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht12 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala
More informationTowards understanding the Lorenz curve using the Uniform distribution. Chris J. Stephens. Newcastle City Council, Newcastle upon Tyne, UK
Towards understanding the Lorenz curve using the Uniform distribution Chris J. Stehens Newcastle City Council, Newcastle uon Tyne, UK (For the Gini-Lorenz Conference, University of Siena, Italy, May 2005)
More informationApproximating min-max k-clustering
Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost
More informationElementary Analysis in Q p
Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some
More informationCS 584 Data Mining. Association Rule Mining 2
CS 584 Data Mining Association Rule Mining 2 Recall from last time: Frequent Itemset Generation Strategies Reduce the number of candidates (M) Complete search: M=2 d Use pruning techniques to reduce M
More informationLecture Notes for Chapter 6. Introduction to Data Mining
Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004
More informationMining Free Itemsets under Constraints
Mining Free Itemsets under Constraints Jean-François Boulicaut Baptiste Jeudy Institut National des Sciences Appliquées de Lyon Laboratoire d Ingénierie des Systèmes d Information Bâtiment 501 F-69621
More informationIntroduction to Probability and Statistics
Introduction to Probability and Statistics Chater 8 Ammar M. Sarhan, asarhan@mathstat.dal.ca Deartment of Mathematics and Statistics, Dalhousie University Fall Semester 28 Chater 8 Tests of Hyotheses Based
More informationDATA MINING LECTURE 3. Frequent Itemsets Association Rules
DATA MINING LECTURE 3 Frequent Itemsets Association Rules This is how it all started Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases.
More informationFor q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i
Comuting with Haar Functions Sami Khuri Deartment of Mathematics and Comuter Science San Jose State University One Washington Square San Jose, CA 9519-0103, USA khuri@juiter.sjsu.edu Fax: (40)94-500 Keywords:
More informationA Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression
Journal of Modern Alied Statistical Methods Volume Issue Article 7 --03 A Comarison between Biased and Unbiased Estimators in Ordinary Least Squares Regression Ghadban Khalaf King Khalid University, Saudi
More informationThe Graph Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule
The Grah Accessibility Problem and the Universality of the Collision CRCW Conflict Resolution Rule STEFAN D. BRUDA Deartment of Comuter Science Bisho s University Lennoxville, Quebec J1M 1Z7 CANADA bruda@cs.ubishos.ca
More informationOn Minimal Infrequent Itemset Mining
On Minimal Infrequent Itemset Mining David J. Haglin and Anna M. Manning Abstract A new algorithm for minimal infrequent itemset mining is presented. Potential applications of finding infrequent itemsets
More informationUncertainty Modeling with Interval Type-2 Fuzzy Logic Systems in Mobile Robotics
Uncertainty Modeling with Interval Tye-2 Fuzzy Logic Systems in Mobile Robotics Ondrej Linda, Student Member, IEEE, Milos Manic, Senior Member, IEEE bstract Interval Tye-2 Fuzzy Logic Systems (IT2 FLSs)
More informationA PROBABILISTIC POWER ESTIMATION METHOD FOR COMBINATIONAL CIRCUITS UNDER REAL GATE DELAY MODEL
A PROBABILISTIC POWER ESTIMATION METHOD FOR COMBINATIONAL CIRCUITS UNDER REAL GATE DELAY MODEL G. Theodoridis, S. Theoharis, D. Soudris*, C. Goutis VLSI Design Lab, Det. of Electrical and Comuter Eng.
More informationFE FORMULATIONS FOR PLASTICITY
G These slides are designed based on the book: Finite Elements in Plasticity Theory and Practice, D.R.J. Owen and E. Hinton, 1970, Pineridge Press Ltd., Swansea, UK. 1 Course Content: A INTRODUCTION AND
More informationAssociation Rules. Acknowledgements. Some parts of these slides are modified from. n C. Clifton & W. Aref, Purdue University
Association Rules CS 5331 by Rattikorn Hewett Texas Tech University 1 Acknowledgements Some parts of these slides are modified from n C. Clifton & W. Aref, Purdue University 2 1 Outline n Association Rule
More informationRANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES
RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES AARON ZWIEBACH Abstract. In this aer we will analyze research that has been recently done in the field of discrete
More informationConstruction of High-Girth QC-LDPC Codes
MITSUBISHI ELECTRIC RESEARCH LABORATORIES htt://wwwmerlcom Construction of High-Girth QC-LDPC Codes Yige Wang, Jonathan Yedidia, Stark Draer TR2008-061 Setember 2008 Abstract We describe a hill-climbing
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 04 Association Analysis Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro
More informationOn the capacity of the general trapdoor channel with feedback
On the caacity of the general tradoor channel with feedback Jui Wu and Achilleas Anastasooulos Electrical Engineering and Comuter Science Deartment University of Michigan Ann Arbor, MI, 48109-1 email:
More informationMATH 2710: NOTES FOR ANALYSIS
MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite
More informationDATA MINING - 1DL360
DATA MINING - DL360 Fall 200 An introductory class in data mining http://www.it.uu.se/edu/course/homepage/infoutv/ht0 Kjell Orsborn Uppsala Database Laboratory Department of Information Technology, Uppsala
More informationA Unified 2D Representation of Fuzzy Reasoning, CBR, and Experience Based Reasoning
University of Wollongong Research Online Faculty of Commerce - aers (Archive) Faculty of Business 26 A Unified 2D Reresentation of Fuzzy Reasoning, CBR, and Exerience Based Reasoning Zhaohao Sun University
More informationCombining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)
Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment
More informationarxiv: v1 [physics.data-an] 26 Oct 2012
Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch
More informationCS 484 Data Mining. Association Rule Mining 2
CS 484 Data Mining Association Rule Mining 2 Review: Reducing Number of Candidates Apriori principle: If an itemset is frequent, then all of its subsets must also be frequent Apriori principle holds due
More informationBayesian System for Differential Cryptanalysis of DES
Available online at www.sciencedirect.com ScienceDirect IERI Procedia 7 (014 ) 15 0 013 International Conference on Alied Comuting, Comuter Science, and Comuter Engineering Bayesian System for Differential
More informationA New Method of DDB Logical Structure Synthesis Using Distributed Tabu Search
A New Method of DDB Logical Structure Synthesis Using Distributed Tabu Search Eduard Babkin and Margarita Karunina 2, National Research University Higher School of Economics Det of nformation Systems and
More informationTopic 7: Using identity types
Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules
More informationBasic Data Structures and Algorithms for Data Profiling Felix Naumann
Basic Data Structures and Algorithms for 8.5.2017 Overview 1. The lattice 2. Apriori lattice traversal 3. Position List Indices 4. Bloom filters Slides with Thorsten Papenbrock 2 Definitions Lattice Partially
More informationProbability Estimates for Multi-class Classification by Pairwise Coupling
Probability Estimates for Multi-class Classification by Pairwise Couling Ting-Fan Wu Chih-Jen Lin Deartment of Comuter Science National Taiwan University Taiei 06, Taiwan Ruby C. Weng Deartment of Statistics
More informationImproved Capacity Bounds for the Binary Energy Harvesting Channel
Imroved Caacity Bounds for the Binary Energy Harvesting Channel Kaya Tutuncuoglu 1, Omur Ozel 2, Aylin Yener 1, and Sennur Ulukus 2 1 Deartment of Electrical Engineering, The Pennsylvania State University,
More information4. Score normalization technical details We now discuss the technical details of the score normalization method.
SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules
More informationDecoding Linear Block Codes Using a Priority-First Search: Performance Analysis and Suboptimal Version
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 44, NO. 3, MAY 1998 133 Decoding Linear Block Codes Using a Priority-First Search Performance Analysis Subotimal Version Yunghsiang S. Han, Member, IEEE, Carlos
More informationSystem Reliability Estimation and Confidence Regions from Subsystem and Full System Tests
009 American Control Conference Hyatt Regency Riverfront, St. Louis, MO, USA June 0-, 009 FrB4. System Reliability Estimation and Confidence Regions from Subsystem and Full System Tests James C. Sall Abstract
More informationMining Positive and Negative Fuzzy Association Rules
Mining Positive and Negative Fuzzy Association Rules Peng Yan 1, Guoqing Chen 1, Chris Cornelis 2, Martine De Cock 2, and Etienne Kerre 2 1 School of Economics and Management, Tsinghua University, Beijing
More informationGeneration of Linear Models using Simulation Results
4. IMACS-Symosium MATHMOD, Wien, 5..003,. 436-443 Generation of Linear Models using Simulation Results Georg Otte, Sven Reitz, Joachim Haase Fraunhofer Institute for Integrated Circuits, Branch Lab Design
More informationResearch of PMU Optimal Placement in Power Systems
Proceedings of the 5th WSEAS/IASME Int. Conf. on SYSTEMS THEORY and SCIENTIFIC COMPUTATION, Malta, Setember 15-17, 2005 (38-43) Research of PMU Otimal Placement in Power Systems TIAN-TIAN CAI, QIAN AI
More informationA Special Case Solution to the Perspective 3-Point Problem William J. Wolfe California State University Channel Islands
A Secial Case Solution to the Persective -Point Problem William J. Wolfe California State University Channel Islands william.wolfe@csuci.edu Abstract In this aer we address a secial case of the ersective
More informationEvaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models
Evaluating Circuit Reliability Under Probabilistic Gate-Level Fault Models Ketan N. Patel, Igor L. Markov and John P. Hayes University of Michigan, Ann Arbor 48109-2122 {knatel,imarkov,jhayes}@eecs.umich.edu
More informationLogistics Optimization Using Hybrid Metaheuristic Approach under Very Realistic Conditions
17 th Euroean Symosium on Comuter Aided Process Engineering ESCAPE17 V. Plesu and P.S. Agachi (Editors) 2007 Elsevier B.V. All rights reserved. 1 Logistics Otimization Using Hybrid Metaheuristic Aroach
More informationMonopolist s mark-up and the elasticity of substitution
Croatian Oerational Research Review 377 CRORR 8(7), 377 39 Monoolist s mark-u and the elasticity of substitution Ilko Vrankić, Mira Kran, and Tomislav Herceg Deartment of Economic Theory, Faculty of Economics
More informationA Parallel Algorithm for Minimization of Finite Automata
A Parallel Algorithm for Minimization of Finite Automata B. Ravikumar X. Xiong Deartment of Comuter Science University of Rhode Island Kingston, RI 02881 E-mail: fravi,xiongg@cs.uri.edu Abstract In this
More informationarxiv: v2 [quant-ph] 2 Aug 2012
Qcomiler: quantum comilation with CSD method Y. G. Chen a, J. B. Wang a, a School of Physics, The University of Western Australia, Crawley WA 6009 arxiv:208.094v2 [quant-h] 2 Aug 202 Abstract In this aer,
More informationPretest (Optional) Use as an additional pacing tool to guide instruction. August 21
Trimester 1 Pretest (Otional) Use as an additional acing tool to guide instruction. August 21 Beyond the Basic Facts In Trimester 1, Grade 8 focus on multilication. Daily Unit 1: Rational vs. Irrational
More informationHomework Set #3 Rates definitions, Channel Coding, Source-Channel coding
Homework Set # Rates definitions, Channel Coding, Source-Channel coding. Rates (a) Channels coding Rate: Assuming you are sending 4 different messages using usages of a channel. What is the rate (in bits
More informationInformation collection on a graph
Information collection on a grah Ilya O. Ryzhov Warren Powell February 10, 2010 Abstract We derive a knowledge gradient olicy for an otimal learning roblem on a grah, in which we use sequential measurements
More informationDETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS
Proceedings of DETC 03 ASME 003 Design Engineering Technical Conferences and Comuters and Information in Engineering Conference Chicago, Illinois USA, Setember -6, 003 DETC003/DAC-48760 AN EFFICIENT ALGORITHM
More informationA MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST BASED ON THE WEIBULL DISTRIBUTION
O P E R A T I O N S R E S E A R C H A N D D E C I S I O N S No. 27 DOI:.5277/ord73 Nasrullah KHAN Muhammad ASLAM 2 Kyung-Jun KIM 3 Chi-Hyuck JUN 4 A MIXED CONTROL CHART ADAPTED TO THE TRUNCATED LIFE TEST
More informationHotelling s Two- Sample T 2
Chater 600 Hotelling s Two- Samle T Introduction This module calculates ower for the Hotelling s two-grou, T-squared (T) test statistic. Hotelling s T is an extension of the univariate two-samle t-test
More informationFrequent Itemset Mining
1 Frequent Itemset Mining Nadjib LAZAAR LIRMM- UM IMAGINA 15/16 2 Frequent Itemset Mining: Motivations Frequent Itemset Mining is a method for market basket analysis. It aims at finding regulariges in
More informationLilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3
Pesquisa Oeracional (2013) 33(1): 123-132 2013 Brazilian Oerations Research Society Printed version ISSN 0101-7438 / Online version ISSN 1678-5142 www.scielo.br/oe SOME RESULTS ABOUT THE CONNECTIVITY OF
More informationand INVESTMENT DECISION-MAKING
CONSISTENT INTERPOLATIVE FUZZY LOGIC and INVESTMENT DECISION-MAKING Darko Kovačević, č Petar Sekulić and dal Aleksandar Rakićević National bank of Serbia Belgrade, Jun 23th, 20 The structure of the resentation
More informationAn Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem
An Ant Colony Otimization Aroach to the Probabilistic Traveling Salesman Problem Leonora Bianchi 1, Luca Maria Gambardella 1, and Marco Dorigo 2 1 IDSIA, Strada Cantonale Galleria 2, CH-6928 Manno, Switzerland
More informationDetection Algorithm of Particle Contamination in Reticle Images with Continuous Wavelet Transform
Detection Algorithm of Particle Contamination in Reticle Images with Continuous Wavelet Transform Chaoquan Chen and Guoing Qiu School of Comuter Science and IT Jubilee Camus, University of Nottingham Nottingham
More informationBayesian Networks Practice
Bayesian Networks Practice Part 2 2016-03-17 Byoung-Hee Kim, Seong-Ho Son Biointelligence Lab, CSE, Seoul National University Agenda Probabilistic Inference in Bayesian networks Probability basics D-searation
More informationMatching Partition a Linked List and Its Optimization
Matching Partition a Linked List and Its Otimization Yijie Han Deartment of Comuter Science University of Kentucky Lexington, KY 40506 ABSTRACT We show the curve O( n log i + log (i) n + log i) for the
More informationTests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)
Chater 225 Tests for Two Proortions in a Stratified Design (Cochran/Mantel-Haenszel Test) Introduction In a stratified design, the subects are selected from two or more strata which are formed from imortant
More informationwhere x i is the ith coordinate of x R N. 1. Show that the following upper bound holds for the growth function of H:
Mehryar Mohri Foundations of Machine Learning Courant Institute of Mathematical Sciences Homework assignment 2 October 25, 2017 Due: November 08, 2017 A. Growth function Growth function of stum functions.
More informationUniversal Finite Memory Coding of Binary Sequences
Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv
More informationEffective Elimination of Redundant Association Rules
Effective Elimination of Redundant Association Rules James Cheng Yiping Ke Wilfred Ng Department of Computer Science and Engineering The Hong Kong University of Science and Technology Clear Water Bay,
More informationFree-sets : a Condensed Representation of Boolean Data for the Approximation of Frequency Queries
Free-sets : a Condensed Representation of Boolean Data for the Approximation of Frequency Queries To appear in Data Mining and Knowledge Discovery, an International Journal c Kluwer Academic Publishers
More informationGOOD MODELS FOR CUBIC SURFACES. 1. Introduction
GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in
More information