XReason: A Semantic Approach that Reasons with Patterns to Answer XML Keyword Queries

Similar documents
TASM: Top-k Approximate Subtree Matching

Model checking, verification of CTL. One must verify or expel... doubts, and convert them into the certainty of YES [Thomas Carlyle]

RANDOM WALKS AND PERCOLATION: AN ANALYSIS OF CURRENT RESEARCH ON MODELING NATURAL PROCESSES

GIVEN an input sequence x 0,..., x n 1 and the

Theoretically Optimal and Empirically Efficient R-trees with Strong Parallelizability

16. Binary Search Trees

SAT based Abstraction-Refinement using ILP and Machine Learning Techniques

16. Binary Search Trees

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar

State Estimation with ARMarkov Models

Approximating min-max k-clustering

Location-Sensitive Resources Recommendation in Social Tagging Systems

Generation of Linear Models using Simulation Results

Finding Shortest Hamiltonian Path is in P. Abstract

Proof Nets and Boolean Circuits

COMMUNICATION BETWEEN SHAREHOLDERS 1

An Introduction To Range Searching

Lilian Markenzon 1, Nair Maria Maia de Abreu 2* and Luciana Lee 3

DMS: Distributed Sparse Tensor Factorization with Alternating Least Squares

Uncorrelated Multilinear Principal Component Analysis for Unsupervised Multilinear Subspace Learning

Information collection on a graph

Information collection on a graph

Solved Problems. (a) (b) (c) Figure P4.1 Simple Classification Problems First we draw a line between each set of dark and light data points.

PROFIT MAXIMIZATION. π = p y Σ n i=1 w i x i (2)

A Comparison between Biased and Unbiased Estimators in Ordinary Least Squares Regression

Title. Author(s)Okamoto, Ryo; Hofmann, Holger F.; Takeuchi, Shigeki; CitationPhysical Review Letters, 95: Issue Date

Analyses of Orthogonal and Non-Orthogonal Steering Vectors at Millimeter Wave Systems

The Motion Path Study of Measuring Robot Based on Variable Universe Fuzzy Control

A uniform programming language for implementing XML standards

Bayesian Networks Practice

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition

q-ary Symmetric Channel for Large q

Metrics Performance Evaluation: Application to Face Recognition

AP Calculus Testbank (Chapter 10) (Mr. Surowski)

AI*IA 2003 Fusion of Multiple Pattern Classifiers PART III

AN APPROACH FOR THE MODEL BASED MONITORING OF PIEZOELECTRIC ACTUATORS

Uncorrelated Multilinear Discriminant Analysis with Regularization and Aggregation for Tensor Object Recognition

Finite-State Verification or Model Checking. Finite State Verification (FSV) or Model Checking

For q 0; 1; : : : ; `? 1, we have m 0; 1; : : : ; q? 1. The set fh j(x) : j 0; 1; ; : : : ; `? 1g forms a basis for the tness functions dened on the i

Hotelling s Two- Sample T 2

An Ant Colony Optimization Approach to the Probabilistic Traveling Salesman Problem

Plotting the Wilson distribution

Recent Developments in Multilayer Perceptron Neural Networks

CHAPTER 5 STATISTICAL INFERENCE. 1.0 Hypothesis Testing. 2.0 Decision Errors. 3.0 How a Hypothesis is Tested. 4.0 Test for Goodness of Fit

Interactive Hypothesis Testing Against Independence

ECE 534 Information Theory - Midterm 2

Research of PMU Optimal Placement in Power Systems

VIBRATION ANALYSIS OF BEAMS WITH MULTIPLE CONSTRAINED LAYER DAMPING PATCHES

Voting with Behavioral Heterogeneity

A Spectral-Factorization Combinatorial-Search Algorithm Unifying the Systematized Collection of Daubechies Wavelets. Abstract.

Factors Effect on the Saturation Parameter S and there Influences on the Gain Behavior of Ytterbium Doped Fiber Amplifier

DIFFERENTIAL evolution (DE) [3] has become a popular

GENERATING FUZZY RULES FOR PROTEIN CLASSIFICATION E. G. MANSOORI, M. J. ZOLGHADRI, S. D. KATEBI, H. MOHABATKAR, R. BOOSTANI AND M. H.

An introduction to forest-regular languages

4. Score normalization technical details We now discuss the technical details of the score normalization method.

A BSS-BASED APPROACH FOR LOCALIZATION OF SIMULTANEOUS SPEAKERS IN REVERBERANT CONDITIONS

FluXQuery: An Optimizing XQuery Processor

ENHANCING TIMBRE MODEL USING MFCC AND ITS TIME DERIVATIVES FOR MUSIC SIMILARITY ESTIMATION

Analyses and Validation of Conditional Dependencies with Built-in Predicates

Sensitivity and Robustness of Quantum Spin-½ Rings to Parameter Uncertainty

Observer/Kalman Filter Time Varying System Identification

Lecture 7: Introduction to syntax-based MT

Combinatorics of topmost discs of multi-peg Tower of Hanoi problem

Central Force Motion Challenge Problems

Probability Estimates for Multi-class Classification by Pairwise Coupling

John Weatherwax. Analysis of Parallel Depth First Search Algorithms

Type Based XML Projection

Outline. Markov Chains and Markov Models. Outline. Markov Chains. Markov Chains Definitions Huizhen Yu

Study on determinants of Chinese trade balance based on Bayesian VAR model

Research Article Research on Evaluation Indicator System and Methods of Food Network Marketing Performance

XML Security Views. Queries, Updates, and Schema. Benoît Groz. University of Lille, Mostrare INRIA. PhD defense, October 2012

MEASUREMENT OF THE INCLUSIVE ELECTRON (POSITRON) +PROTON SCATTERING CROSS SECTION AT HIGH INELASTICITY y USING H1 DATA *

MODELING THE RELIABILITY OF C4ISR SYSTEMS HARDWARE/SOFTWARE COMPONENTS USING AN IMPROVED MARKOV MODEL

DETC2003/DAC AN EFFICIENT ALGORITHM FOR CONSTRUCTING OPTIMAL DESIGN OF COMPUTER EXPERIMENTS

Pretest (Optional) Use as an additional pacing tool to guide instruction. August 21

Diversified Top k Graph Pattern Matching

CTL, the branching-time temporal logic

p(-,i)+p(,i)+p(-,v)+p(i,v),v)+p(i,v)

Feedback-error control

Robust Predictive Control of Input Constraints and Interference Suppression for Semi-Trailer System

Introduction to MVC. least common denominator of all non-identical-zero minors of all order of G(s). Example: The minor of order 2: 1 2 ( s 1)

EXACTLY PERIODIC SUBSPACE DECOMPOSITION BASED APPROACH FOR IDENTIFYING TANDEM REPEATS IN DNA SEQUENCES

Scaling Multiple Point Statistics for Non-Stationary Geostatistical Modeling

Introduction to Tree Logics

Natural Language Processing. Topics in Information Retrieval. Updated 5/10

Evaluating Complex Queries against XML Streams with Polynomial Combined Complexity

Tests for Two Proportions in a Stratified Design (Cochran/Mantel-Haenszel Test)

Convex Optimization methods for Computing Channel Capacity

Detection Algorithm of Particle Contamination in Reticle Images with Continuous Wavelet Transform

How to Estimate Expected Shortfall When Probabilities Are Known with Interval or Fuzzy Uncertainty

Universal Finite Memory Coding of Binary Sequences

Privacy-Preserving Bayesian Network Learning From Heterogeneous Distributed Data

Ensemble Forecasting the Number of New Car Registrations

Evaluating Process Capability Indices for some Quality Characteristics of a Manufacturing Process

Prediction of the Excitation Force Based on the Dynamic Analysis for Flexible Model of a Powertrain

Coding Along Hermite Polynomials for Gaussian Noise Channels

INTRODUCTION. Please write to us at if you have any comments or ideas. We love to hear from you.

Cryptography. Lecture 8. Arpita Patra

ABSTRACT MODEL REPAIR

Implementation and Validation of Finite Volume C++ Codes for Plane Stress Analysis

Transcription:

XReason: A Semantic Aroach that Reasons with Patterns to Answer XML Keyword Queries Cem Aksoy 1, Aggeliki Dimitriou 2, Dimitri Theodoratos 1, Xiaoying Wu 3 1 New Jersey Institute of Technology, Newark, NJ, USA 2 National Technical University of Athens, Athens, Greece 3 Wuhan University, Wuhan, China

Introduction Keyword search A very oular technique Easy for users, hard for systems! Unstructured queries consisting of keywords Tend to be ambiguous Our focus in this work: XML data Poular exort and exchange method Tree-structured data 2

Examle XML database 3

Examle XML database 4

Examle XML database 1 bib 2 3 aer aer 4 5 6 7 author title booktitle 8 9 10 author title cite booktitle 11 12 XML VLDB 13 14 affl name XML SIGMOD 15 16 Integration affl name Design aer aer MIT Miller UIC Burt 17 18 19 20 21 author title booktitle title booktitle 22 23 affl name XML UIC Burt Query VLDB Data Integration EDBT 5

Search on XML Languages such as XQuery and XPath can be used An examle XQuery exression: for $x in doc( bibliograhy.xml )/bib/aer where $x/[booktitle= dasfaa ] return $x/author But they have roblems: Comlex language syntax User needs to know the database schema 6

Keyword Search on XML Query: Set of keywords Answer: Subtrees of the XML tree (not whole documents) Minimum connecting trees (MCTs) are common as results Root of MCT: Lowest Common Ancestor (LCA) of keyword matches Large number of results, mostly irrelevant Ranking is imortant 7

XML Keyword Search Examle Q={hysics, james, harrison} R1={(hysics,4),(james,9),(harrison,10)} R2={(hysics,15),(james,19),(harrison,20)} R3={(hysics,7),(james,13),(harrison,14)} 1 s 3x3x3=27 candidate results! 2 3 4 5 6 7 8 title rerequisite title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 8

XML Keyword Search Examle Q={hysics, james, harrison} R1={(hysics,4),(james,9),(harrison,10)} R2={(hysics,15),(james,19),(harrison,20)} R3={(hysics,7),(james,13),(harrison,14)} 1 s R1 2 3 4 5 6 7 8 title rerequisite title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 9

XML Keyword Search Examle Q={hysics, james, harrison} R1={(hysics,4),(james,9),(harrison,10)} R2={(hysics,15),(james,19),(harrison,20)} R3={(hysics,7),(james,13),(harrison,14)} 1 s R1 2 3 R2 4 5 6 7 8 title rerequisite title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 10

XML Keyword Search Examle Q={hysics, james, harrison} Smallest LCA (SLCA) Not SLCA 1 s SLCA LCA R1 2 3 LCA R3 R2 SLCA 4 5 6 7 8 title rerequisite LCA title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 11

XML Keyword Search Examle Q={hysics, james, harrison} Exclusive LCA (ELCA) ELCA LCA 1 s 2 ELCA 3 ELCA SLCA LCA R1 R3 R2 SLCA 4 5 6 7 8 title rerequisite LCA title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 12

XML Keyword Search Examle Q={hysics, james, harrison} Reasonable but ad hoc! Decisions are based on locality, atterns are not considered ELCA LCA 1 s 2 ELCA 3 ELCA SLCA LCA R1 R3 R2 SLCA 4 5 6 7 8 title rerequisite LCA title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 13

Contributions Novel keyword search semantics on XML which reasons with keyword query atterns Reasoning is based on homomorhisms between atterns Ranking and filtering semantics are based on a grah of atterns A stack-based algorithm for generating atterns Promising effectiveness and efficiency exerimental results on real and benchmark datasets 14

Definitions Data Model XML data is modeled as an ordered, node labeled tree Encoded with Dewey code 1 bib 1.1 1.2 aer aer 1.1.1 1.1.2 1.1.3 author title booktitle 1.2.1 1.2.2 author title 1.1.1.1 1.1.1.2 XML VLDB 1.2.1.11.2.1.2 affl name Integration affl name MIT Miller UIC Burt XML Design 1.2.3 booktitle SIGMOD 15

Definitions An answer to a keyword query on XML data is a set of instance trees (ITs) IT: minimum subtree; rooted at the root of the XML tree; contains a matching for the query keywords 1 s 2 2 (a) IT I 4 title 5 4 title 5 (b) MCT M ann(n) 9 10 fname lname 9 10 fname lname 16

Definitions Ranking semantics: all the result ITs are ranked Filtering semantics: a subset of the result ITs is returned as an answer 17

IT Patterns reresents the set of ITs that share the same structure including the labels and annotations Q={hysics, james, harrison} 2 s 1 university 3 events 4 5 6 7 year seminars [2012] 8 9 10 11 12 13 14 15 title rerequisite title year seminar [Physics II] [Statistical [2012] Physics] 16 17 18 19 20 21 22 23 24 fname lname fname lname fname lname toic seaker [Smith] [John] [George] [Miller] [Quantum [James title Physics] Harrison] [Physics I] 18

Examle on Patterns Q={hysics, james, harrison} 2 s 1 university 3 events 4 5 6 7 year seminars [2012] 8 9 10 11 12 13 14 15 title rerequisite title year seminar [Physics II] [Statistical [2012] Physics] 16 17 18 19 20 21 22 23 24 fname lname fname lname fname lname toic seaker [Smith] [John] [George] [Miller] [Quantum [James title Physics] Harrison] [Physics I] 19

Examle on Patterns Q={hysics, james, harrison} 2 s 1 university 3 events 4 5 6 7 year seminars [2012] 8 9 10 11 12 13 14 15 title rerequisite title year seminar [Physics II] [Statistical [2012] Physics] 16 17 18 19 20 21 22 23 24 fname lname fname lname fname lname toic seaker [Smith] [John] [George] [Miller] [Quantum [James title Physics] Harrison] [Physics I] 20

Examle on Patterns Q={hysics, james, harrison} 2 s 1 university 3 events 4 5 6 7 year seminars [2012] 8 9 10 11 12 13 14 15 title rerequisite title year seminar [Physics II] [Statistical [2012] Physics] 16 17 18 19 20 21 22 23 24 fname lname fname lname fname lname toic seaker [Smith] [John] [George] [Miller] [Quantum [James title Physics] Harrison] [Physics I] university university university university Pattern P 1 P s 2 P events s 3 P 4 s seminars seminar rerequisites title title fname lname fname lname fname lname toic seaker [James, title Harrison] P university 5 P 6 university university university P 7 P 8 s events s s events s events events seminars seminars seminars seminars rerequisites seminar title seminar title seminar title seminar seaker seaker fname seaker fname seaker [James, [James, title Harrison] Harrison] 21

Examle on Patterns (2) Q={hysics, james, harrison} 1 s 2 3 4 5 6 7 8 title rerequisite title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 22

IT Patterns Q={hysics, james, harrison} 1 s IT1 2 3 4 5 6 7 8 title rerequisite title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 [Harrison title textbook title textbook [Physics I] 19 [Calculus] 20 author author 23

IT Patterns Q={hysics, james, harrison} title Pattern s fname lname 4 title IT1 1 s IT3 1 s 2 3 5 7 8 title 9 10 13 14 fname lname fname lname 1 s IT1 2 3 IT3 4 5 6 7 8 title rerequisite title [Physics II] [Statistical Physics] 9 10 11 12 13 14 fname lname fname lname 15 16 17 18 title textbook title textbook [Physics I] 19 [Calculus] 20 author author 24

Semantics Direct comarison between atterns (instead of assigning scores to atterns) Comarison is based on different kind of relations defined on atterns These relations are defined using different tyes of homomorhisms between atterns 25

Pattern Homomorhism A maing from P to P P can be obtained from P by merging aths and unioning annotations P P h P P h P P P seminars seminars seminars year [2012] seminar seminar seaker seaker year [2012] seminar seaker seaker year seminar [2012] seaker [James, Harrison] Intuitively, P is more relevant than P, and P is more relevant than P (comactness) 26

Pattern Homomorhism (2) Pattern homomorhism does not always work P P lname fname title fname lname rerequisites title 27

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname rerequisites title 28

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname rerequisites title 29

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname rerequisites title 30

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname rerequisites title 31

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname rerequisites title 32

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname rerequisites title 33

Path Homomorhism Mas searately every root-to-annotated node ath of P to a ath in P P P lname fname title fname lname P is more relevant than P Intuition: Keyword instances in P are connected more tightly than in P rerequisites title 34

All-Path-Homomorhism (ah) Relation We use the homomorhisms introduced earlier to define a relation on atterns

All-Path-Homomorhism (ah) Relation P and P : two atterns M and M : corresonding MCTs P ah P if h M M and M M or h h M M and M M h P P ah P P lname fname title fname lname rerequisites title 36

All-Path-Homomorhism (ah) Relation P and P : two atterns M and M : corresonding MCTs P ah P if h M M and M M or h h M M and M M h P seminars year [2012] seminar seminar seaker seaker P ah P year [2012] P seminars seminar seaker seaker P ah P P seminars year seminar [2012] seaker [James, Harrison] 37

All-Path-Homomorhism (ah) Relation (2) ah relation cannot hel us in all cases For instance, when heterogeneous data is merged into one XML document university university Pattern P 4 events seminars s events seminars Pattern P 5 seminar toic seaker [James, Harrison] title lname seminar seaker 38

Partial-Path-Homomorhism (h) Relation P h P if A maing of a root-to-annotated node ath MCT root of P should be maed to a descendant of the MCT root of P university university Pattern P 4 events seminars s events seminars Pattern P 5 seminar toic seaker [James, Harrison] title lname seminar seaker 39

Partial-Path-Homomorhism (h) Relation P h P if A maing of a root-to-annotated node ath MCT root should be maed to a descendant of the destination MCT root P 4 h P 5 university LCA university Pattern P 4 events seminars s events seminars Pattern P 5 LCA seminar toic seaker [James, Harrison] title lname seminar seaker 40

XReason Semantics (Precedence) relation P P if P ah P or P h P Precedence grah G : Vertex = Pattern Edge = P P P 2 P 4 a a P 1 P 3 P 7 P 5 P 15 P 8 P 9 P 10 P 12 P 13 P 14 h h P 11 P 6 41

XReason Ranking Semantics We define order O based on (a) Ascending GLevel GLevel=1 GLevel=2 GLevel=3 GLevel=4 P 2 P 4 a a P 1 P 3 P 5 P 15 P 8 P 9 P 10 P 12 P 13 P 14 P 7 Pattern order O 1. P 2 P 4 2. P 1 3. P 3 P 7 4. P 5 P 8 P 9 P 10 P 12 P 13 P 15 P 14 5. P 11 P 6 GLevel=5 P 11 h h P 6 42

XReason Ranking Semantics We define order O based on (a) Ascending GLevel (b) Descending MCTDeth MCTDeth=3 MCTDeth=4 Pattern order O P 2 university s P 4 university events seminars seminar title fname lname toic seaker [James, Harrison] 1. P 4 2. P 2 3. P 1 4. P 3 5. P 7 6. P 5 P 8 P 9 P 10 P 12 P 13 P 14 P 15 7. P 11 P 6 43

XReason Ranking Semantics We define order O based on (a) Ascending GLevel (b) Descending MCTDeth (c) Ascending MCTSize MCTSize=9 MCTSize=7 university university P 5 P 8 s events s events seminars seminars title seminar title seminar seaker fname seaker [James, Harrison] Pattern order O 1. P 4 2. P 2 3. P 1 4. P 3 5. P 7 6. P 8 7. P 5 P 9 P 10 P 12 8. P 13 9. P 15 P 14 10. P 11 P 6 44

XReason Ranking Semantics We define order O based on (a) Ascending GLevel (b) Descending MCTDeth (c) Ascending MCTSize Pattern order O XReason ranks ITs in an order which comlies with the order O of their atterns Might create equivalence classes 1. P 4 2. P 2 3. P 1 4. P 3 5. P 7 6. P 8 7. P 5 P 9 P 10 P 12 8. P 13 9. P 15 P 14 10. P 11 P 6 45

XReason Filtering Semantics Answer: ITs whose atterns are source nodes in G : P 2 P 4 a ITs of P 2 and P 4 a P 1 P 3 P 7 P 5 P 15 P 8 P 9 P 10 P 12 P 13 P 14 h h P 11 P 6 46

Algorithm PatternStack Inut: Inverted lists of the query keywords Outut: Patterns of the query with the associated ITs Reads the inverted lists in document order Constructs the atterns on the fly incrementally Links the constructed ITs with their resective atterns 47

Exerimental Setu Datasets Effectiveness Mondial (1.7 MB), Sigmod (467 KB), EBAY (34 KB) Efficiency NASA (23 MB), XMark (150 MB) Metrics Filtering Exeriments Precision: ratio of the relevant results in the result set of the system Recall: ratio of the relevant results in the result set to all relevant results 48

Exerimental Setu Ranking Exeriments: Mean Average Precision (MAP): The mean of the average of recision scores after each relevant result of the query is retrieved Recirocal Rank (R-Rank): Recirocal of the rank of the first correct result of a query Precision-at N (P@N): Ranked list is cut off at rank N 49

Exerimental Setu Handling equivalence classes (ITs that share the same rank): Best: All correct results are assumed to be ranked at the beginning of the equivalence class Worst: All correct results are assumed to be ranked at the end of the equivalence class The best and worst versions result in lower and uer bounds for the metrics 50

Filtering Exeriments XReason is comared with SLCA: ITs whose MCT root is an SLCA ELCA: ITs whose MCT root is an ELCA ITReal: an adatation of XReal XReal finds an intended node tye ITReal returns ITs whose MCT root is a descendant of XReal s intended node tye 6 queries over each dataset (Mondial, SIGMOD, EBAY) 51

Ranking Results Dataset Semantics MAP worst MAP best R-Rank worst R-Rank best Mondial Sigmod EBAY XReason 0.95 0.95 1.00 1.00 ITReal 0.60 0.87 0.59 1.00 XReason 1.00 1.00 1.00 1.00 ITReal 0.19 0.69 0.26 0.83 XReason 1.00 1.00 1.00 1.00 ITReal 0.60 0.80 0.53 1.00 XReason has erfect R-Rank scores XReason outerforms ITReal w.r.t. MAP 52

Ranking Results Worst and best versions are suerimosed XReason has better P@10 values than ITReal 53

Filtering Results 1.0 1.0 0.9 0.9 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 0.1 0.1 0.0 M1 M2 M3 M4 M5 M6 (a) Mondial All have erfect recall XReason: close to erfect recision Worst is ELCA 0.0 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 S1 S2 S3 S4 S5 S6 (b) SIGMOD E1 E2 E3 E4 E5 E6 (c) EBAY 54

Efficiency Exeriments Comare PatternStack with a naïve algorithm for comuting the atterns 1,000.00 1,000.00 100.00 100.00 10.00 10.00 1.00 1.00 0.10 0.10 0.01 0.01 0.00 M1 M2 M3 M4 M5 M6 0.00 S1 S2 S3 S4 S5 S6 PatternStack is significantly faster Resonse times for PatternStack are reasonable for real-time systems 55

t (msec) t (msec) Efficiency Exeriments 550 500 450 400 350 300 250 200 150 100 50 0 Scalability With three and four keywords We truncated the inverted lists at different sizes: 20%, 40%, 60%, 80% and 100% k=3 k=4 0 50,000 100,000 150,000 200,000 250,000 Number of results (a) NASA 1,800 1,600 1,400 1,200 1,000 800 600 400 200 0 k=3 k=4 0 5,000 10,000 15,000 20,000 25,000 Number of results (b) XMark 56

Conclusion & Future Work Homomorhisms can be effectively used to comare atterns Results show that the aroach is effective and efficient Otimizations about the recedence grah for efficiency Additional information such as statistical information can be combined for ranking the results 57