Approximation: Theory and Algorithms
|
|
- Bethanie Powers
- 5 years ago
- Views:
Transcription
1 Approximation: Theory and Algorithms The Tree Edit Distance (II) Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 7 April 17, 2009 Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
2 Outline 1 Tree Edit Distance (II) The Tree Edit Distance Algorithm 2 Conclusion Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
3 First Recursive Formula: Forest Distance Lemma (First Recursive Formula) Given two trees T 1 and T 2, i N(T 1 ) and d i desc(i), j N(T 2 ) and d j desc(j), then: fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j ]) + ω del fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j 1]) + ω ins fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) = min fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) + fdist(t 1 [l(d i )..d i 1], T 2 [l(d j )..d j 1]) + ω ren Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
4 Proof Tree Edit Distance (II) Proof. Let M be the minimum-cost map between T 1 [l(i)..d i ] and T 2 [l(j)..d j ], i.e., the map we are looking for. Then for T 1 [d i ] and T 2 [d j ] there are tree possibilities: (1) T 1 [d i ] is not touched by a line in M: T 1 [d i ] is deleted and fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) = fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j ]) + ω del (2) T 2 [d j ] is not touched by a line in M: T 2 [d j ] is inserted and fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) = fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j 1]) + ω ins (3) Both, T 1 [d i ] and T 2 [d j ] are touched by a line in M: We show (by contradiction) that in this case (T 1 [d i ], T 2 [d j ]) M, i.e., T 1 [d i ] is renamed to T 2 [d j ]: Assume (T 1 [d i ], T 2 [d i ]) M and (T 1[d j ], T 2[d j ]) M. Case T 1 [d i ] is to the right of T 1 [d j ]: By sibling condition on M also T 2 [d i ] must be to the right of T 2[d j ]. Impossible in T 2 [l(j)..d j ]. Case T 1 [d i ] is proper ancestor of T 1 [d j ]: By ancestor condition on M also T 2 [d i ] must be ancestor of T 2[d j ]. Impossible in T 2 [l(j)..d j ]. As these three cases express all possible mappings yielding fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]), we take the minimum of these tree costs. Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
5 Example: First Recursive Formula (1/3) T 1 f 6 d 4 a 1 c 3 b 2 e 5 T 2 f 6 c 4 d 3 a 1 b 2 e 5 T 1 [l(i)..d i ] T 2 [l(j)..d j ] (i = 6, d i = 3) (j = 6, d j = 3) (1) fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j ]) + ω del a 1 c 3 b 2 d 3 a 1 b 2 T 1 [l(i)..d i 1] T 2 [l(j)..d j ] edit script: ins(d 3 ), del(c 3 ) cost: = 2 Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
6 Example: First Recursive Formula (2/3) T 1 f 6 d 4 a 1 c 3 b 2 e 5 T 2 f 6 c 4 d 3 a 1 b 2 e 5 T 1 [l(i)..d i ] T 2 [l(j)..d j ] (i = 6, d i = 3) (j = 6, d j = 3) (2) fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j 1]) + ω ins a 1 c 3 b 2 d 3 a 1 b 2 T 1 [l(i)..d i 1] T 2 [l(j)..d j ] edit script: del(c 3 ), ins(d 3 ) cost: = 2 Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
7 Example: First Recursive Formula (3/3) (3) fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) + fdist(t 1 [l(d i )..d i 1], T 2 [l(d j )..d j 1]) + ω ren a 1 c 3 b 2 d 3 a 1 b 2 T 1 [l(i)..l(d i ) 1] T 1 [l(d i )..d i 1] T 2 [l(j)..l(d j ) 1] T 2 [l(d j )..d j 1] T 1 [l(i)..l(d i ) 1] T 2 [l(j)..l(d j ) 1]: del(a 1 ) T 1 [l(d i )..d i 1] T 2 [l(d j )..d j 1]: ins(a 1 ) c 3 d 3 : ren(c 3, d 3 ) cost: = 3 Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
8 Analogy to the String Case Why is the third formula not (in analogy to the string case): fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j 1]) + ω ren Consider the previous example: a 1 c 3 d 3 b 2 a 1 b 2 T 1 [l(i)..d i 1] T 2 [l(j)..d j 1] ren(c 3, d 3 ) does not transform T 1 [l(i)..d i ] to T 2 [l(j)..d j ] In fact the mapping M = {(a 1, a 1 ), (b 2, b 2 ), (c 3, d 3 )} is not valid: Connect all trees in the forest with a dummy node ( ): As d 3 is an ancestor of a 1, c 3 must be an ancestor of a 1, which is false. a 1 c 3 b 2 d 3 a 1 b 2 Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
9 Observation Tree Edit Distance (II) Observation about the First Recursive Formula: suggests bottom-up dynamic programming algorithm fdist(t 1 [l(d i )..d i 1], T 2 [l(d j )..d j 1]) [D] does not fit this scheme We derive the Second Recursive Formula: we distinguish two cases (both forests are trees/one forest is not a tree) in each case we replace term [D] by a new term that is easier to handle in a bottom up algorithm fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j ]) + ω del fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j 1]) + ω ins fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) = min fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) + fdist(t 1 [l(d i )..d i 1], T 2 [l(d j )..d j 1]) + ω ren Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
10 Second Recursive Formula: Forest Distance Lemma (Second Recursive Formula) Given two trees T 1 and T 2, i N(T 1 ) and d i desc(i), j N(T 2 ) and d j desc(j), then: (1) If l(i) = l(d i ) and l(j) = l(d j ), i.e., both forests are trees: fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j ]) + ω del fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) = min fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j 1]) + ω ins fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j 1]) + ω ren (2) If l(i) l(d i ) and/or l(j) l(d j ), i.e., one of the forests is not a tree: fdist(t 1 [l(i)..d i 1], T 2 [l(j)..d j ]) + ω del fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j 1]) + ω ins fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) = min fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) + fdist(t 1 [l(d i )..d i ], T 2 [l(d j )..d j ]) Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
11 Proof of the Second Recursive Formula Proof. (1) follows from the previous recursive formula for l(i) = l(d i ) and l(j) = l(d j ) as the following holds: fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) = fdist(, ) = 0. (2) The following inequation holds: [A] fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) [B] + fdist(t 1 [l(d i )..d i ], T 2 [l(d j )..d j ]) [C] fdist(t 1 [l(i)..l(d i ) 1], T 2 [l(j)..l(d j ) 1]) [B] + fdist(t 1 [l(d i )..d i 1], T 2 [l(d j )..d j 1]) [D] + ω ren A B + C as the left-hand side is the minimal cost mapping, while the right-hand side is a particular case with a possibly sub-optimal mapping. C D + ω ren holds for the same reason. As we are looking for the minimum distance, we can substitute D + ω ren by C. Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
12 Illustration: Proof of the Second Recursive Formula (1/2) Case (1): l(i) = l(d i ) and l(j) = l(d j ): i j d i d j l(i) = l(d i ) l(j) = l(d j ) T 1 [l(i)..l(d i ) 1] T 1 [l(d i )..d i 1] T 2 [l(j)..l(d j ) 1] T 2 [l(d j )..d j 1] Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
13 Illustration: Proof of the Second Recursive Formula (2/2) Case (2): l(i) l(d i ) and/or l(j) l(d j ): i j d i d j l(i) l(d i ) T 1 [l(i)..l(d i ) 1] T 1 [l(d i )..d i 1] l(j) l(d j ) T 2 [l(j)..l(d j ) 1] T 2 [l(d j )..d j 1] Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
14 Implications by the Second Recursive Formula Note: fdist(t 1 [l(d i )..d i ], T 2 [l(d j )..d j ] is the tree edit distance between the subtrees rooted in T[d i ] and T[d j ]. We use the following notation: treedist(d i, d j ) = fdist(t 1 [l(d i )..d i ], T 2 [l(d j )..d j ]) Dynamic Programming: As the same sub-problem must be solved many times, we use a dynamic programming approach. Bottom-Up: As for the computation of the tree distance treedist(i, j) we need almost all values treedist(d i, d j ) (d i desc(t 1 [i]), d j desc(t 1 [j])), we use a bottom-up approach. Key Roots: If d i is on the path from l(i) to T 1 [i] and d j is on the path from l(j) to T 2 [j], then treedist(d i, d j ) is computed as a byproduct of treedist(i, j). We call the nodes that are not computed as a byproducts the key roots. Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
15 Key Roots Tree Edit Distance (II) Definition (Key Root) The set of key roots of a tree T is defined as kr(t) = {k N(T) k N(T) : k > k and l(k) = l(k )} Alternative definition: A key root is a node of T that either has a left sibling or is the root of T. Example: kr(t) = {3, 5, 6} f 6 d 4 e 5 a 1 c 3 b 2 Only subtrees rooted in a key root need a separate computation. The number of key roots is equal to the number of leaves in the tree. Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
16 The Edit Distance Algorithm I/II The Tree Edit Distance Algorithm tree-edit-dist(t 1, T 2 ) td[1.. T 1, 1.. T 2 ] : empty array for tree distances; l 1 = lmld(root(t 1 )); kr 1 = kr(l 1, leaves(t 1 ) ); l 2 = lmld(root(t 2 )); kr 2 = kr(l 2, leaves(t 2 ) ); for x = 1 to kr 1 do for y = 1 to kr 2 do forest-dist(kr 1 [x], kr 2 [y], l 1, l 2, td); l 1 is an array of size T 1, l 1 [i] is the leftmost leaf descendant of node i; l 2 is the analog for T 2 (detailed algorithm lmld(.) follows) kr 1 is an array that contains all the key roots of T 1 sorted in ascending order; kr 2 is the analog for T 2 (detailed algorithm kr(.,.) follows) Algorithm and lemmas by [ZS89] (see also [AG97]) Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
17 The Edit Distance Algorithm II/II The Tree Edit Distance Algorithm forest-dist(i, j, l 1, l 2, td) fd[l 1 [i] 1..i, l 2 [j] 1..j] : empty array; fd[l 1 [i] 1, l 2 [j] 1] = 0; for d i = l 1 [i] to i do fd[d i, l 2 [j] 1] = fd[d i 1, l 2 [j] 1] + ω del ; for d j = l 2 [j] to j do fd[l 1 [i] 1, d j ] = fd[l 1 [i] 1, d j 1] + ω ins ; for d i = l 1 [i] to i do for d j = l 2 [j] to j do if l 1 [d i ] = l 1 [i] and l 2 [d j ] = l 2 [j] then fd[d i, d j ] = min(fd[d i 1, d j ] + ω del, fd[d i, d j 1] + ω ins, fd[d i 1, d j 1] + ω ren ); td[d i, d j ] = f [d i, d j ]; else fd[d i, d j ] = min(fd[d i 1, d j ] + ω del, fd[d i, d j 1] + ω ins, fd[l 1 [d i ] 1, l 2 [d j ] 1] + td[d i, d j ]); Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
18 The Tree Edit Distance Algorithm The Temporary Forest Distance Matrix fd[d i, d j ] contains the forest distance between T 1 [l(i)..d i ], where d i desc(t 1 [i]) and T 2 [l(j)..d j ], where d j desc(t 2 [j]). d j d i = T 1 [l(i)..l(i) 1] T 1 [l(i)..l(i)] T 1 [l(i)..i] T 2 [l(j)..l(j) 1] = T 2 [l(j)..l(j)] T 2 [l(j)..j] fdist(t 1 [l(i)..d i ], T 2 [l(j)..d j ]) fd is temporary and exists only in forest-dist() Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
19 The Tree Distance Matrix The Tree Edit Distance Algorithm td[i][j] stores the tree edit distance between the tree rooted in T 1 [i] (i.e., T 1 [l(i)..i]) and the tree rooted in T 2 [j] (i.e., T 2 [l(j)..j]). each call of forest-dist() fills new values into td td[ T 1, T 2 ] stores the tree edit distance between T 1 and T 2 Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
20 Summary Conclusion Tree Edit Distance recursive formula for the edit distance dynamic programming algorithm Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
21 What s Next? Conclusion Tree Edit Distance (III): correctness and complexity of the algorithm edit distance variants Lower Bounds for the Edit Distance Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
22 Alberto Apostolico and Zvi Galill, editors. Pattern Matching Algorithms, chapter Tree Pattern Matching, pages Oxford University Press, Kaizhong Zhang and Dennis Shasha. Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing, 18(6): , Nikolaus Augsten (DIS) Approximation: Theory and Algorithms Unit 7 April 17, / 23
Approximation: Theory and Algorithms
Approimation: Theor and Algorithms Edit Distance Compleit, Upper and Lower Bounds Nikolaus Augsten Free Universit of Bozen-Bolzano Facult of Computer Science DIS Unit 8 April, 009 Nikolaus Augsten (DIS)
More informationOutline. Approximation: Theory and Algorithms. Motivation. Outline. The String Edit Distance. Nikolaus Augsten. Unit 2 March 6, 2009
Outline Approximation: Theory and Algorithms The Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 2 March 6, 2009 1 Nikolaus Augsten (DIS) Approximation: Theory and
More informationApproximation: Theory and Algorithms
Approximation: Theory and Algorithms The String Edit Distance Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 2 March 6, 2009 Nikolaus Augsten (DIS) Approximation:
More informationSimilarity Search. The String Edit Distance. Nikolaus Augsten. Free University of Bozen-Bolzano Faculty of Computer Science DIS. Unit 2 March 8, 2012
Similarity Search The String Edit Distance Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 2 March 8, 2012 Nikolaus Augsten (DIS) Similarity Search Unit 2 March 8,
More informationOutline. Similarity Search. Outline. Motivation. The String Edit Distance
Outline Similarity Search The Nikolaus Augsten nikolaus.augsten@sbg.ac.at Department of Computer Sciences University of Salzburg 1 http://dbresearch.uni-salzburg.at WS 2017/2018 Version March 12, 2018
More informationSimilarity Search. The String Edit Distance. Nikolaus Augsten.
Similarity Search The String Edit Distance Nikolaus Augsten nikolaus.augsten@sbg.ac.at Dept. of Computer Sciences University of Salzburg http://dbresearch.uni-salzburg.at Version October 18, 2016 Wintersemester
More informationTASM: Top-k Approximate Subtree Matching
TASM: Top-k Approximate Subtree Matching Nikolaus Augsten 1 Denilson Barbosa 2 Michael Böhlen 3 Themis Palpanas 4 1 Free University of Bozen-Bolzano, Italy augsten@inf.unibz.it 2 University of Alberta,
More informationFundamental Algorithms
Fundamental Algorithms Chapter 5: Searching Michael Bader Winter 2014/15 Chapter 5: Searching, Winter 2014/15 1 Searching Definition (Search Problem) Input: a sequence or set A of n elements (objects)
More informationChapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn
Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn Priority Queue / Heap Stores (key,data) pairs (like dictionary) But, different set of operations: Initialize-Heap: creates new empty heap
More informationAsurvey on tree edit distance and related problems
Theoretical Computer Science 337 (2005) 217 239 www.elsevier.com/locate/tcs Asurvey on tree edit distance and related problems Philip Bille The IT University of Copenhagen, Rued Langgardsvej 7, DK-2300
More informationAnalysis of tree edit distance algorithms
Analysis of tree edit distance algorithms Serge Dulucq 1 and Hélène Touzet 1 LaBRI - Universit Bordeaux I 33 0 Talence cedex, France Serge.Dulucq@labri.fr LIFL - Universit Lille 1 9 6 Villeneuve d Ascq
More informationDecomposition algorithms for the tree edit distance problem
Decomposition algorithms for the tree edit distance problem Serge Dulucq LaBRI - UMR CNRS 5800 - Université Bordeaux I 05 Talence cedex, France Serge.Dulucq@labri.fr Hélène Touzet LIFL - UMR CNRS 80 -
More informationTrees. A tree is a graph which is. (a) Connected and. (b) has no cycles (acyclic).
Trees A tree is a graph which is (a) Connected and (b) has no cycles (acyclic). 1 Lemma 1 Let the components of G be C 1, C 2,..., C r, Suppose e = (u, v) / E, u C i, v C j. (a) i = j ω(g + e) = ω(g).
More informationAssignment 5: Solutions
Comp 21: Algorithms and Data Structures Assignment : Solutions 1. Heaps. (a) First we remove the minimum key 1 (which we know is located at the root of the heap). We then replace it by the key in the position
More informationDynamic Programming: Shortest Paths and DFA to Reg Exps
CS 374: Algorithms & Models of Computation, Fall 205 Dynamic Programming: Shortest Paths and DFA to Reg Exps Lecture 7 October 22, 205 Chandra & Manoj (UIUC) CS374 Fall 205 / 54 Part I Shortest Paths with
More informationSpace Efficient Algorithms for Ordered Tree Comparison
Algorithmica (2008) 51: 283 297 DOI 10.1007/s00453-007-9100-z Space Efficient Algorithms for Ordered Tree Comparison Lusheng Wang Kaizhong Zhang Received: 27 January 2006 / Accepted: 27 June 2006 / Published
More informationData Structures in Java
Data Structures in Java Lecture 20: Algorithm Design Techniques 12/2/2015 Daniel Bauer 1 Algorithms and Problem Solving Purpose of algorithms: find solutions to problems. Data Structures provide ways of
More informationAn Optimal Decomposition Algorithm for Tree Edit Distance
An Optimal Decomposition Algorithm for Tree Edit Distance ERIK D. DEMAINE Massachusetts Institute of Technology and SHAY MOZES Brown University and BENJAMIN ROSSMAN Massachusetts Institute of Technology
More informationOutline. Approximation: Theory and Algorithms. Application Scenario. 3 The q-gram Distance. Nikolaus Augsten. Definition and Properties
Outline Approximation: Theory and Algorithms Nikolaus Augsten Free University of Bozen-Bolzano Faculty of Computer Science DIS Unit 3 March 13, 2009 2 3 Nikolaus Augsten (DIS) Approximation: Theory and
More informationEach internal node v with d(v) children stores d 1 keys. k i 1 < key in i-th sub-tree k i, where we use k 0 = and k d =.
7.5 (a, b)-trees 7.5 (a, b)-trees Definition For b a an (a, b)-tree is a search tree with the following properties. all leaves have the same distance to the root. every internal non-root vertex v has at
More informationData Structures and Algorithms " Search Trees!!
Data Structures and Algorithms " Search Trees!! Outline" Binary Search Trees! AVL Trees! (2,4) Trees! 2 Binary Search Trees! "! < 6 2 > 1 4 = 8 9 Ordered Dictionaries" Keys are assumed to come from a total
More informationBinary Search Trees. Motivation
Binary Search Trees Motivation Searching for a particular record in an unordered list takes O(n), too slow for large lists (databases) If the list is ordered, can use an array implementation and use binary
More informationSearch Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM
Search Trees Chapter 1 < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 3 - Binary Search Trees Ø A binary search
More informationModels of Computation. by Costas Busch, LSU
Models of Computation by Costas Busch, LSU 1 Computation CPU memory 2 temporary memory input memory CPU output memory Program memory 3 Example: f ( x) x 3 temporary memory input memory Program memory compute
More informationFunctional Data Structures
Functional Data Structures with Isabelle/HOL Tobias Nipkow Fakultät für Informatik Technische Universität München 2017-2-3 1 Part II Functional Data Structures 2 Chapter 1 Binary Trees 3 1 Binary Trees
More informationSkylines. Yufei Tao. ITEE University of Queensland. INFS4205/7205, Uni of Queensland
Yufei Tao ITEE University of Queensland Today we will discuss problems closely related to the topic of multi-criteria optimization, where one aims to identify objects that strike a good balance often optimal
More informationDynamic Programming: Shortest Paths and DFA to Reg Exps
CS 374: Algorithms & Models of Computation, Spring 207 Dynamic Programming: Shortest Paths and DFA to Reg Exps Lecture 8 March 28, 207 Chandra Chekuri (UIUC) CS374 Spring 207 / 56 Part I Shortest Paths
More informationSIGNAL COMPRESSION Lecture 7. Variable to Fix Encoding
SIGNAL COMPRESSION Lecture 7 Variable to Fix Encoding 1. Tunstall codes 2. Petry codes 3. Generalized Tunstall codes for Markov sources (a presentation of the paper by I. Tabus, G. Korodi, J. Rissanen.
More informationINF2220: algorithms and data structures Series 1
Universitetet i Oslo Institutt for Informatikk I. Yu, D. Karabeg INF2220: algorithms and data structures Series 1 Topic Function growth & estimation of running time, trees (Exercises with hints for solution)
More informationAn Algorithm for Computing the Invariant Distance from Word Position
An Algorithm for Computing the Invariant Distance from Word Position Sergio Luán-Mora 1 Departamento de Lenguaes y Sistemas Informáticos, Universidad de Alicante, Campus de San Vicente del Raspeig, Ap.
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationOutline. 1 Merging. 2 Merge Sort. 3 Complexity of Sorting. 4 Merge Sort and Other Sorts 2 / 10
Merge Sort 1 / 10 Outline 1 Merging 2 Merge Sort 3 Complexity of Sorting 4 Merge Sort and Other Sorts 2 / 10 Merging Merge sort is based on a simple operation known as merging: combining two ordered arrays
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationOptimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs
Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs Krishnendu Chatterjee Rasmus Ibsen-Jensen Andreas Pavlogiannis IST Austria Abstract. We consider graphs with n nodes together
More informationarxiv: v1 [cs.ds] 27 Mar 2017
Tree Edit Distance Cannot be Computed in Strongly Subcubic Time (unless APSP can) Karl Bringmann Paweł Gawrychowski Shay Mozes Oren Weimann arxiv:1703.08940v1 [cs.ds] 27 Mar 2017 Abstract The edit distance
More informationINF4130: Dynamic Programming September 2, 2014 DRAFT version
INF4130: Dynamic Programming September 2, 2014 DRAFT version In the textbook: Ch. 9, and Section 20.5 Chapter 9 can also be found at the home page for INF4130 These slides were originally made by Petter
More informationIn-Class Soln 1. CS 361, Lecture 4. Today s Outline. In-Class Soln 2
In-Class Soln 1 Let f(n) be an always positive function and let g(n) = f(n) log n. Show that f(n) = o(g(n)) CS 361, Lecture 4 Jared Saia University of New Mexico For any positive constant c, we want to
More informationTree Edit Distance Cannot be Computed in Strongly Subcubic Time (unless APSP can)
Tree Edit Distance Cannot be Computed in Strongly Subcubic Time (unless APSP can) Karl Bringmann Paweł Gawrychowski Shay Mozes Oren Weimann Abstract The edit distance between two rooted ordered trees with
More informationR u t c o r Research R e p o r t. Relations of Threshold and k-interval Boolean Functions. David Kronus a. RRR , April 2008
R u t c o r Research R e p o r t Relations of Threshold and k-interval Boolean Functions David Kronus a RRR 04-2008, April 2008 RUTCOR Rutgers Center for Operations Research Rutgers University 640 Bartholomew
More informationSearch Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015
Search Trees < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Learning Outcomes Ø From this lecture, you should be able to: q Define the properties of a binary search
More informationData Structures and Algorithms
Data Structures and Algorithms Spring 2017-2018 Outline 1 Sorting Algorithms (contd.) Outline Sorting Algorithms (contd.) 1 Sorting Algorithms (contd.) Analysis of Quicksort Time to sort array of length
More informationContext-Free Languages
CS:4330 Theory of Computation Spring 2018 Context-Free Languages Non-Context-Free Languages Haniel Barbosa Readings for this lecture Chapter 2 of [Sipser 1996], 3rd edition. Section 2.3. Proving context-freeness
More informationThe null-pointers in a binary search tree are replaced by pointers to special null-vertices, that do not carry any object-data
Definition 1 A red black tree is a balanced binary search tree in which each internal node has two children. Each internal node has a color, such that 1. The root is black. 2. All leaf nodes are black.
More informationBinary Decision Diagrams. Graphs. Boolean Functions
Binary Decision Diagrams Graphs Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant
More informationPropositional and Predicate Logic - IV
Propositional and Predicate Logic - IV Petr Gregor KTIML MFF UK ZS 2015/2016 Petr Gregor (KTIML MFF UK) Propositional and Predicate Logic - IV ZS 2015/2016 1 / 19 Tableau method (from the previous lecture)
More informationProblem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26
Binary Search Introduction Problem Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Strategy 1: Random Search Randomly select a page until the page containing
More informationReduction of Ultrametric Minimum Cost Spanning Tree Games to Cost Allocation Games on Rooted Trees. Kazutoshi ANDO and Shinji KATO.
RIMS-1674 Reduction of Ultrametric Minimum Cost Spanning Tree Games to Cost Allocation Games on Rooted Trees By Kazutoshi ANDO and Shinji KATO June 2009 RESEARCH INSTITUTE FOR MATHEMATICAL SCIENCES KYOTO
More informationGreedy Trees, Caterpillars, and Wiener-Type Graph Invariants
Georgia Southern University Digital Commons@Georgia Southern Mathematical Sciences Faculty Publications Mathematical Sciences, Department of 2012 Greedy Trees, Caterpillars, and Wiener-Type Graph Invariants
More informationGap Embedding for Well-Quasi-Orderings 1
WoLLIC 2003 Preliminary Version Gap Embedding for Well-Quasi-Orderings 1 Nachum Dershowitz 2 and Iddo Tzameret 3 School of Computer Science, Tel-Aviv University, Tel-Aviv 69978, Israel Abstract Given a
More information1. Introduction Bottom-Up-Heapsort is a variant of the classical Heapsort algorithm due to Williams ([Wi64]) and Floyd ([F64]) and was rst presented i
A Tight Lower Bound for the Worst Case of Bottom-Up-Heapsort 1 by Rudolf Fleischer 2 Keywords : heapsort, bottom-up-heapsort, tight lower bound ABSTRACT Bottom-Up-Heapsort is a variant of Heapsort. Its
More informationarxiv: v1 [math.co] 22 Jan 2013
NESTED RECURSIONS, SIMULTANEOUS PARAMETERS AND TREE SUPERPOSITIONS ABRAHAM ISGUR, VITALY KUZNETSOV, MUSTAZEE RAHMAN, AND STEPHEN TANNY arxiv:1301.5055v1 [math.co] 22 Jan 2013 Abstract. We apply a tree-based
More informationHW #4. (mostly by) Salim Sarımurat. 1) Insert 6 2) Insert 8 3) Insert 30. 4) Insert S a.
HW #4 (mostly by) Salim Sarımurat 04.12.2009 S. 1. 1. a. 1) Insert 6 2) Insert 8 3) Insert 30 4) Insert 40 2 5) Insert 50 6) Insert 61 7) Insert 70 1. b. 1) Insert 12 2) Insert 29 3) Insert 30 4) Insert
More informationRed-black Trees of smlnj Studienarbeit. Angelika Kimmig
Red-black Trees of smlnj Studienarbeit Angelika Kimmig January 26, 2004 Contents 1 Introduction 1 2 Development of the Isabelle/HOL Theory RBT 2 2.1 Translation of Datatypes and Functions into HOL......
More informationAVL Trees. Manolis Koubarakis. Data Structures and Programming Techniques
AVL Trees Manolis Koubarakis 1 AVL Trees We will now introduce AVL trees that have the property that they are kept almost balanced but not completely balanced. In this way we have O(log n) search time
More informationSorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort
Sorting Algorithms We have already seen: Selection-sort Insertion-sort Heap-sort We will see: Bubble-sort Merge-sort Quick-sort We will show that: O(n log n) is optimal for comparison based sorting. Bubble-Sort
More information8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm
8 Priority Queues 8 Priority Queues A Priority Queue S is a dynamic set data structure that supports the following operations: S. build(x 1,..., x n ): Creates a data-structure that contains just the elements
More informationFundamental Algorithms
Fundamental Algithms Chapter 4: AVL Trees Jan Křetínský Winter 2016/17 Chapter 4: AVL Trees, Winter 2016/17 1 Part I AVL Trees (Adelson-Velsky and Landis, 1962) Chapter 4: AVL Trees, Winter 2016/17 2 Binary
More informationRandomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th
CSE 3500 Algorithms and Complexity Fall 2016 Lecture 10: September 29, 2016 Quick sort: Average Run Time In the last lecture we started analyzing the expected run time of quick sort. Let X = k 1, k 2,...,
More informationParikh s theorem. Håkan Lindqvist
Parikh s theorem Håkan Lindqvist Abstract This chapter will discuss Parikh s theorem and provide a proof for it. The proof is done by induction over a set of derivation trees, and using the Parikh mappings
More informationBinary Search Trees. Lecture 29 Section Robb T. Koether. Hampden-Sydney College. Fri, Apr 8, 2016
Binary Search Trees Lecture 29 Section 19.2 Robb T. Koether Hampden-Sydney College Fri, Apr 8, 2016 Robb T. Koether (Hampden-Sydney College) Binary Search Trees Fri, Apr 8, 2016 1 / 40 1 Binary Search
More informationOptimal Color Range Reporting in One Dimension
Optimal Color Range Reporting in One Dimension Yakov Nekrich 1 and Jeffrey Scott Vitter 1 The University of Kansas. yakov.nekrich@googlemail.com, jsv@ku.edu Abstract. Color (or categorical) range reporting
More informationAlgorithm Design and Analysis
Algorithm Design and Analysis LECTURE 5 Greedy Algorithms Interval Scheduling Interval Partitioning Guest lecturer: Martin Furer Review In a DFS tree of an undirected graph, can there be an edge (u,v)
More informationNotes on Logarithmic Lower Bounds in the Cell Probe Model
Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)
More informationAlgorithms for Variable Length Subnet Address Assignment. Abstract
Algorithms for Variable Length Subnet Address Assignment Mikhail J. Atallah (Fellow) COAST Laboratory and Department of Computer Sciences Purdue University West Lafayette, IN 47907 U.S.A. mja@cs.purdue.edu
More informationB- TREE. Michael Tsai 2017/06/06
B- TREE Michael Tsai 2017/06/06 2 B- Tree Overview Balanced search tree Very large branching factor Height = O(log n), but much less than that of RB tree Usage: Large amount of data to be stored - - Partially
More informationCOS597D: Information Theory in Computer Science October 19, Lecture 10
COS597D: Information Theory in Computer Science October 9, 20 Lecture 0 Lecturer: Mark Braverman Scribe: Andrej Risteski Kolmogorov Complexity In the previous lectures, we became acquainted with the concept
More informationBiased Quantiles. Flip Korn Graham Cormode S. Muthukrishnan
Biased Quantiles Graham Cormode cormode@bell-labs.com S. Muthukrishnan muthu@cs.rutgers.edu Flip Korn flip@research.att.com Divesh Srivastava divesh@research.att.com Quantiles Quantiles summarize data
More informationBinary Search Trees. Dictionary Operations: Additional operations: find(key) insert(key, value) erase(key)
Binary Search Trees Dictionary Operations: find(key) insert(key, value) erase(key) Additional operations: ascend() get(index) (indexed binary search tree) delete(index) (indexed binary search tree) Complexity
More informationLower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine)
Lower Bounds for Dynamic Connectivity (2004; Pǎtraşcu, Demaine) Mihai Pǎtraşcu, MIT, web.mit.edu/ mip/www/ Index terms: partial-sums problem, prefix sums, dynamic lower bounds Synonyms: dynamic trees 1
More informationCPSC 320 Sample Final Examination December 2013
CPSC 320 Sample Final Examination December 2013 [10] 1. Answer each of the following questions with true or false. Give a short justification for each of your answers. [5] a. 6 n O(5 n ) lim n + This is
More informationEasy Shortcut Definitions
This version Mon Dec 12 2016 Easy Shortcut Definitions If you read and understand only this section, you ll understand P and NP. A language L is in the class P if there is some constant k and some machine
More information/463 Algorithms - Fall 2013 Solution to Assignment 4
600.363/463 Algorithms - Fall 2013 Solution to Assignment 4 (30+20 points) I (10 points) This problem brings in an extra constraint on the optimal binary search tree - for every node v, the number of nodes
More information3 Greedy Algorithms. 3.1 An activity-selection problem
3 Greedy Algorithms [BB chapter 6] with different examples or [Par chapter 2.3] with different examples or [CLR2 chapter 16] with different approach to greedy algorithms 3.1 An activity-selection problem
More informationBinary Decision Diagrams
Binary Decision Diagrams Binary Decision Diagrams (BDDs) are a class of graphs that can be used as data structure for compactly representing boolean functions. BDDs were introduced by R. Bryant in 1986.
More informationOutline. Computer Science 331. Cost of Binary Search Tree Operations. Bounds on Height: Worst- and Average-Case
Outline Computer Science Average Case Analysis: Binary Search Trees Mike Jacobson Department of Computer Science University of Calgary Lecture #7 Motivation and Objective Definition 4 Mike Jacobson (University
More informationFind an Element x in an Unsorted Array
Find an Element x in an Unsorted Array What if we try to find a lower bound for the case where the array is not necessarily sorted? J.-L. De Carufel (U. of O.) Design & Analysis of Algorithms Fall 2017
More informationComputer Science & Engineering 423/823 Design and Analysis of Algorithms
Computer Science & Engineering 423/823 Design and Analysis of s Lecture 09 Dynamic Programming (Chapter 15) Stephen Scott (Adapted from Vinodchandran N. Variyam) 1 / 41 Spring 2010 Dynamic programming
More informationAdvanced Implementations of Tables: Balanced Search Trees and Hashing
Advanced Implementations of Tables: Balanced Search Trees and Hashing Balanced Search Trees Binary search tree operations such as insert, delete, retrieve, etc. depend on the length of the path to the
More informationComputer Science & Engineering 423/823 Design and Analysis of Algorithms
Computer Science & Engineering 423/823 Design and Analysis of Algorithms Lecture 03 Dynamic Programming (Chapter 15) Stephen Scott and Vinodchandran N. Variyam sscott@cse.unl.edu 1/44 Introduction Dynamic
More informationCS 4407 Algorithms Lecture: Shortest Path Algorithms
CS 440 Algorithms Lecture: Shortest Path Algorithms Prof. Gregory Provan Department of Computer Science University College Cork 1 Outline Shortest Path Problem General Lemmas and Theorems. Algorithms Bellman-Ford
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationComputational Models - Lecture 4 1
Computational Models - Lecture 4 1 Handout Mode Iftach Haitner and Yishay Mansour. Tel Aviv University. April 3/8, 2013 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice
More information10-704: Information Processing and Learning Fall Lecture 10: Oct 3
0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 0: Oct 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy of
More informationTransforming Hierarchical Trees on Metric Spaces
CCCG 016, Vancouver, British Columbia, August 3 5, 016 Transforming Hierarchical Trees on Metric Spaces Mahmoodreza Jahanseir Donald R. Sheehy Abstract We show how a simple hierarchical tree called a cover
More information7.3 AVL-Trees. Definition 15. Lemma 16. AVL-trees are binary search trees that fulfill the following balance condition.
Definition 15 AVL-trees are binary search trees that fulfill the following balance condition. F eery node height(left sub-tree()) height(right sub-tree()) 1. Lemma 16 An AVL-tree of height h contains at
More informationLecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018
CS17 Integrated Introduction to Computer Science Klein Contents Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018 1 Tree definitions 1 2 Analysis of mergesort using a binary tree 1 3 Analysis of
More informationComputational Models - Lecture 5 1
Computational Models - Lecture 5 1 Handout Mode Iftach Haitner. Tel Aviv University. November 28, 2016 1 Based on frames by Benny Chor, Tel Aviv University, modifying frames by Maurice Herlihy, Brown University.
More informationCS Data Structures and Algorithm Analysis
CS 483 - Data Structures and Algorithm Analysis Lecture VII: Chapter 6, part 2 R. Paul Wiegand George Mason University, Department of Computer Science March 22, 2006 Outline 1 Balanced Trees 2 Heaps &
More informationHeight, Size Performance of Complete and Nearly Complete Binary Search Trees in Dictionary Applications
Height, Size Performance of Complete and Nearly Complete Binary Search Trees in Dictionary Applications AHMED TAREK Department of Math and Computer Science California University of Pennsylvania Eberly
More informationData Structures and and Algorithm Xiaoqing Zheng
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn Trees (max heap) 1 16 2 3 14 10 4 5 6 7 8 7 9 3 8 9 10 2 4 1 PARENT(i) return i /2 LEFT(i) return 2i RIGHT(i) return 2i +1 16 14 10 8 7
More information13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks
13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks Average-case analysis Average-case analysis of algorithms and data structures: Input is generated according to a known probability
More informationAmortized analysis. Amortized analysis
In amortized analysis the goal is to bound the worst case time of a sequence of operations on a data-structure. If n operations take T (n) time (worst case), the amortized cost of an operation is T (n)/n.
More informationCS60007 Algorithm Design and Analysis 2018 Assignment 1
CS60007 Algorithm Design and Analysis 2018 Assignment 1 Palash Dey and Swagato Sanyal Indian Institute of Technology, Kharagpur Please submit the solutions of the problems 6, 11, 12 and 13 (written in
More information1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument
1 Basic Definitions A Problem is a relation from input to acceptable output. For example, INPUT: A list of integers x 1,..., x n OUTPUT: One of the three smallest numbers in the list An algorithm A solves
More informationCS 410/584, Algorithm Design & Analysis: Lecture Notes 3
CS 410/584, : Lecture Notes 3 Amortized Cost Not an algorithm design mechanism per se For analyzing cost of algorithms - - Example: Exercise 17.3-6 Implement a queue with two stacks Why? 1 enqueue(x, Q)
More informationPhylogenetic Reconstruction with Insertions and Deletions
Phylogenetic Reconstruction with Insertions and Deletions [Full Version] Alexandr Andoni Mark Braverman Princeton University Avinatan Hassidim Bar-Ilan University October 19, 2014 Abstract We study phylogenetic
More information1 Approximate Quantiles and Summaries
CS 598CSC: Algorithms for Big Data Lecture date: Sept 25, 2014 Instructor: Chandra Chekuri Scribe: Chandra Chekuri Suppose we have a stream a 1, a 2,..., a n of objects from an ordered universe. For simplicity
More informationLet S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT
More informationMETA: An Efficient Matching-Based Method for Error-Tolerant Autocompletion
: An Efficient Matching-Based Method for Error-Tolerant Autocompletion Dong Deng Guoliang Li He Wen H. V. Jagadish Jianhua Feng Department of Computer Science, Tsinghua National Laboratory for Information
More informationLecture 1 : Data Compression and Entropy
CPS290: Algorithmic Foundations of Data Science January 8, 207 Lecture : Data Compression and Entropy Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will study a simple model for
More information