Disjoint set (Union-Find)

Similar documents
w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

Skip Lists. Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 S 3 S S 1

CS 270 Algorithms. Oliver Kullmann. Growth of Functions. Divide-and- Conquer Min-Max- Problem. Tutorial. Reading from CLRS for week 2

Classification of problem & problem solving strategies. classification of time complexities (linear, logarithmic etc)

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Math 155 (Lecture 3)

6 Integers Modulo n. integer k can be written as k = qn + r, with q,r, 0 r b. So any integer.

CS / MCS 401 Homework 3 grader solutions

4.3 Growth Rates of Solutions to Recurrences

CS:3330 (Prof. Pemmaraju ): Assignment #1 Solutions. (b) For n = 3, we will have 3 men and 3 women with preferences as follows: m 1 : w 3 > w 1 > w 2

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

MA131 - Analysis 1. Workbook 2 Sequences I

(A sequence also can be thought of as the list of function values attained for a function f :ℵ X, where f (n) = x n for n 1.) x 1 x N +k x N +4 x 3

Chapter 0. Review of set theory. 0.1 Sets

Axis Aligned Ellipsoid

Axioms of Measure Theory

CS322: Network Analysis. Problem Set 2 - Fall 2009

Lecture 2: April 3, 2013

Math 475, Problem Set #12: Answers

Sequences I. Chapter Introduction

# fixed points of g. Tree to string. Repeatedly select the leaf with the smallest label, write down the label of its neighbour and remove the leaf.

lim za n n = z lim a n n.

An Introduction to Randomized Algorithms

Statistics 511 Additional Materials

IP Reference guide for integer programming formulations.

Seunghee Ye Ma 8: Week 5 Oct 28

Lecture 10 October Minimaxity and least favorable prior sequences

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

The minimum value and the L 1 norm of the Dirichlet kernel

Discrete probability distributions

Recurrence Relations

Integrable Functions. { f n } is called a determining sequence for f. If f is integrable with respect to, then f d does exist as a finite real number

NUMERICAL METHODS FOR SOLVING EQUATIONS

CSI 2101 Discrete Structures Winter Homework Assignment #4 (100 points, weight 5%) Due: Thursday, April 5, at 1:00pm (in lecture)

62. Power series Definition 16. (Power series) Given a sequence {c n }, the series. c n x n = c 0 + c 1 x + c 2 x 2 + c 3 x 3 +

Random Models. Tusheng Zhang. February 14, 2013

On a Smarandache problem concerning the prime gaps

1 Hash tables. 1.1 Implementation

Summary: CORRELATION & LINEAR REGRESSION. GC. Students are advised to refer to lecture notes for the GC operations to obtain scatter diagram.

Beurling Integers: Part 2

Sequences. Notation. Convergence of a Sequence

6. Uniform distribution mod 1

Bertrand s Postulate

Linear chord diagrams with long chords

HOMEWORK 2 SOLUTIONS

1 Approximating Integrals using Taylor Polynomials

CS284A: Representations and Algorithms in Molecular Biology

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

Mathematical Induction

If a subset E of R contains no open interval, is it of zero measure? For instance, is the set of irrationals in [0, 1] is of measure zero?

CS583 Lecture 02. Jana Kosecka. some materials here are based on E. Demaine, D. Luebke slides

MA131 - Analysis 1. Workbook 3 Sequences II

INTEGRATION BY PARTS (TABLE METHOD)

Disjoint Sets { 9} { 1} { 11} Disjoint Sets (cont) Operations. Disjoint Sets (cont) Disjoint Sets (cont) n elements

Chapter 4. Fourier Series

Test One (Answer Key)

MDIV. Multiple divisor functions

f(x) dx as we do. 2x dx x also diverges. Solution: We compute 2x dx lim

Lecture 2 February 8, 2016

Square-Congruence Modulo n

1 Generating functions for balls in boxes

Discrete Mathematics for CS Spring 2005 Clancy/Wagner Notes 21. Some Important Distributions

Math 113, Calculus II Winter 2007 Final Exam Solutions

The Random Walk For Dummies

SECTION 1.5 : SUMMATION NOTATION + WORK WITH SEQUENCES

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 22

Measure and Measurable Functions

Injections, Surjections, and the Pigeonhole Principle

x c the remainder is Pc ().

Lecture 4 February 16, 2016

Bayesian Methods: Introduction to Multi-parameter Models

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

Analysis of Algorithms. Introduction. Contents

NICK DUFRESNE. 1 1 p(x). To determine some formulas for the generating function of the Schröder numbers, r(x) = a(x) =

Lecture Notes for Analysis Class

CHAPTER I: Vector Spaces

Quantum Computing Lecture 7. Quantum Factoring

Infinite Sequences and Series

Lecture 14: Graph Entropy

Math F215: Induction April 7, 2013

The multiplicative structure of finite field and a construction of LRC

Lecture 2 Long paths in random graphs

Optimally Sparse SVMs

CHAPTER 5. Theory and Solution Using Matrix Techniques


Kinetics of Complex Reactions

Application to Random Graphs

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

Roberto s Notes on Series Chapter 2: Convergence tests Section 7. Alternating series

The Binomial Theorem

Math 299 Supplement: Real Analysis Nov 2013

SEQUENCES AND SERIES

LONG SNAKES IN POWERS OF THE COMPLETE GRAPH WITH AN ODD NUMBER OF VERTICES

It is often useful to approximate complicated functions using simpler ones. We consider the task of approximating a function by a polynomial.

L = n i, i=1. dp p n 1

You may work in pairs or purely individually for this assignment.

Intermediate Math Circles November 4, 2009 Counting II

CSE 4095/5095 Topics in Big Data Analytics Spring 2017; Homework 1 Solutions

Transcription:

CS124 Lecture 7 Fall 2018 Disjoit set (Uio-Fid) For Kruskal s algorithm for the miimum spaig tree problem, we foud that we eeded a data structure for maitaiig a collectio of disjoit sets. That is, we eed a data structure that ca hadle the followig operatios: MAKESET(x) - create a ew set cotaiig the sigle elemet x UNION(x,y) - replace two sets cotaiig x ad y by their uio. FIND(x) - retur the ame of the set cotaiig the elemet x Naturally, this data structure is useful i other situatios, so we shall cosider its implemetatio i some detail. Withi our data structure, each set is represeted by a tree, so that each elemet poits to a paret i the tree. The root of each tree will poit to itself. I fact, we shall use the root of the tree as the ame of the set itself; hece the ame of each set is give by a caoical elemet, amely the root of the associated tree. It is coveiet to add a fourth operatio LINK(x,y) to the above, where we require for LINK that x ad y are two roots. LINK chages the paret poiter of oe of the roots, say x, ad makes it poit to y. It returs the root of the ow composite tree y. With this additio, we have UNION(x, y) = LINK(FIND(x),FIND(y)), so the mai problem is to arrage our data structure so that FIND operatios are very efficiet. Notice that the time to do a FIND operatio o a elemet correspods to its depth i the tree. Hece our goal is to keep the trees short. Two well-kow heuristics for keepig trees short i this settig are UNION BY RANK ad PATH COMPRESSION. We start with the UNION BY RANK heuristic. The idea of UNION BY RANK is to esure that whe we combie two trees, we try to keep the overall depth of the resultig tree small. This is implemeted as follows: the rak of a elemet x is iitialized to 0 by MAKESET. A elemet s rak is oly updated by the LINK operatio. If x ad y have the same rak r, the ivokig LINK(x,y) causes the paret poiter of x to be updated to poit to y, ad the rak of y is the updated to r + 1. O the other had, if x ad y have differet rak, the whe ivokig LINK(x,y) the paret poit of the elemet with smaller rak is updated to poit to the elemet with larger rak. The idea is that the rak of the root is associated with the depth of the tree, so this process keeps the depth small. (Exercise: Try some examples by had with ad without usig the UNION BY RANK heuristic.) 7-1

Lecture 7 7-2 The idea of PATH COMPRESSION is that, oce we perform a FIND o some elemet, we should adjust its paret poiter so that it poits directly to the root; that way, if we ever do aother FIND o it, we start out much closer to the root. Note that, util we do a FIND o a elemet, it might ot be worth the effort to update its paret poiter, sice we may ever access it at all. Oce we access a item, however, we must walk through every poiter to the root, so modifyig the poiters oly chages the cost of this walk by a costat factor. procedure MAKESET(x) p(x) := x rak(x) := 0 fuctio FIND(x) if x p(x) the p(x) := FIND(p(x)) retur(p(x)) fuctio LINK(x,y) if rak(x) > rak(y) the x y if rak(x) = rak(y) the rak(y) := rak(y) + 1 p(x) := y retur(y) procedure UNION(x,y) LINK(FIND(x),FIND(y)) I our aalysis, we show that ay sequece of m UNION ad FIND operatios o elemets take at most O((m + )log ) steps, where log is the umber of times you must iterate the log 2 fuctio o before gettig a umber less tha or equal to 1. (So log 4 = 2,log 16 = 3,log 65536 = 4.) We should ote that this is ot the tightest aalysis possible; however, this aalysis is already somewhat complex! Note that we are goig to do a amortized aalysis here. That is, we are goig to cosider the cost of the algorithm over a sequece of steps, istead of cosiderig the cost of a sigle operatio. I fact a sigle UNION or FIND operatio could require O(log ) operatios. (Exercise: Prove this!) Oly by cosiderig a etire sequece

Lecture 7 7-3 of operatios at oce ca obtai the above boud. Our argumet will require some iterestig accoutig to total the cost of a sequece of steps. We first make a few observatios about rak. if v p(v) the rak(p(v)) > rak(v) wheever p(v) is updated, rak(p(v)) icreases the umber of elemets with rak k is at most 2 k the umber of elemets with rak at least k is at most 2 k 1 The first two assertios are immediate from the descriptio of the algorithm. The third assertio follows from the fact that the rak of a elemet v chages oly if LINK(v,w) is executed, rak(v) = rak(w), ad v remais the root of the combied tree; i this case v s rak is icremeted by 1. A simple iductio the yields that whe rak(v) is icremeted to k, the resultig tree has at least 2 k elemets. The last assertio the follows from the third assertio, as j=k 2 j = 2 k 1. Exercise: Show that the maximum rak a item ca have is log. As soo as a elemet becomes a o-root, its rak is fixed. Let us divide the (o-root) elemets ito groups accordig to their raks. Group i cotais all elemets whose rak r satisfies log r = i. For example, elemets i group 3 have raks i the rage (4,16], ad the rage of raks associated with group i is (2 i 1,2 2i 1 ). For coveiece we shall write this more simply by sayig group (k,2 k ] to mea the group with these raks. It is easy to establish the followig assertios about these groups: The umber of distict groups is at most log. (Use the fact that the maximum rak is log.) The umber of elemets i the group (k,2 k ] is at most 2 k. Let us assig 2 k tokes to each elemet i group (k,2 k ]. The total umber of tokes assiged to all elemets from that group is the at most 2 k =, ad the total umber of groups is at most log, so the total umber of 2 k tokes give out is log. We use these tokes to accout for the work doe by FIND operatios. Recall that the umber of steps for a FIND operatio is proportioal to the umber of poiters that the FIND operatio must follow up the tree. We separate the poiters ito two groups, depig o the groups of u ad p(u) = v, as follows:

Lecture 7 7-4 Type 1: a poiter is of Type 1 if u ad v belog to differet groups, or v is the root. Type 2: a poiter is of Type 2 if u ad v belog to the same group. We accout for the two Types of poiters i two differet ways. Type 1 liks are charged directly to the FIND operatio; Type 2 liks are charged to u, who pays for the operatio usig oe of the tokes. Let us cosider these charges more carefully. The umber of Type 1 liks each FIND operatio goes through is at most log, sice there are oly log groups, ad the group umber icreases as we move up the tree. What about Type 2 liks? We charge these liks directly back to u, who is supposed to pay for them with a toke. Does u have eough tokes? The poit here is that each time a FIND operatio goes through a elemet u, its paret poiter is chaged to the curret root of the tree (by PATH COMPRESSION), so the rak of its paret icreases by at least 1. If u is i the group (k,2 k ], the the rak of u s paret ca icrease fewer tha 2 k times before it moves to a higher group. Therefore the 2 k tokes we assig to u are sufficiet to pay for all FIND operatios that go through u to a paret i the same group. We ow cout the total umber of steps for m UNION ad FIND operatios. Clearly LINK requires just O(1) steps, ad sice a UNION operatio is just a LINK ad 2 FIND operatios, it suffices to boud the time for at most 2m FIND OPERATIONS. Each FIND operatio is charged at most log for a total of O(mlog ). The total umber of tokes used at most log, ad each toke pays for a costat umber of steps. Therefore the total umber of steps is O((m + )log ). Let us give a more equatio-orieted explaatio. The total time spet over the course of m UNION ad FIND operatios is just We split this sum up ito two parts: (# liks passed through). (# liks i same group) + (# liks i differet groups). (Techically, the case where a lik goes to the root should be hadled explicitly; however, this is just O(m) liks i total, so we do t eed to worry!) The secod term is clearly O(mlog ). The first term ca be upper bouded by: (# raks i the group of u), all elemets u

Lecture 7 7-5 because each elemet u ca be charged oly oce for each rak i its group. (Note here that this is because the liks to the root cout i the secod sum!) This last sum is bouded above by This completes the proof. (# items i group) (# raks i group) all groups log k=1 2 k 2k log. x y UNION(x,y) y x a a b FIND(d) c b c d d Figure 7.1: Examples of UNION BY RANK ad PATH COMPRESSION.