Chapter 10 Search Structures

Similar documents
Data Structures and Algorithms

Data Structures and Algorithms " Search Trees!!

Selection and Adversary Arguments. COMP 215 Lecture 19

AVL Trees. Properties Insertion. October 17, 2017 Cinda Heeren / Geoffrey Tien 1

GAMINGRE 8/1/ of 7

Lecture 17: Trees and Merge Sort 10:00 AM, Oct 15, 2018

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

CS 350 Algorithms and Complexity

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

CS Data Structures and Algorithm Analysis

CS 350 Algorithms and Complexity

Algorithms and Data Structures 2016 Week 5 solutions (Tues 9th - Fri 12th February)

DAILY QUESTIONS 28 TH JUNE 18 REASONING - CALENDAR

WHEN IS IT EVER GOING TO RAIN? Table of Average Annual Rainfall and Rainfall For Selected Arizona Cities

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Divide-and-Conquer Algorithms Part Two

Fundamental Algorithms

Review Of Topics. Review: Induction

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

Dictionary: an abstract data type

Search and Lookahead. Bernhard Nebel, Julien Hué, and Stefan Wölfl. June 4/6, 2012

What we have learned What is algorithm Why study algorithm The time and space efficiency of algorithm The analysis framework of time efficiency Asympt

IS 709/809: Computational Methods in IS Research Fall Exam Review

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

Analysis of Algorithms. Outline 1 Introduction Basic Definitions Ordered Trees. Fibonacci Heaps. Andres Mendez-Vazquez. October 29, Notes.

Greedy Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 10

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis

Chapter 5 Arrays and Strings 5.1 Arrays as abstract data types 5.2 Contiguous representations of arrays 5.3 Sparse arrays 5.4 Representations of

Assignment 5: Solutions

NATIONAL UNIVERSITY OF SINGAPORE CS3230 DESIGN AND ANALYSIS OF ALGORITHMS SEMESTER II: Time Allowed 2 Hours

Binary Search Trees. Lecture 29 Section Robb T. Koether. Hampden-Sydney College. Fri, Apr 8, 2016

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

CSC 8301 Design & Analysis of Algorithms: Lower Bounds

15.1 Introduction to Lower Bounds Proofs

Kartsuba s Algorithm and Linear Time Selection

CSE 4502/5717: Big Data Analytics

Design and Analysis of Algorithms

Fibonacci Heaps These lecture slides are adapted from CLRS, Chapter 20.

Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs

Data selection. Lower complexity bound for sorting

Graphs and Trees Binary Search Trees AVL-Trees (a,b)-trees Splay-Trees. Search Trees. Tobias Lieber. April 14, 2008

Calculations Equation of Time. EQUATION OF TIME = apparent solar time - mean solar time

Design and Analysis of Algorithms

Preliminaries. Graphs. E : set of edges (arcs) (Undirected) Graph : (i, j) = (j, i) (edges) V = {1, 2, 3, 4, 5}, E = {(1, 3), (3, 2), (2, 4)}

Determine the trend for time series data

CS60007 Algorithm Design and Analysis 2018 Assignment 1

CS 2110: INDUCTION DISCUSSION TOPICS

Lecture 2 September 4, 2014

Limitations of Algorithm Power

Data structures Exercise 1 solution. Question 1. Let s start by writing all the functions in big O notation:

Lecture 1: Asymptotics, Recurrences, Elementary Sorting

Discrete Optimization 2010 Lecture 2 Matroids & Shortest Paths

An analogy from Calculus: limits

CS 151. Red Black Trees & Structural Induction. Thursday, November 1, 12

Chapter 11. Min Cut Min Cut Problem Definition Some Definitions. By Sariel Har-Peled, December 10, Version: 1.

Dictionary: an abstract data type

JANUARY MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

January 2009 Videmus Stellae

Asymptotic Analysis. Slides by Carl Kingsford. Jan. 27, AD Chapter 2

Lecture 2 14 February, 2007

CMPSCI611: The Matroid Theorem Lecture 5

16. Binary Search Trees. [Ottman/Widmayer, Kap. 5.1, Cormen et al, Kap ]

2. ALGORITHM ANALYSIS

INF2220: algorithms and data structures Series 1

Lecture 4. Quicksort

Ordered Dictionary & Binary Search Tree

ENS Lyon Camp. Day 2. Basic group. Cartesian Tree. 26 October

1 Terminology and setup

Advanced Implementations of Tables: Balanced Search Trees and Hashing

AVL Trees. Manolis Koubarakis. Data Structures and Programming Techniques

Week 5: Quicksort, Lower bound, Greedy

CS361 Homework #3 Solutions

Lecture 5: February 21, 2017

Divide and Conquer Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 14

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

Georgia Tech High School Math Competition

1 Approximate Quantiles and Summaries

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Worst case analysis for a general class of on-line lot-sizing heuristics

data structures and algorithms lecture 2

CSC 421: Algorithm Design & Analysis. Spring 2015

Fundamental Algorithms

On shredders and vertex connectivity augmentation

Data Structures in Java

Midterm 1 for CS 170

On improving matchings in trees, via bounded-length augmentations 1

Algorithm for exact evaluation of bivariate two-sample Kolmogorov-Smirnov statistics in O(nlogn) time.

b + O(n d ) where a 1, b > 1, then O(n d log n) if a = b d d ) if a < b d O(n log b a ) if a > b d

ENGINE SERIAL NUMBERS

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

Solutions to the 74th William Lowell Putnam Mathematical Competition Saturday, December 7, 2013

Activity Sheet Counting M&Ms

Chapter 2. Recurrence Relations. Divide and Conquer. Divide and Conquer Strategy. Another Example: Merge Sort. Merge Sort Example. Merge Sort Example

ISO Lead Auditor Lean Six Sigma PMP Business Process Improvement Enterprise Risk Management IT Sales Training

Greedy. Outline CS141. Stefano Lonardi, UCR 1. Activity selection Fractional knapsack Huffman encoding Later:

8 Priority Queues. 8 Priority Queues. Prim s Minimum Spanning Tree Algorithm. Dijkstra s Shortest Path Algorithm

Allocation of multiple processors to lazy boolean function trees justification of the magic number 2/3

Transcription:

Chapter 1 Search Structures 1

1.2 AVL Trees : 동기 Objective : maintain binary search tree structure for O(log n) access time with dynamic changes of identifiers (i.e., elements) in the tree. JULY APR AUG FEB MAY DEC FEB AUG JAN MAR OCT JAN JULY NOV DEC JUNE NOV SEPT JUNE MAR 최상의 dynamic insertions MAY 최악의 dynamic insertions NOV OCT SEPT 2

1.2 AVL Trees : 정의 Height-balanced: An empty tree is height-balanced. If T is a non-empty binary tree with T L and T R as its left and right subtrees, respectively, then T is height-balanced iff (1) T L and T R are heightbalanced and (2) h L -h R < 2 where h L and h R are the heights of T L and T R, respectively. Balance factor: The balance factor, BF(k), of a node k in a binary tree : h L h R. AVL is a binary search tree, satisfying for any node k in the tree, BF(k) = -1,, or 1. 3

1.2 AVL Trees : Insertion MAR MAR - 1 + 1 MAY + 1 (a) Insert MARCH (b) Insert MAY MAY MAR NOV - 2 AUG (d) Insert AUGUST MAR - 1 MAY RR MAY MAR NOV NOV (c) Insert NOV 4

1.2 AVL Trees : Insertion (2) +2 MAY +2 +1 MAY +1 AUG MAR NOV LL (e) Insert APRIL MAR AUG APR NOV APR +2 MAY - 1 MAR - 1 AUG + 1 NOV LR AUG MAY APR MAR (f) Insert JANUARY APR JAN NOV JAN 5

1.2 AVL Trees : Insertion (3) APR AUG +1 MAR - 1-1 + 1 JAN MAY NOV LR APR +1 MAR - 1-1 AUG JAN MAY NOV DEC DEC JULY (g) Insert DECEMBER (h) Insert JULY APR +2 MAR - 2-1 DEC MAY +1-1 JAN NOV RL +1 MAR -1 DEC MAY + AUG 1 JAN NOV DEC JULY APR FEB JULY FEB (i) Insert FEBRUARY 6

1.2 AVL Trees : Insertion (4) APR +1 AUG +2 MAR -1-1 DEC MAY FEB -1 JAN -1 JULY NOV JUNE LR APR +1 AUG (j) Insert JUNE +1 DEC FEB JAN +1 AUG JULY MAR NOV JULY +1 AUG APR JAN +1-1 DEC FEB -1-1 JULY JUNE MAR -2 MAY -1 NOV OCT RR +1 AUG APR (k) Insert OCTOBER JAN +1 DEC FEB -1 JULY JUNE MAR MAY NOV 7 OCT

1.2 AVL Trees : Insertion (4) APR +1 AUG +2 MAR -1-1 DEC MAY FEB -1 JAN -1 JULY NOV JUNE LR APR +1 AUG (j) Insert JUNE +1 DEC FEB JAN +1 AUG JULY MAR NOV JULY +1 AUG APR JAN +1-1 DEC FEB -1-1 JULY JUNE MAR -2 MAY -1 NOV OCT RR +1 AUG APR (k) Insert OCTOBER JAN +1 DEC FEB -1 JULY JUNE MAR MAY NOV 8 OCT

1.2 AVL Trees : Rebalancing rotations BL B Balanced subtree BR +1 A AR h h+2 Unbalanced following insertion BL +1 B BR +2 A AR rotation type LL h+2 Rebalanced subtree BL B BR A AR Rebalanced subtree -1 A AL B height+2 Height of BL increases to h+1-2 A AL -1 B Height of BR increases to h+1 RR Height of subtrees of B remain h+1 A B BR BL BR BL BR height+2 AL BL

1.2 AVL Trees : Rebalancing rotations(2) Balanced subtree B +1 A Unbalanced following insertion -1 B +2 A rotation type LR(a) Rebalanced subtree B C A C BL B C +1 A AR h+2 h BL -1 B +1 C +2 A AR LR(b) B C -1 A h+2 h CL CR h-1 CL CR h BL CL CR BR

1.2 AVL Trees : Rebalancing rotations(3) -1 B +2 A AR LR(c) +1 B C A h+2 BL -1 C CL CR BL CL CR BR h 11

1.2 AVL Trees : Performance comparisons Operation Sequential list Linked List AVL tree Search for x O(log n) O(log n) O(log n) Search for k-th item O(1) O(k) O(log n) Delete x O(n) O(1) 1 O(log n) Delete k-th item O(n-k) O(k) O(log n) Insert x O(n) O(1) 2 O(log n) Output in order O(n) O(n) O(n) 1. Doubly linked list and position of x known 2. Position for insertion known 12

1.3 2-3 Trees - node degree is more than 2 - a special case of B-trees Definition (2-3 tree) : (1) Each internal node is a 2-node or a 3-node. (2) if e is a 2-node, key of every element in LeftChild(node e) < key of e key of every element in MiddleChild(node e) > key of e (3) if e is a 3-node, key of every element in LeftChild(node e) < keyl of e KeyL of e < key of every element in MiddleChild(node e) > keyr of e key of every element in RightChild(node e) > keyr of e (4) All external nodes are at the same level. A 4 B C 1 2 8 13

1.3.3 Inserting into a 2-3 Tree Searching : O(log n) Inserting : O(log n) A 4 (b) 6 inserted G 4 A 2 F 7 B 1 2 8 C B 1 3 D C 6 8 E A 4 A 2 4 B 1 2 7 8 C B 1 3 D 7 8 C (a) 7 inserted (b) 3 inserted 14

1.3.4 Deletion from a 2-3 Tree Deleting : O(log n) A 5 8 B 1 2 C 6 D 9 95 A 5 8 B 1 2 C 6 7 D 9 95 7 deleted A 5 8 B 1 2 C 6 D 95 9 deleted 15

1.3.4 Deletion from a 2-3 Tree (2) A 5 8 B 2 8 1 deleted B 1 2 C 6 D 95 A 2 B 1 C 8 A 2 8 5 deleted B 1 C 5 D 95 A 2 6 deleted B 1 C 5 8 95 deleted 16

1.3.4 Deletion from a 2-3 Tree : Rotations (a) p is the left child r r x? r y? p q y z p x q z a b c d a b c d (b) p is the middle child r r z? r y? q x y p q x p z a b c d a b c d (c) p is the right child r r x z r w y a q y z p a q x z p b c d e b c d e 17

1.3.4 Deletion from a 2-3 Tree : Procedure Step 1: Modify node p as necessary to reflect its status after the desired element has been deleted. Step 2: for (; p has zero elements && p!=root; p = r) { let r be the parent of p, and let q be the left or right sibling of p (as appropriate); if(q is a 3-node) perform a rotation else perform a combine; } Step 3: If p has zero elements, then p must be the root. The left child of p becomes the new root, and node p is deleted. 18

1.6 B-Trees Definition: An m-way search tree satisfies (1) The root has at most m subtrees n, A, (K 1, A 1 ), (K 2, A 2 ),, (K n, A n ). (2) K i < K i+1, i = 1,, n. (3) K i < All key values in subtree A i < K i+1, i = 1,, n. (4) K n < All key values in subtree A n, All key values in subtree A < K 1. (5) The subtrees A i, i = 1,, n, are also m-way search tree. T 2, 4 a node schematic format a 2, b, (2, c), (4, d) b 1, 15 c 25, 3 d 45, 5 b 2,, (1, ), (15, ) c 2,, (25, e), (3, ) d 2,, (45, ), (5, ) e e 1,, (28, ) 28 Figure 1.35: Example of a 3-way search tree that is not a 2-3 tree Definition: A B-tree of order m is an m-way search tree, satisfying (1) The root node has at least two children. (2) All nodes other than root and failure nodes have at least m/2 children. (3) All failure nodes are at the same level. 19

1.6.3 B-Trees: Properties N: minimum number of keys in a B-tree N+1 = the number of failure nodes = the number of nodes at level l+1 > 2(m/2) l-1 If there are N key values, the level of B-tree l is l < log m/2 {N+1)/2} +1 Choice of m - depending on access time : time for reading nodes from disk + time to search the nodes for x Total maximum search time 6.8 5.7 5 125 4 m 2

1.9 Tries blank a b c g o t w blank l u a h o u oriole h wren r b bluebird bunting cardinal chickadee d s gull a u godwit goshawk thrasher thrush 21

1.9 Tries : Searching and Sampling Strategies 1. Searching : O(l) where l is the number of level 2. How to reduce l sampling strategy at the i-th level for key value x Example: Sample(x, i) = x r(x,i) for r(x,i) a randomization function blank a b c d e f g h i j k l m n o p q r s t u v w x y z b bunting goshawk wren godwit bluebird thrush thrasher e l a h A tri : sampling one character at a time, from right to left chickadee oriole cardinal gull blank a b c d e f g h i j k l m n o p q r s t u v w x y z b thrasher cardinal goshawk wren chickadee bluebird gull oriole bunting thrush godwit An optimal tri : sampling on the first level done by using the fourth character

1.9 Tries : Insertion and Deletion Shrink when deleting b l o u σ Need a count data member in each branch node δ 1 δ 2 e u bobwhite bunting b j δ 3 Section of tri showing changes resulting from inserting bobwhile and bluejay Grow when inserting ρ bluebird bluejay

Outline 1. Introduction 2. Finding max. and min. 3. Finding the 2th largest key 4. The Selection Problem 5. A lower bound for finding the median 24

1. Introduction SP : (Selection Problem) Given a set of n real numbers, find the k th smallest one, 1 k n. How can you solve it? well, (1) Sort the numbers. (2) Pick the k th smallest one. O(nlogn) Any better way? 25

What is a trivial lower bound in time complexity for solving SP? T L (n) = Ω(n) Why? What if only considering comparisons? well,... 26

P : Given a set S of n real numbers, find the largest one. W c L W c = {(?,?, x 1 ), (?,?, x 2 ),, (?,?, x n )} W c = n T L (n) = log 2 W c = log 2 n However, this is not tight!!! Why? n = 3 S = {x 1, x 2, x 3 } L = {1 1, 1 2, 1 3, 1 4 } W c = {(?,?, x 1 ), (?,?, x 2 ), (?,?, x 3 )} 1 : 2 < > 2 : 3 1 : 3 < > < > 1 1 1 2 1 3 1 4 x 3 x 2 x 3 x L >> W 1 c = n as n!!! (x 1, x 2, x 3 ) (?,?, x 2 ) (x 2, x 1, x 3 ) (?,?, x 1 ) 27

Adversary Arguments Z 1 = {, 1,, 999} Guess the number in Z 1 that I have in mind? A Guessing Game!!! I can change my mind as long as my answers(responses) are consistent!!! Maximize the number of leaves in a decision tree. 28

2. Finding Max. and Min. MM : Given a set of n real numbers, find max and min. max = the largest number min = the smallest number How can you solve MM? x 1 x 2 x 3 x 4 x 2n-1 x 2m n = 2m W {x 11, x 21, x 31,, x m1 } max L {x 12, x 22, x 32,, x m2 } min How many comparisons? m dividing m-1 finding max m-1 3 3m 2 = n 2finding min 2 Any better way? 29

What information is needed for finding max and min? Finding max : All numbers except max itself must lose at least once in some comparisons. (n-1 losses) Finding min : All numbers except min itself must win at least once in some comparisons. (n-1 wins) 1 unit of information (1 win) or (1 loss) (2n - 2) units of information are needed!!! max min x 1 x 2 x 3 x 4 x 5 x 6 L L L L L W W W W W 3

Status of a number (x i, s i ) : Status W at least one win, no loss L at least one loss, no win WL wins and losses N no comparisons (x, y) Status of keys x and y new compared by an algorithm Adversary response information New Status Units of (N,N) x>y (W,L) 2 (W,N) or (WL,N) * x>y (W,L) or (WL,L) 1 (L,N) ** x<y (L,W) 1 (W,W) x>y (W,WL) 1 (L,L) x>y (WL,L) 1 (W,L), (WL,L) or (W,WL) *** x>y No change (WL,WL) Consistent with No change * (N, W) or (N, WL) can be treated assigned symmetrically values ** (N, L) can be treated symmetrically. *** (L, W), (L, WL) or (WL, W) can be treated symmetrically 31

Example Comparison x 1, x 2 x 1, x 5 x 3, x 4 x 3, x 6 x 3, x 1 x 2, x 4 x 5, x 6 x 6, x 4 x 1 Status Value x 2 Status Value x 3 Status Value x 4 Status Value x 5 Status Value x 6 Status Value N * N * N * N * N * N * W 2 W 2 WL 2 L 1 W 15 W 15 W 25 L 8 WL 1 L 6 L 2 L 5 L 12 LW 5 L 3 LW 3 32

Theorem : Any algorithm to find max and min of n numbers must do at least 3n/2-2 comparisons in the worst case [proof] n 2m (for nsimplicity) 2 (N, N) 2m information needed n 2?? 2m-2, since 2n-2(4m-2) + (2m 2) = n 3n + ( n 2) = 2 2 2 What if n = 2m + 1? Exercise. 33

3. Finding the 2 nd largest key 2L : Given a set of n real numbers find the largest two numbers (max and max2). max max2 max2 the 2nd largest one x1 x2 x3 x n W L L L W L L 2n - 3 comparisons!!! Do we need all those L? Any better algorithm? max. max --- n - 1 comparisons max 2 ---? How many numbers were compared directly with 34

19 19 19 * 1 * 2 7 7 19 9 15 15 * 15 3 6 6 (n-1) + ( log 2 n - 1) = n + log 2 n - 2 comparisons 35

Initially, w(x i ) = 1, i = 1, 2,, n (x i, x j ) Upon each comparison of x i and x j, their values are manipulated depending on the weights, w(x i ) and w(x j ) : w(x i ) > w(x j ) x i > x j w(x i ) := w(x i ) + w(x j ); w(x j ) := w(x i ) = w(x j ) > same same w(x i ) < w(x j ) x i < x j w(x j ) := w(x j ) + w(x i ); w(x i ) := w(x i ) = w(x j ) = consistent no change w(x 1 ) w(x 2 ) w(x 3 ) w(x 4 ) w(x 5 ) 1 1 1 1 1 (x 1,x 2 ) x 1 > x 2 2 (x 3,x 4 ) x 3 > x 4 2 * (x 3,x 5 ) x 3 > x 5 3 * 36

Lemma : # of direct losers to max = log 2 n [Proof] max x i for some i w(x i ) = n w k (x i ) w(x i ) after the k th win against a previously undefeated key w k (x i ) 2 w k-1 (x i ) Why? w (x i ) = 1 w k (x i ) = w k-1 (x i )+w(x j ) 2w k-1 (x i ) since w(x j ) w k-1 (x i ) for (x i,x j ) Suppose that x i wins against t previously undefeated keys eventually. Then, n = w t (x i ) n 2 t Why? w t (x i ) 2 t w (x i ) w t (x i ) 2 t log 2 n t Theorem : Any algorithm to find the 2 nd largest number in a set of n real numbers must do at least n + log 2 n - 2 comparisons. 37

Lecture Schedule November 18 (Friday) 1: ~ 11: class A 14: ~ 15: class B Room #4443 (Oh Sang-su lecture room) 38

4. Selection Problem SP : Given a set S of n real numbers, find the k th smallest one. a n - k numbers > N k N k k - 1 numbers < N k b is less than a (b < a) b In order to fix the k th smallest number N k, the relation of N k to each number in S must be established!!! Why? 39

y N k x y x An adversary could change the value of y which is not related to N k!!! n - 1 crucial comparisons!!! Why? Theorem : Finding the k th smallest element in S requires at least S - 1 comparisons. 4

How to find the k th smallest one A straightforward approach (1) Sort S (2) Pick the k th smallest one O(nlogn) Far from optimality!!! Any better idea? well,. Try Divide and Conquer!!! 41

S = { 21, 15, 13, 8, 7, 29, 22, 2, 5, 1, 3, 26, 4, 19, 12, 2, 18, 24, 16, 23, 11, 1, 25, 14, 27, 6, 17, 9, 28 } 21 29 3 2 11 6 15 22 26 18 1 17 Divide S into S /5 sequences 13 2 4 24 25 9 of 5 elements each 8 5 19 16 14 28 with up to 4 leftover elements 7 1 12 23 27 21 29 26 24 27 6 15 22 19 23 25 17 Sort each 5-element sequence 13 1 12 2 14 9 8 5 4 18 11 28 7 2 3 16 1 42

A B 29 26 21 27 24 6 M = 22 19 15 25 23 17 m = the median of M 1 12 13 14 2 9 S 1 = {s s < m and s S} 5 4 8 11 18 28 S 2 = {s s = m and s S} 2 3 7 1 16 S 3 = {s s > m and s S} C D S 1 3 S 4 Why? 3 S 3 S 4 43

A B............ m............ m................ C D S 1 = {s s < m and s S} S 2 = {s s = m and s S} S 3 = {s s > m and s S} 3 3 S 1 S and S 3 S 4 4 44

if S 1 k then select (S 1, k) else if S 1 + S 2 k then m is the k th smallest one else select (S 3, k - S 1 - S 2 ) end 3n T ( n) = T ( ) + c n + T ( n / 5) 4 Why? 45

Algorithm ( finding the k th smallest element in S ) procedure SELECT(k,S) begin end if S < 5 then Sort S; SELECT := the k th smallest one end {if} else end Divide S into S /5 sequences of 5 elements each with up to 4 leftover elements; Sort each 5-element sequence; Let M be the set of medians of 5-elements sets (sequences); m := SELECT( ); S 1 = {s s < m and s S}; S 2 = {s s = m and s S}; S 3 = {s s > m and s S}; if S 1 k then SELECT ( k, S 1 ) else if S 1 + S 2 k then else end M, M 2 SELECT ( k - S 1 - S 2, S 3 ) SELECT := m c1 n Why? c2 n n T 5 c n 3 3n T 4 46

c n if T ( n) T ( n / 5) + T (3n / 4) + c n if Show that T(n) 2cn. How? By induction!!! 15 n = 5 T (5) T (1) + T ( ) + c 5 4 T (1) + T (38) + c 5 c 1 + c 38 + c 5 c 98 2 c 5 n < 5 n 5 5 < n k T(n) 2cn n = k+1 T(k+1) T((k+1) / 5) + T(3(k+1) / 4) + c (k+1) 2 c (k+1) / 5 + 2 c 3(k+1) / 4 + c(k+1) 2 c (k+1) 4c(k+1) + 15 c (k+1) + c(k+1) 47

Finding the Median # of comparisons 16n Blum [1973] 5.4n Hyafile [1976] Schonhage, Paterson, and Pippenger [1976] The 3n + o(n) best known little o algorithm 48

5. A Lower Bound for Finding the Median k = (n+1) / 2 n-1 (crucial) comparisons Can you find any tighter lower bound? Well, Why not using an adversary argument? 49

Observation x y median x y crucial comparisons non-crucial comparisons Def n : A comparison involving an element x is said to be a crucial comparison for x if it is the first comparison with y satisfying one of the following conditions : (1) x > y for some y median. (2) x < y for some y median. Note : (i) A crucial comparison for x establishes the relation of x to the median. (ii) The relation of y to the median is not necessarily known at the time the crucial comparison for x is done. 5

Adversary Strategy Force an algorithm to perform as many non-crucial comparisons as possible. How? Assigning values to variables. (x i, s i ) Status L: assigned a value larger than the median S: assigned a value smaller than the median N: not yet in comparison (L, S) (x i, x j ) : Comparing x i and x j, i j (N, N) -- x i > median > x j ; (L, S) (L, N) -- make the unassigned one smaller than the median ; (S, N) -- reverse the above ; (S, L) (L, S) L) consistent with previous responses (S, L) S) 51

n + 2 n 1 2 1 1 elements 1 n + 1 th 2 element n n 1 2 n + 1 < median median < elements media n Unless there are already (n - 1) / 2 elements with status S (or L), keep the strategy previously stated!!! 2 Otherwise, make the balance between the numbers for L and S. (n - 1) / 2 non-crucial comparisons possible!!! Why? 3 ( n 1) / 2 + ( n 1) = ( n 1) 2 crucial non-crucial Comparisons 52

Theorem : Any algorithm to find the median of n numbers must do 3 ( n 1) at least 2 comparisons. 3 ( n 1) 2 1.75n log n 1.8n 2n Best lower bound currently known!!! (n - 1) comparisons are tight lower bound only for k = 1 and n!!! 53

Project 2 Graph-related algorithms Both directed and undirected graphs Menu-driven 1) Initialize Graphs 2) Min-Cost spanning tree 3) Dijkstra s shortest path 4) Depth-first Search 5) Breadth-first Search 6) Biconnected components 7) Strongly connected components 54