RANDOMIZED ALGORITHMS

Similar documents
WHAT IS A SKIP LIST? S 2 S 1. A skip list for a set S of distinct (key, element) items is a series of lists

Skip Lists. What is a Skip List. Skip Lists 3/19/14

Skip Lists. Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 S 3 S S 1

1 Probability Review. CS 124 Section #8 Hashing, Skip Lists 3/20/17. Expectation (weighted average): the expectation of a random quantity X is:

Skip List. CS 561, Lecture 11. Skip List. Skip List

Design and Analysis of Algorithms

Chapter 18 Sampling Distribution Models

Outline Resource Introduction Usage Testing Exercises. Linked Lists COMP SCI / SFWR ENG 2S03. Department of Computing and Software McMaster University

Motivation. Dictionaries. Direct Addressing. CSE 680 Prof. Roger Crawfis

Data Structures and Algorithms " Search Trees!!

CS 125 Section #12 (More) Probability and Randomized Algorithms 11/24/14. For random numbers X which only take on nonnegative integer values, E(X) =

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

27 Binary Arithmetic: An Application to Programming

AVL Trees. Manolis Koubarakis. Data Structures and Programming Techniques

1 Maintaining a Dictionary

Dictionary: an abstract data type

CS 361: Probability & Statistics

6.045: Automata, Computability, and Complexity (GITCS) Class 17 Nancy Lynch

Dictionary: an abstract data type

Recitation 6. Randomization. 6.1 Announcements. RandomLab has been released, and is due Monday, October 2. It s worth 100 points.

Fast Sorting and Selection. A Lower Bound for Worst Case

CSE525: Randomized Algorithms and Probabilistic Analysis April 2, Lecture 1

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

University of New Mexico Department of Computer Science. Final Examination. CS 561 Data Structures and Algorithms Fall, 2013

Randomized Algorithms

2. This exam consists of 15 questions. The rst nine questions are multiple choice Q10 requires two

Lecture 8 HASHING!!!!!

CS 124 Math Review Section January 29, 2018

Lecture 10 Algorithmic version of the local lemma

Chapter 7 Wednesday, May 26th

Math 180B Problem Set 3

ECEN 689 Special Topics in Data Science for Communications Networks

CIS 2033 Lecture 5, Fall

Hypothesis Testing. ) the hypothesis that suggests no change from previous experience

Probability Theory. Probability and Statistics for Data Science CSE594 - Spring 2016

Chapter 5 Data Structures Algorithm Theory WS 2017/18 Fabian Kuhn

Computation Theory Finite Automata

6.041SC Probabilistic Systems Analysis and Applied Probability, Fall 2013 Transcript Tutorial:A Random Number of Coin Flips

Entropy. Probability and Computing. Presentation 22. Probability and Computing Presentation 22 Entropy 1/39

Basic Probability. Introduction

10.4 The Kruskal Katona theorem

Topics in Probabilistic Combinatorics and Algorithms Winter, Basic Derandomization Techniques

Introductory Probability

CS 240 Data Structures and Data Management. Module 4: Dictionaries

PROBABILITY THEORY 1. Basics

Cuckoo Hashing and Cuckoo Filters

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Great Theoretical Ideas in Computer Science

Random Variable. Pr(X = a) = Pr(s)

Balanced Allocation Through Random Walk

CS505: Distributed Systems

1 Ways to Describe a Stochastic Process

Parameter Learning With Binary Variables

Lecture 5. 1 Review (Pairwise Independence and Derandomization)

Math 105 Course Outline

Lecture 24: Randomized Complexity, Course Summary

Lecture 8: Probability

Hash Tables. Given a set of possible keys U, such that U = u and a table of m entries, a Hash function h is a

SkipList. Searching a key x in a sorted linked list. inserting a key into a Sorted linked list. To insert

6.041/6.431 Spring 2009 Quiz 1 Wednesday, March 11, 7:30-9:30 PM. SOLUTIONS

Searching, mainly via Hash tables

CMPSCI 240: Reasoning Under Uncertainty

Advanced Algorithms (2IL45)

Computer Sciences Department

Math 416 Lecture 3. The average or mean or expected value of x 1, x 2, x 3,..., x n is

Design of discrete-event simulations

Algorithm Design Strategies V

Name: Firas Rassoul-Agha

P (E) = P (A 1 )P (A 2 )... P (A n ).

University of Regina. Lecture Notes. Michael Kozdron

Lecture 10: Random Walks in One Dimension

CMPT 307 : Divide-and-Conqer (Study Guide) Should be read in conjunction with the text June 2, 2015

4. The Negative Binomial Distribution

Breadth First Search, Dijkstra s Algorithm for Shortest Paths

Extended Algorithms Courses COMP3821/9801

An Introduction to Bioinformatics Algorithms Hidden Markov Models

Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, Union-Find Structures

Business Statistics. Lecture 3: Random Variables and the Normal Distribution

Lecture 4: Stacks and Queues

Monty Hall Puzzle. Draw a tree diagram of possible choices (a possibility tree ) One for each strategy switch or no-switch

CS5371 Theory of Computation. Lecture 23: Complexity VIII (Space Complexity)

Linear Models for Regression CS534

MACHINE LEARNING INTRODUCTION: STRING CLASSIFICATION

Outline Resource Introduction Usage Testing Exercises. Linked Lists COMP SCI / SFWR ENG 2S03. Department of Computing and Software McMaster University

Lecture 11: Non-Interactive Zero-Knowledge II. 1 Non-Interactive Zero-Knowledge in the Hidden-Bits Model for the Graph Hamiltonian problem

N/4 + N/2 + N = 2N 2.

Lecture Notes. Here are some handy facts about the probability of various combinations of sets:

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Lecture 4: Probability, Proof Techniques, Method of Induction Lecturer: Lale Özkahya

A Brief Review of Probability, Bayesian Statistics, and Information Theory

11.1 Set Cover ILP formulation of set cover Deterministic rounding

arxiv: v2 [cs.dc] 28 Apr 2013

14.1 Finding frequent elements in stream

X = X X n, + X 2

Binomial and Poisson Probability Distributions

Priority queues implemented via heaps

Compatible probability measures

Transcription:

CH 9.4 : SKIP LISTS ACKNOWLEDGEMENT: THESE SLIDES ARE ADAPTED FROM SLIDES PROVIDED WITH DATA STRUCTURES AND ALGORITHMS IN C++, GOODRICH, TAMASSIA AND MOUNT (WILEY 2004) AND SLIDES FROM NANCY M. AMATO AND JORY DENNY 1

RANDOMIZED ALGORITHMS A randomized algorithm controls its execution through random selection (e.g., coin tosses) It contains statements like: b randombit() if b = 0 do something else //b = 1 do something else Its running time depends on the outcomes of the coin tosses Through probabilistic analysis we can derive the expected running time of a randomized algorithm We make the following assumptions in the analysis: the coins are unbiased the coin tosses are independent The worst-case running time of a randomized algorithm is often large but has very low probability (e.g., it occurs when all the coin tosses give heads ) We use a randomized algorithm to insert items into a skip list to insert in expected O(log n) time When randomization is used in data structures they are referred to as probabilistic data structures

WHAT IS A SKIP LIST? A skip list for a set S of distinct (key, element) items is a series of lists,,, S h Each list S i contains the special keys + and List contains the keys of S in non-decreasing order Each list is a subsequence of the previous one, i.e., S h List S h contains only the two special keys Skip lists are one way to implement the Ordered Map ADT Java applet S 3 31 23 31 34 64 3 12 23 26 31 34 44 56 64 78

IMPLEMENTATION We can implement a skip list with quad-nodes A quad-node stores: (Key, Value) links to the nodes before, after, below, and above Also, we define special keys + and, and we modify the key comparator to handle them quad-node x 4

SEARCH - FIND(k) We search for a key k in a skip list as follows: We start at the first position of the top list At the current position p, we compare k with y p. next(). key() x = y: we return p. next(). value() x > y: we scan forward x < y: we drop down If we try to drop down past the bottom list, we return NO_SUCH_KEY S 3 Example: search for 78 31 23 31 34 64 5 12 23 26 31 34 44 56 64 78

EXERCISE SEARCH We search for a key k in a skip list as follows: We start at the first position of the top list At the current position p, we compare k with y p. next(). key() x = y: we return p. next(). value() x > y: we scan forward x < y: we drop down If we try to drop down past the bottom list, we return NO_SUCH_KEY Ex 1: search for 64: list the (S i, node) pairs visited and the return value Ex 2: search for 27: list the (S i, node) pairs visited and the return value S 3 31 23 31 34 64 6 12 23 26 31 34 44 56 64 78

INSERTION - PUT(k, v) To insert an item (k, v) into a skip list, we use a randomized algorithm: We repeatedly toss a coin until we get tails, and we denote with i the number of times the coin came up heads If i h, we add to the skip list new lists S h+1,, S i+1 each containing only the two special keys We search for k in the skip list and find the positions p 0, p 1,, p i of the items with largest key less than k in each list,,, S i For i 0,, i, we insert item (k, v) into list S i after position p i Example: insert key 15, with i = 2 S 3 p 2 15 p 1 23 15 23 7 p 0 10 23 36 10 15 23 36

DELETION - ERASE(k) To remove an item with key k from a skip list, we proceed as follows: We search for k in the skip list and find the positions p 0, p 1,, p i of the items with key k, where position p i is in list S i We remove positions p 0, p 1,, p i from the lists,,, S i We remove all but one list containing only the two special keys Example: remove key 34 S 3 p 2 34 p 1 23 34 p0 23 8 12 23 34 45 12 23 45

SPACE USAGE The space used by a skip list depends on the random bits used by each invocation of the insertion algorithm We use the following two basic probabilistic facts: Fact 1: The probability of getting i consecutive heads when flipping a coin is 1 2 i Fact 2: If each of n items is present in a set with probability p, the expected size of the set is np Consider a skip list with n items By Fact 1, we insert an item in list S i with probability 1 2 i By Fact 2, the expected size of list S i is n 2 i The expected number of nodes used by the skip list is h i=0 n 2 i = n i=0 h 1 2i < 2n Thus the expected space is O 2n 9

HEIGHT Consider a skip list with n items The running time of find k, put k, v, and erase k operations are affected by the height h of the skip list We show that with high probability, a skip list with n items has height O log n We use the following additional probabilistic fact: Fact 3: If each of n events has probability p, the probability that at least one event occurs is at most np By Fact 1, we insert an item in list S i with probability 1 2 i By Fact 3, the probability that list S i has at least one item is at most n 2 i By picking i = 3 log n, we have that the probability that S 3 log n has at least one item is at most n 2 3 log n = n n 3 = 1 n 2 Thus a skip list with n items has height at most 3 log n with probability at least 1 1 n 2 10

SEARCH AND UPDATE TIMES The search time in a skip list is proportional to the number of drop-down steps the number of scan-forward steps The drop-down steps are bounded by the height of the skip list and thus are O log n expected time To analyze the scan-forward steps, we use yet another probabilistic fact: Fact 4: The expected number of coin tosses required in order to get tails is 2 When we scan forward in a list, the destination key does not belong to a higher list A scan-forward step is associated with a former coin toss that gave tails By Fact 4, in each list the expected number of scan-forward steps is 2 Thus, the expected number of scan-forward steps is O(log n) We conclude that a search in a skip list takes O log n expected time The analysis of insertion and deletion gives similar results 11

EXERCISE You are working for ObscureDictionaries.com a new online start-up which specializes in sci-fi languages. The CEO wants your team to describe a data structure which will efficiently allow for searching, inserting, and deleting new entries. You believe a skip list is a good idea, but need to convince the CEO. Perform the following: Illustrate insertion of X-wing into this skip list. Randomly generated (1, 1, 1, 0). Illustrate deletion of an incorrect entry Enterprise Argue the complexity of deleting from a skip list Enterprise Boba Fett Enterprise Yoda 12

SUMMARY A skip list is a data structure for dictionaries that uses a randomized insertion algorithm In a skip list with n items The expected space used is O n The expected search, insertion and deletion time is O(log n) Using a more complex probabilistic analysis, one can show that these performance bounds also hold with high probability Skip lists are fast and simple to implement in practice 13