CSE 4095/5095 Topics in Big Data Analytics Spring 2017; Homework 1 Solutions

Similar documents
CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions

Test One (Answer Key)

Sorting Algorithms. Algorithms Kyuseok Shim SoEECS, SNU.

Divide & Conquer. Divide-and-conquer algorithms. Conventional product of polynomials. Conventional product of polynomials.

CS / MCS 401 Homework 3 grader solutions

Analysis of Algorithms. Introduction. Contents

CS 332: Algorithms. Linear-Time Sorting. Order statistics. Slide credit: David Luebke (Virginia)

Data Structures Lecture 9

Average-Case Analysis of QuickSort

A Probabilistic Analysis of Quicksort

Lecture 3: Asymptotic Analysis + Recurrences

Recurrence Relations

Merge and Quick Sort

A recurrence equation is just a recursive function definition. It defines a function at one input in terms of its value on smaller inputs.

CS161: Algorithm Design and Analysis Handout #10 Stanford University Wednesday, 10 February 2016

Design and Analysis of Algorithms

CS 270 Algorithms. Oliver Kullmann. Growth of Functions. Divide-and- Conquer Min-Max- Problem. Tutorial. Reading from CLRS for week 2

COMP285 Midterm Exam Department of Mathematics

CSI 5163 (95.573) ALGORITHM ANALYSIS AND DESIGN

CS:3330 (Prof. Pemmaraju ): Assignment #1 Solutions. (b) For n = 3, we will have 3 men and 3 women with preferences as follows: m 1 : w 3 > w 1 > w 2

4.3 Growth Rates of Solutions to Recurrences

CS583 Lecture 02. Jana Kosecka. some materials here are based on E. Demaine, D. Luebke slides

CS 5150/6150: Assignment 1 Due: Sep 23, 2010

ORIE 633 Network Flows September 27, Lecture 8

Skip Lists. Presentation for use with the textbook, Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 S 3 S S 1

ITEC 360 Data Structures and Analysis of Algorithms Spring for n 1

This Lecture. Divide and Conquer. Merge Sort: Algorithm. Merge Sort Algorithm. MergeSort (Example) - 1. MergeSort (Example) - 2

Math 475, Problem Set #12: Answers

1. Hilbert s Grand Hotel. The Hilbert s Grand Hotel has infinite many rooms numbered 1, 2, 3, 4

CS 332: Algorithms. Quicksort

) n. ALG 1.3 Deterministic Selection and Sorting: Problem P size n. Examples: 1st lecture's mult M(n) = 3 M ( È

Context-free grammars and. Basics of string generation methods

Model of Computation and Runtime Analysis

Dynamic Programming. Sequence Of Decisions

Dynamic Programming. Sequence Of Decisions. 0/1 Knapsack Problem. Sequence Of Decisions

CSE 202 Homework 1 Matthias Springer, A Yes, there does always exist a perfect matching without a strong instability.

Classification of problem & problem solving strategies. classification of time complexities (linear, logarithmic etc)

Chapter 22 Developing Efficient Algorithms

Algorithms. Elementary Sorting. Dong Kyue Kim Hanyang University

Disjoint set (Union-Find)

Design and Analysis of Algorithms

CS/ECE 715 Spring 2004 Homework 5 (Due date: March 16)

1 Hash tables. 1.1 Implementation

Math 155 (Lecture 3)

ALG 2.2 Search Algorithms

CSI 2101 Discrete Structures Winter Homework Assignment #4 (100 points, weight 5%) Due: Thursday, April 5, at 1:00pm (in lecture)

Recursive Algorithms. Recurrences. Recursive Algorithms Analysis

HOMEWORK 2 SOLUTIONS

CS 171 Lecture Outline October 09, 2008

Trial division, Pollard s p 1, Pollard s ρ, and Fermat s method. Christopher Koch 1. April 8, 2014

Model of Computation and Runtime Analysis

Recursive Algorithm for Generating Partitions of an Integer. 1 Preliminary

CIS 121 Data Structures and Algorithms with Java Spring Code Snippets and Recurrences Monday, February 4/Tuesday, February 5

w (1) ˆx w (1) x (1) /ρ and w (2) ˆx w (2) x (2) /ρ.

ACO Comprehensive Exam 9 October 2007 Student code A. 1. Graph Theory

Sequences A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

Real Variables II Homework Set #5

Addition: Property Name Property Description Examples. a+b = b+a. a+(b+c) = (a+b)+c

Lecture 2: April 3, 2013

Divide and Conquer. 1 Overview. 2 Multiplying Bit Strings. COMPSCI 330: Design and Analysis of Algorithms 1/19/2016 and 1/21/2016

CS161 Design and Analysis of Algorithms. Administrative

Homework 3. = k 1. Let S be a set of n elements, and let a, b, c be distinct elements of S. The number of k-subsets of S is

DATA STRUCTURES I, II, III, AND IV

11. Hash Tables. m is not too large. Many applications require a dynamic set that supports only the directory operations INSERT, SEARCH and DELETE.

Matriculation number: You have 90 minutes to complete the exam of InformatikIIb. The following rules apply:

Introduction to Artificial Intelligence CAP 4601 Summer 2013 Midterm Exam

HOMEWORK #10 SOLUTIONS

Chapter 2. Asymptotic Notation

2 High-level Complexity vs. Concrete Complexity

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

NAME: ALGEBRA 350 BLOCK 7. Simplifying Radicals Packet PART 1: ROOTS

Curve Sketching Handout #5 Topic Interpretation Rational Functions

Analysis of Algorithms -Quicksort-

Notes for Lecture 5. 1 Grover Search. 1.1 The Setting. 1.2 Motivation. Lecture 5 (September 26, 2018)

6.003 Homework #3 Solutions

Data Structures and Algorithm. Xiaoqing Zheng

Oblivious Gradient Clock Synchronization

CSE 332. Data Structures and Parallelism

Randomized Algorithms I, Spring 2018, Department of Computer Science, University of Helsinki Homework 1: Solutions (Discussed January 25, 2018)

4/9/13. Fibonacci Heaps. H.min. H.min. Priority Queues Performance Cost Summary. COMP 160 Algorithms - Tufts University

Lecture 2. The Lovász Local Lemma

Algorithm Analysis. Algorithms that are equally correct can vary in their utilization of computational resources

Ma 530 Introduction to Power Series

First come, first served (FCFS) Batch

Lecture 9: Pseudo-random generators against space bounded computation,

MATH 10550, EXAM 3 SOLUTIONS

An Introduction to Randomized Algorithms

The picture in figure 1.1 helps us to see that the area represents the distance traveled. Figure 1: Area represents distance travelled

OPTIMAL ALGORITHMS -- SUPPLEMENTAL NOTES

Lecture 9: Hierarchy Theorems

Fundamental Algorithms

Examples: data compression, path-finding, game-playing, scheduling, bin packing

Algorithms 演算法. Multi-threaded Algorithms

Algorithms and Data Structures Lecture IV

6.046 Recitation 5: Binary Search Trees Bill Thies, Fall 2004 Outline

Lecture 11: Pseudorandom functions

THE SOLUTION OF NONLINEAR EQUATIONS f( x ) = 0.

ECEN 655: Advanced Channel Coding Spring Lecture 7 02/04/14. Belief propagation is exact on tree-structured factor graphs.

Square-Congruence Modulo n

You may work in pairs or purely individually for this assignment.

Transcription:

CSE 09/09 Topics i ig Data Aalytics Sprig 2017; Homework 1 Solutios Note: Solutios to problems,, ad 6 are due to Marius Nicolae. 1. Cosider the followig algorithm: for i := 1 to α log e do Pick a radom j [1, ]; If a[j] = a[j + 1] or a[j] = a[j 1] the output: Type II ad quit; Output: Type I ; Aalysis: Note that if the array is of type I, the above algorithm will ever give a icorrect aswer. probability of a icorrect aswer as follows. Thus assume that the array is of type II. We ll calculate the Probability of comig up with the correct aswer i oe iteratio of the for loop is = 1. Thus, probability of failure i ay iteratio is 1 1. As a cosequece, ( q probability of failure i q successive iteratios is 1 1 exp( q/ (usig the fact that (1 1/x x 1/e for ay x > 0. This probability will be α whe q α log e. Thus the output of this algorithm is correct with high probability. 2. The algorithm rus i phases. I each phase we elimiate a costat fractio of the iput keys that caot be the elemet of iterest. Whe the umber of remaiig keys is, oe of the processors performs a appropriate selectio ad outputs the right elemet. To with all the keys are alive. I ay phase of the algorithm let N stad for the umber of alive keys at the ig of the phase. At the ig of the first phase, N =. Cosider a phase where the umber of alive keys is N at the ig of the phase. Let Y be the collectio of alive keys. We employ N processors i this phase. Partitio the N keys ito N parts with N keys i each part. Each processor is assiged a part. Each processor i parallel fids the media of its keys i O( N time. Let M 1, M 2,..., M N be these group medias. Oe of the processors fids the media M of these N group medias. This will take O( N time. Now partitio Y ito Y 1 ad Y 2, where Y 1 = {q Y q < M} ad Y 2 = {q Y q > M}. There are 3 cases to cosider: Case 1: If Y 1 = i 1, M is the elemet of iterest. I this case, we output 1

M ad quit. Case 2: If Y 1 i, Y 1 will costitute the alive keys for the ext phase. Case 3: If the above two cases do ot hold, Y 2 will costitute the collectio of alive keys for the ext phase. I this case we set i := i Y 1 1. I cases 2 ad 3 we ca perform the partitios usig a prefix computatio that ca be doe i O( N time usig N processors. It is easy to see that Y 1 N ad Y 2 N. As a result, it follows that the umber of alive keys at the ed of this phase is 3N. ( N Thus we ifer that the ru time of the algorithm is O + (3/N + (3/2 N +... = O( N. 3. If we employ k-way merge where k = cm/, the height of the merge tree will be log(n/. However, i the worst case we may have to do c passes through the data at log(cm/ each level of the tree, sice we ca oly keep /c keys of each ru. Thus the worst case umber of I/O passes eeded is 1 + c log(n/m. log(cm/. The FPRT algorithm for selectio works as follows. Let X = k 1, k 2,..., k ; i be the iput for the selectio problem. Here are the steps: 1 Partitio X ito groups of size each, ad fid the media of each group. Let the groups be G i for 1 i. Let the media of G i be M i, for 1 i ; 2 Fid recursively the media M of M 1, M 2,..., M / ; 3 From X get X 1 = {q X : q < M} ad X 2 = {q X : q > M}. Let 1 = X 1 ; If i = 1 + 1 the output M ad quit else if i 1 the recursively fid ad output the ith smallest elemet of X 1 else recursively fid ad output the (i 1 1st smallest elemet of X 2. We ca implemet each of the above steps as it is. Let T ( be the umber of I/O operatios take by the above algorithm o ay iput of size.. Step 1 ca be doe i oe pass (i.e., I/O operatios through the data. Step 2 takes T ( I/O operatios. Step 3 takes oe pass through the data. We ca show that the size of X 1 ad the size of X 2 caot be more tha 7. As a result, step takes o 10 more tha T ( 7 I/O operatios. 10 Thus we get the followig recurrece relatio for T (: which solves to T ( = O (. ( T ( T + T ( 7 ( 10 + Θ 2

. If a leaf ca store more keys, isertio happes i a similar way, we just have to redefie what it meas that a ode is full. Node u is full if it s a iteral ode with 2t 1 childre or if it s a leaf with t 3 keys. Algorithm 1: IsFull(u Data: u: a -Tree ode; Result: True if ode u is full, False otherwise; retur (leaf u AND u == t 3 OR (NOT leaf u AND u == 2t 1; Also, for simplicity, we will always make the root to be o-leaf. The other thig to modify is how to split a full leaf. A full leaf, which has t 3 keys, will be split ito two leafs with 2t 2 keys each. The middle key from the origial leaf moves up ad becomes a key i the paret ode. Let SPLIT NODE be the algorithm discussed i class for splittig a full ode. The followig algorithm will split a ode, takig ito accout splittig full leafs: Algorithm 2: SplitNode(p, i, u Data: p, u: two odes such that p =paret(u ad u is the i-th child of p; Result: Splits the ode u ito two odes; if leaf u the else Create ode u ; Copy last 2t 2 keys of u to u ; Isert key k u 2t 1 as the i-th key of p; Isert u as the i + 1-th child of p; Remove last 2t 1 keys from u; SPLIT NODE(p, i, u; The isertio algorithm is the the followig: 3

Algorithm 3: Isert(T, k Data: T : a -Tree; k: a key; Result: Iserts key k ito T ; r :=root(t ; if isfull(r the Create a ew ode s; s := 0; leaf s :=False; c s 1 := r; SplitNode(s, 1, r; root(t := s; r := s; IsertNoFull(r, k; Algorithm : IsertNoFull(u, k Data: u: a o full -Tree ode; k: a key; Result: Iserts key k ito the subtree rooted at u; if leaf u the Isert k at the right place; else Choose i s.t. ki 1 u k < ki u ; if IsFull(c u i the SplitNode(u, i, c u i ; Update i s.t. ki 1 u k < ki u ; IsertNoFull(c u i, k; 6. Dijkstra s algorithm ca be described as follows:

Algorithm : Dijkstra(V, E, s Data: (V, E: a graph; s: a source ode; let w(u, v be the weight of edge (u, v; Result: array d where d u is the legth of the shortest path from s to u; for u i V do d u := ; d s := 0; Create a priority queue Q to store pairs of the form (ode, distace; Isert the pair (s, 0 ito Q; while Q ot empty do (u, r := ExtractMi(Q; for every child c of u do if d c > d u + w(u, c the d c := d u + w(u, c; Isert(Q, (c, d c ; // update distace if c preset We assume that we ca store the priority queue i memory (O( V. The algorithm will read the eighbors of each ode at most oce. Therefore, the total umber of I/Os is degu ( u E = O E. + V