Algorithms Exam TIN093 /DIT602

Similar documents
Algorithms Re-Exam TIN093/DIT600

Lecture 2: Divide and conquer and Dynamic programming

CMSC 451: Lecture 7 Greedy Algorithms for Scheduling Tuesday, Sep 19, 2017

NATIONAL UNIVERSITY OF SINGAPORE CS3230 DESIGN AND ANALYSIS OF ALGORITHMS SEMESTER II: Time Allowed 2 Hours

University of Toronto Department of Electrical and Computer Engineering. Final Examination. ECE 345 Algorithms and Data Structures Fall 2016

FINAL EXAM PRACTICE PROBLEMS CMSC 451 (Spring 2016)

Class Note #20. In today s class, the following four concepts were introduced: decision

Friday Four Square! Today at 4:15PM, Outside Gates

University of California Berkeley CS170: Efficient Algorithms and Intractable Problems November 19, 2001 Professor Luca Trevisan. Midterm 2 Solutions

Do not turn this page until you have received the signal to start. Please fill out the identification section above. Good Luck!

CSE 421 Introduction to Algorithms Final Exam Winter 2005

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Dynamic Programming II Date: 10/12/17

The min cost flow problem Course notes for Optimization Spring 2007

P P P NP-Hard: L is NP-hard if for all L NP, L L. Thus, if we could solve L in polynomial. Cook's Theorem and Reductions

CMPSCI 311: Introduction to Algorithms Second Midterm Exam

CS173 Lecture B, November 3, 2015

CS 374: Algorithms & Models of Computation, Spring 2017 Greedy Algorithms Lecture 19 April 4, 2017 Chandra Chekuri (UIUC) CS374 1 Spring / 1

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

CSCE 750 Final Exam Answer Key Wednesday December 7, 2005

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

1 Some loose ends from last time

CPSC 320 (Intermediate Algorithm Design and Analysis). Summer Instructor: Dr. Lior Malka Final Examination, July 24th, 2009

Algorithm Design CS 515 Fall 2015 Sample Final Exam Solutions

Topics in Complexity Theory

Greedy Algorithms. Kleinberg and Tardos, Chapter 4

Aditya Bhaskara CS 5968/6968, Lecture 1: Introduction and Review 12 January 2016

Exam EDAF August Thore Husfeldt

Fall 2016 Test 1 with Solutions

Greedy Algorithms. CSE 101: Design and Analysis of Algorithms Lecture 10

The min cost flow problem Course notes for Search and Optimization Spring 2004

CS103 Handout 12 Winter February 4, 2013 Practice Midterm Exam

CS103 Handout 22 Fall 2012 November 30, 2012 Problem Set 9

Lecture 18: More NP-Complete Problems

CS 598RM: Algorithmic Game Theory, Spring Practice Exam Solutions

Minimizing the Number of Tardy Jobs

Lecture 14 - P v.s. NP 1

University of New Mexico Department of Computer Science. Final Examination. CS 362 Data Structures and Algorithms Spring, 2007

MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,

CSEP 521 Applied Algorithms. Richard Anderson Winter 2013 Lecture 1

CS60007 Algorithm Design and Analysis 2018 Assignment 1

More Approximation Algorithms

ECS122A Handout on NP-Completeness March 12, 2018

Complexity Theory Part I

CS 170 Algorithms Spring 2009 David Wagner Final

Algorithms. NP -Complete Problems. Dong Kyue Kim Hanyang University

CMPSCI 611 Advanced Algorithms Midterm Exam Fall 2015

Fall 2017 November 10, Written Homework 5

P, NP, NP-Complete, and NPhard

0. Introduction 1 0. INTRODUCTION

Combinatorial Optimization

CSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo

CSE 417. Chapter 4: Greedy Algorithms. Many Slides by Kevin Wayne. Copyright 2005 Pearson-Addison Wesley. All rights reserved.

Module 1: Analyzing the Efficiency of Algorithms

Introduction to Divide and Conquer

Midterm Exam 2 Solutions

Lecture 6: Greedy Algorithms I

Aside: Golden Ratio. Golden Ratio: A universal law. Golden ratio φ = lim n = 1+ b n = a n 1. a n+1 = a n + b n, a n+b n a n

Lecture 4: An FPTAS for Knapsack, and K-Center

Computational Complexity. This lecture. Notes. Lecture 02 - Basic Complexity Analysis. Tom Kelsey & Susmit Sarkar. Notes

Algorithm Design and Analysis

Dynamic Programming: Shortest Paths and DFA to Reg Exps

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3

HW8. Due: November 1, 2018

This means that we can assume each list ) is

Preparing for the CS 173 (A) Fall 2018 Midterm 1

Computational Complexity and Intractability: An Introduction to the Theory of NP. Chapter 9

Lecture Notes 4. Issued 8 March 2018

Announcements. CompSci 102 Discrete Math for Computer Science. Chap. 3.1 Algorithms. Specifying Algorithms

4 CONNECTED PROJECTIVE-PLANAR GRAPHS ARE HAMILTONIAN. Robin Thomas* Xingxing Yu**

10.4 The Kruskal Katona theorem

More Dynamic Programming

CIS 375 Intro to Discrete Mathematics Exam 3 (Section M004: Blue) 6 December Points Possible

University of New Mexico Department of Computer Science. Final Examination. CS 561 Data Structures and Algorithms Fall, 2006

CS 350 Algorithms and Complexity

Design and Analysis of Algorithms

Lecture Notes on Inductive Definitions

CS 350 Algorithms and Complexity

Limitations of Algorithm Power

ECS 20: Discrete Mathematics for Computer Science UC Davis Phillip Rogaway June 12, Final Exam

ICS 252 Introduction to Computer Design

Data Structures in Java

Dynamic Programming: Shortest Paths and DFA to Reg Exps

Written Exam Linear and Integer Programming (DM545)

Design and Analysis of Algorithms April 16, 2015 Massachusetts Institute of Technology Profs. Erik Demaine, Srini Devadas, and Nancy Lynch Quiz 2

More Dynamic Programming

Lecture 18: P & NP. Revised, May 1, CLRS, pp

where Q is a finite set of states

CS 70 Discrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand Midterm 1 Solutions

1 Basic Definitions. 2 Proof By Contradiction. 3 Exchange Argument

CS 170 Algorithms Fall 2014 David Wagner MT2

Approximation Algorithms

CIS 375 Intro to Discrete Mathematics Exam 3 (Section M001: Green) 6 December Points Possible

Complexity Theory Part II

CS154 Final Examination

1 More finite deterministic automata

COL 352 Introduction to Automata and Theory of Computation Major Exam, Sem II , Max 80, Time 2 hr. Name Entry No. Group

Algorithms, Probability, and Computing Fall 2010 Exam

Algorithm Design and Analysis

Transcription:

Algorithms Exam TIN093 /DIT602 Course: Algorithms Course code: TIN 093, TIN 092 (CTH), DIT 602 (GU) Date, time: 21st October 2017, 14:00 18:00 Building: SBM Responsible teacher: Peter Damaschke, Tel. 5405 Examiner: Peter Damaschke Exam aids: one A4 paper (both sides), dictionary, printed Lecture Notes and slides (possibly with own annotations), any edition of Kleinberg, Tardos: Algorithm Design. Time for questions: around 15:00 and around 16:30. Solutions: will appear on the course homepage. Results: will appear in ladok. Point limits: CTH: 28 for 3, 38 for 4, 48 for 5; GU: 28 for G, 48 for VG; PhD students: 38. Maximum: 60. Inspection of grading (exam review): Time will be announced on the course homepage. also fortin092 1

Instructions and Advice: First read through all problems, such that you know what is unclear to you and what to ask the responsible teacher. Write solutions in English. Start every new problem on a new sheet of paper. Write your exam number on every sheet. Write legible. Unreadable solutions will not get points. Answer precisely and to the point, without digressions. Unnecessary additional writing may obscure the actual solutions. Motivate all claims and answers. Strictly avoid code for describing an algorithm. Instead explain how the algorithm works. If you cannot manage a problem completely, still provide your approach or partial solution to earn some points. Facts that are known from the course material can be used. You don t have to repeat their proofs. Remark: The number of points is not always proportional to the length or difficulty of a solution, but it may also be influenced by the importance of the topics and skills. Good luck! 2

Problem 1 (12 points) Let x 1,..., x n be positive real numbers which are however smaller than 1. We wish to find a pair i, j such that x i + x j 1, but x i + x j is as close as possible to 1. In other words, the difference 1 (x i + x j ) shall be minimized but still be non-negative. An O(n 2 )-time algorithm for this problem is pretty obvious: examine all pairs and take the best. But we can do faster: First sort the numbers, that is, re-index them such that x 1... x n. Initially let i := 1 and j := n. As long as i < j, repeat the following actions: (1) If x i + x j 1 then increment i, that is, i := i + 1. (2) If x i + x j > 1 then decrement j, that is, j := j 1. Always maintain the largest sum x i + x j 1 that has been found so far. 1.1. How much time does this algorithm need? (Also explain your answer.) You can consider operations with real numbers as elementary. (5 points) 1.2. Show that this algorithm returns, in fact, the largest sum x i + x j 1. You need to argue why the optimal pair is not overlooked, although by far not all pairs i, j are considered. (7 points) Remark/hint: It can be debated whether this is a typical greedy algorithm, however the argument needed in 1.2 is much reminescent of an exchange argument. 3

Problem 2 (12 points) We had solved the String Editing problem, with two input strings named A = a 1... a n and B = b 1... b m, by a dynamic programming algorithm. The core of this algorithm was the edit distance formula OP T (i, j) = min{op T (i 1, j)+1, OP T (i 1, j 1)+δ ij, OP T (i, j 1)+1}, where δ ij = 1 was defined by: δ ij = 1 if a i b j, and δ ij = 0 if a i = b j. Initial values were OP T (i, 0) = i and OP T (0, j) = j for all i and j, respectively. But now the famous microbiologist Prof. Dr. Chris-Per Kaas faces some variations of the String Editing problem: 2.1. A must be transformed into B by a minimum number of insert and delete operations, whereas characters cannot directly be replaced by other characters at unit cost. Adapt the dynamic programming algorithm to this new problem, and explain why your modification correctly solves the problem. (2 points) 2.2. A must be transformed into B by a minimum number of insert and replace operations, whereas characters cannot be deleted. (Clearly, a solution can only exist if A is no longer than B.) Again: Adapt the dynamic programming algorithm to this new problem, and explain why your modification correctly solves the problem. (4 points) 2.3. A must be transformed into B by insert operations, whereas characters can neither be deleted nor be replaced by other characters. Again, a solution does not always exist, but if one exists, the algorithm must find some. Here we do not need dynamic programming. Give a greedy algorithm for this problem version, along with a time bound and correctness argument. (A fully worked-out correctness proof is not expected, but the key argument should be visible.) The time bound should be considerably better than the known one for the original String Editing problem. (6 points) 4

Problem 3 (8 points) We are given b bins, which are denoted by indices 1,..., b. For each index i, the ith bin contains p i positive and n i negative objects. Furthermore, a number f is given. We wish to split the set of bins {1,..., b} into two sets P and N such that i P n i f and i N p i is minimized. How can you compute such sets P, N {1,..., b}, and how much time does this take? Express your time bound as a function of b and f. Hint: You should avoid developing a new algorithm from scratch. Instead, you can reformulate the problem to obtain a similar one that we have already solved. Motivation of Problem 3 (You can skip this paragraph during the exam; it is not needed for solving the exercise, but maybe you are curious afterwards.) The problem can appear in classification tasks in machine learning. Objects in a sample are grouped into abstract bins, according to their properties, and every object is also labeled as either positive or negative. Assuming that the sample is representative, we want to predict the label of new, previously unseen objects, only based on their properties. The cases predicted as positive/negative are collected in P /N. A given rate of false positive predictions is allowed (the negative objects in P ), and under this restriction we wish to minimize the rate of false negative predictions (the positive objects in N). 5

Problem 4 (4 points) The time analysis of the known divide-and-conquer algorithm for Counting Inversions had led to the reucrrence T (n) = 2T (n/2) + O(n) with solution T (n) = O(n log n). There we had counted operations with arbitrary integers as elementary operations. If we are, instead, interested in the number of operations with digits, we can see that a merging phase has to add O(n) numbers of size O(n), hence with O(n log n) digits, and therefore merging needs O(n log n) digit operations. Now we get the recurrence T (n) = 2T (n/2) + O(n log n). Unfortunately it is not covered by the master theorem for recurrences, yet we can follow the same solution method. We do the first step: Repeated substitution yields T (n) in explicit form: T (n) = log 2 n i=0 2 i (n/2 i ) log(n/2 i ). (We may assume that n is a power of 2, and that the constant in the O(n log n) term is 1, since all this does not affect the O-result.) Show that this sum is bounded by O(n(log n) 2 ). The calculation is short and straightforward, but please be precise. 6

Problem 5 (10 points) Reminder: A clique in a graph is a subset of nodes such that all pairs of nodes in this subset are joined by edges. ( All possible edges exist. ) We define the problem Double Clique as follows. We are given an undirected graph G = (V, E) and an integer k. The problem is to decide whether G contains two disjoint cliques, each with k nodes. More formally, the problem asks, for the given graph G: Do there exist two node sets V 1 V and V 2 V such that V 1 = V 2 = k, V 1 V 2 =, and each of V 1 and V 2 is the node set of a clique in G? Prove that Double Clique is NP-complete. In more detail: 5.1. Why is Double Clique in NP? (2 points) 5.2. Describe a reduction from a suitable start problem to Double Clique. Make sure that it needs only polynomial time. (4 points) 5.3. Prove that your reduction is correct, that is: Show that any given instance of your start problem is equivalent to the instance of Double Clique that your reduction constructs. (4 points) Remark: Do not think complicated in 5.2 an idea of a possible reduction should be rather obvious. 7

Problem 6 (14 points) The following problem is of great interest for scheduling of unit-time jobs under precedence constraints. A k-coloring of a directed graph G = (V, E) is a function c that assigns to every node v an integer c(v) {1,..., k} (a color ) such that every directed edge (u, v) E satisfies c(u) < c(v). The problem is to construct a k-coloring with minimum k. Obviously, a coloring can exist only if the graph is a DAG. For k-coloring of undirected graphs, the condition was only c(u) c(v) on every undirected edge. While the usual, undirected coloring problem is known to be NP-complete, amazingly the directed coloring problem on DAGs can be solved in polynomial time, as you are supposed to show now. We split the task into three small exercises that you can solve independently one-by-one. Given a DAG G = (V, E), we first create an additional node s and all possible directed edges (s, v), v V. We define c(s) := 0. For any two nodes, let L(u, v) denote the length of a longest (not shortest!) directed path from u to v. (The length of a path is simply the number of edges.) 6.1. Prove that the function defined by c(v) := L(s, v) for all v is a coloring of G = (V, E). That is, you must show c(u) < c(v) for every directed edge (u, v) E. (5 points) 6.2. We claim that every possible coloring c of G must have the property c (v) L(s, v) for all nodes v. Explain why. (4 points) 6.3. Finally, based on 5.1 and 5.2, propose an algorithm that solves the k-coloring problem on DAGs. Explain why it is correct, and give a time bound. (5 points) 8

Solutions (attached after the exam) 1.1. Sorting costs O(n log n) time. Then, every iteration either increases i or decreases j, which can happen at most n times. Every iteration costs O(1) time, since a constant number of additions and comparisons are done. Thus, the loop costs O(n) time. In total we need O(n log n) time. (5 points) 1.2. Consider the first step where i = 1 and j = n. (1) If x 1 + x n 1 then x 1 + x n is a valid solution. But there is no need to consider other pairs with x 1. If x 1 + x j 1 for any j, then x j x n implies x 1 + x j x 1 + x n 1. That is, we can exchange x j by x n and get a better solution. Therefore it is safe to ignore x 1 henceforth. We can set i := 2 and consider the instance x 2,..., x n only. (2) If x 1 + x n > 1 then it is safe to ignore x n henceforth: Since x 1 x j for all j, we have x j + x n > 1 x 1 + x n > 1. Hence all other pairs with x n are invalid solutions, too. Therefore we can set j := n 1 and consider the instance x 1,..., x n 1 only. The same arguments as above apply to the instance x i,..., x j for general i and j. (7 points) 2.1. We keep the formula for OP T (i, j) but we make replace operations too expensive to be applied, simply by defining δ ij = if a i b j. Alternatively we can define δ ij = 2 if a i b j, since a replace operation is equivalent to one insert and one delete operation. (2 points) 2.2. The term OP T (i 1, j) + 1 accounts for the case that we transform a 1... a i 1 into b 1... b j and delete a i at unit cost. By omitting this term we disable this option. Hence OP T (i, j) = min{op T (i 1, j 1) + δ ij, OP T (i, j 1) + 1} is a suitable dynamic programming formula for this problem version. We must also change the initial values, since no solution exists if A is longer than B. For instance, we may set OP T (i, 0) := for all i > 0. (4 points) 2.3. Scan the string A from left to right and assign every symbol in A to the earliest possible symbol in B. The time is O(n + m), because we need to traverse both A and B only once. The correctness argument is easier to write down by using a more formal description of this greedy algorithm: We construct a function g inductively. 9

Let g(0) := 0, and let g(i + 1) be the smallest j > g(i) such that a i = b j. We align every a i to b g(i). A solution exists if and only if g(n) still exists. To show correctness, we consider any different solution to the instance, described by another function h. Let i be the smallest index such that h(i + 1) g(i + 1). Since g(i + 1) is the smallest j > g(i) = h(i) such that a i = b j, we have h(i + 1) > j = g(i + 1). Thus we can re-define h(i + 1) := j and get another valid solution where also h(i + 1) = g(i + 1). By this exchange argument we can, step by step, turn h into g. (6 points) 3. Minimizing i N p i is equivalent to maximizing i P p i, because i p i is fixed by the given instance, and i N p i + i P p i = i p i. Now it should be evident that the given problem is just the Knapsack problem with the following input parameters: capacity f, sizes n i, and values p i. (Consider P as the knpasack to be filled.) As we know, Knapsack can be solved in O(bf) time by dynamic programming. (8 points) 4. The only idea needed here is to simplify the logarithmic term. T (n) = log 2 n i=0 2 i (n/2 i ) log(n/2 i ) < log 2 n i=0 2 i (n/2 i ) log n = log 2 n i=0 n log n = O(n(log n) 2 ). (4 points) 5.1. We can verify in polynomial time that two given subsets of nodes are cliques, that they are disjoint, and that they have k nodes each. (2 points) 5.2. Naturally we give a reduction from Clique. Let H be any graph, and k an integer. We construct a graph G, simply by taking two copies of H on two disjoint node sets. Obviously, copying can be done in polynomial time. We keep the given k. (4 points) 5.3. If H has a clique of k nodes, then, clearly, G has two disjoint cliques of k nodes, in the two copies of H. Conversely, suppose that G has two disjoint cliques of k nodes. Consider any one of them. This clique is entirely in one copy of H, because no edges exist between the two copies. Therefore H has a clique of k nodes. (4 points) 6.1. Consider a longest path Q from s to u. By definition, its length is c(u). Appending the edge (u, v) yields a path from u to v. The node v is not already on Q, since this would create a directed cycle. Hence we have found a path from s to v with c(u) + 1 edges. Thus c(v) c(u) + 1. (5 points) 10

6.2. This follows instantly from the condition c (u) < c (v) for every directed edge: Along a directed path from s to any node, the value of c inreases by at least 1 on every edge, therefore the color must be at least the path length. In particular, this holds also for a longest path. (4 points) 6.3. Compute the lengths L(s, v) of longest paths from s to all other nodes v. As we know, this can be done in O( E ) time. Due to 6.1 and 6.2, L(s, v) equals the smallest possible color of v. Hence, if all colors (i.e., lengths) are bounded by k, we have found a k-coloring, and otherwise we know that no k-coloring can exist. (5 points) Addendum to solutions (attached later) An alternative way to write the proof of 1.2: Let x a + x b be an optimal solution. We can miss it in only two ways: (1) We reach i < a and j = b, but next we decrement j. (2) We reach i = a and j > b, but next we increment i. Both cases lead to contradictions: (1) Here x i + x j x a + x b 1, and in this case i is incremented instead. (2) Here x a + x b x i + x j. If x b = x j, we have just found another optimal solution, which is fine. If x b < x j then, since x a + x b 1 was optimal, we have x i + x j > 1, and in this case j is decremented instead. 11