The Effect of Homogeneity on the Computational Complexity of Combinatorial Data Anonymization

Size: px
Start display at page:

Download "The Effect of Homogeneity on the Computational Complexity of Combinatorial Data Anonymization"

Transcription

1 Preprint To appear in Data Mining and Knowledge Discovery DOI /s Online available The Effect of Homogeneity on the Computational Complexity of Combinatorial Data Anonymization Robert Bredereck 1,, André Nichterlein 1, Rolf Niedermeier 1, Geevarghese Philip 2, Received: 18 April 2012 / Accepted: 24 September 2012 Abstract A matrix M is said to be k-anonymous if for each row r in M there are at least k 1 other rows in M which are identical to r The NP-hard k-anonymity problem asks, given an n m-matrix M over a fixed alphabet and an integer s > 0, whether M can be made k-anonymous by suppressing (blanking out) at most s entries Complementing previous work, we introduce two new data-driven parameterizations for k-anonymity the number t in of different input rows and the number t out of different output rows both modeling aspects of data homogeneity We show that k-anonymity is fixed-parameter tractable for the parameter t in, and that it is NP-hard even for t out = 2 and alphabet size four Notably, our fixed-parameter tractability result implies that k-anonymity can be solved in linear time when t in is a constant Our computational hardness results also extend to the related privacy problems p-sensitivity and l-diversity, while our fixed-parameter tractability results extend to p-sensitivity and the usage of domain generalization hierarchies, where the entries are replaced by more general data instead of being completely suppressed Keywords k-anonymity p-sensitivity l-diversity Domain generalization hierarchies Matrix modification problems Parameterized algorithmics Fixed-parameter tractability NP-hardness An extended abstract entitled The Effect of Homogeneity on the Complexity of k- Anonymity appeared in Proceedings of the 18th International Symposium on Fundamentals of Computation Theory (FCT 11), volume 6914 of LNCS, pages 53-64, Springer 2011 Apart from the full proofs omitted in that version, the current article also contains new results on l-diversity, p-sensitivity, and on the usage of domain generalization hierarchies Supported by the DFG, research project PAWS, NI 369/10 Supported by the Indo-German Max Planck Centre for Computer Science (IMPECS) 1 Institut für Softwaretechnik und Theoretische Informatik, TU Berlin, Germany 2 Max-Planck-Institut für Informatik, Saarbrücken, Germany {robertbredereck,andrenichterlein,rolfniedermeier}@tu-berlinde gphilip@mpi-infmpgde

2 1 Introduction Assume that data about n individuals are represented by length-m vectors consisting of attribute values In other words, assume that our input is an n m data matrix with entries from some (potentially large) alphabet Σ If all vectors (that is, rows) are identical, then we have full homogeneity and thus full anonymity of all individuals Relaxing full anonymity to k-anonymity (requiring that every vector occurs at least k times), in this work we investigate how the degree of (in)homogeneity (measured by the number of different vectors) both with respect to the input data matrix as well as the corresponding anonymized output data matrix influences the computational complexity of the NP-hard problem of making sets of individuals k-anonymous Moreover, we extend our investigations to further combinatorial data anonymization problems including the concepts of p-sensitivity and l-diversity Samarati and Sweeney (1998), Samarati (2001), and Sweeney (2002b) devised the notion of k-anonymity to better quantify the degree of anonymity in sanitized data This notion formalizes the intuition that entities who have identical sets of attributes cannot be distinguished from one another For a positive integer k we say that a matrix M is k-anonymous if, for each row r in M, there are at least k 1 other rows in M which are identical to r Thus k-anonymity provides a clear and simple combinatorial model for sanitizing data: choose a value of k which would satisfy the relevant privacy requirements, and then try to modify at minimum cost the matrix in such a way that it becomes k-anonymous The corresponding decision problem k- Anonymity asks, additionally given an upper bound s for the number of suppressions allowed, whether a matrix can be made k-anonymous by suppressing (blanking out) at most s entries While k-anonymity is our central data anonymization problem, our results also extend to several more restrictive anonymization problems, 1 including p-sensitivity (Truta and Vinay, 2006) and l-diversity (Machanavajjhala et al, 2007) Motivation The increased use of technology brings with it various non-obvious threats to personal privacy For example, location data from tracking devices such as Global Positioning System (GPS) devices used in public transport systems or communication devices such as mobile phones can be used by an adversary in space or time-correlated inference attacks to deduce private information about individuals Another seemingly unlikely source of breach of privacy lies in the search logs of web search engines, which are sometimes released as a valuable aid for research on information retrieval Yet other, perhaps more obvious, sources of private information are medical data or the relationship graphs of social networks The protection of privacy is widely recognized as a human right, and at the same time there are large economic and scientific incentives for analyzing data sets which could potentially reveal private information To be able to do this without harming privacy, k- 1 See Section 2 for formal definitions 2

3 Anonymity and its variants have been proposed as a possible approach to addressing privacy concerns arising in all these scenarios (Campan and Truta, 2009; Gedik and Liu, 2008; Gkoulalas-Divanis et al, 2010; Gruteser and Grunwald, 2003; Monreale et al, 2010; Navarro-Arribas et al, 2012; Zhou and Pei, 2011) 2 We focus on a better understanding of the computational complexity and on tractable special cases of combinatorial data privacy problems; see Machanavajjhala et al (2007) and Sweeney (2002b) for discussions on the pros and cons of these models in terms of privacy vs preservation of meaningful data In particular, note that with differential privacy (cleverly adding some random noise) a statistical model has become very popular as well (Dwork, 2011; Fung et al, 2010); we do not study this here However, k-anonymity and its variants are very natural combinatorial problems (with potential applications beyond data privacy) that are easy to interpret with respect to performed data modifications, and for instance in the case of one-time anonymization, are obviously valuable for the sake of providing simple models that do not introduce noise Important Variants of k-anonymity While the k-anonymity problem is described in terms of the operation of suppressing blanking out data entries, this level of obfuscation is sometimes not required or desirable Sweeney (2002a) introduced the notion of domain generalization and domain generalization hierarchies (DGH) as a generalization of and an alternative to suppression for achieving k-anonymous datasets The operation of generalization involves replacing a data entry with a less specific element For instance, the precise age of a person might be replaced with an age range which includes the given age The new value, while still being correct for the person, is not as precise as the previous one was This idea is extended to the notion of domain generalization hierarchies, which can be thought of as increasingly more general ranges of data The operation of suppressing a data element is thus a special case of generalization The DGH-k-Anonymity problem asks for a minimal under a certain natural ordering generalization of the input matrix which is k-anonymous Observe that the k-anonymity problem is a special case of DGH-k-Anonymity where the only generalization maps each data element to the -symbol; thus DGH-k-Anonymity is at least as hard as k-anonymity; we extend some of our positive results for k-anonymity to DGH-k-Anonymity The attributes used in the well-known linking attack method by Sweeney (2002b) for re-identification were gender, ZIP code and birth date She called such attributes containing publicly available data quasi-identifiers Published datasets, such as medical data, consist of these quasi-identifiers and private information (eg disease) To protect against a linking attack, it is sufficient 2 While it is beyond the scope of this work to fully address all the potential weaknesses of k-anonymity, we mention for the sake of completeness that k-anonymity is known to be vulnerable against the attack models attribute linkage, table linkage, and probabilistic attack (Fung et al, 2010) 3

4 to make the quasi-identifiers k-anonymous Hence, the input matrix M for k- Anonymity contains only quasi-identifiers This leads to a major drawback of k-anonymity, namely the possibility that the private information may end up being highly uniform in the output For example, an attacker cannot determine the row corresponding to one individual but using the linking attack method of Sweeney (2000) he may identify k rows from which exactly one corresponds to the individual If the private information stored in these k rows is equal, then the attacker does not know the row corresponding to the individual but the private information belonging to the individual is revealed To prevent this, different concepts were introduced, eg p-sensitivity (Truta and Vinay, 2006), l-diversity (Machanavajjhala et al, 2007), and t-closeness (Li et al, 2007) We will partially extend our findings for k-anonymity to p-sensitivity and l-diversity p-sensitivity is an enhancement to k-anonymity in the sense that there is the additional requirement that in the output matrix each maximal set of rows that are identical in the quasi-identifiers contains at least p different private values l-diversity is an enhancement requiring that each such set of rows contains at least l well represented private values The combinatorial realization of well represented that we use is to require that in each such set of rows the relative frequency of each private value is at most 1/l (Wong et al, 2006; Xiao and Tao, 2006) Computational Complexity k-anonymity and many related problems are NP-hard (Meyerson and Williams, 2004), even when the input matrix is highly restricted Concerning polynomial-time approximability, k-anonymity is APX-hard when k = 3, even when the alphabet size is just two (Bonizzoni et al, 2011a); it is MAX SNP-hard when k = 7, even when the number of columns in the input dataset is just three (Chakaravarthy et al, 2010); and it is MAX SNP-hard when k = 3, even when the number of columns in the input dataset is 27 (Blocki and Williams, 2010) Moreover, it is APX-hard when k = 3, even when the input matrix has only three columns (Bonizzoni et al, 2011b) On the positive side, many polynomial-time approximation algorithms have been developed for the problem and its variants Meyerson and Williams (2004) devised an approximation algorithm with the approximation ratio O(k ln k) which runs in O(n 2k ) time, and another one which runs in time polynomial in both n and k and has the ratio O(k ln n) Gionis and Tassa (2009) came up with an algorithm which runs in O(n 2k ) time and has the approximation ratio O(ln k) This was later improved upon by Park and Shim (2007) and by Kenig and Tassa (2012), who devised approximation algorithms with the same ratio and the same worst-case running time, but which run much faster on large classes of inputs Parameterized Complexity Confronted with this computational hardness, we study aspects of the parameterized (or multivariate) computational complexity of k-anonymity and its variants as initiated by Evans et al (2009) The 4

5 central question here is how naturally occurring parameters influence computational complexity For example, consider the following parameterization Is k-anonymity polynomial-time solvable for constant values of k? While 2-Anonymity is polynomial-time solvable (Blocki and Williams, 2010), the general answer is no since already 3-Anonymity is NP-hard (Meyerson and Williams, 2004), even on binary data sets (Bonizzoni et al, 2011a) Thus, k alone does not yield a promising parameterization k-anonymity has a number of meaningful parameterizations beyond k, including the number of rows n, the alphabet size Σ, the number of columns m, and, in the spirit of multivariate algorithmics (Fellows, 2009; Niedermeier, 2010), various combinations of single parameters Here the arity of the alphabet Σ may range from binary (such as gender) to unbounded (such as net worth) For instance, answering an open question of Evans et al (2009), Bonizzoni et al (2011b) showed that k-anonymity is fixed-parameter tractable with respect to the combined parameter (m, Σ ), whereas there is no hope for fixedparameter tractability with respect to the single parameters m and Σ (Evans et al, 2009) We emphasize that Bonizzoni et al (2011b) made use of the fact that the value Σ m is an upper bound on the number of different input rows, thus implicitly exploiting a very rough upper bound on input homogeneity In this work, we refine this view by asking how the degree of homogeneity of the input matrix influences the complexity of k-anonymity In other words, is k-anonymity fixed-parameter tractable for the parameter number of different input rows? In a similar vein, we also study the effect of the degree of homogeneity of the output matrix on the complexity of k-anonymity Table 1, which extends similar tables due to Evans et al (2009) and Bonizzoni et al (2011b), summarizes known and new results for k-anonymity Our Contributions We introduce the homogeneity parameters the number t in of different input rows and the number t out of different output rows, for studying the computational complexity of k-anonymity and related problems Typically, we expect t in n and t in Σ m The latter relation is obvious since Σ m just refers to having all possible input rows over the alphabet Σ The relation t in n, however, may also be justified in natural settings First, assume that the data are already k -anonymous and shall be made k-anonymous with k > k This may be due stronger safety requirements Then we have t in n/k Second, the parameter t in can be viewed as a distance from triviality-parameterization in the sense that if t in is small then the input is almost k-anonymous and hopefully needs not many modifications to be made k-anonymous This may be particularly true when the matrix entries represent ranges (such as age range 40-50) and are not too fine-grained (such as specifying the exact birth day) Third, we study the adult data set (Frank and Asuncion, 2010): Preparing it as described in Machanavajjhala et al (2007), it consists of n = rows and t in = input row types, that is, t in 06 n Furthermore, the data set has at least one column with 72 different values, its alphabet size is Σ = 72, and it has eight columns Hence, for this dataset we have Σ m = , and so t in Σ m Indeed, 5

6 Table 1 The parameterized complexity of k-anonymity FPT stands for fixed-parameter tractability and W[1]-hardness means presumable fixed-parameter intractability, see Section 2 for definitions Results proved in this paper are in bold The column and row entries represent parameters For instance, the entry in row and column s refers to the (parameterized) complexity for the single parameter s whereas the entry in row m and column s refers to the (parameterized) complexity for the combined parameter (s, m) An entry NPhard means that k-anonymity remains NP-hard even if the parameter adopts constant values In addition,? marks a currently unknown parameterized complexity status k s k, s NP-hard a NP-hard a W[1]-hard b W[1]-hard b Σ NP-hard c NP-hard c?? m NP-hard d NP-hard d FPT e FPT e n FPT e FPT e FPT e FPT e Σ, m FPT b FPT e FPT e FPT e Σ, n FPT e FPT e FPT e FPT e t in FPT FPT FPT FPT t out NP-hard? FPT FPT Σ, t out NP-hard? FPT FPT k s Σ m n t in t out parameters degree of anonymity number of suppressions alphabet size number of columns number of rows number of different input rows number of different output rows a Meyerson and Williams (2004) b Bonizzoni et al (2011b) c Aggarwal et al (2005) d Bonizzoni et al (2011a) e Evans et al (2009) t in is a data-driven parameterization in the sense that one can efficiently measure in advance the instance-specific value of t in, whereas Σ m denotes the maximum possible degree of inhomogeneity As to parameterization by output homogeneity t out, we basically show that this leads to computational intractability even when t out = 2 and the alphabet size is t out +2, irrespective of the variant of t out being part of the input or not etc which we consider This holds for k-anonymity as well as for the more restrictive privacy concepts that lead to the problems p-sensitivity (Truta and Vinay, 2006) and l-diversity (Machanavajjhala et al, 2007) On the positive side we show that parameterization by input homogeneity t in yields several tractability results For instance, we derive an algorithm that solves k-anonymity in O(nm + 2 t2 in t 2 in (m + t in log(t in ))) time, which compares favorably with the Bonizzoni et al (2011b) algorithm which runs in O(2 ( Σ +1)m kmn 2 ) time In particular, when t in is a constant, our algorithm 6

7 solves k-anonymity in time linear in the size of the input Moreover, we prove that k-anonymity becomes fixed-parameter tractable for the parameter m when t out is a constant, and that k-anonymity becomes fixed-parameter tractable for the combined parameter (t out, s) where s denotes the number of allowed suppressions in the matrix We also show that our positive results for k- Anonymity parameterized by t in generalize to the problems p-sensitivity, DGH-k-Anonymity, and DGH-p-Sensitivity, with similar time bounds In Table 1 we summarize our results on k-anonymity and present them in the context of previous results on the parameterized complexity of k-anonymity Organization of the Paper In the next section we introduce the notation and terminology which we use in the paper In Section 3 we consider the homogeneity of the output matrix and demonstrate that achieving a very homogeneous output matrix (as might seem desirable for privacy purposes) is computationally very hard This applies to all problems studied in our work In Section 4, we show that for homogeneous input matrices one can gain several fixedparameter tractability results for most problems considered in this article, leaving the case of l-diversity as a major open question We conclude in Section 5 with a discussion of directions for future research 2 Preliminaries We briefly overview some relevant concepts from parameterized complexity which we use later We then describe the notation and terminology used in the discussion on k-anonymity, p-sensitivity, and l-diversity The definitions specific to DGH-k-Anonymity can be found in Subsection 44 Parameterized Complexity Our algorithmic results mostly rely on concepts of parameterized algorithmics (Downey and Fellows, 1999; Flum and Grohe, 2006; Niedermeier, 2006) The fundamental idea herein is, given a computationally hard problem X, to identify a parameter p (typically a positive integer or a tuple of positive integers) for X and to determine whether a size-n input instance of X can be solved in f(p) n O(1) time, where f is an arbitrary computable function If this is the case, then one says that X is fixed-parameter tractable for the parameter p The corresponding complexity class is called FPT If X could only be solved in polynomial running time where the degree of the polynomial depends on p (such as n O(p) ), then, for parameter p, problem X is said to lie in the strictly larger parameterized complexity class XP Finally, we also consider the parameterized complexity class W[1] with FPT W[1] XP It is widely believed that a problem which is W[1]- hard based on the so-called parameterized reductions (Downey and Fellows, 1999) does not have FPT algorithms 7

8 k-anonymity Our inputs are datasets in the form of n m-matrices, where the n rows refer to the individuals and the m columns correspond to attributes with entries drawn from an alphabet Σ Suppressing an entry M[i, j] of an n m-matrix M over alphabet Σ with 1 i n and 1 j m means to simply replace M[i, j] Σ by the new symbol ending up with a matrix over the alphabet Σ { } Definition 1 A row type is a string from (Σ { }) m We say that a row in a matrix has a certain row type if it coincides in all its entries with the row type In what follows, synonymously, we sometimes also speak of a row lying in a row type, and that the row type contains these rows Definition 2 (k-anonymity (Samarati, 2001; Samarati and Sweeney, 1998; Sweeney, 2002b)) A matrix is k-anonymous if for every row in the matrix one can find at least k 1 other identical rows A natural objective when trying to achieve k-anonymity is to minimize the number of suppressed matrix entries For a row y (with some entries suppressed) in the output matrix, we call a row x in the input matrix the preimage of y if y is obtained from x by suppressing in x the -entry positions of y The basic problem of this work reads as follows k-anonymity Input: An n m-matrix M over a fixed alphabet Σ and nonnegative integers k, s Question: Can one suppress at most s elements of M to obtain a k- anonymous matrix M? We study two new parameterizations t in and t out, referring to the input and the output homogeneity, respectively, where t in denotes the number of row types in the input matrix and t out denotes the number of row types in the output k-anonymous matrix Note that as we show later t in can be determined efficiently In Section 3, we discuss several possibilities for a more fine-grained and mathematically precise definition of t out p-sensitivity and l-diversity For the problems p-sensitivity and l- Diversity, the input matrix M consists of a matrix where the last column is declared as private and the other columns as quasi-identifiers 3 The row types are defined over the submatrix restricted to the quasi-identifier columns (denoted by M qi ) That is, when two rows are identical in their quasi-identifier entries quasi-identifiers for short then they belong to the same row type, irrespective of whether they differ in their private columns This definition fits to the anonymization concepts: The more homogeneous the quasi-identifiers are, the easier p-sensitivity and l-diversity will be to solve A similar 3 Following an approach due to Xiao et al (2010) we consider the case with one private information column 8

9 argument does not hold for the private information: Homogeneous private information does not make these two problems more efficiently solvable Hence, row types are defined over the quasi-identifiers only Definition 3 (p-sensitive (Truta and Vinay, 2006)) A matrix is p-sensitive if for each row type there are p different values in the private column p-sensitivity Input: An n m-matrix M over a fixed alphabet Σ and nonnegative integers k, s, and p Question: Can one suppress at most s elements in M qi to obtain a p- sensitive matrix M where M qi is k-anonymous? Definition 4 (l-diverse (Wong et al, 2006; Xiao and Tao, 2006)) A matrix is l-diverse if for each row type the relative frequency of each value in the private column is at most 1/l Note that Definition 4 implies that in an l-diverse matrix each row type is of size at least l l-diversity Input: An n m-matrix M over a fixed alphabet Σ and nonnegative integers l and s Question: Can one suppress at most s elements in M qi to obtain an l- diverse matrix M? 3 Hardness of Achieving Homogeneous Outputs In Section 31, we discuss some basic concepts of output homogeneity In Section 32, we demonstrate the computational intractability of achieving homogeneous outputs for k-anonymity, l-diversity and p-sensitivity 31 Output Homogeneity There are two cases to consider when investigating the homogeneity of the output matrix: The user specifies the required output homogeneity by providing the parameter t out as part of the input The output homogeneity is not specified by the user In the following, we briefly discuss both cases 9

10 User-Specified Output Homogeneity In this case, the user wants to specify the number of output row types Here, we consider three variants: The minimum number of output row types is specified in the input The maximum number of output row types is specified in the input The exact number of output row types is specified in the input There is a close relationship between k-anonymity and clustering problems where one is also interested in grouping together similar objects Such a relationship has already been observed in related work (Aggarwal et al, 2010; Bredereck et al, 2011; Fard and Wang, 2010) This clustering view on k- Anonymity makes the three described problem variants very interesting because they allow the user to specified a minimum, maximum, or exact number of clusters which is a useful feature for many applications Output Homogeneity not Specified by the User If the output homogeneity is not specified, then it is a property of the set of feasible solutions Here, the two simplest properties yield the following two parameters: The minimum number t min out of output row types among all feasible solutions The maximum number t max out of output row types among all feasible solutions The parameter t max out is the more interesting one from the point of view of anonymization Given a fixed number s of allowed suppressions and a required degree k of anonymity, in this case the user of the data is interested in output matrices with as many output row types as possible Due to the higher diversity more output row types potentially provide more information to the end-user of the data The parameter t min out is more interesting than t max out from an algorithmic per- t max out, fixed-parameter algorithms with t min out as the pa- are potentially more effective It is not difficult to is much smaller than t max spective Since t min out rameter instead of t max out think of real-world examples where t min out out 32 Hardness Results It is a natural question whether k-anonymity is fixed-parameter tractable with respect to the number of output types, for each of the parameters discussed in Section 31 It is easy to see that the problem becomes polynomialtime solvable when there is only one output row type Answering this question for more than one output row types in the negative, we show that k- Anonymity is NP-hard even when there are only two output types and the alphabet has size four, destroying any hope for fixed-parameter tractability already for the combined parameter number of output types and alphabet size This intractability also transfers to all the three variants of user-specified output homogeneity discussed above The hardness proof uses a polynomial-time many-one reduction from Balanced Complete Bipartite Subgraph: 10

11 Balanced Complete Bipartite Subgraph (BCBS) Input: A bipartite graph G = (A B, E) and an integer k 1 Question: Is there a complete bipartite subgraph G = (A B, E ) of G with A A and B B such that A = B k? BCBS is NP-complete by a reduction from Clique (Johnson, 1987) We provide a polynomial-time many-one reduction from a special case of BCBS with a balanced input graph, that is, a bipartite graph that can be partitioned into two independent sets of the same size, and k = V /4 This special case is also NP-complete by a simple reduction from the general BCBS problem: Let V := A B with A and B being the two vertex partition classes of the input graph If the input graph is not balanced, that is, if A B 0, then add A B isolated vertices to the partition class of smaller size If k < V /4, then repeat the following until k = V /4: Add a new vertex a to A and a new vertex b to B Make a adjacent to all vertices in B, and b adjacent to all vertices in A Increment k by 1 If k > V /4, then add 2k V /2 isolated vertices to each of A and B We devise a reduction from BCBS with a balanced input graph and k = V /4 to show the NP-hardness of k-anonymity This reduction also introduces the basic structure of the constructions used in all the hardness proofs in this work Proposition 1 k-anonymity is NP-complete for two output row types and alphabet size four Proof We only have to prove NP-hardness, since containment in NP is clear Let (G, n/4) be a BCBS-instance where G = (V, E) is a balanced bipartite graph, and where n := V Let A = {a 1, a 2,, a n/2 } and B = {b 1, b 2,, b n/2 } be the two vertex partition classes of G We construct an (n/4+1)-anonymityinstance that is a yes-instance if and only if (G, n/4) BCBS To this end, the main idea is to use a matrix expressing the adjacencies between A and B as the input matrix Partition class A corresponds to the rows and partition class B corresponds to the columns The salient points of our reduction are as follows 1 By making a matrix with 2x rows x-anonymous we ensure that there are at most two output types One of the types, the solution type, corresponds to the solution set of the original instance: Solution set vertices from A are represented by rows that are preimages of rows in the solution type and solution set vertices from B are represented by columns in the solution type that are not suppressed 2 We add one row that contains the -symbol in each entry Since the - symbol is not used in any other row, this enforces the other output type to be fully suppressed, that is, each column is suppressed 11

12 b 1 b 2 b n/2 1 b n/2 a a a n/ a n/ a n/ Fig 1 Typical structure of the matrix D of the k-anonymity instance obtained by the reduction from Balanced Complete Bipartite Subgraph for t out = 2 3 Since the rows in the solution type have to agree on the columns that are not suppressed, we have to ensure that they agree on adjacencies to model BCBS This is done by using two different types of 0-symbols representing non-adjacency The 1-symbol represents adjacency The matrix D is described in the following and illustrated in Figure 1 There is one row for each vertex in A and one column for each vertex in B The value in the i th column of the j th row is 1 if a j is adjacent to b i and, otherwise, 0 1 if 4 j n/4 and 0 2 if j > n/4 Additionally, there are two further rows, one containing only 1s and one containing only -symbols The number of allowed suppressions is s := (n/4 + 1) n/2 + (n/4 + 1) n/4 This completes the construction It remains to show that (G, n/4) is a yes-instance of BCBS if and only if the constructed matrix can be transformed into an (n/4 + 1)-anonymous matrix by suppressing at most s elements We start with the easier direction: : Let A A and B B be the vertex subsets forming a balanced complete bipartite subgraph of G such that A = B = n/4 The (n/4 + 1)- anonymous matrix looks as follows: The first output type is fully suppressed and contains the all- row, and each row which corresponds to a vertex from A \ A The second output type contains the remaining rows; in it each column corresponding to a vertex in B \ B is suppressed These rows clearly agree in each unsuppressed column, because each vertex from A is adjacent to each vertex from B Hence, each unsuppressed column contains a 1 The fully suppressed output type needs (n/4 + 1) n/2 suppressions and the second output type needs at most (n/4+1) n/4 suppressions Thus, the total number of suppressions is at most s : The (n/4 + 1)-anonymous matrix consists of exactly two output types because having only one type would cause (n/2+2) n/2 > s suppressions (due to the all- row everything would have to be suppressed) and an (n/4 + 1)- anonymous matrix with n/2+2 rows cannot have more than two output types One output type contains n/4 + 1 fully suppressed rows because of the all- 4 We assume without loss of generality that n is divisible by four 12

13 b 1 b 2 b n/2 1 b n/2 d 1 d 2 d n/2 a a a n/ a n/ a n/ n/ n/ n/4 + 1 t t t t t t t t Fig 2 Structure of the matrix D of the k-anonymity instance obtained by the extended reduction from Balanced Complete Bipartite Subgraph for general t out New parts of the reduction are in boldface row in the input Since the total number s of suppressions is (n/4 + 1) n/2 + (n/4 + 1) n/4 and for the first type one already needs at least (n/4 + 1) n/2 suppressions, the other type contains at most n/4 suppressed columns Since no symbol other than 1 appears in more than n/4 rows (by construction), each unsuppressed column consists entirely of 1s Now, consider the vertex subset A A corresponding to the rows of the second output type and the vertex subset B B corresponding to the unsuppressed columns Since at most one row in the second output type does not correspond to a vertex from A (the all-1 row), A n/4 Furthermore, B n/4 because at most n/4 of the n/2 columns can be suppressed in the second output type Since the second output type only consists of - and 1-entries, the subgraph induced by A and B is a balanced complete bipartite subgraph of G Observe that, for the matrix D constructed in the proof of Proposition 1, t min out = t max out = 2: If the matrix D is a yes-instance, then by construction it must have exactly two output types It follows that k-anonymity is NPcomplete for the maximization of the number of output row types, for its minimization, as well as when asking for an exact number of output row types In the following, we show that this hardness result holds for all fixed constants greater than or equal to two Theorem 1 For any fixed constant t out 2, k-anonymity is NP-complete, where t out = t min out = t max out and the alphabet size is t out + 2, even if the exact, minimum, or maximum number of output row types is additionally specified in the input 13

14 Proof We will show that the reduction from Proposition 1 can be extended such that for any given constant t out 2, a solution must have exactly t out output row types An illustration of the reduction can be found in Figure 2 As before, the reduction is from BCBS First, we construct a k-anonymity instance as described in Proposition 1 Then, we add n/2 dummy columns (d 1,, d n/2 ) filled with 1-entries Finally, for each integer 2 < i t out, we add n/4 + 1 dummy rows filled with the new symbol i in each entry As in the previous proof, the number of allowed suppressions is s := (n/4 + 1) n/2 + (n/4 + 1) n/4, and the degree of anonymity is k := (n/4 + 1) : Given a solution for BCBS, constructing the solution for the extended k-anonymity instance works exactly like before Each dummy row occurs n/4 + 1 times in the reduced instance, and so none of their entries need be suppressed The resulting solution has exactly t out output types : two output types as in the previous proof, and t out 2 other output types each of which consists of all the dummy rows which have the same symbol : Consider a solution for the k-anonymity instance If a dummy row and a row from another type have to be included in the same output type, then this requires suppressing all the entries of both these rows, at a cost of (n/4 + 1) n suppressions Since this is more than the allowed cost, it follows that in the output the dummy rows are left as they are So the dummy rows form t out 2 output row types By the same argument as in the proof of Proposition 1, it follows that there is a solution for the corresponding BCBS instance which is encoded in two further output row types Observe that the (n/4+1)-anonymous output matrix consists of exactly t out output row types Hence the theorem holds for t min out and t max out This clearly transfers to the cases where t out is specified in the input By allowing more symbols in the alphabet, the intractability result of Theorem 1 also extends to p-sensitivity and l-diversity Corollary 1 For any fixed constant t out 2, 1 p-sensitivity is NP-complete, where t out = t min out = t max out and the alphabet size is max(t out + 2, p), even if the exact, minimum, or maximum number of output row types is additionally specified in the input 2 l-diversity is NP-complete, where t out = t min out = t max out, even if the exact, minimum, or maximum number of output row types is additionally specified in the input Proof To extend the reduction given for the proof of Theorem 1 to also work for l-diversity and p-sensitivity, we define all original columns as quasiidentifiers, and add one extra private column To keep the argument simple, we only sketch the proof for 2-Sensitivity Cases with larger values of p can be handled in a similar fashion, with an alphabet of size max(t out + 2, p) 1 For 2-Sensitivity, set the private entries for the all-1 row and the all- row to 1 For each dummy row type, set half of the private entries to 0 and the rest to 1 Finally, set the private entries of the remaining rows to 0 and set p to 2 14

15 2 For l-diversity, set the private entry for each row to a new unique symbol and set l := k = (n/4 + 1) Now, for every row type containing at least (n/4 + 1) rows, the relative frequency of each private value is at most 1/(n/4 + 1) Furthermore, for every row type containing less than (n/4 + 1) rows, the relative frequency of each private value is at least 1/(n/4) Thus, every l-diverse matrix is also k-anonymous (restricted on the quasi-identifiers) The rest of the proof works analogously to the proof of Theorem 1 4 Tractability Results As we saw in the previous section, k-anonymity and its variants l- Diversity and p-sensitivity are all NP-hard even when there are just two different row types in the output As such, unless P=NP, these problems cannot be solved in polynomial time where the degree of the polynomial is a constant independent of the input even when it is specified as part of the input that the output has to contain a bounded number of row types In this section we see that the situation changes for the better when the number of row types in the input matrix is small We show, for instance, that when the number of row types in the input matrix is a constant, then we can solve k-anonymity in time linear in the input size (see Subsection 41) We then show how to adjust the algorithm to achieve fixed-parameter tractability for the combined parameter (t out, s) (where s denotes the allowed number of suppressions) and for the parameter number m of columns when t out is a constant (see Subsection 42) We also derive results for p-sensitivity (see Subsection 43) and for anonymization using the so-called domain generalization hierarchies (see Subsection 44) The algorithms described in this section find solutions that have at most t out many output row types If t out is not part of the input, then by iteratively trying the values 1,, t out the minimum value such that a solution exists can be determined Thus, if t out is not given, we have t out = t min out 41 Parameter t in In this subsection we show that k-anonymity is fixed-parameter tractable with respect to the parameter number t in of input row types Since t in Σ m and t in n, this result implies that k-anonymity is also fixed-parameter tractable with respect to the combined parameter (m, Σ ) and the single parameter n Both these latter results were known previously; Bonizzoni et al (2011b) demonstrated fixed-parameter tractability for the parameter (m, Σ ), and Evans et al (2009) showed the same for the parameter n Besides achieving a fixed-parameter tractability result for a typically smaller parameter, we improve their results by giving a simpler algorithm with a (usually) better running time 15

16 Algorithm 1 Pseudo-code for solving k-anonymity The function solverowassignment solves Row Assignment in polynomial time, see Lemma 1 1: procedure solvekanonymity(m, k, s, t out) 2: Determine the row types R 1,, R tin Phase 1, Step 1 3: for each possible A : [0, 1] t in t out do Phase 1, Step 2 4: for j 1 to t out do Phase 1, Step 3 5: if A[1, j] = A[2, j] = = A[t in, j] = 0 then 6: delete empty output row type R j 7: decrease t out by one 8: else 9: Determine all entries of R j 10: if solverowassignment then Phase 2 11: return yes 12: return no Let (M, k, s) be an instance of k-anonymity, and let M be the (unknown) k-anonymous solution matrix which we seek to obtain from M by suppressing at most s elements Our fixed-parameter algorithm works in two phases In the first phase, the algorithm guesses the entries of each row type R j in M In the second phase, the algorithm computes an assignment of the rows of M to the row types R j in M see Algorithm 1 for an outline We now explain the two phases in detail, beginning with Phase 1 To first determine the row types R i of M (line 2 in Algorithm 1), the algorithm constructs a trie (Fredkin, 1960) on the rows of M The leaves of the trie correspond to the row types of M For later use, the algorithm also keeps track of the numbers n 1, n 2,, n tin of each type of row that is present in M; this can clearly be done by keeping a counter at each leaf of the trie and incrementing it by one whenever a new row matches the path to a leaf All of this can be done in a single pass over M For implementing the guess in Step 2 of Phase 1, the algorithm goes over all binary matrices of dimension t in t out ; such a matrix A is interpreted as follows: A row of type R i is mapped 5 to a row of type R j if and only if A[i, j] = 1 (see line 3) Note that we allow in our guessing step that an output type may contain no row of any input row type These empty output row types are deleted Hence, with our guessing in Step 2, we guess not only output matrices M with exactly t out types, but also matrices M with at most t out types Now the algorithm computes the entries of each row type R j, 1 j t out, of M (Step 3 of Phase 1) Assume for ease of notation that R 1,, R l are the row types of M which contribute (according to the guessing) at least one row to the (as yet unknown) output row type R j Now, for each 1 i m, if R 1 [i] = R 2 [i] = = R l [i], then set R j [i] := R 1[i]; otherwise, set R j [i] := This yields the entries of the output type R j, and the number ω j of suppres- 5 Note that not all rows of an input type need to be mapped to the same output type 16

17 sions required to convert any input row (if possible) to the type R j is the number of -entries in R j The guessing in Step 2 of Phase 1 takes time exponential in the parameter (t in, t out ), but Phase 2 can be done in polynomial time To show this, we prove that Row Assignment is polynomial-time solvable We do this in the next lemma, after formally defining the Row Assignment problem To this end, we use the two sets T in = {1,, t in } and T out = {1,, t out } Row Assignment Input: Nonnegative integers k, s, ω 1,, ω tout and n 1,, n tin with tin i=1 n i = n, and a function a : T in T out {0, 1} Question: Is there a function g : T in T out {0,, n} such that t in t out i=1 j=1 a(i, j) n g(i, j) i T in j T out (1) t in i=1 g(i, j) k j T out (2) t out g(i, j) = n i i T in (3) j=1 g(i, j) ω j s (4) Row Assignment formally defines the remaining problem in Phase 2: At this stage of the algorithm the input row types R 1,, R tin and the number of rows n 1,, n tin in these input row types are known The algorithm has also computed the output row types R 1,, R t out and the number of suppressions ω 1,, ω tout in these output row types Now, the algorithm computes an assignment of the rows of the input row types to output row types such that: The assignment of the rows respects the guessing in Step 2 of Phase 1 This is secured by Inequality (1) M is k-anonymous, that is, each output row type contains at least k rows This is secured by Inequality (2) All rows of each input row type are assigned This is secured by Equation (3) The total cost of the assignment is at most s This is secured by Inequality (4) Note that in the definition of Row Assignment no row type occurs and, hence, the problem is independent of the specific entries of the input or output row types Lemma 1 Row Assignment can be solved in O((t in +t out ) log(t in +t out )(t in t out + (t in + t out ) log(t in + t out ))) time Proof We reduce Row Assignment to the Uncapacitated Minimum Cost Flow problem, which is defined as follows (Orlin, 1988): 17

18 n 1 v 1 n 2 v 2 n 3 v 3 n 4 v 4 n 5 v 5 ω 1 ω 1 ω 3 ω 2 ω 2 ω 3 ω 4 ω 4 k u 1 k u 2 k u 3 k u ( tin ) n i k t out i=1 t Fig 3 Example of the constructed network with t in = 5 and t out = 4 The number on each arc denotes its cost The number next to each node denotes its demand Uncapacitated Minimum Cost Flow Input: Task: A network (directed graph) D = (V, A) with demands d : V Z on the nodes and costs c : V V N Find a function f which minimizes (u,v) A c(u, v) f(u, v) and satisfies: f(u, v) f(v, u) = d(u) u V {v (u,v) A} {v (v,u) A} f(u, v) 0 (u, v) A We first describe the construction of the network with demands and costs For each n i, 1 i t in, add a node v i with demand n i (that is, a supply of n i ) and for each ω j add a node u j with demand k If a(i, j) = 1, then add an arc (v i, u j ) with cost ω j Finally, add a sink t with demand ( n i ) k t out and the arcs (u j, t) with cost zero See Figure 3 for an example of the construction Note that, although the arc capacities are unbounded, the maximum flow over one arc is implicitly bounded by n because the sum of all supplies is tin i=1 n i = n The Uncapacitated Minimum Cost Flow problem is solvable in O( V log( V )( A + V log( V ))) time in a network (directed graph) D = (V, A) (Orlin, 1988) Since our constructed network has O(t in + t out ) nodes and O(t in t out ) arcs, we can solve our Uncapacitated Minimum Cost Flow-instance in O((t in + t out ) log(t in + t out )(t in t out + (t in + t out ) log(t in + t out ))) time It remains to prove that the Row Assignment-instance is a yes-instance if and only if the constructed network has a minimum cost flow of cost at most s : Assume that g is a function fulfilling constraints (1) to (4) Then define a flow f as follows: For each 1 i t in, 1 j t out, set f(v i, u j ) = g(i, j) and f(u j, t) = t in i=1 g(i, j) k The flow f fulfills the demands on the nodes due to Equation (3) and Inequality (2) Since g fulfills Inequality (4) and the cost of each arc (u j, t), 1 j t out, is zero, flow f has cost of at most s 18

19 : Assume that f is a flow with cost of at most s All costs and demands are integer valued, and hence, due to the Integrality Property of network flow problems, an optimal flow has also integer values Then set g(i, j) = f(v i, u j ) for each 1 i t in, 1 j t out Note that g fulfills Equation (3) and Inequality (2) due to the demands on the nodes of the network Since n i n for all 1 i t in, also Inequality (1) is fulfilled Note that f has cost of at most s and, hence, g fulfills Inequality (4) Putting all these together, we arrive at the following theorem: Theorem 2 k-anonymity can be solved in O(nm + 2 tintout (t in t out m + (t in + t out ) log(t in + t out )(t in t out + (t in + t out ) log(t in + t out )))) time Proof The correctness of the described two-phase-algorithm (see Algorithm 1) follows from the fact that Phase 1 performs an exhaustive search, Lemma 1, and (as discussed above) that Row Assignment is indeed the remaining problem As for the running time: As described above, Step 1 of Phase 1 (line 2 in Algorithm 1) can be done in one pass of the input matrix, that is, in O(nm) time The loop starting at line 3 iterates O(2 tintout ) times The loop starting at line 4 iterates at most t out times; the condition inside this loop can be checked in O(t in ) time, and the column in A corresponding to an empty output row type can be marked as deleted in constant time The entries of each R j can be computed in O(mt in) time, as described above As we saw in Lemma 1, the Row Assignment problem can be solved in O((t in + t out ) log(t in + t out )(t in t out + (t in + t out ) log(t in + t out ))) time Putting all these together, the algorithm runs in O(nm + 2 tintout (t in t out m + (t in + t out ) log(t in + t out )(t in t out + (t in + t out ) log(t in + t out )))) time In case that t out is not given, the algorithm simple tries all possible values from 0 to t out The running time is O(nm+ t out i=1 2tini (t in im+(t in +i) log(t in + i)(t in i+(t in +i) log(t in +i)))) = O(nm+2 tintout (t in t out m+(t in +t out ) log(t in + t out )(t in t out + (t in + t out ) log(t in + t out )))) We now show that the exponential dependence of the running time of the algorithm of Theorem 2 on the number t out of output types can be done away with To this end, we exploit that the problem definition (see Section 2) does not impose any restriction on t out In particular, we are free to look for a solution which minimizes the number of output types We take advantage of this to show that, without loss of generality, one may assume t out t in (remember that in this section we set t out := t min out ) Lemma 2 Let (M, k, s) be a yes-instance of k-anonymity If M has t in row types, then by suppressing at most s elements one can construct a k-anonymous matrix M Proof Let M be a k-anonymous matrix obtained from M by suppressing at most s elements If M has at most t in row types, then there is nothing to prove So let M have more than t in row types We now describe an operation 19

20 that reduces the number of row types in M without increasing the cost (in terms of the number of suppressed elements) of obtaining M from M Call a row type R in M redistributable if, for each row r in R, there is a row type R r in M such that (i) the preimage r of r in M can be converted to the type R r by suppressing in r the -symbol positions of R r, and (ii) R r contains at most as many -symbols as R Observe that if a row type in M is redistributable, then we can eliminate this row type from M by moving each of its rows to a different, at most as expensive row type, while preserving the condition that there are at least k rows per (output) row type This operation reduces the number of row types in M without increasing the cost of obtaining M from M As long as there are redistributable row types left in M, we repeatedly eliminate row types in this manner If the number of remaining row types is at most t in, then we are done So let there be more than t in row types left in M, none of which is redistributable Consider a row type R in M that is not redistributable Let r be a row in R such that the cost of suppressing the preimage r of r to match any row type R in M with R R is more than the cost of suppressing r to match R Since R is not redistributable such a row r must exist Let R be the row type of r in M Then the cost of suppressing any row in R to match any row type R in M with R R is more than the cost of suppressing this row to match R It follows that R is the only row type in M having this property with respect to R Let f be a mapping that takes each row type R in M to some row type R for which R is the unique cheapest row type From the above argument, such an f exists and is one-to-one It follows that M has more than t in row types, a contradiction Combining Lemma 2 with Theorem 2 we get: Theorem 3 k-anonymity can be solved in O(nm + 2 t2 in t 2 in (m + t in log(t in ))) time In the beginning of this section, we remarked that our algorithm implies the fixed-parameter tractability of k-anonymity with respect to the combined parameter ( Σ, m) This was previously shown by Bonizzoni et al (2011b) They presented an algorithm for k-anonymity which runs in O(2 ( Σ +1)m kmn 2 ) time and works similarly to our algorithm in two phases: First, their algorithm guesses all possible output row types together with their entries in O(2 ( Σ +1)m ) time In Phase 1 our algorithm guesses the output row types producible from M within O(2 tintout t in t out m + mn) time using a different approach Note that, in general, t in is much smaller than the number Σ m of all possible different input types Hence, in general the guessing step of our algorithm is faster For instances where Σ m t in t out, one can exchange the guessing step with the one from Bonizzoni et al which runs in O(2 ( Σ +1)m ) time Next, we compare Phase 2 of our algorithm to the second step of Bonizzoni et al s algorithm In both algorithms, the same problem Row Assignment is solved Bonizzoni et al did this by finding a maximum matching on a bipartite 20

Using Patterns to Form Homogeneous Teams

Using Patterns to Form Homogeneous Teams Algorithmica manuscript No. (will be inserted by the editor) Using Patterns to Form Homogeneous Teams Robert Bredereck, Thomas Köhler, André Nichterlein, Rolf Niedermeier, Geevarghese Philip Abstract Homogeneous

More information

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization

The Complexity of Maximum. Matroid-Greedoid Intersection and. Weighted Greedoid Maximization Department of Computer Science Series of Publications C Report C-2004-2 The Complexity of Maximum Matroid-Greedoid Intersection and Weighted Greedoid Maximization Taneli Mielikäinen Esko Ukkonen University

More information

Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems

Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems Classical Complexity and Fixed-Parameter Tractability of Simultaneous Consecutive Ones Submatrix & Editing Problems Rani M. R, Mohith Jagalmohanan, R. Subashini Binary matrices having simultaneous consecutive

More information

arxiv: v2 [cs.dm] 12 Jul 2014

arxiv: v2 [cs.dm] 12 Jul 2014 Interval Scheduling and Colorful Independent Sets arxiv:1402.0851v2 [cs.dm] 12 Jul 2014 René van Bevern 1, Matthias Mnich 2, Rolf Niedermeier 1, and Mathias Weller 3 1 Institut für Softwaretechnik und

More information

Approximability and Parameterized Complexity of Consecutive Ones Submatrix Problems

Approximability and Parameterized Complexity of Consecutive Ones Submatrix Problems Proc. 4th TAMC, 27 Approximability and Parameterized Complexity of Consecutive Ones Submatrix Problems Michael Dom, Jiong Guo, and Rolf Niedermeier Institut für Informatik, Friedrich-Schiller-Universität

More information

Finding Consensus Strings With Small Length Difference Between Input and Solution Strings

Finding Consensus Strings With Small Length Difference Between Input and Solution Strings Finding Consensus Strings With Small Length Difference Between Input and Solution Strings Markus L. Schmid Trier University, Fachbereich IV Abteilung Informatikwissenschaften, D-54286 Trier, Germany, MSchmid@uni-trier.de

More information

Some hard families of parameterised counting problems

Some hard families of parameterised counting problems Some hard families of parameterised counting problems Mark Jerrum and Kitty Meeks School of Mathematical Sciences, Queen Mary University of London {m.jerrum,k.meeks}@qmul.ac.uk September 2014 Abstract

More information

NP-Hardness and Fixed-Parameter Tractability of Realizing Degree Sequences with Directed Acyclic Graphs

NP-Hardness and Fixed-Parameter Tractability of Realizing Degree Sequences with Directed Acyclic Graphs NP-Hardness and Fixed-Parameter Tractability of Realizing Degree Sequences with Directed Acyclic Graphs Sepp Hartung and André Nichterlein Institut für Softwaretechnik und Theoretische Informatik TU Berlin

More information

On the Complexity of the Minimum Independent Set Partition Problem

On the Complexity of the Minimum Independent Set Partition Problem On the Complexity of the Minimum Independent Set Partition Problem T-H. Hubert Chan 1, Charalampos Papamanthou 2, and Zhichao Zhao 1 1 Department of Computer Science the University of Hong Kong {hubert,zczhao}@cs.hku.hk

More information

Kernelization Lower Bounds: A Brief History

Kernelization Lower Bounds: A Brief History Kernelization Lower Bounds: A Brief History G Philip Max Planck Institute for Informatics, Saarbrücken, Germany New Developments in Exact Algorithms and Lower Bounds. Pre-FSTTCS 2014 Workshop, IIT Delhi

More information

Parameterized Algorithms and Kernels for 3-Hitting Set with Parity Constraints

Parameterized Algorithms and Kernels for 3-Hitting Set with Parity Constraints Parameterized Algorithms and Kernels for 3-Hitting Set with Parity Constraints Vikram Kamat 1 and Neeldhara Misra 2 1 University of Warsaw vkamat@mimuw.edu.pl 2 Indian Institute of Science, Bangalore neeldhara@csa.iisc.ernet.in

More information

A Cubic-Vertex Kernel for Flip Consensus Tree

A Cubic-Vertex Kernel for Flip Consensus Tree To appear in Algorithmica A Cubic-Vertex Kernel for Flip Consensus Tree Christian Komusiewicz Johannes Uhlmann Received: date / Accepted: date Abstract Given a bipartite graph G = (V c, V t, E) and a nonnegative

More information

Preliminaries and Complexity Theory

Preliminaries and Complexity Theory Preliminaries and Complexity Theory Oleksandr Romanko CAS 746 - Advanced Topics in Combinatorial Optimization McMaster University, January 16, 2006 Introduction Book structure: 2 Part I Linear Algebra

More information

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem.

an efficient procedure for the decision problem. We illustrate this phenomenon for the Satisfiability problem. 1 More on NP In this set of lecture notes, we examine the class NP in more detail. We give a characterization of NP which justifies the guess and verify paradigm, and study the complexity of solving search

More information

Detecting Backdoor Sets with Respect to Horn and Binary Clauses

Detecting Backdoor Sets with Respect to Horn and Binary Clauses Detecting Backdoor Sets with Respect to Horn and Binary Clauses Naomi Nishimura 1,, Prabhakar Ragde 1,, and Stefan Szeider 2, 1 School of Computer Science, University of Waterloo, Waterloo, Ontario, N2L

More information

More on NP and Reductions

More on NP and Reductions Indian Institute of Information Technology Design and Manufacturing, Kancheepuram Chennai 600 127, India An Autonomous Institute under MHRD, Govt of India http://www.iiitdm.ac.in COM 501 Advanced Data

More information

On Explaining Integer Vectors by Few Homogeneous Segments

On Explaining Integer Vectors by Few Homogeneous Segments On Explaining Integer Vectors by Few Homogeneous Segments Robert Bredereck a,1, Jiehua Chen a,2, Sepp Hartung a, Christian Komusiewicz a, Rolf Niedermeier a, Ondřej Suchý b,3 a Institut für Softwaretechnik

More information

Parameterized Complexity of the Arc-Preserving Subsequence Problem

Parameterized Complexity of the Arc-Preserving Subsequence Problem Parameterized Complexity of the Arc-Preserving Subsequence Problem Dániel Marx 1 and Ildikó Schlotter 2 1 Tel Aviv University, Israel 2 Budapest University of Technology and Economics, Hungary {dmarx,ildi}@cs.bme.hu

More information

Lecture 14 - P v.s. NP 1

Lecture 14 - P v.s. NP 1 CME 305: Discrete Mathematics and Algorithms Instructor: Professor Aaron Sidford (sidford@stanford.edu) February 27, 2018 Lecture 14 - P v.s. NP 1 In this lecture we start Unit 3 on NP-hardness and approximation

More information

Essential facts about NP-completeness:

Essential facts about NP-completeness: CMPSCI611: NP Completeness Lecture 17 Essential facts about NP-completeness: Any NP-complete problem can be solved by a simple, but exponentially slow algorithm. We don t have polynomial-time solutions

More information

Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik. Combinatorial Optimization (MA 4502)

Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik. Combinatorial Optimization (MA 4502) Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik Combinatorial Optimization (MA 4502) Dr. Michael Ritter Problem Sheet 1 Homework Problems Exercise

More information

The matrix approach for abstract argumentation frameworks

The matrix approach for abstract argumentation frameworks The matrix approach for abstract argumentation frameworks Claudette CAYROL, Yuming XU IRIT Report RR- -2015-01- -FR February 2015 Abstract The matrices and the operation of dual interchange are introduced

More information

arxiv: v1 [cs.ds] 2 Oct 2018

arxiv: v1 [cs.ds] 2 Oct 2018 Contracting to a Longest Path in H-Free Graphs Walter Kern 1 and Daniël Paulusma 2 1 Department of Applied Mathematics, University of Twente, The Netherlands w.kern@twente.nl 2 Department of Computer Science,

More information

Efficient Approximation for Restricted Biclique Cover Problems

Efficient Approximation for Restricted Biclique Cover Problems algorithms Article Efficient Approximation for Restricted Biclique Cover Problems Alessandro Epasto 1, *, and Eli Upfal 2 ID 1 Google Research, New York, NY 10011, USA 2 Department of Computer Science,

More information

Closest 4-leaf power is fixed-parameter tractable

Closest 4-leaf power is fixed-parameter tractable Discrete Applied Mathematics 156 (2008) 3345 3361 www.elsevier.com/locate/dam Closest 4-leaf power is fixed-parameter tractable Michael Dom, Jiong Guo, Falk Hüffner, Rolf Niedermeier Institut für Informatik,

More information

Exploiting a Hypergraph Model for Finding Golomb Rulers

Exploiting a Hypergraph Model for Finding Golomb Rulers Exploiting a Hypergraph Model for Finding Golomb Rulers Manuel Sorge, Hannes Moser, Rolf Niedermeier, and Mathias Weller Institut für Softwaretechnik und Theoretische Informatik, TU Berlin, Berlin, Germany

More information

Copyright 2013 Springer Science+Business Media New York

Copyright 2013 Springer Science+Business Media New York Meeks, K., and Scott, A. (2014) Spanning trees and the complexity of floodfilling games. Theory of Computing Systems, 54 (4). pp. 731-753. ISSN 1432-4350 Copyright 2013 Springer Science+Business Media

More information

The Maximum Flow Problem with Disjunctive Constraints

The Maximum Flow Problem with Disjunctive Constraints The Maximum Flow Problem with Disjunctive Constraints Ulrich Pferschy Joachim Schauer Abstract We study the maximum flow problem subject to binary disjunctive constraints in a directed graph: A negative

More information

On improving matchings in trees, via bounded-length augmentations 1

On improving matchings in trees, via bounded-length augmentations 1 On improving matchings in trees, via bounded-length augmentations 1 Julien Bensmail a, Valentin Garnero a, Nicolas Nisse a a Université Côte d Azur, CNRS, Inria, I3S, France Abstract Due to a classical

More information

A Polynomial-Time Algorithm for Pliable Index Coding

A Polynomial-Time Algorithm for Pliable Index Coding 1 A Polynomial-Time Algorithm for Pliable Index Coding Linqi Song and Christina Fragouli arxiv:1610.06845v [cs.it] 9 Aug 017 Abstract In pliable index coding, we consider a server with m messages and n

More information

Advanced topic: Space complexity

Advanced topic: Space complexity Advanced topic: Space complexity CSCI 3130 Formal Languages and Automata Theory Siu On CHAN Chinese University of Hong Kong Fall 2016 1/28 Review: time complexity We have looked at how long it takes to

More information

The Budgeted Unique Coverage Problem and Color-Coding (Extended Abstract)

The Budgeted Unique Coverage Problem and Color-Coding (Extended Abstract) The Budgeted Unique Coverage Problem and Color-Coding (Extended Abstract) Neeldhara Misra 1, Venkatesh Raman 1, Saket Saurabh 2, and Somnath Sikdar 1 1 The Institute of Mathematical Sciences, Chennai,

More information

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES

ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES SANTOSH N. KABADI AND ABRAHAM P. PUNNEN Abstract. Polynomially testable characterization of cost matrices associated

More information

Strongly chordal and chordal bipartite graphs are sandwich monotone

Strongly chordal and chordal bipartite graphs are sandwich monotone Strongly chordal and chordal bipartite graphs are sandwich monotone Pinar Heggernes Federico Mancini Charis Papadopoulos R. Sritharan Abstract A graph class is sandwich monotone if, for every pair of its

More information

1 Basic Game Modelling

1 Basic Game Modelling Max-Planck-Institut für Informatik, Winter 2017 Advanced Topic Course Algorithmic Game Theory, Mechanism Design & Computational Economics Lecturer: CHEUNG, Yun Kuen (Marco) Lecture 1: Basic Game Modelling,

More information

An Improved Algorithm for Parameterized Edge Dominating Set Problem

An Improved Algorithm for Parameterized Edge Dominating Set Problem An Improved Algorithm for Parameterized Edge Dominating Set Problem Ken Iwaide and Hiroshi Nagamochi Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, Japan,

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity 1 CS 350 Algorithms and Complexity Fall 2015 Lecture 15: Limitations of Algorithmic Power Introduction to complexity theory Andrew P. Black Department of Computer Science Portland State University Lower

More information

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about

Lecture December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about 0368.4170: Cryptography and Game Theory Ran Canetti and Alon Rosen Lecture 7 02 December 2009 Fall 2009 Scribe: R. Ring In this lecture we will talk about Two-Player zero-sum games (min-max theorem) Mixed

More information

Realization Plans for Extensive Form Games without Perfect Recall

Realization Plans for Extensive Form Games without Perfect Recall Realization Plans for Extensive Form Games without Perfect Recall Richard E. Stearns Department of Computer Science University at Albany - SUNY Albany, NY 12222 April 13, 2015 Abstract Given a game in

More information

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k.

P is the class of problems for which there are algorithms that solve the problem in time O(n k ) for some constant k. Complexity Theory Problems are divided into complexity classes. Informally: So far in this course, almost all algorithms had polynomial running time, i.e., on inputs of size n, worst-case running time

More information

Approximation Algorithms for Re-optimization

Approximation Algorithms for Re-optimization Approximation Algorithms for Re-optimization DRAFT PLEASE DO NOT CITE Dean Alderucci Table of Contents 1.Introduction... 2 2.Overview of the Current State of Re-Optimization Research... 3 2.1.General Results

More information

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding

CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding CS264: Beyond Worst-Case Analysis Lecture #11: LP Decoding Tim Roughgarden October 29, 2014 1 Preamble This lecture covers our final subtopic within the exact and approximate recovery part of the course.

More information

Introduction to Complexity Theory

Introduction to Complexity Theory Introduction to Complexity Theory Read K & S Chapter 6. Most computational problems you will face your life are solvable (decidable). We have yet to address whether a problem is easy or hard. Complexity

More information

34.1 Polynomial time. Abstract problems

34.1 Polynomial time. Abstract problems < Day Day Up > 34.1 Polynomial time We begin our study of NP-completeness by formalizing our notion of polynomial-time solvable problems. These problems are generally regarded as tractable, but for philosophical,

More information

A An Overview of Complexity Theory for the Algorithm Designer

A An Overview of Complexity Theory for the Algorithm Designer A An Overview of Complexity Theory for the Algorithm Designer A.1 Certificates and the class NP A decision problem is one whose answer is either yes or no. Two examples are: SAT: Given a Boolean formula

More information

A Fixed-Parameter Algorithm for Max Edge Domination

A Fixed-Parameter Algorithm for Max Edge Domination A Fixed-Parameter Algorithm for Max Edge Domination Tesshu Hanaka and Hirotaka Ono Department of Economic Engineering, Kyushu University, Fukuoka 812-8581, Japan ono@cscekyushu-uacjp Abstract In a graph,

More information

CS6901: review of Theory of Computation and Algorithms

CS6901: review of Theory of Computation and Algorithms CS6901: review of Theory of Computation and Algorithms Any mechanically (automatically) discretely computation of problem solving contains at least three components: - problem description - computational

More information

On the Space Complexity of Parameterized Problems

On the Space Complexity of Parameterized Problems On the Space Complexity of Parameterized Problems Michael Elberfeld Christoph Stockhusen Till Tantau Institute for Theoretical Computer Science Universität zu Lübeck D-23538 Lübeck, Germany {elberfeld,stockhus,tantau}@tcs.uni-luebeck.de

More information

CS151 Complexity Theory. Lecture 1 April 3, 2017

CS151 Complexity Theory. Lecture 1 April 3, 2017 CS151 Complexity Theory Lecture 1 April 3, 2017 Complexity Theory Classify problems according to the computational resources required running time storage space parallelism randomness rounds of interaction,

More information

On the Fixed Parameter Tractability and Approximability of the Minimum Error Correction problem

On the Fixed Parameter Tractability and Approximability of the Minimum Error Correction problem On the Fixed Parameter Tractability and Approximability of the Minimum Error Correction problem Paola Bonizzoni, Riccardo Dondi, Gunnar W. Klau, Yuri Pirola, Nadia Pisanti and Simone Zaccaria DISCo, computer

More information

Limitations of Algorithm Power

Limitations of Algorithm Power Limitations of Algorithm Power Objectives We now move into the third and final major theme for this course. 1. Tools for analyzing algorithms. 2. Design strategies for designing algorithms. 3. Identifying

More information

An improved approximation algorithm for the stable marriage problem with one-sided ties

An improved approximation algorithm for the stable marriage problem with one-sided ties Noname manuscript No. (will be inserted by the editor) An improved approximation algorithm for the stable marriage problem with one-sided ties Chien-Chung Huang Telikepalli Kavitha Received: date / Accepted:

More information

The Parameterized Complexity of Intersection and Composition Operations on Sets of Finite-State Automata

The Parameterized Complexity of Intersection and Composition Operations on Sets of Finite-State Automata The Parameterized Complexity of Intersection and Composition Operations on Sets of Finite-State Automata H. Todd Wareham Department of Computer Science, Memorial University of Newfoundland, St. John s,

More information

Characterization of Semantics for Argument Systems

Characterization of Semantics for Argument Systems Characterization of Semantics for Argument Systems Philippe Besnard and Sylvie Doutre IRIT Université Paul Sabatier 118, route de Narbonne 31062 Toulouse Cedex 4 France besnard, doutre}@irit.fr Abstract

More information

Characterization of Convex and Concave Resource Allocation Problems in Interference Coupled Wireless Systems

Characterization of Convex and Concave Resource Allocation Problems in Interference Coupled Wireless Systems 2382 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 59, NO 5, MAY 2011 Characterization of Convex and Concave Resource Allocation Problems in Interference Coupled Wireless Systems Holger Boche, Fellow, IEEE,

More information

A bad example for the iterative rounding method for mincost k-connected spanning subgraphs

A bad example for the iterative rounding method for mincost k-connected spanning subgraphs A bad example for the iterative rounding method for mincost k-connected spanning subgraphs Ashkan Aazami a, Joseph Cheriyan b,, Bundit Laekhanukit c a Dept. of Comb. & Opt., U. Waterloo, Waterloo ON Canada

More information

Local Search for String Problems: Brute Force is Essentially Optimal

Local Search for String Problems: Brute Force is Essentially Optimal Local Search for String Problems: Brute Force is Essentially Optimal Jiong Guo 1, Danny Hermelin 2, Christian Komusiewicz 3 1 Universität des Saarlandes, Campus E 1.7, D-66123 Saarbrücken, Germany. jguo@mmci.uni-saarland.de

More information

Pareto Optimality in Coalition Formation

Pareto Optimality in Coalition Formation Pareto Optimality in Coalition Formation Haris Aziz Felix Brandt Paul Harrenstein Department of Informatics Technische Universität München 85748 Garching bei München, Germany {aziz,brandtf,harrenst}@in.tum.de

More information

Lecture 22: Counting

Lecture 22: Counting CS 710: Complexity Theory 4/8/2010 Lecture 22: Counting Instructor: Dieter van Melkebeek Scribe: Phil Rydzewski & Chi Man Liu Last time we introduced extractors and discussed two methods to construct them.

More information

K-center Hardness and Max-Coverage (Greedy)

K-center Hardness and Max-Coverage (Greedy) IOE 691: Approximation Algorithms Date: 01/11/2017 Lecture Notes: -center Hardness and Max-Coverage (Greedy) Instructor: Viswanath Nagarajan Scribe: Sentao Miao 1 Overview In this lecture, we will talk

More information

1 The linear algebra of linear programs (March 15 and 22, 2015)

1 The linear algebra of linear programs (March 15 and 22, 2015) 1 The linear algebra of linear programs (March 15 and 22, 2015) Many optimization problems can be formulated as linear programs. The main features of a linear program are the following: Variables are real

More information

CS 6820 Fall 2014 Lectures, October 3-20, 2014

CS 6820 Fall 2014 Lectures, October 3-20, 2014 Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given

More information

The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth

The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth The Mixed Chinese Postman Problem Parameterized by Pathwidth and Treedepth Gregory Gutin, Mark Jones, and Magnus Wahlström Royal Holloway, University of London Egham, Surrey TW20 0EX, UK Abstract In the

More information

Exact and Approximate Equilibria for Optimal Group Network Formation

Exact and Approximate Equilibria for Optimal Group Network Formation Exact and Approximate Equilibria for Optimal Group Network Formation Elliot Anshelevich and Bugra Caskurlu Computer Science Department, RPI, 110 8th Street, Troy, NY 12180 {eanshel,caskub}@cs.rpi.edu Abstract.

More information

Parameterized Algorithms for the H-Packing with t-overlap Problem

Parameterized Algorithms for the H-Packing with t-overlap Problem Journal of Graph Algorithms and Applications http://jgaa.info/ vol. 18, no. 4, pp. 515 538 (2014) DOI: 10.7155/jgaa.00335 Parameterized Algorithms for the H-Packing with t-overlap Problem Alejandro López-Ortiz

More information

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler

Complexity Theory VU , SS The Polynomial Hierarchy. Reinhard Pichler Complexity Theory Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 15 May, 2018 Reinhard

More information

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181.

Outline. Complexity Theory EXACT TSP. The Class DP. Definition. Problem EXACT TSP. Complexity of EXACT TSP. Proposition VU 181. Complexity Theory Complexity Theory Outline Complexity Theory VU 181.142, SS 2018 6. The Polynomial Hierarchy Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität

More information

Additive Combinatorics and Szemerédi s Regularity Lemma

Additive Combinatorics and Szemerédi s Regularity Lemma Additive Combinatorics and Szemerédi s Regularity Lemma Vijay Keswani Anurag Sahay 20th April, 2015 Supervised by : Dr. Rajat Mittal 1 Contents 1 Introduction 3 2 Sum-set Estimates 4 2.1 Size of sumset

More information

Computational Aspects of Aggregation in Biological Systems

Computational Aspects of Aggregation in Biological Systems Computational Aspects of Aggregation in Biological Systems Vladik Kreinovich and Max Shpak University of Texas at El Paso, El Paso, TX 79968, USA vladik@utep.edu, mshpak@utep.edu Summary. Many biologically

More information

CS 350 Algorithms and Complexity

CS 350 Algorithms and Complexity CS 350 Algorithms and Complexity Winter 2019 Lecture 15: Limitations of Algorithmic Power Introduction to complexity theory Andrew P. Black Department of Computer Science Portland State University Lower

More information

Exact Algorithms for Dominating Induced Matching Based on Graph Partition

Exact Algorithms for Dominating Induced Matching Based on Graph Partition Exact Algorithms for Dominating Induced Matching Based on Graph Partition Mingyu Xiao School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu 611731,

More information

DIMACS Technical Report March Game Seki 1

DIMACS Technical Report March Game Seki 1 DIMACS Technical Report 2007-05 March 2007 Game Seki 1 by Diogo V. Andrade RUTCOR, Rutgers University 640 Bartholomew Road Piscataway, NJ 08854-8003 dandrade@rutcor.rutgers.edu Vladimir A. Gurvich RUTCOR,

More information

The minimum G c cut problem

The minimum G c cut problem The minimum G c cut problem Abstract In this paper we define and study the G c -cut problem. Given a complete undirected graph G = (V ; E) with V = n, edge weighted by w(v i, v j ) 0 and an undirected

More information

Int. Journal of Foundations of CS, Vol. 17(6), pp , 2006

Int. Journal of Foundations of CS, Vol. 17(6), pp , 2006 Int Journal of Foundations of CS, Vol 7(6), pp 467-484, 2006 THE COMPUTATIONAL COMPLEXITY OF AVOIDING FORBIDDEN SUBMATRICES BY ROW DELETIONS SEBASTIAN WERNICKE, JOCHEN ALBER 2, JENS GRAMM 3, JIONG GUO,

More information

Time-bounded computations

Time-bounded computations Lecture 18 Time-bounded computations We now begin the final part of the course, which is on complexity theory. We ll have time to only scratch the surface complexity theory is a rich subject, and many

More information

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms

Computer Science 385 Analysis of Algorithms Siena College Spring Topic Notes: Limitations of Algorithms Computer Science 385 Analysis of Algorithms Siena College Spring 2011 Topic Notes: Limitations of Algorithms We conclude with a discussion of the limitations of the power of algorithms. That is, what kinds

More information

1.1 P, NP, and NP-complete

1.1 P, NP, and NP-complete CSC5160: Combinatorial Optimization and Approximation Algorithms Topic: Introduction to NP-complete Problems Date: 11/01/2008 Lecturer: Lap Chi Lau Scribe: Jerry Jilin Le This lecture gives a general introduction

More information

A New Upper Bound for Max-2-SAT: A Graph-Theoretic Approach

A New Upper Bound for Max-2-SAT: A Graph-Theoretic Approach A New Upper Bound for Max-2-SAT: A Graph-Theoretic Approach Daniel Raible & Henning Fernau University of Trier, FB 4 Abteilung Informatik, 54286 Trier, Germany {raible,fernau}@informatik.uni-trier.de Abstract.

More information

Network-based dissolution

Network-based dissolution Network-based dissolution Bevern, van, R.; Bredereck, R.; Chen, J.; Froese, V.; Niedermeier, R.; Woeginger, G. Published: 01/01/2014 Document Version Publisher s PDF, also known as Version of Record (includes

More information

arxiv: v1 [cs.dm] 12 Jun 2016

arxiv: v1 [cs.dm] 12 Jun 2016 A Simple Extension of Dirac s Theorem on Hamiltonicity Yasemin Büyükçolak a,, Didem Gözüpek b, Sibel Özkana, Mordechai Shalom c,d,1 a Department of Mathematics, Gebze Technical University, Kocaeli, Turkey

More information

Asymptotic enumeration of sparse uniform linear hypergraphs with given degrees

Asymptotic enumeration of sparse uniform linear hypergraphs with given degrees Asymptotic enumeration of sparse uniform linear hypergraphs with given degrees Vladimir Blinovsky Instituto de atemática e Estatística Universidade de São Paulo, 05508-090, Brazil Institute for Information

More information

arxiv: v3 [cs.fl] 2 Jul 2018

arxiv: v3 [cs.fl] 2 Jul 2018 COMPLEXITY OF PREIMAGE PROBLEMS FOR DETERMINISTIC FINITE AUTOMATA MIKHAIL V. BERLINKOV arxiv:1704.08233v3 [cs.fl] 2 Jul 2018 Institute of Natural Sciences and Mathematics, Ural Federal University, Ekaterinburg,

More information

Tree sets. Reinhard Diestel

Tree sets. Reinhard Diestel 1 Tree sets Reinhard Diestel Abstract We study an abstract notion of tree structure which generalizes treedecompositions of graphs and matroids. Unlike tree-decompositions, which are too closely linked

More information

Discrete Optimization 23

Discrete Optimization 23 Discrete Optimization 23 2 Total Unimodularity (TU) and Its Applications In this section we will discuss the total unimodularity theory and its applications to flows in networks. 2.1 Total Unimodularity:

More information

arxiv: v1 [math.oc] 3 Jan 2019

arxiv: v1 [math.oc] 3 Jan 2019 The Product Knapsack Problem: Approximation and Complexity arxiv:1901.00695v1 [math.oc] 3 Jan 2019 Ulrich Pferschy a, Joachim Schauer a, Clemens Thielen b a Department of Statistics and Operations Research,

More information

CS3719 Theory of Computation and Algorithms

CS3719 Theory of Computation and Algorithms CS3719 Theory of Computation and Algorithms Any mechanically (automatically) discretely computation of problem solving contains at least three components: - problem description - computational tool - analysis

More information

On the mean connected induced subgraph order of cographs

On the mean connected induced subgraph order of cographs AUSTRALASIAN JOURNAL OF COMBINATORICS Volume 71(1) (018), Pages 161 183 On the mean connected induced subgraph order of cographs Matthew E Kroeker Lucas Mol Ortrud R Oellermann University of Winnipeg Winnipeg,

More information

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness

NP-Completeness I. Lecture Overview Introduction: Reduction and Expressiveness Lecture 19 NP-Completeness I 19.1 Overview In the past few lectures we have looked at increasingly more expressive problems that we were able to solve using efficient algorithms. In this lecture we introduce

More information

Lectures 6, 7 and part of 8

Lectures 6, 7 and part of 8 Lectures 6, 7 and part of 8 Uriel Feige April 26, May 3, May 10, 2015 1 Linear programming duality 1.1 The diet problem revisited Recall the diet problem from Lecture 1. There are n foods, m nutrients,

More information

NP-Completeness. NP-Completeness 1

NP-Completeness. NP-Completeness 1 NP-Completeness Reference: Computers and Intractability: A Guide to the Theory of NP-Completeness by Garey and Johnson, W.H. Freeman and Company, 1979. NP-Completeness 1 General Problems, Input Size and

More information

k-protected VERTICES IN BINARY SEARCH TREES

k-protected VERTICES IN BINARY SEARCH TREES k-protected VERTICES IN BINARY SEARCH TREES MIKLÓS BÓNA Abstract. We show that for every k, the probability that a randomly selected vertex of a random binary search tree on n nodes is at distance k from

More information

Steiner Forest Orientation Problems

Steiner Forest Orientation Problems Steiner Forest Orientation Problems Marek Cygan Guy Kortsarz Zeev Nutov November 20, 2011 Abstract We consider connectivity problems with orientation constraints. Given a directed graph D and a collection

More information

Computational Complexity. IE 496 Lecture 6. Dr. Ted Ralphs

Computational Complexity. IE 496 Lecture 6. Dr. Ted Ralphs Computational Complexity IE 496 Lecture 6 Dr. Ted Ralphs IE496 Lecture 6 1 Reading for This Lecture N&W Sections I.5.1 and I.5.2 Wolsey Chapter 6 Kozen Lectures 21-25 IE496 Lecture 6 2 Introduction to

More information

Notes for Lecture Notes 2

Notes for Lecture Notes 2 Stanford University CS254: Computational Complexity Notes 2 Luca Trevisan January 11, 2012 Notes for Lecture Notes 2 In this lecture we define NP, we state the P versus NP problem, we prove that its formulation

More information

Local Search for String Problems: Brute Force is Essentially Optimal

Local Search for String Problems: Brute Force is Essentially Optimal Local Search for String Problems: Brute Force is Essentially Optimal Jiong Guo 1, Danny Hermelin 2, Christian Komusiewicz 3 1 Universität des Saarlandes, Campus E 1.7, D-66123 Saarbrücken, Germany. jguo@mmci.uni-saarland.de

More information

5 Flows and cuts in digraphs

5 Flows and cuts in digraphs 5 Flows and cuts in digraphs Recall that a digraph or network is a pair G = (V, E) where V is a set and E is a multiset of ordered pairs of elements of V, which we refer to as arcs. Note that two vertices

More information

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u)

1 T 1 = where 1 is the all-ones vector. For the upper bound, let v 1 be the eigenvector corresponding. u:(u,v) E v 1(u) CME 305: Discrete Mathematics and Algorithms Instructor: Reza Zadeh (rezab@stanford.edu) Final Review Session 03/20/17 1. Let G = (V, E) be an unweighted, undirected graph. Let λ 1 be the maximum eigenvalue

More information

A Note on Perfect Partial Elimination

A Note on Perfect Partial Elimination A Note on Perfect Partial Elimination Matthijs Bomhoff, Walter Kern, and Georg Still University of Twente, Faculty of Electrical Engineering, Mathematics and Computer Science, P.O. Box 217, 7500 AE Enschede,

More information

The approximability of Max CSP with fixed-value constraints

The approximability of Max CSP with fixed-value constraints The approximability of Max CSP with fixed-value constraints Vladimir Deineko Warwick Business School University of Warwick, UK Vladimir.Deineko@wbs.ac.uk Mikael Klasson Dep t of Computer and Information

More information