Computing Bi-Clusters for Microarray Analysis
|
|
- Terence Powers
- 5 years ago
- Views:
Transcription
1 December 21, 2006
2 General Bi-clustering Problem Similarity of Rows (1-5) Cheng and Churchs model General Bi-clustering Problem Input: a n m matrix A. Output: a sub-matrix A P,Q of A such that the rows of A P,Q are similar. That is, all the rows are identical. Why sub-matrix? A subset of genes are co-regulated and co-expressed under specific conditions. It is interesting to find the subsets of genes and conditions.
3 General Bi-clustering Problem Similarity of Rows (1-5) Cheng and Churchs model Similarity of Rows (1-5) 1. All rows are identical All the elements in a row are identical (the same as 1 if we treat columns as rows)
4 General Bi-clustering Problem Similarity of Rows (1-5) Cheng and Churchs model Similarity of Rows (1-5) 3. The curves for all rows are similar (additive) a i,j a i,k = c(j, k) for i = 1, 2,..., m. Case 3 is equivalent to case 2 (thus also case 1) if we construct a new matrix a i,j = a i,j a i,p for a fixed p indicate a row.
5 General Bi-clustering Problem Similarity of Rows (1-5) Cheng and Churchs model Similarity of Rows (1-5) 4. The curves for all rows are similar (multiplicative) a 1,1 a 1,2 a 1,3... a 1,m c 1 a 1,1 c 1 a 1,2 c 1 a 1,3... c 1 a 1,m c 2 a 1,1 c 2 a 1,2 c 2 a 1,3... c 2 a 1,m... c n a 1,1 c n a 1,2 c n a 1,3... c n a 1,m Transfer to case 2 (thus case 1) by taking log and subtraction. Case 3 and Case 4 are called bi-clusters with coherent values.
6 General Bi-clustering Problem Similarity of Rows (1-5) Cheng and Churchs model Similarity of Rows (1-5) 5. The curves for all rows are similar (multiplicative and additive) a i,j = c i a k,j + d i Transfer to case 2 (thus case 1) by subtraction of a fixed row (row i), taking log and subtraction of row i again. The basic model: All the rows in the sub-matrix are identical.
7 General Bi-clustering Problem Similarity of Rows (1-5) Cheng and Churchs model Cheng and Churchs model The model introduced a similarity score called the mean squared residue score H to measure the coherence of the rows and columns in the submatrix. where H(P, Q) = 1 P Q a i,q = 1 Q j Q i P,j Q a i,j, a P,j = 1 P (a i,j a i,q a P,j + a P,Q ) 2 i P a i,j, a P,Q = 1 P Q i P,j Q If there is no error, H(P, Q)=0 for case 1, 2 and 3. A lot of heuristics (programs) have been produced. a i,j.
8 Consensus Sub-matrix Problem Bottleneck Sub-matrix Problem NP-Hardness Results Consensus Sub-matrix Problem Bottleneck Sub-matrix Problem
9 Consensus Sub-matrix Problem Bottleneck Sub-matrix Problem NP-Hardness Results Consensus Sub-matrix Problem Input: a n m matrix A, integers l and k. Output: a sub-matrix A P,Q of A with l rows and k columns and a consensus row z (of k elements) such that r i P d(r i Q, z) is minimized. Here d(, ) is the Hamming distance.
10 Consensus Sub-matrix Problem Bottleneck Sub-matrix Problem NP-Hardness Results Bottleneck Sub-matrix Problem Input: a n m matrix A, integers l and k. Output: a sub-matrix A P,Q of A with l rows and k columns and a consensus row z (of k elements) such that for any r i in P d(r i Q, z) d and d is minimized Here d(, ) is the Hamming distance.
11 Consensus Sub-matrix Problem Bottleneck Sub-matrix Problem NP-Hardness Results NP-Hardness Results Theorem 1: Both consensus sub-matrix and bottleneck sub-matrix problems are NP-hard. Proof: We use a reduction from maximum edge bipartite problem.
12 for Consensus Sub-matrix Problem Input: a n m matrix A, integers l and k. Output: a sub-matrix A P,Q of A with l rows and k columns and a consensus row z (of k elements) such that r i P d(r i Q, z) is minimized. Here d(, ) is the Hamming distance.
13 Assumptions: H opt = p i P opt d(x pi Qopt, z opt) = O(kl), H opt c = kl and Q opt = k = O(n), k c = n. : We use a random sampling technique to randomly select O(logm) columns in Q opt, enumerate all possible vectors of length O(logm) for those columns. At some moment, we know O(logm) bits of r opt and we can use the partial z opt to select the l rows which are closest to z opt in those O(logm) bits. After that we can construct a consensus vector r as follows: for each column, choose the (majority) letter that appears the most in each of the l letters in the l selected rows. Then for each of the n columns, we can calculate the number of mismatches between the majority letter and the l letters in the l selected rows. By selecting the best k columns, we can get a good solution.
14 How to randomly select O(logm) columns in Q opt while Q opt is unknown? Our new idea is to randomly select a set B of (c + 1)logm columns from A and enumerate all size logm subsets of B in time O(m c+1) which is polynomial in terms of the input size O(mn). We can show that with high probability, we can get a set of logm columns randomly selected from Q opt.
15 Algorithm 1 for The Consensus Submatrix Problem Input: one m n matrix A, integers l and k, and ɛ > 0 Output: a size l subset P of rows, a size k subset Q of columns and a length k consensus vector z Step 1: randomly select a set B of (c + 1)( 4 log m + 1) columns from A. ɛ 2 (1.1) for every size 4 log m ɛ 2 subset R of B do (1.2) for every z R Σ R do (a) Select the best l rows P = {p 1,..., p l } that minimize d(z R, x i R ). (b) for each column j do Compute f (j) = l i=1 d(s j, a pi,j), where s j is the majority element of the l rows in P in column j. Select the best k columns Q = {q 1,..., q k } with minimum value f (j) and let z(q) = s q1 s q2... s qk. (c) Calculate H = l i=1 d(xp i Q, z) of this solution. Step 2: Output P, Q and z with minimum H.
16 Lemma 1: With probability at most m 2 ɛ 2 c 2 (c+1), no subset R of size 4 log m used in Step 1 of Algorithm 1 satisfies ɛ 2 R Q opt. Lemma 2: Assume R = 4 log m and R Q ɛ 2 opt. Let ρ = k R. With probability at most m 1, there is a row x i in X satisfying d(z opt, x i Qopt ) ɛk ρ > d(z opt R, x i R ). With probability at most m 1 3, there is a row x i in X satisfying d(z opt R, x i R ) > d(z opt, x i Qopt ) + ɛk. ρ
17 Lemma 3: When R Q opt and z R = z opt R, with probability at most 2m 1 3, the set of rows P = {p 1,..., p l } selected in Step 1 (a) of Algorithm 1 satisfies l i=1 d(z opt, x pi Qopt ) > H opt + 2ɛkl. Theorem 2: For any δ > 0, with probability at least 2 1 m 8c δ 2 c 2 (c+1) 2m 1 3, Algorithm 1 will output a solution with consensus score at most (1 + δ)h opt in O(nm O( 1 δ 2 ) ) time.
18 Thanks for Bottleneck Sub-matrix Problem Input: a n m matrix A, integers l and k. Output: a sub-matrix A P,Q of A with l rows and k columns and a consensus row z (of k elements) such that for any r i in P d(r i Q, z) d and d is minimized Here d(, ) is the Hamming distance.
19 Thanks Assumptions: d opt = MAX pi P opt d(x pi Qopt, z opt ) = O(k), d opt c = k and Q opt = k = O(n), k c = n. : (1) Use random sampling technique to know O(logm) bits of z opt and select l best rows like Algorithm 1. (2) Use linear programming and randomized rounding technique to select k columns in the matrix.
20 Thanks Linear programming Given a set of rows P = {p 1,..., p l }, we want to find a set of k columns Q and vector z such that bottleneck score is minimized. min d; Σ n y i,j = k, i=1 Σ j=1 j=1 y i,j 1, i = 1, 2,..., n, Σ n χ(π j, x ps,i)y i,j d, s = 1, 2,..., l. i=1 j=1 y i,j = 1 if and only if column i is in Q and the corresponding bit in z is π j. Here, for any a, b Σ, χ(a, b) = 0 if a = b and χ(a, b) = 1 if a b.
21 Thanks Randomized rounding To achieve two goals: (1) Select k columns, where k k δd opt. (2) Get integers values for y i,j such that the distance (restricted on the k selected columns) between any row in P and the center vector thus obtained is at most (1 + γ)d opt. Here δ > 0 and γ > 0 are two parameters used to control the errors.
22 Thanks Lemma 4: When nγ 2 3(cc ) 2 2 log m, for any γ, δ > 0, with probability at most exp( nδ2 ) + m 1, the rounding result 2(cc ) 2 y = {y 1,1,..., y 1, Σ,..., y n,1,..., y n, Σ } does not satisfy at least one of the following inequalities, Σ n ( y i,j) > k δd opt, i=1 j=1 and for every row x ps (s = 1, 2,..., l), Σ n ( χ(π j, x ps,i)y i,j) < d + γd opt. i=1 j=1
23 Thanks Algorithm 2 for The bottleneck Sub-matrix Problem Input: one matrix A Σ m n, integer l, k, a row z Σ n and small numbers ɛ > 0, γ > 0 and δ > 0. Output: a size l subset P of rows, a size k subset Q of columns and a length k consensus vector z. nγ 2 3(cc ) 2 if 2 log m then try all size k subset Q of the n columns and all z of length k to solve the problem. if > 2 log m then nγ 2 3(cc ) 2 Step 1: randomly select a set B of size subset R of B do 4 log m ɛ 2 4(c+1) log m columns from A. for every ɛ 2 for every z R Σ R do (a) Select the best l rows P = {p 1,..., p l } that minimize d(z R, x i R ). (b)solve the optimization problem by linear programming and randomized rounding to get Q and z. Step 2: Output P,Q and z with minimum bottleneck score d.
24 Thanks Lemma 5: When R Q opt and z R = z opt R, with probability at most 2m 1 3, the set of rows P = {p 1,..., p l } obtained in Step 1(a) of Algorithm 2 satisfies d(z opt, x pi Qopt ) > d opt + 2ɛk for some row x pi (1 i l). Theorem 3: With probability at least 1 m 2 ɛ 2 c 2 (c+1) 2m 1 3 exp( nδ2 ) m 1, Algorithm 2 2(cc ) 2 runs in time O(n O(1) m O( 1 ɛ γ 2 ) ) and obtains a solution with bottleneck score at most (1 + 2c ɛ + γ + δ)d opt for any fixed ɛ, γ, δ > 0.
25 Thanks Thanks Acknowledgements This work is fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China [Project No. CityU 1070/02E]. This work is collaborated with Dr. Lusheng Wang and Xiaowen Liu in City University of Hong Kong, Hong Kong, China.
26 Thanks Let X 1, X 2,..., X n be n independent random 0-1 variables, where X i takes 1 with probability p i, 0 < p i < 1. Let X = n i=1 X i, and µ = E[X ]. Then for any 0 < ɛ 1, Pr(X > µ + ɛ n) < e 1 3 nɛ2, Pr(X < µ ɛ n) e 1 2 nɛ2.
Closest String and Closest Substring Problems
January 8, 2010 Problem Formulation Problem Statement I Closest String Given a set S = {s 1, s 2,, s n } of strings each length m, find a center string s of length m minimizing d such that for every string
More informationACO Comprehensive Exam October 14 and 15, 2013
1. Computability, Complexity and Algorithms (a) Let G be the complete graph on n vertices, and let c : V (G) V (G) [0, ) be a symmetric cost function. Consider the following closest point heuristic for
More informationFisher Equilibrium Price with a Class of Concave Utility Functions
Fisher Equilibrium Price with a Class of Concave Utility Functions Ning Chen 1, Xiaotie Deng 2, Xiaoming Sun 3, and Andrew Chi-Chih Yao 4 1 Department of Computer Science & Engineering, University of Washington
More informationBounds for pairs in partitions of graphs
Bounds for pairs in partitions of graphs Jie Ma Xingxing Yu School of Mathematics Georgia Institute of Technology Atlanta, GA 30332-0160, USA Abstract In this paper we study the following problem of Bollobás
More informationCS6999 Probabilistic Methods in Integer Programming Randomized Rounding Andrew D. Smith April 2003
CS6999 Probabilistic Methods in Integer Programming Randomized Rounding April 2003 Overview 2 Background Randomized Rounding Handling Feasibility Derandomization Advanced Techniques Integer Programming
More informationLet S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely
JOURNAL OF COMPUTATIONAL BIOLOGY Volume 8, Number 1, 2001 Mary Ann Liebert, Inc. Pp. 69 78 Perfect Phylogenetic Networks with Recombination LUSHENG WANG, 1 KAIZHONG ZHANG, 2 and LOUXIN ZHANG 3 ABSTRACT
More informationTHE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION
THE SZEMERÉDI REGULARITY LEMMA AND ITS APPLICATION YAQIAO LI In this note we will prove Szemerédi s regularity lemma, and its application in proving the triangle removal lemma and the Roth s theorem on
More informationOn the k-closest Substring and k-consensus Pattern Problems
On the k-closest Substring and k-consensus Pattern Problems Yishan Jiao 1,JingyiXu 1, and Ming Li 2 1 Bioinformatics lab, Institute of Computing Technology, Chinese Academy of Sciences, 6#, South Road,
More informationDistances and similarities Based in part on slides from textbook, slides of Susan Holmes. October 3, Statistics 202: Data Mining
Distances and similarities Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Similarities Start with X which we assume is centered and standardized. The PCA loadings were
More informationx 1 + x 2 2 x 1 x 2 1 x 2 2 min 3x 1 + 2x 2
Lecture 1 LPs: Algebraic View 1.1 Introduction to Linear Programming Linear programs began to get a lot of attention in 1940 s, when people were interested in minimizing costs of various systems while
More information6.842 Randomness and Computation Lecture 5
6.842 Randomness and Computation 2012-02-22 Lecture 5 Lecturer: Ronitt Rubinfeld Scribe: Michael Forbes 1 Overview Today we will define the notion of a pairwise independent hash function, and discuss its
More informationProblem Set 1: Solutions Math 201A: Fall Problem 1. Let (X, d) be a metric space. (a) Prove the reverse triangle inequality: for every x, y, z X
Problem Set 1: s Math 201A: Fall 2016 Problem 1. Let (X, d) be a metric space. (a) Prove the reverse triangle inequality: for every x, y, z X d(x, y) d(x, z) d(z, y). (b) Prove that if x n x and y n y
More informationNon-Interactive Zero Knowledge (II)
Non-Interactive Zero Knowledge (II) CS 601.442/642 Modern Cryptography Fall 2017 S 601.442/642 Modern CryptographyNon-Interactive Zero Knowledge (II) Fall 2017 1 / 18 NIZKs for NP: Roadmap Last-time: Transformation
More informationMatrices: 2.1 Operations with Matrices
Goals In this chapter and section we study matrix operations: Define matrix addition Define multiplication of matrix by a scalar, to be called scalar multiplication. Define multiplication of two matrices,
More informationOn the complexity of unsigned translocation distance
Theoretical Computer Science 352 (2006) 322 328 Note On the complexity of unsigned translocation distance Daming Zhu a, Lusheng Wang b, a School of Computer Science and Technology, Shandong University,
More informationFast Algorithms for Constant Approximation k-means Clustering
Transactions on Machine Learning and Data Mining Vol. 3, No. 2 (2010) 67-79 c ISSN:1865-6781 (Journal), ISBN: 978-3-940501-19-6, IBaI Publishing ISSN 1864-9734 Fast Algorithms for Constant Approximation
More informationApproximation Algorithms
Approximation Algorithms What do you do when a problem is NP-complete? or, when the polynomial time solution is impractically slow? assume input is random, do expected performance. Eg, Hamiltonian path
More informationThe Algorithmic Aspects of the Regularity Lemma
The Algorithmic Aspects of the Regularity Lemma N. Alon R. A. Duke H. Lefmann V. Rödl R. Yuster Abstract The Regularity Lemma of Szemerédi is a result that asserts that every graph can be partitioned in
More informationOn approximation of real, complex, and p-adic numbers by algebraic numbers of bounded degree. by K. I. Tsishchanka
On approximation of real, complex, and p-adic numbers by algebraic numbers of bounded degree by K. I. Tsishchanka I. On approximation by rational numbers Theorem 1 (Dirichlet, 1842). For any real irrational
More informationTopics in Approximation Algorithms Solution for Homework 3
Topics in Approximation Algorithms Solution for Homework 3 Problem 1 We show that any solution {U t } can be modified to satisfy U τ L τ as follows. Suppose U τ L τ, so there is a vertex v U τ but v L
More informationReductions. Reduction. Linear Time Reduction: Examples. Linear Time Reductions
Reduction Reductions Problem X reduces to problem Y if given a subroutine for Y, can solve X. Cost of solving X = cost of solving Y + cost of reduction. May call subroutine for Y more than once. Ex: X
More information1 The Knapsack Problem
Comp 260: Advanced Algorithms Prof. Lenore Cowen Tufts University, Spring 2018 Scribe: Tom Magerlein 1 Lecture 4: The Knapsack Problem 1 The Knapsack Problem Suppose we are trying to burgle someone s house.
More informationApproximate Counting and Markov Chain Monte Carlo
Approximate Counting and Markov Chain Monte Carlo A Randomized Approach Arindam Pal Department of Computer Science and Engineering Indian Institute of Technology Delhi March 18, 2011 April 8, 2011 Arindam
More informationA conjecture on the alphabet size needed to produce all correlation classes of pairs of words
A conjecture on the alphabet size needed to produce all correlation classes of pairs of words Paul Leopardi Thanks: Jörg Arndt, Michael Barnsley, Richard Brent, Sylvain Forêt, Judy-anne Osborn. Mathematical
More informationICALP g Mingji Xia
: : ICALP 2010 g Institute of Software Chinese Academy of Sciences 2010.7.6 : 1 2 3 Matrix multiplication : Product of vectors and matrices. AC = e [n] a ec e A = (a e), C = (c e). ABC = e,e [n] a eb ee
More informationInteger Factorization using lattices
Integer Factorization using lattices Antonio Vera INRIA Nancy/CARAMEL team/anr CADO/ANR LAREDA Workshop Lattice Algorithmics - CIRM - February 2010 Plan Introduction Plan Introduction Outline of the algorithm
More informationStatistics and data analyses
Statistics and data analyses Designing experiments Measuring time Instrumental quality Precision Standard deviation depends on Number of measurements Detection quality Systematics and methology σ tot =
More informationStatistics 202: Data Mining. c Jonathan Taylor. Week 2 Based in part on slides from textbook, slides of Susan Holmes. October 3, / 1
Week 2 Based in part on slides from textbook, slides of Susan Holmes October 3, 2012 1 / 1 Part I Other datatypes, preprocessing 2 / 1 Other datatypes Document data You might start with a collection of
More informationPart I. Other datatypes, preprocessing. Other datatypes. Other datatypes. Week 2 Based in part on slides from textbook, slides of Susan Holmes
Week 2 Based in part on slides from textbook, slides of Susan Holmes Part I Other datatypes, preprocessing October 3, 2012 1 / 1 2 / 1 Other datatypes Other datatypes Document data You might start with
More informationFactoring Algorithms Pollard s p 1 Method. This method discovers a prime factor p of an integer n whenever p 1 has only small prime factors.
Factoring Algorithms Pollard s p 1 Method This method discovers a prime factor p of an integer n whenever p 1 has only small prime factors. Input: n (to factor) and a limit B Output: a proper factor of
More informationSZEMERÉDI S REGULARITY LEMMA FOR MATRICES AND SPARSE GRAPHS
SZEMERÉDI S REGULARITY LEMMA FOR MATRICES AND SPARSE GRAPHS ALEXANDER SCOTT Abstract. Szemerédi s Regularity Lemma is an important tool for analyzing the structure of dense graphs. There are versions of
More informationApproximation Algorithms and Hardness of Approximation. IPM, Jan Mohammad R. Salavatipour Department of Computing Science University of Alberta
Approximation Algorithms and Hardness of Approximation IPM, Jan 2006 Mohammad R. Salavatipour Department of Computing Science University of Alberta 1 Introduction For NP-hard optimization problems, we
More informationA Polytope for a Product of Real Linear Functions in 0/1 Variables
Don Coppersmith Oktay Günlük Jon Lee Janny Leung A Polytope for a Product of Real Linear Functions in 0/1 Variables Original: 29 September 1999 as IBM Research Report RC21568 Revised version: 30 November
More informationLecture 2. MATH3220 Operations Research and Logistics Jan. 8, Pan Li The Chinese University of Hong Kong. Integer Programming Formulations
Lecture 2 MATH3220 Operations Research and Logistics Jan. 8, 2015 Pan Li The Chinese University of Hong Kong 2.1 Agenda 1 2 3 2.2 : a linear program plus the additional constraints that some or all of
More informationLecture 5: Two-point Sampling
Randomized Algorithms Lecture 5: Two-point Sampling Sotiris Nikoletseas Professor CEID - ETY Course 2017-2018 Sotiris Nikoletseas, Professor Randomized Algorithms - Lecture 5 1 / 26 Overview A. Pairwise
More informationImproved Parallel Approximation of a Class of Integer Programming Problems
Improved Parallel Approximation of a Class of Integer Programming Problems Noga Alon 1 and Aravind Srinivasan 2 1 School of Mathematical Sciences, Raymond and Beverly Sackler Faculty of Exact Sciences,
More informationMath 1302 Notes 2. How many solutions? What type of solution in the real number system? What kind of equation is it?
Math 1302 Notes 2 We know that x 2 + 4 = 0 has How many solutions? What type of solution in the real number system? What kind of equation is it? What happens if we enlarge our current system? Remember
More informationSignal Recovery from Permuted Observations
EE381V Course Project Signal Recovery from Permuted Observations 1 Problem Shanshan Wu (sw33323) May 8th, 2015 We start with the following problem: let s R n be an unknown n-dimensional real-valued signal,
More informationProbability and Samples. Sampling. Point Estimates
Probability and Samples Sampling We want the results from our sample to be true for the population and not just the sample But our sample may or may not be representative of the population Sampling error
More informationON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES
ON COST MATRICES WITH TWO AND THREE DISTINCT VALUES OF HAMILTONIAN PATHS AND CYCLES SANTOSH N. KABADI AND ABRAHAM P. PUNNEN Abstract. Polynomially testable characterization of cost matrices associated
More informationAnalysis and Design of Algorithms Dynamic Programming
Analysis and Design of Algorithms Dynamic Programming Lecture Notes by Dr. Wang, Rui Fall 2008 Department of Computer Science Ocean University of China November 6, 2009 Introduction 2 Introduction..................................................................
More informationInteger Linear Programs
Lecture 2: Review, Linear Programming Relaxations Today we will talk about expressing combinatorial problems as mathematical programs, specifically Integer Linear Programs (ILPs). We then see what happens
More informationRANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA
Discussiones Mathematicae General Algebra and Applications 23 (2003 ) 125 137 RANK AND PERIMETER PRESERVER OF RANK-1 MATRICES OVER MAX ALGEBRA Seok-Zun Song and Kyung-Tae Kang Department of Mathematics,
More informationHEURISTICS FOR TWO-MACHINE FLOWSHOP SCHEDULING WITH SETUP TIMES AND AN AVAILABILITY CONSTRAINT
HEURISTICS FOR TWO-MACHINE FLOWSHOP SCHEDULING WITH SETUP TIMES AND AN AVAILABILITY CONSTRAINT Wei Cheng Health Monitor Network, Parmus, NJ John Karlof Department of Mathematics and Statistics University
More informationTechnische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik. Combinatorial Optimization (MA 4502)
Technische Universität München, Zentrum Mathematik Lehrstuhl für Angewandte Geometrie und Diskrete Mathematik Combinatorial Optimization (MA 4502) Dr. Michael Ritter Problem Sheet 1 Homework Problems Exercise
More informationZero-sum square matrices
Zero-sum square matrices Paul Balister Yair Caro Cecil Rousseau Raphael Yuster Abstract Let A be a matrix over the integers, and let p be a positive integer. A submatrix B of A is zero-sum mod p if the
More informationRandom Variable. Pr(X = a) = Pr(s)
Random Variable Definition A random variable X on a sample space Ω is a real-valued function on Ω; that is, X : Ω R. A discrete random variable is a random variable that takes on only a finite or countably
More informationA Note on the Density of the Multiple Subset Sum Problems
A Note on the Density of the Multiple Subset Sum Problems Yanbin Pan and Feng Zhang Key Laboratory of Mathematics Mechanization, Academy of Mathematics and Systems Science, Chinese Academy of Sciences,
More informationScheduling on Unrelated Parallel Machines. Approximation Algorithms, V. V. Vazirani Book Chapter 17
Scheduling on Unrelated Parallel Machines Approximation Algorithms, V. V. Vazirani Book Chapter 17 Nicolas Karakatsanis, 2008 Description of the problem Problem 17.1 (Scheduling on unrelated parallel machines)
More informationCS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms
CS 372: Computational Geometry Lecture 14 Geometric Approximation Algorithms Antoine Vigneron King Abdullah University of Science and Technology December 5, 2012 Antoine Vigneron (KAUST) CS 372 Lecture
More informationDiameter of the Zero Divisor Graph of Semiring of Matrices over Boolean Semiring
International Mathematical Forum, Vol. 9, 2014, no. 29, 1369-1375 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/imf.2014.47131 Diameter of the Zero Divisor Graph of Semiring of Matrices over
More informationLecture Semidefinite Programming and Graph Partitioning
Approximation Algorithms and Hardness of Approximation April 16, 013 Lecture 14 Lecturer: Alantha Newman Scribes: Marwa El Halabi 1 Semidefinite Programming and Graph Partitioning In previous lectures,
More informationLecture 9 Nearest Neighbor Search: Locality Sensitive Hashing.
COMS 4995-3: Advanced Algorithms Feb 15, 2017 Lecture 9 Nearest Neighbor Search: Locality Sensitive Hashing. Instructor: Alex Andoni Scribes: Weston Jackson, Edo Roth 1 Introduction Today s lecture is
More informationAlgorithms for Molecular Biology
Algorithms for Molecular Biology BioMed Central Research A polynomial time biclustering algorithm for finding approximate expression patterns in gene expression time series Sara C Madeira* 1,2,3 and Arlindo
More informationAssignment 2 : Probabilistic Methods
Assignment 2 : Probabilistic Methods jverstra@math Question 1. Let d N, c R +, and let G n be an n-vertex d-regular graph. Suppose that vertices of G n are selected independently with probability p, where
More informationMatrices. Chapter Definitions and Notations
Chapter 3 Matrices 3. Definitions and Notations Matrices are yet another mathematical object. Learning about matrices means learning what they are, how they are represented, the types of operations which
More informationSeparation Techniques for Constrained Nonlinear 0 1 Programming
Separation Techniques for Constrained Nonlinear 0 1 Programming Christoph Buchheim Computer Science Department, University of Cologne and DEIS, University of Bologna MIP 2008, Columbia University, New
More informationProofs Not Based On POMI
s Not Based On POMI James K. Peterson Department of Biological Sciences and Department of Mathematical Sciences Clemson University February 1, 018 Outline Non POMI Based s Some Contradiction s Triangle
More informationSTABILITY OF JOHNSON S SCHEDULE WITH LIMITED MACHINE AVAILABILITY
MOSIM 01 du 25 au 27 avril 2001 Troyes (France) STABILITY OF JOHNSON S SCHEDULE WITH LIMITED MACHINE AVAILABILITY Oliver BRAUN, Günter SCHMIDT Department of Information and Technology Management Saarland
More informationHandout 5. α a1 a n. }, where. xi if a i = 1 1 if a i = 0.
Notes on Complexity Theory Last updated: October, 2005 Jonathan Katz Handout 5 1 An Improved Upper-Bound on Circuit Size Here we show the result promised in the previous lecture regarding an upper-bound
More informationData Mining and Matrices
Data Mining and Matrices 08 Boolean Matrix Factorization Rainer Gemulla, Pauli Miettinen June 13, 2013 Outline 1 Warm-Up 2 What is BMF 3 BMF vs. other three-letter abbreviations 4 Binary matrices, tiles,
More informationApproximation Schemes for Job Shop Scheduling Problems with Controllable Processing Times
Approximation Schemes for Job Shop Scheduling Problems with Controllable Processing Times Klaus Jansen 1, Monaldo Mastrolilli 2, and Roberto Solis-Oba 3 1 Universität zu Kiel, Germany, kj@informatik.uni-kiel.de
More informationLP based heuristics for the multiple knapsack problem. with assignment restrictions
LP based heuristics for the multiple knapsack problem with assignment restrictions Geir Dahl and Njål Foldnes Centre of Mathematics for Applications and Department of Informatics, University of Oslo, P.O.Box
More informationCONSTRUCTION OF SLICED SPACE-FILLING DESIGNS BASED ON BALANCED SLICED ORTHOGONAL ARRAYS
Statistica Sinica 24 (2014), 1685-1702 doi:http://dx.doi.org/10.5705/ss.2013.239 CONSTRUCTION OF SLICED SPACE-FILLING DESIGNS BASED ON BALANCED SLICED ORTHOGONAL ARRAYS Mingyao Ai 1, Bochuan Jiang 1,2
More informationEE512: Error Control Coding
EE512: Error Control Coding Solution for Assignment on Linear Block Codes February 14, 2007 1. Code 1: n = 4, n k = 2 Parity Check Equations: x 1 + x 3 = 0, x 1 + x 2 + x 4 = 0 Parity Bits: x 3 = x 1,
More informationFinding Consensus Strings With Small Length Difference Between Input and Solution Strings
Finding Consensus Strings With Small Length Difference Between Input and Solution Strings Markus L. Schmid Trier University, Fachbereich IV Abteilung Informatikwissenschaften, D-54286 Trier, Germany, MSchmid@uni-trier.de
More informationLecture 9: Submatrices and Some Special Matrices
Lecture 9: Submatrices and Some Special Matrices Winfried Just Department of Mathematics, Ohio University February 2, 2018 Submatrices A submatrix B of a given matrix A is any matrix that can be obtained
More informationCOVARIANCE IDENTITIES AND MIXING OF RANDOM TRANSFORMATIONS ON THE WIENER SPACE
Communications on Stochastic Analysis Vol. 4, No. 3 (21) 299-39 Serials Publications www.serialspublications.com COVARIANCE IDENTITIES AND MIXING OF RANDOM TRANSFORMATIONS ON THE WIENER SPACE NICOLAS PRIVAULT
More informationCHAPTER 6. Direct Methods for Solving Linear Systems
CHAPTER 6 Direct Methods for Solving Linear Systems. Introduction A direct method for approximating the solution of a system of n linear equations in n unknowns is one that gives the exact solution to
More informationApproximability of Dense Instances of Nearest Codeword Problem
Approximability of Dense Instances of Nearest Codeword Problem Cristina Bazgan 1, W. Fernandez de la Vega 2, Marek Karpinski 3 1 Université Paris Dauphine, LAMSADE, 75016 Paris, France, bazgan@lamsade.dauphine.fr
More information1 Column Generation and the Cutting Stock Problem
1 Column Generation and the Cutting Stock Problem In the linear programming approach to the traveling salesman problem we used the cutting plane approach. The cutting plane approach is appropriate when
More informationRECAP How to find a maximum matching?
RECAP How to find a maximum matching? First characterize maximum matchings A maximal matching cannot be enlarged by adding another edge. A maximum matching of G is one of maximum size. Example. Maximum
More informationLinear and Integer Programming - ideas
Linear and Integer Programming - ideas Paweł Zieliński Institute of Mathematics and Computer Science, Wrocław University of Technology, Poland http://www.im.pwr.wroc.pl/ pziel/ Toulouse, France 2012 Literature
More informationThe University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.
The University of Texas at Austin Department of Electrical and Computer Engineering EE381V: Large Scale Learning Spring 2013 Assignment 1 Caramanis/Sanghavi Due: Thursday, Feb. 7, 2013. (Problems 1 and
More informationOn The Computational Complexity Of Bi-Clustering
On The Computational Complexity Of Bi-Clustering by Sharon Wulff A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Computer
More informationLinear Classification: Linear Programming
Linear Classification: Linear Programming Yufei Tao Department of Computer Science and Engineering Chinese University of Hong Kong 1 / 21 Y Tao Linear Classification: Linear Programming Recall the definition
More informationLecture 4 Scheduling 1
Lecture 4 Scheduling 1 Single machine models: Number of Tardy Jobs -1- Problem 1 U j : Structure of an optimal schedule: set S 1 of jobs meeting their due dates set S 2 of jobs being late jobs of S 1 are
More informationConsensus Optimizing Both Distance Sum and Radius
Consensus Optimizing Both Distance Sum and Radius Amihood Amir 1, Gad M. Landau 2, Joong Chae Na 3, Heejin Park 4, Kunsoo Park 5, and Jeong Seop Sim 6 1 Bar-Ilan University, 52900 Ramat-Gan, Israel 2 University
More informationCSE 431/531: Analysis of Algorithms. Dynamic Programming. Lecturer: Shi Li. Department of Computer Science and Engineering University at Buffalo
CSE 431/531: Analysis of Algorithms Dynamic Programming Lecturer: Shi Li Department of Computer Science and Engineering University at Buffalo Paradigms for Designing Algorithms Greedy algorithm Make a
More informationNational University of Singapore Department of Electrical & Computer Engineering. Examination for
National University of Singapore Department of Electrical & Computer Engineering Examination for EE5139R Information Theory for Communication Systems (Semester I, 2014/15) November/December 2014 Time Allowed:
More informationSource Coding. Master Universitario en Ingeniería de Telecomunicación. I. Santamaría Universidad de Cantabria
Source Coding Master Universitario en Ingeniería de Telecomunicación I. Santamaría Universidad de Cantabria Contents Introduction Asymptotic Equipartition Property Optimal Codes (Huffman Coding) Universal
More informationAlgorithms for Data Science: Lecture on Finding Similar Items
Algorithms for Data Science: Lecture on Finding Similar Items Barna Saha 1 Finding Similar Items Finding similar items is a fundamental data mining task. We may want to find whether two documents are similar
More informationSequential pairing of mixed integer inequalities
Sequential pairing of mixed integer inequalities Yongpei Guan, Shabbir Ahmed, George L. Nemhauser School of Industrial & Systems Engineering, Georgia Institute of Technology, 765 Ferst Drive, Atlanta,
More informationQ(x n+1,..., x m ) in n independent variables
Cluster algebras of geometric type (Fomin / Zelevinsky) m n positive integers ambient field F of rational functions over Q(x n+1,..., x m ) in n independent variables Definition: A seed in F is a pair
More informationCS 6820 Fall 2014 Lectures, October 3-20, 2014
Analysis of Algorithms Linear Programming Notes CS 6820 Fall 2014 Lectures, October 3-20, 2014 1 Linear programming The linear programming (LP) problem is the following optimization problem. We are given
More informationA Branch-and-Cut Algorithm for the Stochastic Uncapacitated Lot-Sizing Problem
Yongpei Guan 1 Shabbir Ahmed 1 George L. Nemhauser 1 Andrew J. Miller 2 A Branch-and-Cut Algorithm for the Stochastic Uncapacitated Lot-Sizing Problem December 12, 2004 Abstract. This paper addresses a
More informationBipartite Perfect Matching
Bipartite Perfect Matching We are given a bipartite graph G = (U, V, E). U = {u 1, u 2,..., u n }. V = {v 1, v 2,..., v n }. E U V. We are asked if there is a perfect matching. A permutation π of {1, 2,...,
More informationCONSTRUCTION OF SLICED ORTHOGONAL LATIN HYPERCUBE DESIGNS
Statistica Sinica 23 (2013), 1117-1130 doi:http://dx.doi.org/10.5705/ss.2012.037 CONSTRUCTION OF SLICED ORTHOGONAL LATIN HYPERCUBE DESIGNS Jian-Feng Yang, C. Devon Lin, Peter Z. G. Qian and Dennis K. J.
More informationSolving the Order-Preserving Submatrix Problem via Integer Programming
Solving the Order-Preserving Submatrix Problem via Integer Programming Andrew C. Trapp, Oleg A. Prokopyev Department of Industrial Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15261,
More informationFaster Algorithms for some Optimization Problems on Collinear Points
Faster Algorithms for some Optimization Problems on Collinear Points Ahmad Biniaz Prosenjit Bose Paz Carmi Anil Maheshwari J. Ian Munro Michiel Smid June 29, 2018 Abstract We propose faster algorithms
More informationLecture 13 October 6, Covering Numbers and Maurey s Empirical Method
CS 395T: Sublinear Algorithms Fall 2016 Prof. Eric Price Lecture 13 October 6, 2016 Scribe: Kiyeon Jeon and Loc Hoang 1 Overview In the last lecture we covered the lower bound for p th moment (p > 2) and
More informationCS Homework Chapter 6 ( 6.14 )
CS50 - Homework Chapter 6 ( 6. Dan Li, Xiaohui Kong, Hammad Ibqal and Ihsan A. Qazi Department of Computer Science, University of Pittsburgh, Pittsburgh, PA 560 Intelligent Systems Program, University
More informationApproximations for MAX-SAT Problem
Approximations for MAX-SAT Problem Chihao Zhang BASICS, Shanghai Jiao Tong University Oct. 23, 2012 Chihao Zhang (BASICS@SJTU) Approximations for MAX-SAT Problem Oct. 23, 2012 1 / 16 The Weighted MAX-SAT
More informationConcentration function and other stuff
Concentration function and other stuff Sabrina Sixta Tuesday, June 16, 2014 Sabrina Sixta () Concentration function and other stuff Tuesday, June 16, 2014 1 / 13 Table of Contents Outline 1 Chernoff Bound
More informationApproximations for MAX-SAT Problem
Approximations for MAX-SAT Problem Chihao Zhang BASICS, Shanghai Jiao Tong University Oct. 23, 2012 Chihao Zhang (BASICS@SJTU) Approximations for MAX-SAT Problem Oct. 23, 2012 1 / 16 The Weighted MAX-SAT
More informationCHAPTER 10 Shape Preserving Properties of B-splines
CHAPTER 10 Shape Preserving Properties of B-splines In earlier chapters we have seen a number of examples of the close relationship between a spline function and its B-spline coefficients This is especially
More information5 Integer Linear Programming (ILP) E. Amaldi Foundations of Operations Research Politecnico di Milano 1
5 Integer Linear Programming (ILP) E. Amaldi Foundations of Operations Research Politecnico di Milano 1 Definition: An Integer Linear Programming problem is an optimization problem of the form (ILP) min
More informationCSCI1950 Z Computa4onal Methods for Biology Lecture 4. Ben Raphael February 2, hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary
CSCI1950 Z Computa4onal Methods for Biology Lecture 4 Ben Raphael February 2, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Algorithm Summary Parsimony Probabilis4c Method Input Output Sankoff s & Fitch
More informationConference Program Design with Single-Peaked and Single-Crossing Preferences
Conference Program Design with Single-Peaked and Single-Crossing Preferences Dimitris Fotakis 1, Laurent Gourvès 2, and Jérôme Monnot 2 1 School of Electrical and Computer Engineering, National Technical
More informationApplied Mathematics Letters
Applied Mathematics Letters (009) 15 130 Contents lists available at ScienceDirect Applied Mathematics Letters journal homepage: www.elsevier.com/locate/aml Spectral characterizations of sandglass graphs
More information