Segment Problem. Xue Wang, Fasheng Qiu Sushil K. Prasad, Guantao Chen Georgia State University, Atlanta, GA April 21, 2010

Similar documents
Efficient Algorithms for Locating the Length-Constrained Heaviest Segments, with Applications to Biomolecular Sequence Analysis

Constrained Minkowski Sum Selection and Finding

Whole Genome Alignments and Synteny Maps

Introduction to numerical computations on the GPU

Analysis and Design of Algorithms Dynamic Programming

An Introduction to Sequence Similarity ( Homology ) Searching

Shortest Lattice Vector Enumeration on Graphics Cards

Dense Arithmetic over Finite Fields with CUMODP

Theoretical aspects of ERa, the fastest practical suffix tree construction algorithm

Sparse LU Factorization on GPUs for Accelerating SPICE Simulation

On the Monotonicity of the String Correction Factor for Words with Mismatches

Measuring freeze-out parameters on the Bielefeld GPU cluster

R ij = 2. Using all of these facts together, you can solve problem number 9.

Comparative genomics: Overview & Tools + MUMmer algorithm

Using a CUDA-Accelerated PGAS Model on a GPU Cluster for Bioinformatics

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS

MACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance

Dictionary Matching in Elastic-Degenerate Texts with Applications in Searching VCF Files On-line

Dynamic Programming. Shuang Zhao. Microsoft Research Asia September 5, Dynamic Programming. Shuang Zhao. Outline. Introduction.

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

Locating regions in a sequence under density constraints

Practical Free-Start Collision Attacks on 76-step SHA-1

Computing Maximum Subsequence in Parallel

Efficient Polynomial-Time Algorithms for Variants of the Multiple Constrained LCS Problem

Multivariate density estimation and its applications

Optimization Techniques for Parallel Code 1. Parallel programming models

Let S be a set of n species. A phylogeny is a rooted tree with n leaves, each of which is uniquely

TR A Comparison of the Performance of SaP::GPU and Intel s Math Kernel Library (MKL) for Solving Dense Banded Linear Systems

Greedy Algorithms. CS 498 SS Saurabh Sinha

Parallelization of Molecular Dynamics (with focus on Gromacs) SeSE 2014 p.1/29

Mayer Sampling Monte Carlo Calculation of Virial. Coefficients on Graphics Processors

SP-CNN: A Scalable and Programmable CNN-based Accelerator. Dilan Manatunga Dr. Hyesoon Kim Dr. Saibal Mukhopadhyay

Solving PDEs with CUDA Jonathan Cohen

Searching Sear ( Sub- (Sub )Strings Ulf Leser

S0214 : GPU Based Stacking Sequence Generation For Composite Skins Using GA

Lecture 2: Pairwise Alignment. CG Ron Shamir

CRYPTOGRAPHIC COMPUTING

Notes on MapReduce Algorithms

Pattern Matching (Exact Matching) Overview

Background. Another interests. Sieve method. Parallel Sieve Processing on Vector Processor and GPU. RSA Cryptography

I519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB

Practical Free-Start Collision Attacks on full SHA-1

RGP finder: prediction of Genomic Islands

Shortest paths with negative lengths

Perm State University Research-Education Center Parallel and Distributed Computing

StreamSVM Linear SVMs and Logistic Regression When Data Does Not Fit In Memory

Parallel Transposition of Sparse Data Structures

Accelerating Model Reduction of Large Linear Systems with Graphics Processors

Scaled gradient projection methods in image deblurring and denoising

Welcome to MCS 572. content and organization expectations of the course. definition and classification

Bisection Ideas in End-Point Conditioned Markov Process Simulation

Jacobi-Based Eigenvalue Solver on GPU. Lung-Sheng Chien, NVIDIA

Population annealing study of the frustrated Ising antiferromagnet on the stacked triangular lattice

How many double squares can a string contain?

Compressed Index for Dynamic Text

A CUDA Solver for Helmholtz Equation

Prediction o f of cis -regulatory B inding Binding Sites and Regulons 11/ / /

arxiv: v1 [hep-lat] 7 Oct 2010

Edge-pancyclicity of Möbius cubes

Tips Geared Towards R. Adam J. Suarez. Arpil 10, 2015

Minimal Height and Sequence Constrained Longest Increasing Subsequence

The Lattice Boltzmann Method for Laminar and Turbulent Channel Flows

A Phylogenetic Network Construction due to Constrained Recombination

On the Number of Distinct Squares

Smith et al. American Journal of Botany 98(3): Data Supplement S2 page 1

Randomized Selection on the GPU. Laura Monroe, Joanne Wendelberger, Sarah Michalak Los Alamos National Laboratory

Small RNA in rice genome

Bio nformatics. Lecture 3. Saad Mneimneh


On the Computational Complexity of the Discrete Pascal Transform

S XMP LIBRARY INTERNALS. Niall Emmart University of Massachusetts. Follow on to S6151 XMP: An NVIDIA CUDA Accelerated Big Integer Library

Real-time signal detection for pulsars and radio transients using GPUs

Analytical Modeling of Parallel Programs (Chapter 5) Alexandre David

Optimal Folding Of Bit Sliced Stacks+

3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT

Parallel Performance Theory - 1

CycleTandem: Energy-Saving Scheduling for Real-Time Systems with Hardware Accelerators

COL 730: Parallel Programming

Quicksort. Where Average and Worst Case Differ. S.V. N. (vishy) Vishwanathan. University of California, Santa Cruz

Parallel Rabin-Karp Algorithm Implementation on GPU (preliminary version)

RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE

Accelerating linear algebra computations with hybrid GPU-multicore systems.

Research on GPU-accelerated algorithm in 3D finite difference neutron diffusion calculation method

On-line String Matching in Highly Similar DNA Sequences

Streaming and communication complexity of Hamming distance

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

Sequence Database Search Techniques I: Blast and PatternHunter tools

Cpt S 223. School of EECS, WSU

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre

A Self-Stabilizing Algorithm for Finding a Minimal Distance-2 Dominating Set in Distributed Systems

Hidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98)

sri 2D Implicit Charge- and Energy- Conserving Particle-in-cell Application Using CUDA Christopher Leibs Karthik Murthy

Introduction to sequence alignment. Local alignment the Smith-Waterman algorithm

Lecture 8 Learning Sequence Motif Models Using Expectation Maximization (EM) Colin Dewey February 14, 2008

Analytical Modeling of Parallel Systems

Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2

Information Sciences Institute 22 June 2012 Bob Lucas, Gene Wagenbreth, Dan Davis, Roger Grimes and

Lecture 14: Multiple Sequence Alignment (Gene Finding, Conserved Elements) Scribe: John Ekins

OPTIMAL PARALLEL SUPERPRIMITIVITY TESTING FOR SQUARE ARRAYS

Sequence Alignment. Johannes Starlinger

Transcription:

Efficient Algorithms for Maximum Density Segment Problem Xue Wang, Fasheng Qiu Sushil K. Prasad, Guantao Chen Georgia State University, Atlanta, GA 30303 April 21, 2010

Introduction Outline Origin of the maximum density problem Description of the maximum density problem Current progress Why more efficient algorithm is needed Our contribution Our Approach Basic concepts Overall idea Find right skew pointers efficiently Three steps to find maximum density segment. Implementation and Experimental Results Comparison with naïve algorithm Some additional words Ongoing and Future Work

Origin In DNA sequence, GC content is positively correlated with gene length, gene density, patterns of codon usage, recombination rate within chromosomes, Find a segment where the density of C and G is maximum http://www.topnews.in/health/files/dna.jpg

Origin http://home.cc.umanitoba.ca/~psgendb/biolegato/multalign/defensin.jalview.gif Sequence alignment Each aligned column has an alignment score Identify segment that is aligned best from output sequence

Problem Definition S = s 1 s n =(a 1,w 1 ), (a 2,w 2 ),, (a n,w n ). w i is the width of s i 0 1 2 3 4 5 6 7 1 1 0 5 22 4 1 1 μ( i, j) = i k j i k j ( ak ) S[ i, j] = ( w ) w ( i, j ) k µ(i,j) is the density of S[i,j]. C min is the width constraint Definition: for given S and C find i j such that C <= Definition: for given S and C min, find i, j such that C min <= w(i,j) and µ(i,j) is maximized.

An Example 0 1 2 3 4 5 6 7 1 1 0 5 22 4 1 1 w i = 1, 0 i 7 a1 = 1, a2 = 1,., a7 = 1 C min = 2

An Example 0 1 2 3 4 5 6 7 1 1 0 5 22 4 1 1 w i = 1, 0 i 7 a1 = 1, a2 = 1,., a7 = 1 C min = 3

Current Status O(n) algorithm for fixed width. O(nlogC min ) algorithm proposed in 2002. Two optimal sequential algorithms (O(n)) proposed in 2005. 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Treatment of current position depends on previous position No non trivial parallel algorithm has been known.

Why more efficient algorithms are needed Super large sequences: The longest chromosome from human has approximately 2,220,000,000220 000 000 base pairs. http://www.odec.ca/projects/2005/anna5m0/public_html/images/chromosome.gif Output maximum density segments starting from each index. Current best sequential algorithm is O(nlogC min ) 0 1 2 3 4 5 6 7 1 1 0 5 2 4 1 1

Our Contribution Proved a Combination Lemma. Proposed n processor O(log 2 n) algorithm. Implemented the proposed algorithm on GPU Implemented the proposed algorithm on GPU using CUDA.

Some Thoughts About the C min n processor, O(n) time complexity n 2 processor, O(logn) time complexity cost is = Ω(n 2 ) 0 1 2 3 4 5 6 7 1 1 0 5 2 4 1 1 width w i = 1 i Is there a better way?

Introduction Outline Origin of the maximum density problem Description of the maximum density problem Current progress Why more efficient algorithm is needed Our contribution Our Approach Basic concepts Overall idea Find right skew pointers efficiently Three steps to find maximum density segment. Implementation and Experimental Results Comparison with naïve algorithm Some additional words Ongoing and Future Work

Basic Concepts/DRSP [Lin et al. ] DRSP: Decreasing right skew partition Existence and uniqueness S 1 S 2 S 3 S 4 S 5 S 6 s 1 s 2 s 3 s 4 s 5 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Decreasing: µ(s 1 ) > µ(s 2 ) > µ(s 3 ) > µ(s 4 ) > µ(s 5 ) > µ(s 6 ) Right skew: the density of anyprefix sequence isno morethanthe Right skew: the density of any prefix sequence is no more than the density of remaining suffix : µ(s 1 ) µ(s 2 s 3 s 4 s 5 ), µ(s 1 s 2 ) µ(s 3 s 4 s 5 ),

Basic Concepts/DRSP [Lin et al. ] Bitonic property p S 1 S 2 S 3 S 4 S 5 S 6 S L 1 1 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 µ(s L S 1 ) µ(s L S 1 S 2 ) µ(s L S 1 S 2 S 3 S 4 S 5 S 6 ).

Basic Concepts/DRSP [Lin et al. ] Bitonic property p Atomic property: if µ(s L S 1 ) µ(s L ), then µ(s L S 1 )>=µ(s L s 1 s i ), 1<= i <=5 S 1 S 2 S 3 S 4 S 5 S 6 S L 1 1 s 1 s 2 s 3 s 4 s 5 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 µ(s L S 1 ) =14/7=2, µ(s L s 1 )=4/3, µ(s L s 1 s 2 )=3/4,

General Idea If for each i, 1<= i <=n, we know the DRSP of s i+li s n With one processor, the maximum density segment beginning with ican be identified within O(log(n)) by binary search. With n processors, maximum density segment beginning with i can be identified in O(log(n)) in parallel S 2 S 3 S 4 S 5 S 6 s i 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 s i+li

General Idea Efficiently find the DRSP of each suffix of S?

Right Skew Pointers [Goldwasser et al.] The right skew pointer tells us the DRSP of s i s n (The use of right skew pointers) For each s i, find the longest right skew segment beginning with s i (Lemma) (How to find right skew pointers) 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Efficiently Find Right Skew Pointers 2 1 3 1 7 0 1 4 6 4 5 3 0 2 4 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Left half: S L Right half: S R 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 As a whole

Efficiently Find Right Skew Pointers Guarantee that the construction of right skew pointers can be done in a parallel and binary fashion.

General Steps Step 1: Find right skew pointers for S (log 2 n) 2-1 3 1 7 0 1-4 6 4 5 3 0-2 -4-8 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Step 2: Find for each i, i* such that w(i, i*) C min and µ(i, i*) reaches maximum. (logn) 2 1 3 1 7 0 1 4 6 4 5 3 0 2 4 8 1 1* Step 3: Identify the maximum density segment from all S[i, i*]. (logn)

Efficiently Find Right Skew Pointers

Introduction Outline Origin of the maximum density problem Description of the maximum density problem Current progress Why more efficient algorithm is needed Our contribution Our Approach Basic concepts Overall idea Find right skew pointers efficiently Three steps to find maximum density segment. Implementation and Experimental Results Comparison with naïve algorithm Some additional words Ongoing and Future Work

CUDA Architecture nvidia GTX280

Finding good partner with CUDA n elements, block size t, #blocks is n/t. t 512 due to the required shared memory. t = 512 for maximal use of shared memory. Find right skew pointers. Seq. is divided up to equal-size sub sequences (except the last one). -Find right skew pointers for each block (subsequence) intra-block communication (through shared memory, less expensive). Each block processes a subsequence of length t. - Find right skew pointers for the entire sequence (by melding the RSP of all subsequences). inter-block communication (through global memory, much expensive). Find good partners Each thread finds the good partner for its corresponding element.

Performance O(n 2 )

Performance

Some Additional Words We have implemented optimal sequential algorithm One a single CPU of GTX 280, out of memory above 3 million size Current parallel algorithm should be further optimized.

Ongoing g and Future Work Algorithm for both lower and upper bound is available Optimize CUDA code and compare parallel algorithm with optimal sequential algorithm Apply to real sequences Reduce memory requirement Cost optimal parallel algorithm

Thanks! Questions?

References Yaw Ling Lin, Tao Jiang, and Kun Mao Chao. Efficient algorithms for locating the length constrained heaviest segments with applications to biomolecularlar sequence ence analysis. Journal of Computer and System Sciences, 65(3):570 586, 2002. Michael H. Goldwasser, Ming Yang Kao, and Hsueh I Lu. Linear time algorithms for computing maximum density sequence segments with bioinformatics applications. Journal of Computer and System Sciences, 70(2):128 144, 2005. Kai min Chung and Hsueh I Lu. An optimal algorithm for the maximum density segment problem. SIAM J Comput, 34:373 387, 2005. X. Huang. Local rates of recombination are positively correlated with gc content in the human genome. Mol Biol Evol, 10:219 225, 1994.