Computational Genetics Winter 2013 Lecture 10 Eleazar Eskin University of California, Los ngeles
Pair End Sequencing Lecture 10. February 20th, 2013 (Slides from Ben Raphael)
Chromosome Painting: Normal Cells
Chromosome Painting: Tumor Cells
Rearrangements in Tumors Change gene structure and regulatory wiring of the genome. Create bad novel fusion genes and break good old genes. Example: translocation in leukemia. Chromosome 9 promoter Chromosome 22 BL gene promoter BCR gene promoter BCR-BL oncogene n Gleevec TM (Novartis 2001) targets BCR-BL oncogene.
Complex Tumor Genomes 1) What are detailed architectures of tumor genomes? 2) What rearrangements/duplications produce these architectures and what is the order of these events? 3) What are the novel fusion genes and old broken genes?
Genome rearrangements Mouse (X chrom.) Unknown ancestor ~ 80 million years ago (X chrom.) n What are the the architectural blocks forming the existing genomes and how to find them? n What is the architecture of the ancestral genome? n What is the evolutionary scenario for transforming one genome into the other?
History of Chromosome X Rat Consortium, Nature, 2004
Inversions 1 2 3 9 10 8 4 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 7 6 5 n Blocks represent conserved genes. n In the course of evolution or in a clinical context, blocks 1,2,,10 could be misread as: 1, 2, 3, -8, -7, -6, -5, -4, 9, 10.
Inversions 1 2 3 9 10 8 4 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 7 6 5 n n n n Blocks represent conserved genes. In the course of evolution or in a clinical context, blocks 1,,10 could be misread as 1, 2, 3, -8, -7, -6, -5, -4, 9, 10. Evolution: occurred one-two times every million years. Cancer: may occur every month.
Inversions 1 2 3 9 10 8 4 1, 2, 3, -8, -7, -6, -5, -4, 9, 10 7 6 5 The inversion introduced two breakpoints (disruptions in order).
Measuring Structural Changes in Tumors: Cytogenetics Directly visualize (fluorescently) labeled chromosomes. Chromosome banding, mfish, SKY Weakness: Physical location of chromosomal junctions not revealed. Low resolution. No/little information about copy number changes.
Paired End Sequencing (PE) C. Collins et al. (UCSF Cancer Center) Tumor DN 1) Pieces of tumor genome: clones (100-250kb). DN x y 2) Sequence ends of clones (500bp). 3) Map end sequences to human genome. Each clone corresponds to pair of end sequences (PE pair) (x,y). Typical Next Generation Sequencing read lengths are shorter.
PE Pairs Order PE pair such that x < y. PE pair (x,y) is valid if x,y on same chromosome. and l y x L, min (max) size of clone. x, y have opposite, convergent orientations invalid, otherwise. Results from rearrangement or experimental noise. L x y x 1 x x 3 x 4 y 1 y x 5 y y y 2 2 5 4 3
Tumor Genome Reconstruction Puzzle Reconstruct tumor genome x 1 x 2 B C D E genome (known) Unknown sequence of rearrangements Tumor genome -C -D B E (unknown) Map PE pairs to human genome. x 3 x 4 y 1 y x 5 y 2 5 y 4 y 3 Location of PE pairs in human genome. (known)
Tumor Genome Reconstruction E B Tumor -D -C B C D E
Tumor Genome Reconstruction E B Tumor -D -C B C D E
Tumor Genome Reconstruction E B (x 3,y 3 ) Tumor (x 2,y -D 2 ) -C (x 4,y 4 ) (x 1,y 1 ) B C D E x 1 x 2 x 3 x 4 y 1 y 2 y 4 y 3
PE Plot E (x 3,y 3 ) (x 4,y 4 ) D C B (x 1,y 1 ) (x 2,y 2 ) 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E D C B 2D Representation of PE Data Each point is PE pair. Can we reconstruct the tumor genome from the positions of the PE pairs? B C D E
E E D -D C -C B B Reconstructed Tumor Genome B C D E -C -D B E
Real data noisy and incomplete!
Reconstruction of Tumor History n Use knowledge of known rearrangement mechanisms e.g. inversions, translocations, etc. n Find simplest explanation for data, given these mechanisms. n Motivation: Sorting by Reversals
n G = [0,M], unichromosomal genome. n Reversal ρ s,t (x)= x, if x < s or x > t, PE Sorting Problem t (x s), otherwise. B C G x 1 s y 1 t x 2 y 2 ρ -B x 1 y G = ρg 1 x 2 y 2 Given: PE pairs (x 1, y 1 ),, (x n, y n ) Find: Minimum number of reversals ρ s1,t1,, ρ sn, tn such that if ρ = ρ s1,t1 ρ sn, tn then (ρ x 1, ρ y 1 ),, (ρ x n, ρ y n ) are valid PE pairs.
B C x 1 s x x y 1 t 2 3 y 3 y 2 -C -B x 1 y 1 y 3 x 3 x 2 y 2 ρ t s Sequence of reversals. s t ll ES pairs valid.
Breast Cancer MCF7 Cell Line chromosomes MCF7 chromosomes 5 inversions 15 translocations Raphael et al. 2003.
Complications with MCF7: Chromosomes 1,3,17, 20 33/70 clusters Total length: 31Mb
s t C -B s t C B inversion u B u w D B C D v E w C D v E duplication/ transposition u B w C v E???? Rearrangement Signatures Tumor s t -B s t -C B D C D translocation
Complex Tumor Genomes
Structure of Duplications in Tumors? Duplicated segments may co-localize (Guan et al. Nat.Gen.1994) genome Tumor genome n Mechanisms not well understood.
Tumor mplisomes