Supplementary materials

Similar documents
Aoife McLysaght Dept. of Genetics Trinity College Dublin

Objective: You will be able to justify the claim that organisms share many conserved core processes and features.

SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA

Biology 155 Practice FINAL EXAM

C CH 3 N C COOH. Write the structural formulas of all of the dipeptides that they could form with each other.

A p-adic Model of DNA Sequence and Genetic Code 1

Using an Artificial Regulatory Network to Investigate Neural Computation

Edinburgh Research Explorer

Genetic code on the dyadic plane

TRANSLATION: How to make proteins?

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers

Mathematics of Bioinformatics ---Theory, Practice, and Applications (Part II)

In previous lecture. Shannon s information measure x. Intuitive notion: H = number of required yes/no questions.

Lecture IV A. Shannon s theory of noisy channels and molecular codes

From Gene to Protein

UNIVERSITY OF CAMBRIDGE INTERNATIONAL EXAMINATIONS General Certifi cate of Education Advanced Subsidiary Level and Advanced Level

Proteins: Characteristics and Properties of Amino Acids

CHEMISTRY 9701/42 Paper 4 Structured Questions May/June hours Candidates answer on the Question Paper. Additional Materials: Data Booklet

The degeneracy of the genetic code and Hadamard matrices. Sergey V. Petoukhov

A Minimum Principle in Codon-Anticodon Interaction

Genetic Code, Attributive Mappings and Stochastic Matrices

Evolutionary Analysis of Viral Genomes

Introduction to the Ribosome Overview of protein synthesis on the ribosome Prof. Anders Liljas

Supplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two

TRANSLATION: How to make proteins?

Energy and Cellular Metabolism

Lecture 15: Realities of Genome Assembly Protein Sequencing

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Lect. 19. Natural Selection I. 4 April 2017 EEB 2245, C. Simon

GENETICS - CLUTCH CH.11 TRANSLATION.

Sequence Divergence & The Molecular Clock. Sequence Divergence

Translation. A ribosome, mrna, and trna.

The translation machinery of the cell works with triples of types of RNA bases. Any triple of RNA bases is known as a codon. The set of codons is

Supplementary Information. Broad Spectrum Anti-Influenza Agents by Inhibiting Self- Association of Matrix Protein 1

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Chemistry Chapter 26

Clustering and Model Integration under the Wasserstein Metric. Jia Li Department of Statistics Penn State University

PROTEIN SYNTHESIS INTRO

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

Properties of amino acids in proteins

Protein Synthesis. Unit 6 Goal: Students will be able to describe the processes of transcription and translation.

FW 1 CDR 1 FW 2 CDR 2

Midterm Review Guide. Unit 1 : Biochemistry: 1. Give the ph values for an acid and a base. 2. What do buffers do? 3. Define monomer and polymer.

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

Molecular Evolution and Phylogenetic Analysis

Crystal Basis Model of the Genetic Code: Structure and Consequences

NMR study of complexes between low molecular mass inhibitors and the West Nile virus NS2B-NS3 protease

A Mathematical Model of the Genetic Code, the Origin of Protein Coding, and the Ribosome as a Dynamical Molecular Machine

Three-Dimensional Algebraic Models of the trna Code and 12 Graphs for Representing the Amino Acids

Amino Acid Side Chain Induced Selectivity in the Hydrolysis of Peptides Catalyzed by a Zr(IV)-Substituted Wells-Dawson Type Polyoxometalate

Snork Synthesis Lab Lab Directions

Packing of Secondary Structures

Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012

Lesson Overview. Ribosomes and Protein Synthesis 13.2

ATTRIBUTIVE CONCEPTION OF GENETIC CODE, ITS BI-PERIODIC TABLES AND PROBLEM OF UNIFICATION BASES OF BIOLOGICAL LANGUAGES *

The genetic code, 8-dimensional hypercomplex numbers and dyadic shifts. Sergey V. Petoukhov

On the optimality of the standard genetic code: the role of stop codons

Supplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution

C H E M I S T R Y N A T I O N A L Q U A L I F Y I N G E X A M I N A T I O N SOLUTIONS GUIDE

Chapter 4: Amino Acids

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary

Get started on your Cornell notes right away

7.012 Problem Set 1. i) What are two main differences between prokaryotic cells and eukaryotic cells?

Supplementary Information Intrinsic Localized Modes in Proteins

Week 6: Protein sequence models, likelihood, hidden Markov models

Degeneracy. Two types of degeneracy:

Supplementary figure 1. Comparison of unbound ogm-csf and ogm-csf as captured in the GIF:GM-CSF complex. Alignment of two copies of unbound ovine

DNA Barcoding and taxonomy of Glossina

Other Methods for Generating Ions 1. MALDI matrix assisted laser desorption ionization MS 2. Spray ionization techniques 3. Fast atom bombardment 4.

Leber s Hereditary Optic Neuropathy

Electronic Supplementary Information

Ramachandran Plot. 4ysz Phi (degrees) Plot statistics

Practice Problems 6. a) Why is there such a big difference between the length of the HMG CoA gene found on chromosome 5 and the length of the mrna?

Complete mitochondrial genome of the Amur hedgehog Erinaceus amurensis (Erinaceidae) and higher phylogeny of the family Erinaceidae

STEPHEN L. DOBSON ET AL. WOLBACHIA SUPERINFECTION IN AEDES ALBOPICTUS ORIGIN OF WOLBACHIA SUPERINFECTION IN AEDES ALBOPICTUS BY SEQUENTIAL POPULATION

Similarity or Identity? When are molecules similar?

In this article, we investigate the possible existence of errordetection/correction

Fundamental mathematical structures applied to physics and biology. Peter Rowlands and Vanessa Hill

Could Genetic Code Be Understood Number Theoretically?

3. Evolution makes sense of homologies. 3. Evolution makes sense of homologies. 3. Evolution makes sense of homologies

arxiv: v2 [physics.bio-ph] 8 Mar 2018

Reducing Redundancy of Codons through Total Graph

Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell

RGP finder: prediction of Genomic Islands

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Organic Chemistry Option II: Chemical Biology

Protein Fragment Search Program ver Overview: Contents:

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building

Ribosome kinetics and aa-trna competition determine rate and fidelity of peptide synthesis

RNA & PROTEIN SYNTHESIS. Making Proteins Using Directions From DNA

Abstract Following Petoukhov and his collaborators we use two length n zero-one sequences, α and β,

The Journal of Animal & Plant Sciences, 28(5): 2018, Page: Sadia et al., ISSN:

Could Genetic Code Be Understood Number Theoretically?

Bioinformatics. Part 8. Sequence Analysis An introduction. Mahdi Vasighi

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran

Unraveling the degradation of artificial amide bonds in Nylon oligomer hydrolase: From induced-fit to acylation processes

The Trigram and other Fundamental Philosophies

1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine.

Analysis of Codon Usage Bias of Delta 6 Fatty Acid Elongase Gene in Pyramimonas cordata isolate CS-140

Transcription:

1 Supplementary materials 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Mitochondrial diversity. The mtdna sequences used to compare nucleotide diversity between Culicidae species were available on GenBank: Ae. albopictus (cytb: AJ970990- AJ9702, AY072044; COI: AF253022, AY072044, AY666-671, AY101848- AY101854, DQ181451, DQ181457, DQ181458, DQ397908-DQ397912), Ae. aegypti (cytb: AJ970943-AJ970958; ND4: AF203344-AF203366, AF334841-AF334859, AF334861- AF334865), Ae. caspius (COI: FJ210902-FJ210908; COII: DQ300479-DQ300499), Ae. vexans (COI: AY645241- AY645247; COII: AY645304- AY645309, GU229896); C. pipiens (ND4: AY793688-AY793693, EF028084, EF030092, EF033661; COI: AJ557889, AJ557891, AJ557892, AJ633083-AJ633086, AY33086, GQ255648-GQ255651, GQ255659- GQ25564, GQ255666; COII: EU014281, EU014282, L344351); Culex sp. (ND4: AY793694- AY7937003), C. tarsalis (nad4: EF125799-EF125862), An. aconitus (COI: AY423055, DQ000253-DQ000264; COII: AJ194448-AJ194451, AJ547367-AJ547369, AY626951- AY626978), An. funestus (cytb: AF062501-AF062511), An. gambiae (COI: AF020967, AF020968, AF020970, AF020971, AF020973, AF020980, AF020988, AF020989, AF020991-AF02093, AF020998, AF020999, AF022, AF023, AF021011-AF021023) and An. maculipennis (COI: AF342716-AF342722, AF491682-AF491736). 1

19 20 Gene Putative product Locus tag in wpip(pel) Primer (5'-3') Size (bp) Number of alleles found in this study (accession numbers) References gatb Glutamyl-tRNA(Gln) amidotransferase, WPa_0087 gatb_f1-gakttaaaycgygcaggbgtt 369 1 (Baldo et al. 2006) subunit B gatb_r1-tggyaaytcrggyaaagatga coxa Cytochrome c oxidase, subunit I WPa_0082 coxa_f1-ttggrgcratyaactttatag 402 1 (Baldo et al. 2006) coxa_r1-ctaaagactttkacrccagt hcpa Conserved hypothetical protein WPa_1214 hcpa_f1-gaaatarcagttgctgcaaa 444 1 (Baldo et al. 2006) hcpa_r1-gaaagtyragcaagytctg ftsz Cell division protein WPa_0577 ftsz_f1-atyatggarcatataaargatag 435 1 (Baldo et al. 2006) ftsz_r1-tcragyaatggattrgatat fbpa Fructose-bisphosphate aldolase WPa_1081 fbpa_f1-gctgctccrcttggywtgat 429 1 (Baldo et al. 2006) fbpa_r1-ccrccagaraaaayyactattc wsp Surface protein WPa_0937 81F-TGGTCCAATAAGTGATGAAGAAAC 602 1 (Braig et al. 1998) 691R-AAAAATTAAACGCTACTCCA MutL DNA mismatch repair protein WPa_0278 F- ACTTCATTGCCCTTCCAGCT 0-1,063 6 (HQ709389-HQ709394) This study R -GGCATCAAATTAAGGGACA ank2 Ankyrin domain protein WPa_0652 F-CTTCTTCTGTGAGTGTACGT 313-511 5 (AM397068-AM397072) (Duron et al. 2007) R2-TCCATATCGATCTACTGCGT pk1 Ankyrin domain protein WPa_0256 (1) F-CCACTACATTGCGCTATAGA 1,334-1,349 5 (AM397075-AM397079) (Sinkins et al. 2005) WPa_0313 (2) R-ACAGTAGAACTACACTCCTCCA (Duron et al. 2007) WPa_1306 (3) pk2 Ankyrin domain protein WPa_0299 (1) F-ATTATGATAAAGCTTGGTAAGAA 453 4 (AM397073-AM397073; DQ000471-DQ000472) (Sinkins et al. 2005) WPa_0413 (2) R-TTAGCCCTTCATAAATAGCTT (Duron et al. 2007) GP12 Phage related DNA methylase-like protein WPa_0258 (1) F-ATGAATTTAGCAATCCACTACT 1,215-1,302 7 (GU827985-GU827987; HQ709395-HQ709398) (Atyame et al. in press) WPa_0317 (2) R-TTACTAAATAACAGACATATTGCT WPa_1310 (3) WPa_0429 (4) GP15 (=vrlc) Phage related probable secretory protein WPa_1322 F1-ACCATTACAGAACTTGAGGA 1,511-1,538 7 (GU827988-GU827991; HQ709399-HQ709401) (Duron, Fort, and Weill 2006) R1-TAGACGTTCATAGGCAACCA (Atyame et al. in press) F2-ACCTGACTCTGCAGTACTTGA R2-ACTGCTTCTCTCATAAATTCA RepA Phage related replication protein WPa_1312 Tr1e-F1-ACTTTAGAGGGGTGCTTTCT 583-1,501 2 (AJ646884 ; AJ646887) (Duron et al. 2005) Tr1e-R2- ACAAACAACGGCACAGATT Table S1. List of primers and characteristics of genes used to examine the Wolbachia polymorphism. 2

21 22 23 24 Mitochondrial forward primers (5'-3') Mitochondrial reverse primers (5'-3') 1F AATGAATTGCCTGATAAAAAGGA 417R TGAAGAGGCAAAAGCTTGAGT 161F a GCTATTGGGTTCATACCCCAC 773R a GCTATTAATATTCAACCTAAG 286F TGGCTTGGTGCTTGAATAGGGT 1442R AATGGCTGAAGTTTAGGCGAT 1254F ACTAATAGCCTTCAAAGCTGA 2123R TGGATCTCCTCCTCCAATTGGA 2045F AGCTGGTGCTATTACTATGT 3921R AGTTAATCATCTAATAGGGGCT 2768F TCCAGATAGTTACTTAGCATGA 4798R AGCTCCAATAGCTCCTGT 3738F TTCATTAGATGACTGAAAGCA 5968R TTAGGTCGAAACTAATTGCA 4781F ACAGGAGCTATTGGAGCT 7002R CTTTTTTAGCAGGGTTTTATTC 5949F TGCAATTAGTTTCGACCTAA 7723R b GGGTGGGATGGATTAGGATTGG 6290F CATCTTCAGTGTCATGCTCT 8112R b GATTTGTGGTGTCAATGATA 6981F b GAATAAAACCCTGCTAAAAAAG 8871R TGATTACCTAAGGCTCATGT 7702F b CCAATCCTAATCCATCCCACCC 9259R AGCAAGAGAAAGAGTTGTACGA 7940F TGAAACAATTTCCCATTCA 99R AATAAAACTAATATTCCTCCT 8636F TGAGCAACAGAAGAATAAGCA 11217R c ACTAAAGGATTAGCAGGAATGA 8781F GTAATAATCCATATCCTCCT 12178R TACGAGCGGTTGCTCAAACA 9239F CGTACAACTCTTTCTCTTGCT 12409R TACTAAGGAACAAACTTATCCT 9851F AGAAATCTCTTTGTCACTAACT 13182R TGAATGAGATATATACTGTCT 10366F c CTTTATTAGTAACTGTAAAAATTAC 13587R TATTTTAAGGGATTAGCTTTAA 10912F ACAATGGATTTGAGGAGGA 13706R TAATTAGAAATGAAATGTTAATCG 11985F AGGAGTACGATTAGTTTCAGCT 14067R TTAAAGCTTAATTAGTAAAGTA 12387F AGGATAAGTTTGTTCCTTAGTAA 14998R AGCAATGGGAAGGCTTACACT 12856F TCCAACATCGAGGTCGCAATC 13338F GCCGAATTCCTTATTTAAACCTTTC 13566F TTAAAGCTAATCCCTTAAAATA 13802F ACCCTGATACACAAGGTACA 14793F AATTCACACAAAAATTTACATGT Table S2. List of primers used to examine the Culex pipiens mitochondrial polymorphism. The name of the primers indicates their position in the mitochondrial genome. a,b,c, primers used to amplified fragments of the ND2, ND5 and cytb genes, respectively. GenBank accession numbers: 3

25 26 ND2 (HQ709410-HQ709413), ND5 (HQ724607-HQ724613), cytb (HQ709402-HQ709409), complete mitochondrial genomes (HQ724614- HQ724617). 4

27 28 29 30 31 32 Gene No. of alleles a Fragment size % of VI b π b G+C content (%) b Ka/Ks b Intragenic recombination (Sawyer's test) a MutL 6 960-1,023 7.1 0.03 35.2 0.25 Yes (P<10-4 ) ank2 5 273-471 3.3 0.01 38.6 0.00 No (P=0.33) pk1 5 1,292-1,307 16.2 0.07 33.7 0.17 Yes (P<10-4 ) pk2 4 409 11.2 0.03 38.3 0.04 Yes (P<10-4 ) GP12 7 1,193-1,278 7.9 0.03 37.5 0.05 Yes (P<10-4 ) GP15 8 c 1,470-1,497 12.4 0.04 37.6 0.12 Yes (P<10-4 ) RepA 2 544-1,462 0.0 0.00 32.6 0.00 not reliable Table S3. Genetic characteristics of the seven polymorphic genes used for wpip characterization. VI; number of variable sites; π : pairwise nucleotide diversity based on the average of all pairwise comparisons; a Characteristics estimated considering indels in sequence alignments; b characteristics assessed excluding indels; c including the null GP15 wpip(jhb) allele. Note that primer regions were not considered in these analyses. 5

33 34 35 36 37 Genes MutL ank2 pk1 pk2 GP12 GP15 RepA MutL 0.000*** 0.000*** 0.087 0.000*** 0.000*** 0.060 ank2 0.966 0.000*** 0.013 0.000*** 0.000*** 0.009 pk1 0.966 0.999 0.020 0.000*** 0.000*** 0.010 pk2 0.639 0.700 0.700 0.478 0.206 0.008 GP12 0.914 0.967 0.967 0.600 0.000*** 0.501 GP15 0.911 0.999 0.999 0.700 0.999 0.058 RepA 0.688 0.750 0.750 0.750 0.667 0.750 Table S4. Linkage disequilibrium (LD) measures and tests of association between the wpip genes. The upper half shows probabilities based on the null hypothesis of random association of allelic diversity between loci. The lower half shows LD measures (D values). ***, the null hypothesis is rejected at α = 0.001 taking into account a Bonferonni s adjustment for 21 comparisons. 6

38 39 40 41 42 Gene Position, Direction of transcription trna anticodon/position Start codon End codon t-rna Ile 2-69, CW GAU/31-33 t-rna Gln 70-138, CCW UUG/108-106 t-rna Met 142-210, CW CAU/172-174 ND2 211-1233, CW _ ATC (Ile) TAA trna Trp 1235-1303, CW UCA/1265-1267 trna Cys 1303-1369, CCW GCA/1340-1338 trna Tyr 1382-1447, CCW GUA/1416-1414 COI 1446-2983, CW _ TCG (Ser) T trna Leu 2983-3049, CW UAA/3012-3014 COII 3055-3739, CW _ ATG (Met) T trna Lys 3740-3810, CW CUU/3770-3772 trna Asp 3821-3888, CW GUC/3852-3854 ATPase8 3888-4050, CW _ ATT (Ile) TAA ATPase6 4044-4724, CW _ ATG (Met) TAA COIII 4724-5512, CW _ ATG (Met) TAA trna Gly 5512-5578, CW UCC/5543-5545 ND3 5578-5932, CW _ ATT (Ile) TAA trna Arg 5931-5994, CW UCG/5960-5962 trna Ala 5995-6060, CW UGC/6024-6026 trna Asn 6061-6127, CW GUU/6091-6093 trna Ser 6131-6197, CW GCA/6170-6172 trna Glu 6198-6263, CW UUC/6228-6230 trna Phe 6262-6328, CCW GAA/6296-6294 ND5 6329-8071, CCW _ GTG (Val) TAA trna His 8072-8137, CCW GUG/8107-8105 ND4 8137-9480, CCW _ ATG (Met) TAA ND4L 9474-9770, CCW _ ATG (Met) TAA trna Thr 9776-9840, CW UGU/9806-9808 trna Pro 9841-9906, CCW UGG/9876-9874 ND6 9907-10427, CW _ ATA (Met) TAA CytB 10428-11567, CW _ ATG (Met) TAA trna Ser 11562-11627, CW UGA/11590-11592 ND1 11646-12596, CCW _ TTG (Phe) TAA trna Leu 12597-12663, CCW UAG/12634-12632 Large rrna 12664-13999, CCW _ trna Val 14000-14071, CCW UAC/14038-14036 Small rrna 14072-14865, CCW _ A + T rich region 14867-15587 _ Table S5. Summary of the Culex pipiens mitochondrial genome. Position: expressed in nucleotides based on the Pel sequence. Direction of transcription: CW, clockwise; CCW, counterclockwise. 7

43 44 45 46 47 Gene, position Mitotype ND2 ND5 cytb Mosquito line 256 470 543 591 660 7,061 7,106 7,280 7,341 7,345 7,571 7,824 7,826 7,927 10,502 10,554 10,758 10,887 10,918 10,943 10,952 11,118 pi1 A T C T T T A T G C G A C A A A G G G C A A Pel pi2 G - - - - - - - - - - G - - - G - - - - - - Cot-A, Cot-B, Ma-B pi3 G - - - - - - - - T - G - - - G - - - - - - Ep-A, Ep-B pi4 G - - C - A - - - - - G - - - G - A - - - - Ko, Tn pi5 G - - C - - - - - - - G - - - G - A - - - - Bf-A pi6 G - - - A - G - - - - G T - - G - - - - - G Au pi7 G - - - A - - - - - - G T - - G - - - - - G Lv pi8 G C - - A - - C - - - G T G - G - - - - - G Ke-A pi9 G C - - A - - - - - - G T - - G - - - - - G Ke-B pi10 G - - - A - - - - - - G T - G G - - - - - G Bf-B, Mc pi11 G - - - A - - - - - - G T - G G - - - - G G Sl pi12 G - G - A - - - A - A G T - - G - - A - - - Is pi13 G - - - A - - - - - - G T - - G A - - T - - Ka-C pi14 G - - - A - - - - - - G T - - G - - - T - - Ma-A Table S6: Nucleotide polymorphism in the ND2, ND5 and cytb mitochondrial genes of Culex pipiens. Mosquito lines are listed according to mitotype (pi1 to pi14). Only polymorphic site are indicated, and a dash indicates similarity with the top sequence. Position: expressed in nucleotides based on the complete mitochondrial sequence of the Pel C. pipiens line. 8

48 Supplementary figures 49 50 51 52 53 Figure S1. Wolbachia phylogeny constructed using Bayesian inferences on concatenated sequences of the five MLST genes gatb, coxa, hcpa, ftsz and hcpa. Wolbachia of major supergroups (A, B, D, F and H) were included in the analysis to delineate the wpip group (highlighted). Host species of Wolbachia are reported, followed by the name of the Wolbachia strain. The scale bar is in units of substitutions/site. 54 55 56 57 58 59 Figure S2. Examples of recombination breakpoints along the pk1 (A, B), pk2 (C, D, E) and GP12 (F) sequences. For each alignment, only polymorphic sites around the breakpoints are shown. Polymorphisms shared with the underlined sequence are highlighted in grey. Arrows indicated the significant breakpoints and the nucleotide position detected by Sawyer s procedure. 60 61 62 63 64 65 Figure S3. Mapping of the 13 genes examined in this study on the wpip(pel) genome and on the five major contigs of the wpip(jhb) genome. Black boxes designate prophage genes, or genes inserted in phage regions. Lines connect orthologous genes. The wpip(jhb) genome description corresponds to the current situation and could change when the assembling is achieved. 66 67 68 69 70 Figure S4. Wolbachia phylogenies constructed with six wpip polymorphic genes. A: MutL; B: ank2; C: pk1; D: pk2; E: GP12; F: GP15. The phylogeny of the RepA gene was not performed because the polymorphism with this gene is only based on the presence or the absence of the transposon Tr1. The scale bar is in units of substitutions/site. 9

71 72 73 74 75 76 77 Figure S5. Map of the Culex pipiens mitochondrial genome. The map has been linearized and nucleotide 1 is arbitrary allocated to trna Ile transcription start. All genes are indicated as boxes above (transcription from left to right) or below (transcription from right to left) the baseline. trnas are represented by the single-letter code for the cognate amino acid. Sites found polymorphic between the five C. pipiens mtdna genomes (without the A+T rich region) are indicated by stars. 10

78 79 80 81 Figure S1 11

82 A wpip(pel) T T T A G A A G C G C G T G C A T A G G A C T G A T A A A T T G T G G A T G T C G A C G T T C T T C G A C T G G G T G C C wpip(jhb)............................................................. wpip(is).............................. C A C T A G C T C T A G A T G A T C G T A G T C A A A C A T T wpip(lv) A A A G A T C A A A A T C A T G C G A A T T C A G C G G G G C A C T A G C T C T A G A T G A T C G T A G T C A A A C A T T B wpip(pel) C T G G T A G T T C G C T A C G T A A G T T A A T G G A G C G T A C C G T T T A T T C C A C C C A A A A A C A wpip(jhb)...................................................... wpip(ka-c)................. G A C A G G C A T C A A A C G A G A A G C G A C G T C A A T T G G G G A G wpip(sl) 3 0 6 6 3 3 1 0 8 C T T C G T G A T A T C G T A C G. A C A G G C A T C A A A C G A G. A G C G A C G T C A A T T G G G. A G 2 wpip(pel) A T C C G T A A T A C G G G A T G A C G G A A C G A A A T G G T G A G C A C G C T A A G T A wpip(jhb).............................................. wpip(ep-a)........... A C A G A T G T T A G G T A G G G C A A A C G A G G T A A C G G A C G wpip(bf-b)g G T T A A G C C G T A C A G A T G T T A G G T A G G G C A A A C. A G G T A A C G G A C G 543 4 2 6 1 0 9 2 9 6 7 D E F 159 wpip(pel) A T C C G T A A T A C G G G A T G A C G G A A C G A A A T G G T G G C A C G C T A A G T A wpip(jhb)............................................. wpip(sl)................................ C A G G T A A C G G A C G wpip(bf-b) G G T T A A G C C G T A C A G A T G T T A G G T A G G G C A A A C A G G T A A C G G A C G wpip(pel) G G G A T G A C G G A A C G A A A T G G T G A G C A C G C T A A G T A wpip(jhb)................................... wpip(sl)..................... C. A G G T A A C G G A C G wpip(ep-a) A C A G A T G T T A G G T A G G G C A A A C G A G G T A A C G G A C G wpip(pel) 1 0 8 2 0 7 1 4 7 345 G C A G C C C C A G G A G C C G G A A A A A T G C G A T T A G C A C A C C G G A C G G A A C T T T G C G wpip(jhb).................................................... wpip(ka-c) A T G T T T A T G A A G A T T A A T T C G T A A T A C C C T A T G T T A T............... wpip(ma-a) A T G T T T A T G A A G A T T A A T T C G T A A T A C C C T A T G T T A T A A C A A T C C T A G C A G A 329 4 2 6 4 2 6 1 2 0 0 83 1147 84 85 Figure S2 12

86 87 88 89 w Pip(JHB) contig 1299 (478,325bp) w Pip(JHB) contig 1298 (316,943bp) w Pip(JHB) contig 1302 (42,565bp) w Pip(JHB) contig 1301 (126,623bp) w Pip(JHB) contig 1300 (466,173bp) w Pip(Pel) genome (1,482,355bp) gatb coxa coxa gatb // GP12 wsp fbpa GP12 RepA ftsz ank2 pk2? pk2 pk1 GP12 MutL ank2 hcpa pk1 GP12 MutL pk2 pk1 GP12 pk2 GP12 // ftsz ank2 wsp fbpa hcpa pk1 GP12 RepA GP15 // // // Figure S3 13 / // /

14 90 wpip(pel) wpip(jhb) wpip(cot-a) wpip(cot-b) wpip(ma-b) wpip(bf-a) wpip(ko) wpip(tn) wpip(lv) wpip(ke-a) wpip(ke-b) wpip(au) wpip(is) wpip(ka-c) wpip(ma-a) wpip(ep-a) wpip(ep-b) wpip(bf-b) wpip(mc) wpip(sl) 0.1 wpip(pel) wpip(jhb) wpip(ep-a) wpip(ep-b) wpip(cot-a) wpip(cot-b) wpip(ma-b) wpip(bf-a) wpip(ko) wpip(tn) wpip(lv) wpip(mc) wpip(ke-a) wpip(ke-b) wpip(is) wpip(sl) wpip(bf-b) wpip(ka-c) wpip(ma-a) wpip(au) 58 57 58 0.1 wpip(pel) wpip(jhb) wpip(ep-a) wpip(ep-b) wpip(cot-a) wpip(cot-b) wpip(ma-b) wpip(bf-a) wpip(ko) wpip(tn) wpip(lv) wpip(ke-b) wpip(au) wpip(bf-b) wpip(mc) wpip(sl) wpip(ke-a) wpip-(is) wpip(ma-a) wpip(ka-c) 66 95 0.1 wpip(pel) wpip(jhb) wpip(ep-a) wpip(ep-b) wpip(cot-a) wpip(cot-b) wpip(ma-b) wpip(bf-a) wpip(ko) wpip(tn) wpip(ka-c) wpip(ma-a) wpip(lv) wpip(ke-a) wpip(ke-b) wpip(au) wpip(bf-b) wpip(mc) wpip(sl) wpip(is) 70 59 0.05 wpip(pel) wpip(ep-a) wpip(ep-b) wpip(cot-a) wpip(cot-b) wpip(ma-b) wpip(bf-a) wpip(ko) wpip(tn) wpip(is) wpip(ke-a) wpip(ke-b) wpip(lv) wpip(au) wpip(bf-b) wpip(mc) wpip(sl) wpip(ka-c) wpip(ma-a) 51 50 71 0.05 A B C D E F WP0652-Pel WP0652-Ep-A WP0652-Ep-B WP0652-Cot-A WP0652-Cot-B WP0652-Ma-B WP0652-Bf-A WP0652-Ko WP0652-Tn WP0652-Lv WP0652-Ke-A WP0652-Ke-B WP0652-Au WP0652-Bf-B WP0652-Mc WP0652-Sl WP0652-Is WP0652-Ka-C WP0652-Ma-A 0.05 wpip(pel) wpip(jhb) wpip(cot-a) wpip(cot-b) wpip(ma-b) wpip(bf-a) wpip(ko) wpip(tn) wpip(ep-a) wpip(ep-b) wpip(lv) wpip(ke-a) wpip(ke-b) wpip(au) wpip(mc) wpip(is) wpip(sl) wpip(bf-b) wpip(ka-c) wpip(ma-a) 93 0.58 0.58 0.57 0.93 0.95 0.66 0.50 0.51 0.71 0.59 0.70 91 Figure S4 92

93 94 95 96 97 I M Q ND4L ND2 T P W CY ND6 cytb ATPase8 L K D G R A N S E COI COII ATPase6 COIII ND3 F S ND1 Large rrna Small rrna L V Figure S5 ND5 A+T rich region H ND4 15

98 99 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 Literature Cited Atyame, C., O. Duron, P. Tortosa, N. Pasteur, P. Fort, and M. Weill. 2011. Multiple Wolbachia determinants control the evolution of cytoplasmic incompatibilities in Culex pipiens mosquito populations. Mol Ecol (in press). Baldo, L., J. C. Dunning Hotopp, K. A. Jolley, S. R. Bordenstein, S. A. Biber, R. R. Choudhury, C. Hayashi, M. C. Maiden, H. Tettelin, and J. H. Werren. 2006. Multilocus sequence typing system for the endosymbiont Wolbachia pipientis. Appl Environ Microbiol 72 :7098-7110. Braig, H. R., W. Zhou, S. L. Dobson, and S. L. O'Neill. 1998. Cloning and characterization of a gene encoding the major surface protein of the bacterial endosymbiont Wolbachia pipientis. J Bacteriol 180 :2373-2378. Duron, O., A. Boureux, P. Echaubard, A. Berthomieu, C. Berticat, P. Fort, and M. Weill. 2007. Variability and expression of ankyrin domain genes in Wolbachia variants infecting the mosquito Culex pipiens. J Bacteriol 189 :4442-4448. Duron, O., P. Fort, and M. Weill. 2006. Hypervariable prophage WO sequences describe an unexpected high number of Wolbachia variants in the mosquito Culex pipiens. Proc Biol Sci 273 :495-502. Duron, O., J. Lagnel, M. Raymond, K. Bourtzis, P. Fort, and M. Weill. 2005. Transposable element polymorphism of Wolbachia in the mosquito Culex pipiens: evidence of genetic diversity, superinfection and recombination. Mol Ecol 14 :1561-1573. Sinkins, S. P., T. Walker, A. R. Lynd, A. R. Steven, B. L. Makepeace, H. C. Godfray, and J. Parkhill. 2005. Wolbachia variability and host effects on crossing type in Culex mosquitoes. Nature 436 :257-260. 16