Research Survey of human mitochondrial diseases using new genomic/proteomic tools

Similar documents
Survey of Human Mitochondrial Diseases Using New Genomic/Proteomic Tools

Rick Van Kooten, Indiana Univ., FNAL Users Meeting, HEPAP Session, June 27, 2000

råáp=== = = ======råáîéêëáíó=çñ=pìêêéó

Parts Manual. EPIC II Critical Care Bed REF 2031

2 Genome evolution: gene fusion versus gene fission

M i t o c h o n d r i a

A L A BA M A L A W R E V IE W

Leber s Hereditary Optic Neuropathy

Research Extracting biological information from DNA arrays: an unexpected link between arginine and methionine metabolism in Bacillus subtilis

What are mitochondria?

Minireview Rhesus factors and ammonium: a function in efflux?

Research Comparison of complete nuclear receptor sets from the human, Caenorhabditis elegans and Drosophila genomes

176 5 t h Fl oo r. 337 P o ly me r Ma te ri al s

Research The =/> fold uracil DNA glycosylases: a common origin with diverse fates

Chapter 30 Design and Analysis of

Introduction to Bioinformatics Integrated Science, 11/9/05

P a g e 5 1 of R e p o r t P B 4 / 0 9

NVLAP Proficiency Test Round 14 Results. Rolf Bergman CORM 16 May 2016

T h e C S E T I P r o j e c t

Hereditary Hemochromatosis

NACC Uniform Data Set (UDS) FTLD Module

Scale in the biological world

In-Depth Assessment of Local Sequence Alignment

# shared OGs (spa, spb) Size of the smallest genome. dist (spa, spb) = 1. Neighbor joining. OG1 OG2 OG3 OG4 sp sp sp

The Mitochondrion. Renáta Schipp

NACC Uniform Data Set (UDS) FTLD Module

Executive Committee and Officers ( )

Cellular Neuroanatomy I The Prototypical Neuron: Soma. Reading: BCP Chapter 2

7.06 Cell Biology EXAM #3 April 21, 2005

Energy and Cellular Metabolism

The Minimal-Gene-Set -Kapil PHY498BIO, HW 3

Subsystem: Succinate dehydrogenase

H STO RY OF TH E SA NT

74 ).20;51 /;),20)4 )+ /; % #& %`! MMM FF H= M F * 5 1) ; ) 5 1) ; *; *)+6-42; ;5)++0)41,- 1,7+-,+; )

Properties of amino acids in proteins

necessita d'interrogare il cielo

S E R U M O S M O L A L I T Y. Group B: hypothyroid animals. Group C: hyperthyroid animals

Newly made RNA is called primary transcript and is modified in three ways before leaving the nucleus:

Lecture 15: Realities of Genome Assembly Protein Sequencing

Bioinformatics Practical for Biochemists

Midterm Review Guide. Unit 1 : Biochemistry: 1. Give the ph values for an acid and a base. 2. What do buffers do? 3. Define monomer and polymer.

Mitochondrial DNA Medicine

råáp=== = = ======råáîéêëáíó=çñ=pìêêéó

Pushbutton Units and Indicator Lights

TCA Cycle. Voet Biochemistry 3e John Wiley & Sons, Inc.

L-NAME 10 mg/kg Control. N=5~6 * p<0.05. AG 10 mg/kg AG 5 mg/kg. * AG 20 mg/kg. Normal days

1 64,7+61 5= E?O =JAE J IFA?EBE?= JECA E = AHCE?DOFAHIA IEJELEJOHA=?JE MDE?D?=??KHKF? J=?J B= HC= EI MEJDI= E?O EJ

Translation. A ribosome, mrna, and trna.

Genetic defects causing mitochondrial respiratory chain disorders and disease

Mitochondria: Transducers or Time Bombs?

Colon. Duodenum 0,02 0,015 0,01 0,005. control PEMF exposure (wk) 0,02 0,015 0,01 0,005. control PEMF exposure (wk)

TRANSLATION: How to make proteins?

The discovery of the important role of vagal nerves in control of gastric and pancreatic secretion by IP Pavlov in 1890

ΔG o' = ηf ΔΕ o' = (#e ( V mol) ΔΕ acceptor

MATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME

Advanced Topics in RNA and DNA. DNA Microarrays Aptamers

Informatica e diritto, Vol. XIII, 2004, n. 1-2, pp

SUPPLEMENTARY INFORMATION

GUIDE. mirfieldshow.com. Sponsored by. Orange Design Studio.

$ 2 =JA IK> L=I?K =HHEI B=?J H FH BE A MDE?D CAIE F =JA AJ=CCHAC=JE EJIA B LA F OFDA HE?DA

Sequence analysis and comparison

wi& be demonstrated. Ilexbert Arkro,

BIOINFORMATICS LAB AP BIOLOGY

Objective: You will be able to justify the claim that organisms share many conserved core processes and features.

$ 1 64,7+61?= M = JA?D CE?= M 9D=JEI?= M IHA O E B H =JE >AE BH ACA AH=JE J JDA ANJE A? K EJO HE =I = HACE.

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

SUPPLEMENTARY EXPERIMENTAL PROCEDURES

Cell Organelles. a review of structure and function

Importance of Protein sorting. A clue from plastid development

OH BOY! Story. N a r r a t iv e a n d o bj e c t s th ea t e r Fo r a l l a g e s, fr o m th e a ge of 9

THE TRANSLATION PLANES OF ORDER 49 AND THEIR AUTOMORPHISM GROUPS

! 1 64,7+61 E?E I =HAE F HJ= J? B=?J HIE HAJD= A O = O A O =JE?FH JAE I 1JEI M JD=J EIFHAIA JE JDA?A JH= AHL KI IOIJA MDAHAEJH=EIAIA A?JHE?= =?J

One of the great challenges in molecular biology is to understand the mechanisms by

I N A C O M P L E X W O R L D

Introduction to Comparative Protein Modeling. Chapter 4 Part I

2011 The Simple Homeschool Simple Days Unit Studies Cells

Graduate Funding Information Center

GO ID GO term Number of members GO: translation 225 GO: nucleosome 50 GO: calcium ion binding 76 GO: structural

E *71,1 / 7 61, )6,1/16) 1*4) / ),* >O E?D=A A I * 5 =O'' 8EHCE E=2 OJA?D E?1 ELAHIEJO )6DAIEI5K>

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Reconstructing Mitochondrial Evolution?? Morphological Diversity. Mitochondrial Diversity??? What is your definition of a mitochondrion??

Comparing whole genomes

Multiple Choice Review- Eukaryotic Gene Expression

Glycogen. Glucose-6-Phosphate. Pyruvate. Acetyl-CoA

Regulatory Sequence Analysis. Sequence models (Bernoulli and Markov models)

Lecture 7 Cell Biolog y ٢٢٢ ١

Evolutionary Use of Domain Recombination: A Distinction. Between Membrane and Soluble Proteins

From Gene to Protein

MOLECULAR CELL BIOLOGY

(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.

What is the central dogma of biology?

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Protein structure forces, and folding

I n t e r n a t i o n a l E l e c t r o n i c J o u r n a l o f E l e m e n t a r y E.7 d u, c ai ts is ou n e, 1 V3 1o-2 l6, I n t h i s a r t

Tools and Algorithms in Bioinformatics

Extranuclear Inheritance. Dr.Shivani Gupta, PGGCG-11, Chandigarh

Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species

SUPPLEMENTARY INFORMATION

PROTEIN EVOLUTION AND PROTEIN FOLDING: NON-FUNCTIONAL CONSERVED RESIDUES AND THEIR PROBABLE ROLE

Comparative genomics: Overview & Tools + MUMmer algorithm

Transcription:

http://genomebiology.com/2001/2/6/research/0021.1 Research Survey of human mitochondrial diseases using new genomic/proteomic tools 6D =I 2 =IJAHAH 6A F A 5 EJD V = @5? JJ+ DH W )@@HAIIAI *E A?K =H- CE AAHE C4AIA=H?D+A JAH= @,AF=HJ A J B2D=H =? CO * IJ 7 ELAHIEJO5?D B A@E?E A & - +? H@5JHAAJ * IJ ) & 75) V *E A?K =H- CE AAHE C4AIA=H?D+A JAH * IJ 7 ELAHIEJO+ ACA B- CE AAHE C!$ +K E CJ 5JHAAJ * IJ ) # 75) W *E A?K =H- CE AAHE C4AIA=H?D+A JAH= @,AF=HJ A J B+DA EIJHO * IJ 7 ELAHIEJO #'+ MA= JD)LA KA * IJ ) # 75) + HHAIF @A?A 5? JJ+ DH - =E DH(@=HME >K A@K Published: 1 June 2001 Genome Biology 2001, 2(6):research0021.1 0021.16 The electronic version of this article is the complete one and can be found online at http://genomebiology.com/2001/2/6/research/0021 2001 Plasterer et al., licensee BioMed Central Ltd (Print ISSN 1465-6906; Online ISSN 1465-6914) Abstract Background: We have constructed Bayesian prior-based, amino-acid sequence profiles for the complete yeast mitochondrial proteome and used them to develop methods for identifying and characterizing the context of protein mutations that give rise to human mitochondrial diseases. (Bayesian priors are conditional probabilities that allow the estimation of the likelihood of an event - such as an amino-acid substitution - on the basis of prior occurrences of similar events.) Because these profiles can assemble sets of taxonomically very diverse homologs, they enable identification of the structurally and/or functionally most critical sites in the proteins on the basis of the degree of sequence conservation. These profiles can also find distant homologs with determined threedimensional structures that aid in the interpretation of effects of missense mutations. Results: This survey reports such an analysis for 15 missense mutations, one insertion and three deletions involved in Leber s hereditary optic neuropathy, Leigh syndrome, mitochondrial neurogastrointestinal encephalomyopathy, Mohr-Tranebjaerg syndrome, iron-storage disorders related to Friedreich s ataxia, and hereditary spastic paraplegia. We present structural correlations for seven of the mutations. Conclusions: Of the 19 mutations analyzed, 14 involved changes in very highly conserved parts of the affected proteins. Five out of seven structural correlations provided reasonable explanations for the malfunctions. As additional genetic and structural data become available, this methodology can be extended. It has the potential for assisting in identifying new disease-related genes. Furthermore, profiles with structural homologs can generate mechanistic hypotheses concerning the underlying biochemical processes - and why they break down as a result of the mutations. Background )IMA LAE J JDAF IJ CA AAH= >E E B H =JE?IJ I FH LE@A = F MAHBK A= I B HC= E E C = @ E JACH=JE C E B H =JE HA =JA@J DAHA@EJ=HO@EIA=IAI 9AFHAIA JDAHA =FH J JOFAIJK@O BJDEI=FFH =?D 7IE C=@=J=>=IA B= Received: 8 February 2001 Revised: 3 April 2001 Accepted: 26 April 2001 E@A JEBE=> AFH JAE IJD=J? JHE>KJAJ JDAIJHK?JKHA= @ H >E IO JDAIEI BOA=IJ5=??D=H O?AI?AHALEIE=A EJ?D @HE= MAD=LACA AH=JA@= E>H=HO BFH JAE FH BE AI JD=J A => AIKIJ E@A JEBOD CIE =ME@AH= CA BCA AI E? K@E C DK = )I = HAIK J MA D=LA =L=E => A IAJI B comment reviews reports deposited research refereed research interactions information

2 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= DECD O =??KH=JA = E =?E@ IAGKA?A = EC A JI LAH JDA? IAHLA@ HACE I JD=J? HHAIF @ J JDAIA FH BE AI *O =FFE C M DK = @EIA=IA HA =JA@ KJ=JE I J JDAIA = EC A JI MA?= E LAIJEC=JA JDA HA =JE IDEFI J IAGKA?A? IAHL=JE = @ J IAGKA?A L=HE=JE I 1 JDA?=IAI MDAHA A H HA B JDA E@A JEBEA@ D CI D=I = @AJAH E A@IJHK?JKHA JDAIA KJ=JE I?= >AE JAHFHAJA@E IJHK?JKH= JAH I = ME CKIJ @H=M A?D= EIJE?E BAHA?AI 9AD=LAKIA@FH BE A= = OIEI>=IA@ *=OAIE= FHE HI? @EJE = FH >=>E EJEAIJD=J= MJDAAIJE =JE BJDA E A E D @ B= ALA J JDA>=IEI BFHE H??KHHA?AI BIE E =H ALA JI J = A JDEI E>H=HO B IAGKA?A FH BE AI 6DA IJHA CJD B FHE H >=IA@ FH BE A = = OIEI EAI E EJI =>E EJO J?=FJKHA D C KI FH JAE @ =E I =I HA =JA@ > A?JI -=?D@ =E EIHAFHAIA JA@=I= =JHEN B= E =?E@FH > =>E EJEAI HA FHA?EIA O C E A ED @ L= KAI MDAHA A=?D F IEJE E JDA@ =E D=I= AIJE =JA@FH >=>E EJO B>AE C??KFEA@ >O = O F=HJE?K =H = E =?E@ 6DA IE E =HEJEAI H @ACHAAI B HAIE@KA? IAHL=JE >AJMAA D C KI IAGKA?AIHALA= F IEJE I?HEJE?= B HFH JAE IJHK?JKHA= @ BK?JE ) FH BE A E @K?A@ K JEF A = EC A J @EIF =OI E KJ=> AF IEJE I FDOIE??DA E?= O? IAHLA@F IEJE I = @ DECD O L=HE=> A F IEJE I 6DA AN= F AI FHAIA JA@ E JDEI F=FAH ID M D M IK?D = EC A JI FH LE@A E IECDJ E J DK = EJ?D @HE= F=JD CEAI EJ?D @HE= FH JAE I=HA= =JJH=?JELAIK>IAJ BJDA?A K =HFH JA AJ = = O A>A?=KIAJDAOFAHB H = K >AH B? HHA =JA@ BK?JE I = @ E JAH=?J ANJA IELA O MEJD A = JDAH 6DAEHFH FAHJEAI= @>AD=LE HHAB A?J= E JACH=JA@ IOIJA = @JDAABBA?JI B@EBBAHA J KJ=JE =?D= CAI?= >A A= E CBK O?H II? F=HA@ 6DAIKHLAO BJDEI E E= JKHAFH JA AID K @HALA= = O BJDABA=JKHAIJD=J?D=H =?JAHE A JDA =HCAH? F AJA FH JA AI B AK =HO JE??A JOFAI EJ?D @HE= F=JD CEAI = I FHAIA J K EGKA O? F ANAL KJE =HO= @CA AJE?BA=JKHAI=I=HAIK J BJDAEH HECE MEJDE HC= A AI@AHELA@BH =?EA JIO >E JI= @ JDA IJHE?J O =JAH = E DAHEJ=?A F=JJAH B MA@ >O = E F HJ= JIK>IAJ BJDAHA AL= JCA AI Mitochondrial genome overview 6DA EJ?D @HE= CA AEJIA BA? @AI O=D= @BK B CA AI 6DA =HCAIJ EJ?D @HE= CA A @EI? LAHA@ J @=JA BJDAFH JEIJ4A? E =I = AHE?= =? @AIB H '%CA AFH @K?JI$%FH JAE I= @!IJHK?JKH= 4 )IE = IF= B $'!" >=IA F=EHI 0K = EJ?D @HE=, )EI K?DI = AH A? @E C!%CA AFH @K?JI!FH JAE I H4 )I= @ J4 )I LAH$ #$'>=IAF=EHI! ) EJIA? @A@FH JAE I=HA@EHA?J OE L LA@E NE@=JELA FD IFD HO =JE = @ E? K@A? F A JI B? F AN 1 ),0 @ADO@H CA =IA IK>K EJI,,,!,",",#,$? F AN111?OJ?DH A>? F AN 18?OJ?DH A NE@=IA IK>K EJI 1 11 = @ 111 = @? F AN8 EJ?D @HE= )62=IAIK>K EJI$= @& " ) JDAH DK = CA AI A? @E C EJ?D @HE= FH JAE I J J= AIJE =JA@=J=H K @ # =HAJH= I?HE>A@E JDA K? AKI JH= I =JA@ E JDA?OJ F =I = @ JDAEH FH @K?JI E F HJA@ E J EJ?D @HE= 6DA E JAHF =O >AJMAA HC= A =H= @ K? A=HCA AI=?? K JIB H= =HCAF=HJ B JDA@ELAHIEJOE EJ?D @HE= F=JD CEAI -=?D DK =?A? J=E I K JEF A EJ?D @HE= BJA @O = E?= O E JAH? A?JA@ E =? F AN HAJE?K =H AJM H $ % = @ A=?D EJ?D @HE =O? J=E JA H HA EJ?D @HE=, ) J, ) A?K AI 7IK= O =? FEAI B J, ) =HA E@A JE?= = IJ=JA M =I D F =I O??=IE = O D MALAH J, ) KJ=JE I??KH = @ JDAHA =HEIA JM H HA F FK =JE I B J, ) = IJ=JA?= A@ DAJAH F =I O " 1 B=?J DAJAH F =I E? J, ) KCDJJ >A HA =JELA O?? IE@AHE CJD=JDK = J, ) KJ=JAI JE AI B=IJAH JD= K? A=H, ) =I = HAIK J B E =@A GK=JAFH BHA=@E C>O EJ?D @HE=, )F O AH=IAI " = @ E EJA@ J, )HAF=EH?=F=>E EJO 6DEIANFA?J=JE EIJ I AANJA J> H A KJ>OJDAHA =JELAFH E A?A B EJ?D @HE= @EI H@AHI=HEIE CBH KJ=JE IE JDA J, ) = JD KCD EJ EI = I E F HJ= J J JA JD=J IK?D KJ=JE I >AE C? F=H=JELA OA=IOJ E@A JEBO>OIAGKA?E C ME =J KH= OD=LA>AA = CJDABEHIJJ >A?D=H=?JAHE A@ Disease classification 5?D=FEH=D=I@ALEIA@=? =IIEBE?=JE I?DA AB H EJ?D @H E= @ABA?JI & >=IA@FHE =HE O MDAJDAH H JJDA=BBA?JA@ CA A FH @K?J EI @EHA?J O E L LA@ E NE@=JELA FD IFD HO = JE = @MDAJDAHEJEIA? @A@>O J, ) H K? A=H, ) )FHE =HO@ABA?JE NE@=JELAFD IFD HO =JE @ABE AI? =II 1,EIJE?J EJ?D @HE= F=JD CEAI =O LAH =F HAJD= A? =II B HAN= F A AECDIO @H AIAA>A M 5K>? =II 1=E? K@AI@EIA=IAI=HEIE CBH KJ=JE IE J, )CA AI A? @E C IK>K EJI B FH JAE I E L LA@ E NE@=JELA FD I FD HO =JE EJ?D @HE= J4 ) CA AI = @ EJ?D @HE= H4 ) CA AI 6DEI IK>? =II D=I JDHAA HA =JA@?=JAC HEAI @AFA @E C JDA =JKHA B JDA K @AH OE C KJ=JE I E =HCA I?= A J, ) @A AJE I = @ @KF E?=JE I EE F E J KJ=JE I = @ I = HA=HH= CA A JI E FH JAE? @E C J, ) HACE I = @ EEE I = I?= A J, ) KJ=JE I E J4 )= @H4 )CA AI & 1 H@AHB H=? =II1=F=JD CO J FHAIA JEJIA B JDA J, ) KJ=JE KIJ??KHE =IEC EB E?= J BH=?JE B JDA DAJAH F =I E? J, ) F FK =JE =I DECD=I$B H J, )@A AJE I ' 1= E= @ALA KFJ '#B HJ4 ) KJ=JE I1= EEE + =II1> EJ?D @HE= @EIA=IAI=HEIABH KJ=JE IE = O BJDA% @@ K? A=H CA AIA? @E CFH JAE IK>K EJIE L LA@E NE@=JELAFD I FD HO =JE # 6DAIA KJ=JE I?=??KH E AN I H E JH I=IMA =IFH JAHI + =II 11 EJ?D @HE= @EI H@AHI IJA BH IA? @=HO @ABA?JI E NE@=JELA FD IFD HO =JE + =II 11= CA AJE? F=JD CEAI=HA?D=H=?JAHE A@>O KJ=JE IE K? A=H EJ?D @HE= FH JAE CA AI MD IA FH @K?JI =HA E F HJA@ E J JDA EJ?D @HE= >KJ =HA J IK>K EJI B JDA NE@=JELA

http://genomebiology.com/2001/2/6/research/0021.3 FD IFD HO =JE =FF=H=JKI 6DAHA =HA B KH IK>JOFAI E J, )=> H = EJO@KAJ KJ=JE IE K? A=HCA AI=BBA?J E C J, )JH= I?HEFJE JH= I =JE HHAF E?=JE EE@EHA?J J, ) @= =CA H @ABA?JI E J, ) HAF=EH EEE @ABA?JELA NE@=JELA FD IFD HO =JE IK>K EJ E F HJ = @ EL @ABA?JELA NE@=JELA FD IFD HO =JE IK>K EJ =IIA > O & 6OFA 11> F=JD CEAI=HEIABH A @ CA KI= @AN CA KIJ NE I Results 6=> A IK =HE AI JDA AN= F AI @AI?HE>A@ >A M 6DAO E? K@A =?=IAI B H MDE?D IKBBE?EA J O IFA?EBE? KJ=JE = E B H =JE ANEIJIJ A => AABBA?JELAKIA BJDAFH BE AJ Primary mutational disorders of oxidative phosphorylation (class I) Leber s hereditary optic neuropathy A>AH\I DAHA@EJ=HO FJE? AKH F=JDO 0 JDA IJ??=KIA B = A=@ AI?A J> E @ AII EI=JJHE>KJA@J KJ=JE IE J, )CA AIA? @E CIK>K EJI B? F AN1 ),0 @ADO@H CA =IA, = JAH =JELA O JAH A@ ),0 + 3 HA@K?J=IA 6DA IJBHAGKA J OA? K JAHA@ KJ=JE # B?=IAI = C -KH FA= I '# = C )IE= I EI /%%&) E," )HC!" 0EI )@@EJE = O /!"$) ) =# 6DH E, = @ > JD 6""&"+ AJ$" 8= = @/""#')) =% 8= E,$H= = CJDA JDAH FHE =HO 0 =II?E=JA@ KJ=JE I " Table 1 Selected mutations associated with human mitochondrial disorders )IIAA E ECKHA JDA=HCE E AHAIE@KA=JF IEJE!"E JDADK = IAGKA?AB H," 7" 07 ) MDE?D? H HAIF @I J F IEJE! B JDA FH BE A EI =>I KJA O? IAHLA@ =?H II /H= F IEJELA 4L!#% /H= AC=JELA -+ %% =H?D=A= 2)*"! = @ AK =HO JE? J=N= 7" 07 ) = @ =@" 24) 6DA F=JD CA E? KJ= JE B =HCE E A J DEIJE@E A =J JDEI F IEJE HAFHAIA JI = @AH=JA O IEC EBE?= J @ELAHCA?A E IAGKA?A IK>IJEJKJE C = E E@= ACH KFB H=CK= E@E CH KF 6DAJM = E =?E@IE@A?D=E I@EBBAHE ID=FA DO@H CA > @E C?=F=>E EJO = @ @ACHAA B F IEJELA?D=HCA =HCE E A = M=OI >AE C F IE JELA O?D=HCA@ =J FDOIE CE?= F0 DEIJE@E A O D=LE C = F IEJELA?D=HCA=> KJD= BJDAJE A 6DA? EIIA IA IK>IJEJKJE E, JD=J A=@I J 0 ) =# 6DH @ AI JD=LA=I >LE KI= ANF = = JE B HEJIABBA?J 1 IFA?JE BJDAIAJ B= EC A@IAGKA?AI ECKHA ID MI JD=J JDEI F IEJE J AH=JAI C O?E A = @ AJDE E A=IMA =I= = E A 5K>IJEJKJE C=JDHA E AB H JDA = = E A E JDA DK = IAGKA?A =@@I = >AJ= >H=?DA@ = E =?E@?=F=> A B @EIHKFJE C?= IA? @=HO JAHJE=HO IJHK?JKHA H F IIE> O E JAHIK>K EJ E JAH=?JE I 6DA DO@H NO CH KF B JDA JDHA E A =O = I B H DO@H CA > @IMEJD AECD> HE C= E =?E@HAIE@KAII =IJ =@@J JDEI F JA JE= @EIHKFJE KHJDAH IJHK?JKH= IJK@O B? F AN 1 ME >A HAGKEHA@ J ANF =E D M JDEI KJ=JE? JHE>KJAIJ 0 Disorder Affected protein DNA change Protein change Evolutionary conservation Structural environment LHON ND4 G11778A R340H Absolute N/A ND1 G3460A A52T Slight N/A ND6 T14484C M64V Moderate N/A G14459A A72V High N/A Leigh syndrome ATP6 T8993G L156R Absolute Subunit interface PDHA1 C892G R263G Slight Near active site SURF1 G385A G124E Absolute N/A T751C I246T High Predicted β-sheet SDHA C1684T R554W High Surface-exposed MNGIE TP G1419A G145R High Near active site G1443A G153S Absolute Near active site A2744G K222S Absolute Near active site A3371 E289A Absolute Not in active site Deafness-dystonia DDP1 T151del (1 nt) Truncation - see text High N/A A183del (10 nt) Truncation - see text High NA C198G C66W Absolute N/A Iron-storage ABCB7 ATT ATG I400M High Predicted tight turn HSP Paraplegin 784del (2 nt) 60% truncated High N/A 2228ins (1 nt = A) 7.2% Truncated Moderate N/A ABCB7, ATP-binding cassette, subfamily B, member 7; ATP4, adenosine triphosphate synthase subunit 4 (ATP6 subunit 6); DDP1, deafness-dystonia peptide 1, del, deletion; HSP, hereditary spastic paraplegia; ins, insertion; LHON, Leber s hereditary optic neuropathy; MNGIE, mitochondrial neurogastrointestinal encephalomyopathy; N/A, not available; ND, NADH dehydrogenase; ND1 NDn, ND subunits 1 n; nt, nucleotide; PDHA1, pyruvate dehydrogenase subunit E1α; SDHA, succinate dehydrogenase subunit A; SURF1, surfeit locus protein 1; TP, thymidine phosphorylase. comment reviews reports deposited research refereed research interactions information

4 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= (a) 1 100 200 300 400 500 565 Rv3157 EC2277 PAB1402 NU4M_HUMAN nad4_pra (b) 280 290 300 310 320 PATTERN hhxfxfaxhh axxhxaffff jxxfxxxxir XfXXXXGaXs XfXfXXXXXf Rv3157 GSTLYMLNHG LSTAAVFLIA GFLIARRGSR SIADYGGVQK VAPILAGTFM EC2277 GAVIQMIAHG LSAAGLFILC GQLYERIHTR DMRMMGGLWS KMKWLPALSL PAB1402 AALFHLINHA IAKALLFLAV GVFIHVAGSR NIDDLAGLGR KMPITTFSLA NU4M_HUMAN GAVILMIAHG LTSSLLFCLA NSNYERTHSR IMILSQGLQT LLPLMAFWWL nad4_pra GSILLMLSHG IVSSALFLCI GVLYDRHKTR LLKYYSGVVQ TMPIFATLFM Conserved Arg340 His Figure 1 Profile-induced multiple alignment of NADH dehydrogenase, subunit 4. (a) Protein domain diagram. Profile-aligned regions (in red) set the boundaries for sequence domains. Gaps are indicated by gray interruptions in the red bars. Coordinates in the domain diagram are based on the aligned set of sequences. Amino-acid position is given on the scale above. (b) The zoomed-in aligned region marked by the yellow box on the domain diagram (see Additional data files). The alignment is colored according to amino-acid class assignments, with red indicating identity. Coordinates in the aligment are taken from the position in the profile. For more details see [48,53]. Sequences are obtained from Mycobacterium tuberculosis (Rv3157), Escherichia coli (EC2277), Pyrococcus abyssi (PAB1402), Homo sapiens mitochondrion (NU4M_HUMAN) and Reclinomonas americana mitochondrion (nad4_pra). 6DA JM? F=JD CA E? KJ=JE I E,$ AJ$" 8= = @) =% 8= A=@J = AIIIALAHA= @=LAHOIALAHA B H B 0 HAIFA?JELA O )I O? >=?JAHEK JK>AH?K IEI = HA=@O KIAI = L= E A E F IEJE $" ECKHA! EJ EI K @AHIJ= @=> A JD=J JDA AJ$" 8= IK>IJEJKJE ID K @?=KIA = AII IALAHA @EIA=IA FDA JOFA 1 F IEJE % O = = E A= @ AJDE E AD=LA>AA IAA = CJDAE@A JEBEA@ D CI )I IK?D JDA E JH @K?JE B L= E A =@@I = β >H=?DA@ = E =?E@ =J =? IAHLA@ β >H=?DA@ DO@H FD >E? F IEJE )@@EJE = O JDA JM AK =HO JAI :A FKI =ALEI = @ 0 I=FEA I D=LA? IAHLA@ JDA = = E A MDAHA=I JDA HA @EIJ= J O HA =JA@ IFA?EAI =? IAHLA= AJDE E A 9DAHA=I O A BJDA ),0 @ADO@H CA =IAIK>K EJ KJ=JE I??KHI=J= =>I KJA O? IAHLA@ F IEJE JDAHA=HAHA=I => AANF = =JE IB HD MJDA JDAH KJ=JE I? K @ = EBAIJJDAEH E A OF=JD CO Leigh syndrome AECD IO @H A = I M =I IK>=?KJA A?H JE E C A?AFD= OA F=JDOEI= JDAH? J, ) KJ=JE @EI H@AH BHAGKA J O =II?E=JA@ MEJD?OJ?DH A NE@=IA + :@ABE?EA?O! JD KCD @EIA=IA HA =JA@ KJ=JE I BJDA J, ) A? @A@+ :IK>K EJID=LA>AA HAF HJA@ ) IE C A6 J /JH= ILAHIE =J J, )F IEJE & ''!?D= CAI AK?E A#$ B)62IO JD=IAIK>K EJ$J =HCE E A AK#$ )HC " ECKHA "= MAH? FO K >AHI BJDEI KJ=JE %=HA=II?E=JA@ MEJD AKH F=JDO =J=NE= = @ HAJE EJEI FEC A J I= )42 MDAHA=IDECD? FO K >AHI 'CELAHEIAJ AECDIO @H A # KJ=JE C JDA? IAHLA@ AK?E A J =HCE E A =J F IEJE #$ EI = @H=IJE??D= CA @EIHKFJE C = γ >H=?DA@ = EFD=JE? HAIE@KA MEJD = >=IE? CK= E@E CH KF = = H?DA E?= = @ FHAIK => O IJHK?JKH= IK>IJEJKJE 1 JDEI?=IAIJHK?JKH= @=J==HA=L=E => AIAA ECKHA ">= @? A=H O E @E?=JAI JDA AO?=JE B JDEI HAIE@KA =J JDA FH JAE FH JAE E JAHB=?A BJM BJDAIK>K EJI BJDA? F A J B? F AN8)62IO JD=IA KJ=JE IB K @E =J A=IJJM K? A=HCA AI= I? HHA =JA MEJD AECD IO @H A A B H = IK>K EJ B JDA FOHKL=JA @ADO@H CA =IA? F AN = @ A B H IKHBAEJ?KI FH JAE

http://genomebiology.com/2001/2/6/research/0021.5 Rv3152 EC2282 AF1831 nad1_pra NU1M_HUMAN 1 100 200 300 400 452 30 40 50 60 70 PATTERN naxhxfqxrx GPXXVhggXX GXaQXfADXf nffxrexfxx XXXsXXfdXX Rv3152 KLLGRMQLRP GPNRVG--PK GALQSLADGI KLALKESITP GGIDRFVYFV EC2282 RLLGLFQNRY GPNRVG--WG GSLQLVADMI KMFFKEDWIP KFSDRVIFTL AF1831 KLTARIQWRV GPLEVARPIR GAIQALADGI RYFFQEAIVH REAHRPYFVQ nad1_pra KVIAAMQQRK GPNVVG--VF GLLQPLADGL KLLLKETIVP TRANSAIFIL NU1M_HUMAN KILGYMQLRK GPNVVG--PY GLLQPFADAM KLFTKEPLKP ATSTITLYIT Non-conserved Ala52 Figure 2 Domain diagram and partial alignment for NADH dehydrogenase, subunit 1 (ND1). The column indicated by the arrow at the bottom is position 52 of NU1M_HUMAN. Sequences are from M. tuberculosis (Rv3152), E. coli (EC2282), Archaeoglobus fulgidus (AF1831), H. sapiens mitochondrion (NU1M_HUMAN) and R. americana mitochondrion (nad1_pra). Colors are as in Figure 1. 574 MDE?DEIE L LA@E JDA=IIA > O B?OJ?DH A? NE@=IA )IJDAIA=HA J NE@=JELAFD IFD HO =JE FH JAE I JDA KJ=JE IGK= EBO=I? =II11@EI H@AHI ) EIIA IA KJ= JE )HC $! / OE JDA-α IK>K EJ BFOHKL=JA@ADO@H CA =IA 2,0) D=I >AA =II?E=JA@ MEJD AECD IO @H A ECKHA # 6DEI KJ=JE IAA I J =BBA?J =IIA > O B JDA -α -β DAJAH JAJH= AH $ 6DA CA A\I?=JE JDA :?DH I A? F E?=JAI EJI ANFHAIIE F=JJAH E BA = AI =I=HAIK J BJDAH= @ F=JJAH B:E =?JEL=JE % 9DAHA=I OJDADK = IAGKA?AKIAI=HCE E A=JJDEIF IE JE = JDAH J=N= KIA = =HCA DO@H FD >E? = E =?E@ AEJDAH JOH IE A H AK?E A JA JD=J ALA = =HCE E A IE@A?D=E D=I=IK>IJ= JE= DO@H FD >E?IAC A J>A=HE CEJIJAH E = CK= E@E CH KF 5K>IJEJKJE CJDAI = JKH B H E C C O?E A HAIE@KA? K @ D=LA = IEC EBE?= J E F=?J FH FAH B @E C B JDEI IK>K EJ )I ID M E ECKHA #> )HC $! EI?=JA@GKEJA? IAJ JDA=?JELAIEJA BJDAA O A =HACE BJDAIJHK?JKHAMDAHAALA I = FAHJKH>=JE I? K @D=LA =HCAABBA?JI =?JELEJO 6DACA AB H574 @EIF =OI KJ=JE I/ O " / K= @ 1 A "$ 6DHJD=J?= = I A=@J AECDIO @H A & ' Thr ) K >AH B @A AJE I IAA E 574 @ I =I MA 6DA / O " / KIK>IJEJKJE @EIHKFJI= =>I KJA O? IAHLA@ C O?E AJD=J IJFH >=> OF=HJE?EF=JAIE =JKH ECKHA $ 6DA 1 A "$ 6DH KJ=JE??KHI E = @ FH >=> O @EI HKFJI = FHA@E?JA@ β IDAAJ >A EALA@ J >A FHAIA J E JDEI FH JAE BH = DECDAHAK =HO JAI 1J??KHIJ M=H@JDA?=H > NO JAH E KI B JDA DK = FH JAE = @ EI >AO @ JDA HACE? LAHA@>OJDAFH BE A =HAB A?JE BIAGKA?A@ELAH CA?A>AJMAA AK =HO JAI= @FH =HO JAIE JDEIHACE KJ=JE I CELE C HEIA J AECD IO @H A D=LA = I >AA @AI?HE>A@ E JDA B =L FH JAE IK>K EJ F H 5,0) B JDA IK??E =JA @ADO@H CA =IA? F AN 5,0 6DA KJ= JLAHIE +$&"6 )HC##" 6HF ECKHA %= B5,0) D=I = H = = @ E B H = =JA >KJ EI K?D HA IA IEJELAJ @ M HACK =JE >O N= =?AJ=JA )) KJ=JE CJDA? IAHLA@=HCE E AHAIE@KA=JF IEJE ##"J = JHOFJ FD=?=KIAI= II B=F =HHAIE@KA= @HAF =?A A J MEJD= K?D =HCAH=H =JE?CH KF 1JEI J=FF=HA JBH JDA?=JE BJDEIHAIE@KAE JDAIJHK?JKHA ECKHA %>MDO JDEI?D= CAID K @D=LAIK?D= ABBA?J BK?JE 1J EAI JDA IKHB=?A =J =? IE@AH=> A @EIJ=?A BH JDA =?JELA IEJA comment reviews reports deposited research refereed research interactions information

6 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= 1 100 200 272 nad6_pra NU6M_HUMAN Rv3154 EC2280 NU6M_XENLA 60 70 80 90 100 PATTERN fhffxffayx GhfffaFXdX XfffXXXXXo XXXsXXXXXX XXXaXXXfXX nad6_pra IAMIFIMVYV GAIAVLFLFV VMMLNVRLVQ FNVNMLRYLP LGGVIGLIFL NU6M_HUMAN MGLMVFLIYL GGMMVVFGYT TAMAIEEYPE AWGSGVEVLV SVLVGLAMEV Rv3154 LGVVQVVVYT GAVMMLFLFV LMLIGVDSAE SLKETLRGQR VAAVLTGVGF EC2280 AGALEIIVYA GAIMVLFVFV VMMLNLGGSE IEQERQWLKP QVWIGPAIL S NU6M_XENLA LSIVLFLIYL GGMLVVFAYS AARA-KPYPE AWGS--WSVV FYVLVYLIGV Semi-conserved Met64 Val Eukaryotic-conserved Ala72 Val Figure 3 Domain diagram and partial alignment for NADH dehydrogenase, subunit 6 (ND6). The columns indicated with arrows at the bottom are positions 64 and 72 of NU6M_HUMAN. Sequences are from M. tuberculosis (Rv3154), Escherichia coli (EC2280), X. laevis mitochondrion (NUAM6_XENLA), H. sapiens mitochondrion (NU6M_HUMAN) and R. americana mitochondrion (nad6_pra). Colors are as in Figure 1. = @@ AI JD=LA= O=FF=HA JH AE IK>K EJ IK>K EJE JAH =?JE I 1 B=?J D MALAH JDADECD@ACHAA BIAGKA?A? IAH L=JE LAHJDEIA JEHAHACE ECKHA %=IKCCAIJIJD=J5,0) EIB=EH OE J AH= J B KJ=JE IDAHA 2H FAHE JAH=?JE MEJD JDA EFE@>E =OAH =OF=HJ OANF =E JDAFDA A Secondary mutational disorders in oxidative phosphorylation (class II) Mitochondrial neurogastrointestinal encephalomyopathy + =II 11 EJ?D @HE= @EI H@AHI =HA @KA J KJ=JE I E K? A=H EJ?D @HE= FH JAE CA AI EJ?D @HE= AKH C=IJH E JAIJE = A?AFD= O F=JDO /1- EI?D=H=?JAH E A@ >O FJ IEI FH CHAIIELA ANJAH = FDJD= F ACE= C=IJH E JAIJE = @EI JE EJO JDE > @O D=>EJKI FAHEFDAH= AKH F=JDO O F=JDO AK A?AFD= F=JDO= @ =?JE?=?E @ IEI= @ K JEF A J, )@A AJE I H J, )@AF AJE H > JD 1J EI =? =II 11 J, ) @AF AJE @EI H@AH 1J EI >A EALA@JD=JJDAF=JD CA E? A?D= EI HA =JAIJ =>AHH= J JDO E@E A AJ=> EI CELE CHEIAJ E F=EHA@ J, )HAF E?=JE = @ H J, ) =E JA =?A EIIA IA KJ= JE IE JDAA O AJDO E@E AFD IFD HO =IA62/ O"# )HC / O#! 5AH OI 5AH / K &' ) = = @ = BAM JDAHI D=LA >AA B K @ E K JEF A /1- F=JEA JI = @=HA>A EALA@J?=KIAJDAF=JD CO 62?=J= O AI = IJAF E JDA I= L=CA F=JDM=O B H JDO E A JDO E@E A FD IFD=JA JDO E A @A NO, HE> IA FD IFD=JA 1 DECDAH AK =HO JAI 62 F =OI JDA H A B F =J AJ @AHELA@A @ JDA E=?A CH MJDB=?J H! ALE@A J O >O LEHJKA B JDA B=?J JD=J JDA @A NOHE> IA IKC=H FH @K?J EI KIA@ =I = = CE CA AIEI E @K?E C B=?J H " 62 = I D=I C E IJ=JE =?JELEJO 2H BE A E @K?A@ K JEF A = EC A JI B JDEI FH JAE ID M = CHA=J @ACHAA B? IAHL=JE LAH JDA F=JD CA E? F IEJE I F=HJE?K =H O B H JDA / O#! 5AH KJ=JE ECKHA & / O?E AEI=>I KJA OFHAIAHLA@=JJDEI F IEJE E @E?=JE CJD=JEJ =O>A?HEJE?= B HFH FAHB @E C 5K>IJEJKJE C=IAHE A? K @@EIHKFJJDEI = JD KCD=FHA?A@ E C C O?E A =O F=HJE= O? FA I=JA 6DA / O"# )HC IK>IJEJKJE @ AI J@EIHKFJ= =>I KJA O? IAHLA@F IE JE =I IAHE A EI = I FAH EJJA@ >KJ EJ E JH @K?AI = =HCA >=IE? = @ F=HJ O DO@H FD >E? HAIE@KA JD=J FHAIK => O @E E EIDAI =J A=IJ A AO BK?JE B 62 OI 5AH = @ / K &' ) = > JD @EIHKFJ =>I KJA O? IAHLA@ IAGKA?AF IEJE I JDAB H AH BMDE?D EAI JDAFAHEFDAHO

http://genomebiology.com/2001/2/6/research/0021.7 (a) EC3738 ATP6_HUMAN HP0828 Q0085 Rv1304 1 100 200 288 160 170 180 190 200 PATTERN fxxxxxxfxf ffxipxasxf hnxaslhfrl XhNIgXGXXf ffaafxfg-g EC3738 PFNHWAFIPV NLILEGVSLL SKPVSLGLRL FGNMYAGELI FILIAGLLPW ATP6_HUMAN QGTPTPLIPM LVIIETISLL IQPMALAVRL TANITAGHLL MHLIGSA--- HP0828 AGPVKWLAPF MFPIEIISHF SRIVSLSFRL FGNI-KGDDM FLLIMLL--- Q0085 AGTPLPLVPL LVIIETLSYF ARAISLGLRL GSNILAGHLL MVILAGL--- Rv1304 IKLLKGHVTL LAPINLVEEV AKPISLSLRL FGNIFAGGIL VALIALFP-P (b) Conserved Leu156 1 Arg Figure 4 Domain diagram and structure of ATP synthase. (a) Domain diagram and partial alignment for ATP synthase, subunit 6 (ATP6). The column indicated by the arrow at the bottom corresponds to position 156 of ATP6_HUMAN. Sequences are obtained from M. tuberculosis (Rv1304), Helicobacter pylori (HP0828), E. coli (EC3738), S. cerevisiae mitochondrion (Q0085) and H. sapiens mitochondrion (ATP6_HUMAN). Colors are as in Figure 1. (b) Ribbon diagram showing the structure of ATP synthase subunits: ATP6 (a-chain, green) and ATP9 (c-chain, red). Leucine 156 (1) is shown as a space-filling model. The structure comes from the Protein Data Bank (code: 1C17). comment reviews reports deposited research refereed research interactions information

8 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= (a) CT245 bsubtilis-pdha YER178W APE1677 ODPA_HUMAN 1 100 200 300 400 450 160 170 180 190 200 PATTERN sxxfxxxxxg -galgxdxfx XfXXXXXAXr XfXsXXhPXa aexxxxrfxj CT245 SQAISYGLSS ITLNGFDLFN SLIGFREAYH HMQQTGSPII VEALCSRFRG bsubtilis-pdha KAVAAGIVGV -QVDGMDPLA VYAATAEARE RAINGEGPTL IETLTFRYGP YER178W RGQYIPGLK- --VNGMDILA VYQASKFAKD WCLSGKGPLV LEYETYRYGG APE1677 KGLAYGVPGV -RIDGNDVMV VYKIVSDAAE KARRGGGPTL IEAVTYRLGP ODPA_HUMAN RGDFIPGLR- --VDGMDILC VREATRFAAA YCRSGKGPIL MELQTYRYHG Non-conserved Arg263 Gly (b) Chain B Chain A Thiamine diphosphate 1 3 2 Figure 5 Structures of pyruvate dehydrogenase E1-α subunit and branched-chain α-ketoacid dehydrogenase α subunit. (a) Domain diagram and partial alignment for pyruvate dehydrogenase E1-α subunit (PDHA1). The column indicated by the arrow at the bottom corresponds to position 263 of ODPA_HUMAN. Sequences come from Chlamydia trachomatis (CT245), Bacillus subtilis (bsubtilis-pdha), S. cerevisiae (YER178W), Aeropyrum pernix (APE1677) and H. sapiens (ODPA_HUMAN). Colors are as in Figure 1. (b) Ribbon diagram of branched-chain α-ketoacid dehydrogenase α subunit (H. sapiens) - a close homolog to PDHA1 (Z-score 73.2). The structure comes from the Protein Data Bank ([49]; code: 1DTWA). Chain A is colored red, chain B gray. A space-filling model of thiamine diphosphate (TPP) (center) indentifies the enzyme active site. Below and slightly to the left, the catalytic group Glu92 is also shown in space-filling representation (1), and further down (directly under the TPP) Tyr262 is similarly shown (2). Arg263 occupies this position in ODPA_HUMAN. Helix H3 is the second helix at the lower left (3).

http://genomebiology.com/2001/2/6/research/0021.9 CT27995 PA0110 RP733 Rv2235 SUR1_HUMAN YGR112W 1 100 200 300 400 417 60 70 80 90 100 PATTERN gxxxfg--gs XfXDXXdXnV XfsGrdXXrX sfff-xxr-- ---------- CT27995 LPDDL----T DLAQMEYRLV KIRGRFLHDK EMRL-GPRSL IRPDGVETQG PA0110 -PGQL----E GLRDPAYVRV RLHGRFDERH TLLL-DNR-- ---------- RP733 ----L---EK VQENLLYHKV KITGQFLPNK DIYLYGIR-- ---------- Rv2235 LKTLLPQQDS SAPDAQWRRV TATGQYLPDV QVLA-RLR-- ---------- SUR1_HUMAN LPADPM---- ELKNLEYRPV KVRGCFDHSK ELYM-MPRTM VDPVRE---- YGR112W LPKSFT--PD MCEDWEYRKV ILTGHFLHNE EMFV-GPR-- ---------- Conserved Gly124 Glu Figure 6 Domain diagram and partial alignment for surfeit locus protein 1 (SURF1). The column indicated by an arrow at the bottom corresponds to position 124 of SUR1_HUMAN. Sequences come from Drosophila melanogaster (CT27995), Pseudomonas aeruginosa (PA0110), Rickettsia prowazekii (RP733), M. tuberculosis (Rv2235), H. sapiens (SUR1_HUMAN) and S. cerevisiae (YGR112W). Colors are as in Figure 1. B JDA =?JELA IEJA @=J= J ID M / K &' F=HJE?EF=JAI E I A F=HJ O >KHEA@? IAHLA@ E E? E JAH=?JE I HA JD= # @EIJ= JBH JDA=?JELAIEJA@=J= JID M Deafness-dystonia,A=B AII @OIJ E= H DH 6H= A> =AHC IO @H A EI = EJ?D @HE= FH JAE E F HJ @EI H@AH KJ=JE I E,,2 @A=B AII @OIJ E= FAFJE@A A=@ J @A=B AII @OIJ E= 1 JM?=IAI IAA J @=JA = >F @A AJE E AN 6#@A HAIK JI E = BH= AIDEBJ A=@E C J E? HF H=JE B # AM = E =?E@I =BJAH / K!& B MA@ >O = IJ F? @ 1 = IA? @?=IA = >F@A AJE E AN?=KIAI=BH= AIDEBJ=J AJ"& A=@E C J E? HF H=JE B AM HAIE@KAI = @ = IJ F? @ # 6DAIAFH JAE I=HA D C KIJ JDAOA=IJ EJ?D @HE= E F HJFH JAE 6E &F; 4!#9 ) $ )IIAA E ECKHA ' > JDBH= AIDEBJ KJ=JE I A=@J A E E = JE B=DECD O? IAHLA@ JEB +8F4 =,6 A=HJDA?=H > NO JAH E KI BJDAFAFJE@A =HACE JD=J KIJ>A?HEJE?= J FH FAHBK?JE HB @E C 6DAHA?A JFK> E?=JE BJDABEHIJ,,2 EIIA IA KJ=JE +OI$$ 6HF % IJH C O IKF F HJI JDEI FHE H?? KIE >=IA@ JDA BH= AIDEBJI )I AEJDAH6E &F H= O? IAD CD=IOAJD=@EJIIJHK? JKHA@AJAH E A@ MA?= JE BAH@AJ=E A@IJHK?JKHA BK?JE HA =JE IDEFI 6E &F= @6E!FANEIJE JDA EJ?D @HE= E JAH A >H= AIF=?A= @@AJAH E AJDA?=JE B6E 'F 6E 'F EI?=JA@ AEJDAH E JDA %,= E F HJ? F AN >AJMAA JDA EJ?D @HE= A >H= AI H JDA!,= 61? F AN A >A@@A@ E JDA E AH A >H= A &,A AJE I B6E &F=BBA?JJDA=>E EJOJ E F HJ EJ?D @HE= FH JAE I JD=J >A? A A >A@@A@ E JDA E AH A >H= A ' 1J EI F IIE> A JD=J JDA BH= AIDEBJ KJ=JE I IAA E @A=B AII @OIJ E= E JAHBAHA MEJD JDA E JAHB=?A >AJMAA 6E &F= @6E!F Iron-storage disorders HEA@HAE?D\I =J=NE= = I B= I E J JDA?=JAC HO B =? =II 11 J, ) @= =CA HAF=EH @EI H@AH 1J EI?D=H=?JAHE A@ >O comment reviews reports deposited research refereed research interactions information

10 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= (a) 1 100 200 300 400 500 600 674 YKL148C YJL045W Rv3318 AF0681 CEC03G5_1 DHSA_HUMAN RP128 PATTERN YKL148C YJL045W Rv3318 AF0681 CEC03G5_1 DHSA_HUMAN RP128 500 XrsXXrasXX EKTFDDVKTT DKTFEDVHVS KERYSRITVH KERYRNVEIN YKDQAHLNVA YGDLKHLKTF RNRYKDIKIN 510 DnhXXdNiDL 520 XsXaEaXXfL 530 XXAXXXffhA 540 XXRrESRGhH DRSMIWNSDL VETLELQNLL TCASQTAVSA ANRKESRGAH DKSMIWNSDL VETLELQNLL TCATQTAVSA SKRKESRGAH DKGKRFNTDL LEAIELGFLL ELAEVTVVGA LNRKESRGGH DKSRGFNTDL TSAIEIGYML DLAEVVAIGA LKRQESRGAH DKGLVWNSDL IETLELQNLL INATQTIVAA ENREESRGAH DRGMVWNTDL VETLELQNLM LCALQTIYGA EARKESRGAH DKSLIWNSDL VEALELDNLL DQALVTVCSA AARKESRGAH Conserved Arg554 Trp (b) Flavinadenine dinucleotide 1 Figure 7 Structures of the flavoprotein subunits of succinate dehydrogenase and fumarate reductase. (a) Domain diagram and alignment for the flavoprotein subunit of succinate dehydrogenase (SDHA). The column indicated by the arrow on the bottom corresponds to position 554 of DHSA_HUMAN. Sequences come from S. cerevisiae (YKL148W and YJL045W), M. tuberculosis (Rv3318), A. fulgidus (AF0681), Caenorhabditis elegans (CEC03G5_1), H. sapiens (DHSA_HUMAN) and R. prowazekii (RP128). Colors are as in Figure 1. (b) Ribbon diagram of the flavoprotein subunit of fumarate reductase (E. coli) - a very close homolog of SDH Fp (Z-score 175.8). The structure comes from the Protein Data Bank (code: 1FUMA). Flavin adenine dinucleotide can be seen (in space-filling representation) in the middle of the structure and Thr494 is similarly indicated at the right-hand end of the bottom-most helix (1). Arg554 occupies this position in DHSA_HUMAN.

http://genomebiology.com/2001/2/6/research/0021.11 (a) TYPH_HUMAN TYPH_MYCPI MJ0667 EC4382 1 100 200 300 400 500 538 110 120 130 140 150 PATTERN XXXaXXXShR haxxxhgtfd XaEXaXXXsX XXXXXsgrXa XrXXjffafG TYPH_HUMAN GCKVPMISGR GLGHTGGTLD KLESIPGFNV IQSPEQMQVL LDQAGCCIVG TYPH_MYCPI DLGVAKLSGR GLGFTGGTID KLESINVNTD IDLKNS-KKI LNIANMFIVG MJ0667 GLKIPKTSSR AITSAAGTAD VVEVLTRVDL TIEEIK-RVV KETNGCMVWG EC4382 GGYIPMISGR GLGHTGGTLD KLESIPGFDI FPDDNRFREI IKDVGVAIIa Semi-conserved Gly145 Arg Conserved Gly153 Ser (b) 2 1 Figure 8 Domain diagram and structure of thymidine phosphorylase (TP). (a) Domain diagram and partial alignment for thymidine phosphorylase. The columns indicated by arrows at the bottom correspond to positions 145 and 153 of TYPH_HUMAN. Sequences are obtained from H. sapiens (TYPH_HUMAN), Mycoplasma pyrum (TYPH_MYCPI), Methanococcus jannaschii (MJ0667) and E. coli (EC4382). Colors are as in Figure 1. (b) Ribbon diagram of TP (E. coli). Structure from the Protein Data Bank (code: 2TPT). Space-filling representations show a sulfate ion (center) (3) and two conserved glycines (left), the lower of which (1) aligns with Gly145 in TYPH_HUMAN and the other (2) with Gly153 in TYPH_HUMAN. 3 comment reviews reports deposited research refereed research interactions information

12 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= CEY39A3C_83_D CT4964 YJR135W-A DDP_HUMAN 1 100 10 20 30 40 50 PATTERN XsXfXXXXXX rxoaxsfaxx EssKXrfpXX ahofspacdr KCfXsXXg-X CEY39A3C_83_D MDSA--DPQL NRFLQQ-LQA ETQRQKFTEQ VHTLTGRCWD VCFADYRPPS CT4964 MSDFENLSGN DKELQEFLLI EKQKAQVNAQ IHEFNEICWE KCIGKPS--T YJR135W-A SDLASLDDTS KKEIATFLEG ENSKQKVQMS IHQFTNICFK KCVESVND-S DDP_HUMAN SSSAAGLGAV DPQLQHFIEV ETQKQRFQQL VHQMTELCWE KCMDKPG--P 60 70 80 81 PATTERN rlssxxexcf XNCVpRFaDT sxxixpxfxp X CEY39A3C_83_D KMDGKTQTCI QNCVNRMIDA SNFMVEHLSK M CT4964 KLDHATETCL SNCVDRFIDT SLLITQRFAQ M YJR135W-A NLSSQEEQCL SNCVNRFLDT NIRIVNGLQN T DDP_HUMAN KLDSRAEACF VNCVERFIDT SQFILNRLEQ T Codon-39 frameshift Codon-49 frameshift Conserved Cys66 Trp Figure 9 Domain diagram and complete alignment for Tim8p. The columns indicated by the two arrows on the right correspond to positions 39 and 49 of DDP_HUMAN. Sequences are obtained from C. elegans (CEY39A3C_83_D), D. melanogaster (CT4964), H. sapiens (DDP_HUMAN) and S. cerevisiae (YJR135W-A). Colors are as in Figure 1. FH CHAIIELA C=EJ = @ E > =J=NE= =? B JA @ HAB ANAI @OI=HJDHE= = @ FOH= E@= MA= AII E E BAHE H E >I! =? B BH=J=NE FH JAE?=KIAI = >KE @ KF B EH E JDA =JHEN IF=?AI B JDA EJ?D @HE= )I EH =??K K =JAI EJ? FAJAI MEJD?=J= =IA B H DO@H CA FAH NE@A = @ B H I DO@H NO H=@E?= I A 0 A! 0 0 I?= A@HA=?JELA NOCA IFA?EAIJD=J?= =JJ=? J, ) A >H= A EFE@I?=H> DO@H=JAI = @ FH JAE I 6DEI?= @= =CA JDA A JEHA NE@=JELA FD IFD HO =JE IOIJA AIFA?E= O JDA DECD O LK AH=> A EH IK BKH?A JAHI 6DA ID HJ=CA B BH=J=NE @ AI JHAIK JBH B H =JE B KJ= JFH JAE I >KJH=JDAHBH ANF= IE B=/))HAFA=JE JDABEHIJE JH B JDA BH=J=NE CA A 6DEI ANF= @A@ HAFA=J EI >A EALA@ J B H?= E A@4 )IA? @=HOIJHK?JKHAIJD=JE JAHBAHAMEJD JH= I?HEFJE!! ;BDF ;, 9 JDA BH=J=NE D C E OA=IJ E JAH=?JI MEJD JDA OA=IJ EJ?D @HE= E JAH A@E=JAFAFJE@=IA?J ;!"+J MAH =JHENEH??A JH=JE @EHA?J O= @>O? FAJEJE!! H = EJ?D @HE= EH D A IJ=IEI E E = O@AFA @I ME @ JOFA BH=J=NE J? JH ABB KN * JD FH JA OJE?? A=L=CALE= 22= @ H1 2? F ANAI= @?D=FAH E DA F J 052% 5I? F E A >H= A JH=BBE? E C = @ H B @E C=HA AA@A@B HJDA =JKHABH=J=NE FH JAE J =HHELA =J EJI @AIJE =JE E JDA =JHEN E EJI FH FAH? B H =JE 1H ABB KN= I @AFA @IKF JDAEH IK BKH? KIJAHJH= I F HJAH)J F; 4!+ ) EIIA IA KJ=JE E )>?>% H D)*+% JDADK = D C B)J F 1 A" AJ D=I >AA E F E?=JA@E : E A@IE@AH > =IJE?= A E== @=J=NE=!" = @EI H@AH? IA O HA =JA@ J HEA@HAE?D\I =J=NE= ECKHA IK =HE AI JDA IAGKA?A = @ FDO CA AJE?? JANJ B H JDEI F=JD CA E??D= CA ANFAHE A J= IJHK? JKH= @=J= =HA =L=E => A >KJ @EI?HAJA IJ=JA IF=?A @A I!#!$ 0 0A /?) EIJAH= @6 5 EJD = KI?HEFJE FHAF=H=JE B H A >H= AFH JAE I? A=H OE @E?=JAJD=JJDA KJ=JE B= I E = ANJH=?A K =H JECDJ JKH >AJMAA JH= I A >H= ADA E?AIBELA= @IEN )F=EH B?=JE JH= IF HJAHI J; 4%%9= @ J ;2 "+E OA=IJ? JH IEH E B KNE J JDA EJ?D @HE= 6DEI JDAH D= B B EH D A IJ=IEI? K @ = I A=IE O D=LA=H AE HEA@HAE?D\I=J=NE== @: E A@IE@AH > =IJE? = A E= Hereditary spastic paraplegia 0AHA@EJ=HOIF=IJE?F=H=F ACE=052 =? =II11=IIA > O@EI H@AH EI?D=H=?JAHE A@>OFH CHAIIELA KIK= OIALAHA MAH ANJHA EJOIF=IJE?EJO@KAJ @ACA AH=JE BJDA C=N I B JDA?A JH= AHL KI IOIJA MDE A JDA?A > @EAI HA =E E J=?J 1 DAHEJ=?A?= >A=KJ I = @ E = J =KJ I = HA?AIIELA H: E A@ 5 A?=IAI B052HAIK JBH KJ= JE I E JDA CA A B H F=H=F ACE = K? A=H A? @A@ EJ?D @HE= AJ= FH JA=IA E? K@E C = >F @A AJE =J F IEJE %&" MDE?D?=KIAI=BH= AIDEBJJD=JA E E =JAI$ BJDAFH JAE = @= =@A E AE IAHJE =JF IEJE & B JDAF=H=F ACE?, ) MDE?D?HA=JAI=BH= AIDEBJHAIK JE CE JHK?=JE BJDA =IJ#%HAIE@KAI BJDEI%'# HAIE@KAFH JAE!% )I ID M E ECKHA JDAHA =HA LAHO BAM MA? IAHLA@F IEJE IE JDA@A AJA@#% HAIE@KAJ=E E F OE CJD=J JDEI =O>A=? F=H=JELA O@EI H@AHA@F=HJ BJDAFH JAE EIIA IA KJ=JE ID=LA>AA B K @E DK = F=H=F A CE J @=JA 6DEI FH JAE HAIA > AI JDHAA OA=IJ FH JAE I MDE?D BK?JE =I?D=FAH E I = @ H FH JA=IAI )BC!F ;-4%+ 4?=F ; 4&'+ = @ ; AF ;24 "9 6DAOF=HJE?EF=JAE HA?O? E CJDA? F A JI B? F AN8 )62 IO JD=IA =I MA =I E EJI =IIA > O 6DA FH @K?JI B ;-4%+= @; 4&'+ = AKFJDA EJ?D @HE= =@A IE A JHEFD IFD=J=IA ))) FH JA=IA? F AN MDE?D EI A >A@@A@E JDAE AH EJ?D @HE= A >H= AB=?E CJDA =JHEN ;24 "9 A? @AI JDA E ))) FH JA=IA? F AN MDE?DB=?AIJDAE JAH A >H= AIF=?A 6DAOA=IJ )))FH JA=IA? F ANEIEJIA BHACK =JA@>OJM FH DE>EJE FH JAE I 2D> ;/4! + = @ 2D> ;/4!+!& 2=H=F ACE D CI = = C KIJ JDAOA=IJF=H= CI

http://genomebiology.com/2001/2/6/research/0021.13 ABC7_HUMAN RP205 YMR301C PATTERN 1 100 200 300 400 500 600 700 756 260 LXXYrshXXK 270 XisiLhfLNX Figure 10 Domain diagram and partial alignment for the Fe-S cluster transporter Atm1p. The column indicated by the arrow at the bottom corresponds to position 400 of ABC7_HUMAN. Sequences are obtained from H. sapiens (ABC7_HUMAN), R. prowazekii (RP205) and S. cerevisiae (YMR301C). Colors are as in Figure 1. )BC!F 4?=F = @ ; AF KIJ = IJ?AHJ=E O ANEIJ = @ DK = FH DE>EJE I =O>AANFA?JA@=IMA 2 E J KJ=JE I E? IAHLA@ HACE I B JDAIA FH JAE I?= >A ANFA?JA@ J?=KIA EIIA IA KJ=JE I= @JH= I =JE = JHK?=JE IJD=J ME = I A=@J DAHA@EJ=HOIF=IJE?F=H=F ACE= Discussion 6DA FH?AII B K @AHIJ= @E C CA AJE? @EIA=IA JOFE?= O FH?AA@I JDH KCD JDHAA IJ=CAI BEHIJ HA? C EJE B JDA @EIA=IA IJ=JA HIO @H AE? K@E CEJIDAHA@EJ=HO?D=H=?JAH IA? @ @EI? LAHO= @ =FFE C BJDAHA =JA@ KJ=JE I = @JDEH@ A K?E@=JE B JDA >E?DA E?= >E FDOIE?= A?D= EI A=@E CJ JDA@EIA=IAFDA JOFA 5E? A?A = A E=FH LE@AI JDA? =IIE? AN= F A 1 JDA?=IA B EJ?D @HE= @EIA=IAI JDAHA =HA HA JD= @AI?HE>A@? @EJE I 1!' 16 2 " = @ 9756 # MA>IEJAI = @ =? F=H=> A = @ CH ME C K >AH B KJ=JE I & )I = HA=@O @AI?HE>A@B H 0 AECDIO @H A= @052 JDA?=KI=JELA KJ=JE I =HA J A?AII=HE O? BE A@ J = IE C A FH JAE? @E CCA A HALA J JDA K? A=HLAHIKIJDA EJ?D @HE= CA A 2KJ @EBBAHA J O JDAHA EI A J A =FFE C >AJMAA KJ=JE I H ALA MD A CA AI = @ = F=HJE?K =H @ABE A@@EIA=IAA JEJO 6DKI A=?D@EIA=IAIFA?JHK =O? H HAIF @ J IALAH= IAJI B KJ=JE I A=?D IAJ =BBA?JE C A FH JAE HFH JAE IK>K EJ >LE KI O KJ=JE IE J J4 ) 280 GQsfIfiXhL 290 XhfMffhXsh 300 axxjsfxvgd ABC7_HUMAN LKTYETASLK STSTLAMLNF GQSAIFSVGL TAIMVLASQG IVAGTLTVGD RP205 LQAYEKSATK ITNSLSILNI GQDVIISLGL VSLMILSVNA INQNKMMVGD YMR301C LMNYRDSQIK VSQSLAFLNS GQNLIFTTAL TAMMYMGCTG VIGGNLTVGD Conserved Ile400 Met J H4 )= @FH JAE IJD=J=BBA?J EJ?D @HE= CA AANFHAI IE ME = I F JA JE= OD=LAF AE JH FE?ABBA?JI 6DAHAIK JIFHAIA JA@DAHA=@@HAIIJDAJDEH@IJ=CA BK @AH IJ= @E CCA AJE?@EIA=IA = A OJDA?D=H=?JAHE =JE BJDA @EIA=IA FDA JOFA =J JDA A?K =H ALA MEJD = LEAM J ANF =E E C EJI >E?DA E?= A?D= EI 1 FHE?EF A JDA =FFH =?D?= >A =FF EA@ J = O KJ=JA@ FH JAE? @E C CA A 9DA IJHK?JKH= E B H =JE EI =L=E => A JDA AN=?J A?D= EI B@EIA=IA?=KI=JE =O>A=I?AHJ=E => A=JJDA A?K =H ALA = @JAIJ=> A>O A= I B?=HABK OJ=HCAJA@ ANFAHE A JI ) BKHJDAH ANJA IE B JDEI >=IE? M A@CA =FFH =?D EAIE JDA@EHA?JE B A?K =HAL KJE )IMA ME ID M A IAMDAHA 6 2 6 5 = @ 5 + K FK> EIDA@ >IAHL=JE I JDA ANJA @A@ IAJI B D CI =IIA > A@ KIE CFHE H >=IA@FH BE AIFH LE@A=HE?DI KH?A BFDO CA AJE? E B H =JE J O ID K @ EJ >A F IIE> A J I AJ?D KJJDADEIJ HO BJDA@ALA F A J B@EBBAHA J>E?DA E?= =?DE AHEAI EJ ID K @ = I >A F IIE> A J AIJ=> EID MDE?D F=HJI BJDAFH JAE I=HA IJ?HEJE?= J BK?JE JDA>=IEI B = E =?E@ IAGKA?A? IAHL=JE? HHA =JA@ MEJD JDA IJHK?JKH=? JANJI M A@CAC=E A@>OJDA=FFH =?D@AI?HE>A@DAHAD=I >LE KI =FF E?=JE J > JDJDA@E=C IEI= @JDAH=FO B EJ?D @HE= @EIA=IAI?A = KJ=JE D=I >AA?D=H=?JAHE A@ JDA AN=?J comment reviews reports deposited research refereed research interactions information

14 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= 1 100 200 300 400 500 600 700 800 857 YER017C YPR024W YMR089C bsubtilis-ftsh hum paraplegin hum parapleg_ins XF0093 480 490 500 510 520 523 PATTERN alxkrxkrar XfhpXLLrXE XasXXpfXXa fxxxxxjxg- -gxrxxxxxx XXD YER017C LLTKNLDKVD LVAKELLRKE AITREDMIRL LGPRPFKE-- --RNEAFEKY LDP YPR024W LLTKKNVELH RLAQGLIEYE TLDAHEIEQV CKGEKLDKLK TSTNTVVEGP DSD YMR089C LLKEKAEDVE KIAQVLLKKE VLTREDMIDL LGKRPFPE-- --RNDAFDKY LND bsubtilis-ftsh ILTENRDKLE LIAQTLLKVE TLDAEQIKHL IDHGTLPE-- --RNFSDDEK NDD paraplegin VLQDNLDKLQ ALANALLEKE VINYEDIEAL IGPPPHGP-- --KKMIAPQR WID paraplegin_ins VLQDNLDKLQ ALANALLEKE VIN XF0093 ILADNLDKLH AMSQLLLQYE TIDAPQIDAI MEGREPPPPV GWGKPSDNGS DDD Figure 11 Domain diagram and partial alignment for the mitochondrial AAA-proteases. Sequences are obtained from H. sapiens (paraplegin and paraplegin_ins containing the adenine insertion), B. subtilis (bsubtilis-ftsh), Xylella fastidiosa (XF0093) and S. cerevisiae (YER017C, YPR024W and YMR089C). Colors are as in Figure 1. =JKHA BJDA=II?E=JA@E AIIEI M MEJDFHA?EIE 6DEI D=I = >A=HE C KF?D E?A B JDAH=FO = @ = MI = K?D HAAN=?JFH C IEIJD= =?D=H=?JAHE =JE IE F O=I I=O AECDIO @H A 1 =@@EJE JDAE B H =JE BH IAGKA?A = EC A JID=IFHA@E?JELAF MAH KJ=JE I JOAJHA? H@A@ E JDA EJAH=JKHA ECDJJKH KF=J AOF IEJE IE I A BJDA FH JAE I M J >A =BBA?JA@ E EJ?D @HE= @EIA=IAI 1 @ELE@K= I?=HHOE C IK?D?D= CAI ECDJ >A = AHJA@ J JDA F IIE> A D=H BK? IAGKA?AI = @ J= A MD=JALAH FHA?=K JE =HO A=IKHAIIAA =FFH FHE=JA JAJD=JIK>IJEJKJE I E JDA DK = IAGKA?A JD=J =J?D = E =?E@ HAIE@KAI B K @ =J JDA? HHAIF @E C?=JE I E J=N E?= O @EIJ= JIAGKA?AI=HA AII E A OJ FH LAD=H BK JD=? F AJA O LA IK>IJEJKJE I 6DEI =FF E?=JE B E B H =JE?I @=J=ME?AHJ=E OCH ME E F HJ=?A=I=?GKEIEJE B=@@E JE = IE C A K? A JE@AF O HFDEI 5 2@=J=>A? AI E?HA=IE C O? = @ AM F=H= CI = @ HJD CI =HA =@@A@ J JDA IAGKA?A @=J=>=IA E = O @AFA @E C JDA =JKHA B JDA =BBA?JA@ FH JAE = @ JDA BHAGKA?O B??KH HA?A B@EIA=IA@E @ELE@K= I EJ =O>AF IIE> AJ KIAJDA @=J=B H=JJA FJI=JH=JE = @HKC@AIEC HCA AJDAH=FO 6DAM H HAF HJA@DAHAHAFHAIA JI=FHA E E =HO LAHLEAM B JDA BEA @ -LA JK= O = AND=KIJELA IKHLAO ID K @ >A? F AJA@? JE K KI O KF@=JA@ = @ =@A FK> E? O =L=E => A 6D=J ME HAGKEHA FHA?EIA?D=H=?JAHE =JE B = O HA KJ=JE I=IMA =IANF= IE BJDA@=J=>=IA B EJ?D @HE= FH JAE FH BE AI>AO @MD=JMAD=LA? IJHK?JA@KIE C OA=IJ=I= @A 5K?DABB HJI=HAK @AHM=O Materials and methods Yeast mitochondrial protein database (YMPD) 9A?HA=JA@=@=J=>=IA B "!OA=IJ EJ?D @HE= FH JAE I KIE C>E?DA E?= AOM H@IA=H?DAIB H M EJ?D @HE= FH?AIIAI BH JDA 5=??D=H O?AI /A A,=J=>=IA " JDA K E?D1 B H =JE +A JAHB H2H JAE 5AGKA?AI " JDA;A=IJ2H JA A,=J=>=IA "! = @ EJ*)5- "" = @ >O EJ HE C JDA?KHHA J EJAH=JKHA -=?D A >AH B JDEI IAJ B FH JAE I EI?D=H=?JAHE A@ >O =J A=IJ A FHE H >=IA@ FH BE A JD=JHAFHAIA JIJDA IJ? IAHLA@HACE I )I 5?AHALEIE=A @ AI J KIA =?= E?= EJ?D @HE=? F AN 1 MA D=LA IK>IJEJKJA@ JDA ),0 @ADO@H CA =IA FH JAE IK>K EJIBH 4 = AHE?= = J E EJE= E AJDAIAFH BE AI Developing prior-based profiles 2HE H >=IA@FH BE AI JDAFHE?EF= J IKIA@E JDEIM H HAFHAIA J AL KJE =HE O? IAHLA@ IAGKA?A >=IA@ BK? JE = @ =E I?A?HA=JA@ A=?D FH BE A?= >A KIA@ J

http://genomebiology.com/2001/2/6/research/0021.15 IA=H?D = =L=E => A @=J=>=IAI E = =JJA FJ J?=JA = D C KIIAGKA?A =J?DAI MDE?DE E@A=?=IAI? HHA IF @J IE C AFH JAE @ =E I 6DEIANF= @A@IAJ BD C KIIAGKA?AIEIJDA K JEF O= EC A@KIE CJDAFH BE A=I = JA F =JA 6DA IAJ B K JEF O = EC A@ IAGKA?AI @ABE AI IAGKA?A @ =E > K @=HEAI = MI FKJ=JELA BK?JE = =IIEC A JI B H FHALE KI O K E@A JEBEA@ FH JAE I = @ DECD ECDJI AO? IAHLA@ HAIE@KAI B A >AHI B = D C KI IAJ BFH JAE I )I AMIAGKA?AI= @IAGKA?A@CA AI >A? A =L=E => A KH IAJ B FH BE AI?= E@A JEBO JDA =@@E JE = D CIMEJDDECDIA IEJELEJO= @IFA?EBE?EJO The profile-defining set ) @ABE E C IAJ MEJD = ME@A J=N E? IFHA=@ ID K @ >A FJE = B H?=JE C@EIJ= JD CIBH =I = OIFA?EAI =I F IIE> A 1@A= FH BE AI D=LA = @ABE E C IAJ? F IA@ B A FH JAE A=?D BH = AK =HO JA = =H?D=A = /H= F IEJELA>=?JAHEK = @=/H= AC=JELA>=?JAHEK = @ A OA=IJ EJ?D @HE= FH JAE 6?HA=JA JDA @ABE E C IAJ B H A=?D FH BE A MA AIJ=> EIDA@ = IKFAHIAJ B F JA JE= B= E O A >AHIKIE C* )56IA=H?DAI "# =C=E IJ KH*E A?K =H - CE AAHE C 4AIA=H?D +A JAH * -4+ = CA AI @=J=>=IA = @ 5MEII 2H J "$ -=?D E@A JEBEA@ OA=IJ EJ?D @HE= FH JAE IAHLAI=I=[IAA@\E EJE= E E CFH JAE B H= * )56 IA=H?D 6DA?KJ BB I? HAI =HA IAJ =J = ANFA?J=JE L= KA B- `$ KIE CJDAIJ= @=H@5-/= @: 7BE JAHIB H M? F ANEJO HACE I "% H JDAIA * )56 DEJI MA IA A?JA@ JDA E EJE= @ABE E C IAJ B =NE K J=N E? IFHA=@ 1 I A?=IAI * )56 B=E I J?=JA IAGKA?A D CI JD=J D=LA @ELAHCA@ CHA=J O BH JDA IAA@ EJ?D @HE= FH JAE 9A JDA KIA@ = EJAH=JELA FH?A@KHA J CA AH=JA = FH BE A KIE C?= @O = E? FH CH= E C J =NE E AJDAE B H =JE? JA J HIK?D?=IAI JDAE EJE= FH BE AEIHK =C=E IJJDA= CA AI@=J=>=IA>OAND=KIJELA?= @O = E?FH CH= E C 6DA JDA ANJDECDAIJI? HE C A >AHBH =@EBBAHA J E C@ EI=@@A@J JDA@ABE E C IAJ 9AEJAH=JA@JDEIFH?AIIK JE JDAE@A= J=N E?IFHA=@ M=I =?DEALA@ H JDA E B H =JE? JA J B JDA FH BE A @H FFA@>A MJA = E =?E@AGKEL= A?EAI "& Expanding the set of profile homologs?a = FH BE A EI >KE J MA =JJA FJ J BE @ = IAGKA?AI ID=HE C JDEI IAGKA?A @ =E 6DEI EI @ A >O DAKHEIJE? = @ HAND=KIJELA=FFH =?DAI 1 JDADAKHEIJE?=FFH =?DJDA FH BE AIA=H?DAI* -4+\I= CA AI= @5MEII 2H J>OBE JAHE CJDA =HCAH@=J=>=IAIKIE C* )56MEJD= ANFA?J=JE L= KA?KJ BB B- ` B MA@>O=ID HJ E CFH BE A IA=H?D 1 JDAAND=KIJELA=FFH =?D JDAFH BE AIA=H?DAIJDA = CA AI @=J=>=IA H JDA?KHHA J LAHIE B 5MEII 2H J KIE CID HJ E C@O = E?FH CH= E C Multiple alignment sets -=?DFH BE AE @K?AI= K JEF A= EC A JB H= OIAJ BFH JAE IJD=J? J=E EJ 6DAFH BE AI O= EC JD IAIK>HACE I B=FH JAE JD=JJDAO =J?D 6DAIAHACE IE? HF H=JAJDA IJ? IAHLA@= E =?E@F IEJE I= CJDAIAGKA?AI JD=J = A KF JDA FH BE A\I @ABE E C IAJ 9A D=LA?D IA @ABE E C IAJI JD=J? J=E JDA ID HJAIJ E B H =JELA FH JAE IAGKA?A E H@AH J =NE E A JDA?D=?A B FH @K?E C IE C A @ =E FH BE A > A?JI ) CAH IAGKA?A E JDA @ABE E CIAJ ECDJ = A=I ECDJ ODECDAH I? HE CFH BE A>O JDA =@@EJE B = BAM?=H> NO H = E JAH E = HAIE@KAI >KJ O =J JDA HEI B EIIE C I ECDJ O ID HJAH A >AHI B JDAIAJMDA JDAFH BE AEIKIA@J IA=H?DJDA@=J=>=IAI Homologous structures from the Protein Data Bank 9A H KJE A O IA=H?D JDA 4+5* 2H JAE,=J= *= "' B H F JA JE= IJHK?JKH= D CI 2H BE A =J?DAIMEJD I? HAI => LA=HAHAJ=E A@= @KIA@J = EC JDAFH BE A =J?DA@ HACE J =IJHK?JKH= @ =E 6DAIAFH BE A E @K?A@IJHK? JKH= = EC A JI?= >A LEIK= E A@ E 4=I # = @ E =CAI>KE JKIE C I?HEFJ # = @4=IJAH!, # Additional data files 6DA B ME C =@@EJE = @=J= BE AI =HA =L=E => A MEJD JDEI F=FAH E A? F AJA = EC A JI B H JDA FH BE A? LAHA@ HACE I B JDA IAGKA?AI ID M E ECKHA ECKHA ECKHA! ECKHA " ECKHA # ECKHA $ ECKHA % ECKHA & ECKHA ' ECKHA = @ ECKHA Acknowledgements We thank Greg McAllister for helpful discussions of transmembrane protein structures. This work was partially supported by US Department of Energy Grant DE-FG02-98ER62558 and NSF Grant DBI-9807993. References 1. Das S, Smith, TF: Identifying nature s protein Lego set. In Advances in Protein Chemistry. Edited by Richards FM, Eisenberg DS, Kim PS. San Diego: Academic Press; 2000:159-183. 2. Lang BF, Burger G, O Kelly CJ, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Gray MW: An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature 1997, 387:493-497. 3. Taanman JW: The mitochondrial genome: structure, transcription, translation and replication. Biochim Biophys Acta 1999, 1410:103-123. 4. Leonard JV, Schapira AH: Mitochondrial respiratory chain disorders I: mitochondrial DNA defects. Lancet 2000, 355:299-304. 5. Mitochondrial disorders [http://www.neuro.wustl.edu/neuromuscular/mitosyn.html] 6. Rizzuto R, Pinton P, Carrington W, Fay FS, Fogarty KE, Lifshitz LM, Tuft RA, Pozzan T: Close contacts with the endoplasmic reticulum as determinants of mitochondrial Ca 2+ responses. Science 1998, 280:1763-1766. 7. Skulachev VP: Mitochondrial filaments and clusters as intracellular power-transmitting cables. Trends Biochem Sci 2001, 26:23-29. 8. Schapira AH: Mitochondrial disorders. Biochim Biophys Acta 1999, 1410:99-102. 9. Hayashi J, Ohta S, Kikuchi A, Takemitsu M, Goto Y, Nonaka I: Introduction of disease-related mitochondrial DNA deletions into HeLa cells lacking mitochondrial DNA results in mitochondrial dysfunction. Proc Natl Acad Sci USA 1991, 88:10614-10618. 10. Chomyn A, Martinuzzi A, Yoneda M, Daga A, Hurko O, Johns D, Lai ST, Nonaka I, Angelini C, Attardi G: MELAS mutation in mtdna binding site for transcription termination factor causes defects in protein synthesis and in respiration but no comment reviews reports deposited research refereed research interactions information

16 Genome Biology Vol 2 No 6 2 =IJAHAHAJ= change in levels of upstream and downstream mature transcripts. Proc Natl Acad Sci USA 1992, 89:4221-4225. 11. Riordan-Eva P, Harding AE: Leber s hereditary optic neuropathy: the clinical relevance of different mitochondrial DNA mutations. J Med Genet 1995, 32:81-87. 12. Brown MD, Voljavec AS, Lott MT, MacDonald I, Wallace DC: Leber s hereditary optic neuropathy: a model for mitochondrial neurodegenerative diseases. FASEB J 1992, 6:2791-2799. 13. Zeviani M, Bertagnolio B, Uziel G: Neurological presentations of mitochondrial diseases. J Inherit Metab Dis 1996, 19:504-520. 14. Trounce I, Neill S, Wallace DC: Cytoplasmic transfer of the mtdna nt 8993 T G (ATP6) point mutation associated with Leigh syndrome into mtdna-less cells demonstrates cosegregation with a decrease in state III respiration and ADP/O ratio. Proc Natl Acad Sci USA 1994, 91:8334-8338. 15. Santorelli FM, Shanske S, Macaya A, DeVivo DC, DiMauro S: The mutation at nt 8993 of mitochondrial DNA is a common cause of Leigh s syndrome. Annl Neurol 1993, 34:827-834. 16. Marsac C, Benelli C, Desguerre I, Diry M, Fouque F, De Meirleir L, Ponsot G, Seneca S, Poggi F, Saudubray JM, et al.: Biochemical and genetic studies of four patients with pyruvate dehydrogenase E1 alpha deficiency. Hum Genet 1997, 99:785-792. 17. Dahl H-HM: Getting to the nucleus of mitochondrial disorders: Identification of respiratory chain-enzyme genes causing Leigh syndrome. Am J Hum Genet 1998, 63:1594-1597. 18. Tiranti V, Hoertnagel K, Carrozzo R, Galimberti C, Munaro M, Granatiero M, Zelante L, Gasparini P, Marzella R, Rocchi M, et al.: Mutations of SURF-1 in Leigh disease associated with cytochrome c oxidase deficiency. Am J Hum Genet 1998, 63:1609-1621. 19. Poyau A, Buchet K, Bouzidi MF, Zabot MT, Echenne B, Yao J, Shoubridge EA, Godinot C: Missense mutations in SURF1 associated with deficient cytochrome c oxidase assembly in Leigh syndrome patients. Hum Genet 2000, 106:194-205. 20. Bourgeron T, Rustin P, Chretien D, Birch-Machin M, Bourgeois M, Viegas-Pequignot E, Munnich A, Rotig A: Mutation of a nuclear succinate dehydrogenase gene results in mitochondrial respiratory chain deficiency. Nat Genet 1995, 11:144-149. 21. Parfait B, Chretien D, Rotig A, Marsac C, Munnich A, Rustin P: Compound heterozygous mutations in the flavoprotein gene of the respiratory chain complex II in a patient with Leigh syndrome. Hum Genet 2000, 106:236-243. 22. Nishino I, Spinazzola A, Hirano M: Thymidine phosphorylase gene mutations in MNGIE, a human mitochondrial disorder. Science 1999, 283:689-692. 23. Miyadera K, Dohmae N, Takio K, Sumizawa T, Haraguchi M, Furukawa T, Yamada Y, Akiyama S: Structural characterization of thymidine phosphorylase from human placenta. Biochem Biophys Res Commun 1995, 212:1040-1045. 24 Brown NS, Bicknell R: Thymidine phosphorylase, 2-deoxy-Dribose and angiogenesis. Biochem J 1998, 334:1-8. 25. Jin H, May M, Tranebjaerg L, Kendall E, Fontan G, Jackson J, Subramony SH, Arena F, Lubs H, Smith S, et al.: A novel X-linked gene, DDP, shows mutations in families with deafness (DFN-1), dystonia, mental deficiency and blindness. Nat Genet 1996, 14:177-180. 26. Koehler CM, Leuenberger D, Merchant S, Renold A, Junne T, Schatz G: Human deafness dystonia syndrome is a mitochondrial disease. Proc Natl Acad Sci USA 1999, 96:2141-2146. 27. Tranebjaerg L, Hamel BC, Gabreels FJ, Renier WO, Van Ghelue M: A de novo missense mutation in a critical domain of the X- linked DDP gene causes the typical deafness-dystonia-optic atrophy syndrome. Eur J Hum Genet 2000, 8:464-467. 28. Koehler CM, Merchant S, Schatz, G: How membrane proteins travel across the mitochondrial intermembrane space. Trends Biochem Sci 1999, 24:428-432. 29. Leuenberger D, Bally NA, Schatz G, Koehler CM: Different import pathways through the mitochondrial intermembrane space for inner membrane proteins. EMBO J 1999, 18:4816-4822. 30. Rotig A, de Lonlay P, Chretien D, Foury F, Koenig M, Sidi D, Munnich A, Rustin P: Aconitase and mitochondrial ironsulphur protein deficiency in Friedreich ataxia. Nat Genet 1997, 17:215-217. 31. Delatycki MB, Williamson R, Forrest SM: Friedreich ataxia: an overview. J Med Genet 2000, 37:1-8. 32. Sakamoto N, Chastain PD, Parniewski P, Ohshima K, Pandolfo M, Griffith JD, Wells RD: Sticky DNA: self-association properties of long GAA.TTC repeats in R.R.Y Triplex structures from Friedreich s ataxia. Mol Cell 1999, 3:465-475. 33. Branda SS, Yang ZY, Chew A, Isaya G: Mitochondrial intermediate peptidase and the yeast frataxin homolog together maintain mitochondrial iron homeostasis in Saccharomyces cerevisiae. Hum Mol Genet 1999, 8:1099-1110. 34. Allikmets R, Raskind WH, Hutchinson A, Schueck ND, Dean M, Koeller DM: Mutation of a putative mitochondrial iron transporter gene (ABC7) in X-linked sideroblastic anemia and ataxia (XLSA/A). Hum Mol Genet 1999, 8:743-749. 35. Stultz CM, White JV, Smith TF: Structural analysis based on state-space modeling. Prot Sci 1993, 2:305-314. 36. White JV, Stultz CM, Smith TF: Protein classification by stochastic modeling and optimal filtering of amino acid sequences. Math Biosci 1994, 119:35-75. 37. Casari G, De Fusco M, Ciarmatori S, Zeviani M, Mora M, Fernandez P, De Michele G, Filla A, Cocozza S, Marconi R, et al.: Spastic paraplegia and OXPHOS impairment caused by mutations in paraplegin, a nuclear-encoded mitochondrial metalloprotease. Cell 1998, 93:973-983. 38. Steglich G, Neupert W, Langer T: Prohibitins regulate membrane protein degradation by the m-aaa protease in mitochondria. Mol Cell Biol 1999, 19:3435-3442. 39. Online Mendelian Inheritance in Man [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=omim] 40. Mitochondrial Project http://www.mips.biochem.mpg.de/proj/medgen/mitop/] 41. Chervitz SA, Hester ET, Ball CA, Dolinski K, Dwight SS, Harris MA, Juvik G, Malekian A, Roberts S, Roe T, et al.: Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure. Nucleic Acids Res 1999, 27:74-78. 42. Mewes HW, Frishman D, Gruber C, Geier B, Haase D, Kaps A, Lemcke K, Mannhaupt G, Pfeiffer F, Schuller C, et al.: MIPS: a database for genomes and protein sequences. Nucleic Acids Res 2000, 28:37-40. 43. Hodges PE, McKee AH, Davis BP, Payne WE, Garrels JI: The Yeast Proteome Database (YPD): a model for the organization and presentation of genome-wide functional data. Nucleic Acids Res 1999, 27:69-73. 44. de Pinto B, Malladi SB, Altamura N: MitBASE pilot: a database on nuclear genes involved in mitochondrial biogenesis and its regulation in Saccharomyces cerevisiae. Nucleic Acids Res 1999, 27:147-149. 45. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215:403-410. 46. Bairoch A, Apweiler R: The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res 1999, 27:49-54. 47. Wootton JC, Federhen S: Analysis of compositionally biased regions in sequence databases. Methods Enzymol 1996, 266:554-571. 48. Smith RF, Smith TF: Automatic generation of primary sequence patterns from sets of related protein sequences. Proc Natl Acad Sci USA 1990, 87:118-122. 49. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28:235-242. 50. Sayle RA, Milner-White EJ: RASMOL: biomolecular graphics for all. Trends Biochem Sci 1995, 20:374. 51. Kraulis PJ: MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J Appl Crystallogr 1991, 24:946-950. 52. Merritt E, Bacon D: Raster3D: photorealistic molecular graphics. Methods Enzymol 1997, 277:505-524. 53. Amino acid classes [http://bmerc-www.bu.edu/description/aaclasses.html]