A Theoretical Inference of Protein Schemes from Amino Acid Sequences
|
|
- Terence Holmes
- 5 years ago
- Views:
Transcription
1 A Theoretical Inference of Protein Schemes from Amino Acid Sequences Angel Villahoz-Baleta ABSTRACT Proteins are based on tri-dimensional dispositions generated from amino acid sequences. The disposition of a new protein must be totally or almost stable so that it can have a biological meaning. Proteins can be viewed as making-life blocks to build an alive being. Pursuing such a molecular stability is a problem very famous to be computed and it has to be still totally solved by the artificial intelligence (AI) for the biology community. The theoretical inference proposed in this paper is trying to find potential stable protein schemes using two algorithmic tools coming from the AI: the uninformed and informed searches. Author Keywords artificial intelligence, amino acid, hydrophobic, hydrophilic, informed search, molecular stability, neutral, protein, protein structure, uninformed search, scheme, sequence, water affinity. INTRODUCTION The chemical challenge of the molecular stability in proteins is frequently covered in scientific publications. A great part of publications shows the importance of the AI as a valuable help in their research efforts. The theoretical inference in this paper also uses AI but gets AI and chemical properties to need each other for another alternative research. There are several chemical properties to be considered influential for the molecular stability of proteins, but the most important one would be the water affinity since any known biological process is developed in an aqueous environment. About 60% of our adult human body is composed by water and our biological processes occur there. Each amino acid comes from a set of 22 different amino acids. They are classified into three classes: hydrophobic, hydrophilic, and neutral depending on their water affinities. The water-affinity subdivision of the 22 amino acids, as well as their short abbreviations is showed in the Table 1. amino acids have an aversion towards water molecules so these hydrophobic amino acids Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. tend to be inside of the protein. amino acids are attracted into water molecules so these hydrophilic amino acids tend to be in the outer layer of the protein. Neutral amino acids do not have any adverse or favorable reaction to water molecules so these neutral amino acids can be put at any place of the protein. Amino Acid Name (Short Abbreviation) Alanine (A) Arginine (R) Asparagine (N) Aspartic Acid (D) Cysteine (C) Glutamic Acid (E) Glutamine (Q) Glycine (G) Histidine (H) Hydroxyproline (O) Isoleucine (I) Leucine (L) Lysine (K) Methionine (M) Phenylalanine (F) Proline (P) Pyroglutamatic (U) Serine (S) Threonine (T) Tryptophan (W) Tyrosine (Y) Valine (V) Water Affinity Neutral Neutral Table 1. The water-affinity subdivision of the 22 amino acids.
2 A 1D sequence can get out of its one-dimensionality by folding into itself. Such a folding mechanism generates 2D schemes and 3D structures. 2D schemes are the intermediary step between 1D sequences and 3D structures. These 3D structures are the most accurate representation of the proteins but they are also probably one of the mathematical objects most expensive to be computed. The 2D schemes are less accurate but they demand less computational efforts. So there is a compromise between accurateness and computational power. Yet, this compromise can be improved with AI algorithms: uninformed search and informed search. Uninformed search takes advantage of the brute force offered by the current software forming each possible 2D scheme. Informed search refines the results of the uninformed search using the chemical property of the water affinity. Finally, a successful 2D scheme can be the basis on which the algorithms to get 3D structures can be started there with a major probability of success and a less costly effort. PROJECT DESCRIPTION A queue contains the amino acid sequence and it is preferable to put the first amino acid in the head of the queue, not the tail. Each amino acid coming from the queue is classified according to its water affinity. The first two amino acids have always a unique 2D disposition as its only possible way to be connected with each other. This initial disposition is the start state. Let s i be a state defined as a succession of planar coordinates, each one being the position of an amino acid. Then the start state can be written as s 0 = {(0, 0), (1, 0)} (see Figure 1). Notice that the first amino acid is hydrophilic due to its white color and the second amino acid, hydrophobic, due to its black color as the example for an arbitrary amino acid sequence given. A priori, the minimal number of amino acids is two to begin with, but the threshold of the number of amino acids to have a minimal biological is about fifteen amino acids as viruses, the simplest living beings, for example. After the two first amino acids being connected in an imaginary plane mimicking an aqueous environment, the next amino acid coming from the queue is put on each one of the 3 possible sides of the last amino acid following an uninformed search strategy (USS). The three new states are written as s 1 = {(0, 0), (1, 0), (1, 1)}, s 2 = {(0, 0), (1, 0), (2, 0)}, and s 3 = {(0, 0), (1, 0), (1, -1)} (see Figure 2). The USS always tries to discover any possible disposition with these 3 sides with the exception of the last side which the last amino acid maintains a connection with the next-to-last amino acid. Yet, one or more sides are usually not available for a next amino acid at a moment of such a search due to being already occupied by other previous amino acids during the early development of the USS. So each next amino acid in its turn would have 1 to 3 free sides to try. The branching factor of the USS is always between 1 and 3. The size of the search space based on the USS would equal or less than the number of 3 (m - 2) planar dispositions where m is the number of amino acids. For example, the number s 1 ) s 2 ) s 3 ) (1, 1) (2, 0) (1, -1) of planar dispositions for a sequence of 5 amino acids is 3 (5-2) = 3 3 = 27 states. Unfortunately, as proteins very studied by biologists have about 150 amino acids, the size of the uninformed search space can be so big too easily. So there is another alternative search strategy to be considered: the informed search strategy (ISS). The main difference between both search strategies, the USS and the ISS, is about the ISS using a chemical property, the water affinity, as an information rule to refine the search strategy with fewer sides to consider. amino acids are the target of such an information rule. Now the information rule of the water affinity for any amino acid is defined by the following points: Figure 1. The start state. amino acids are white circles, hydrophobic amino acids black circles, and neutral amino acids gray circles. Figure 2. The three possible states, s 1. s 2, and s 3, based on the USS with m = 3 after the start state. If a hydrophilic or neutral amino acid comes from the queue, the ISS will follow the rules of the USS regarding to the sides. If a hydrophobic amino acid comes from the queue, the side(s) to be put will be the nearest one(s) to the last hydrophobic amino acid put in the 2D scheme.
3 The new states generated by the ISS after the start state are written as s 1 = {(0, 0), (1, 0), (1, 1)} and s 2 = {(0, 0), (1, 0), (1, -1)} (see Figure 3). Notice that a potential state {(0, 0), (1, 0), (2, 0)} is dismissed due to the distance between its two hydrophobic amino acids being greater than the ones in s 1 and s 2. s 1 ) (1, 1) data for both USS and ISS. But the project can work with two or more data sets in a batch mode. Biologists use a standard format known as the FASTA format to store and interchange data sets. There are two FASTA formats, the first one about nucleic acids and the other one about amino acids which we use for this project. There is an example of the FASTA format in Figure 4. The FASTA format commands any written or electronic media to have its contents readable as plain text. The first line has to have the information about the protein as its name, its NCBI identifier, etc. The amino acid sequence is broken into short lines of 70 alphanumeric characters and then these lines are put in this file. Each line can be seen as a subsequence from the sequence only for the purpose of a better human reading. Only any data set following the FASTA format is accepted for the project. s 2 ) (1, -1) >gi gb AAD cytochrome b [Elephas maximus maximus] LCLYTHIGRNIYYGSYLYSETWNTGIMLLLITMATAFMGYVLPWGQMSFWGATVITNLFSAIPYIGTNLV EWIWGGFSVDKATLNRFFAFHFILPFTMVALAGVHLTFLHETGSNNPLGLTSDSDKIPFHPYYTIKDFLG LLILILLLLLLALLSPDMLGDPDNHMPADPLNTPLHIKPEWYFLFAYAILRSVPNKLGGVLALFLSIVIL GLMPFLHTSKHRSMMLRPLSQALFWTLTMDLLTLTWIGSQPVEYPYTIIGQMASILYFSIILAFLPIAGX IENY Figure 4. An example of the FASTA format. Evidently, if there are more hydrophobic amino acids than hydrophilic or neutral amino acids in a sequence given as an example, then the ISS becomes more powerful than the USS. According to the Table 1, there is a ratio of 9:12 between hydrophobic and hydrophilic amino acids making the ISS working with much fewer sides than the USS. The size of the search space based on the USS would not be greater than the number of 3 m + n planar dispositions where m is the number of non-hydrophobic amino acids existent in the subsequence starting from the third amino acid. The second term, n, is a quasi-linear function of the total sum of the planar disposition(s) generated by each hydrophobic amino acid with the application of the minimum Euclidean distance. The example in the Figure 3 would have n = = 2 states. Each one of the states generated from the USS or the ISS is qualified as stable or unstable. The main condition of the molecular stability as the goal state for any amino acid sequence is resumed in the following rule: Figure 3. The two possible states, s 1 and s 2, based on the ISS after the start state. Each hydrophobic amino acid must be isolated from the water molecules, that is, each one of its sides must be occupied by a hydrophilic or neutral amino acid but never any water molecule. An amino acid sequence in the queue is abstracted as the data set of a string of upper letters. Each upper letter is the short abbreviation of an amino acid (see Table 1). There can be amino acids as many as possible to be put in the data set if the computational resources can afford to bear such a number of amino acids. But it is very rare that a giant protein would have more than 400 amino acids. Yet, only the data set of an amino acid sequence is accepted as input ANALYSIS OF RESULTS The results were produced from the project working with several electronic files containing biological information organized in accordance to the FASTA format. The execution of the project pauses only when a stable 2D scheme is detected and visually showed. There was always a pause when each stable 2D scheme was generated and detected during the tests (see Figure 5 as an example). The remaining 2D schemes were unstable and the project dismissed them continuing its execution. All the executions of the project were visual and each expansion node of both USS and ISS was visually checked at each interval of one second. Eventually a one-dimensional pattern was discovered. If the 1D sequence was too fragmented, that is, alternative subsequences with a same water affinity were too short no stable 2D scheme was generated. Figure 5. A stable 2D scheme detected as the goal state.
4 If the longest subsequence of hydrophilic and neutral amino acids was 4 times (as the same number of the sides) longer than the longest one of hydrophobic amino acids, then the probability about reaching a goal state was very high. Besides the successful order of water affinity of these subsequences was to get first the hydrophobic subsequence and then the other subsequence. The metrics of both USS and ISS give the cost of one for the process of making a node. The cost of all the previous nodes made is transported into the next process of making the new node at each recursive call during the development of both USS and ISS. After making several tests of metrics with the same several electronic files as before, the final results of numbers of nodes at the final states detected as goal states showed that the performance difference between both USS and ISS was not very marked on the short term as it was expected but became more and more marked on the long term with longer amino acid sequences. DISCUSSION The inference proposed here would be improved if more chemical properties would be joined to the water affinity in the algorithmic motor of the ISS. Then the branching factor would be one and, rarely times, two thanks to the refinement coming from the new chemical properties. A few properties would be studied and chosen in accordance to their chemical influence in the molecular stability for the next version of the ISS. Some candidate chemical properties would be the ph factor or the polarity or the aromaticity. Another interesting point of discussion would be about an amino acid sequence being processed as a queue or stack. The queue of the data input here is processed beginning by its head. But there is another onedimensional alternative in the 1D sequence to begin by its tail so the amino acid sequence would be stored as a stack, too. The inclusion of new chemical properties and the treatment of the amino acid sequence as a bidirectional data structure would allow a hypothetical new version of the ISS to restrict the explosive growth in the number of states generated from the ISS working with 3D structures instead of 2D schemes. So the start state would be s 0 = {(0, 0, 0)}. The first algorithmic step of the ISS would generate 6 states: s 1 = {(0, 0, 0), (1, 0, 0)}, s 2 = {(0, 0, 0), (0, 1, 0)}, s 3 = {(0, 0, 0), (0, 0, 1)}, s 4 = {(0, 0, 0), (-1, 0, 0)}, s 5 = {(0, 0, 0), (0, -1, 0)}, and s 6 = {(0, 0, 0), (0, 0, -1)}. Then the next states would be processed according to the formula of the minimum Euclidean distance: d = ((x m x n ) 2 + (y m y n ) 2 + (z m z n ) 2 ) 1/2 where (x m, y m, z m ) and (x n, y n, z n ) are the positions of the new and last hydrophobic amino acids. It is know that Python, the programming language used by the project, has a well-defined and standard GUI, Tkinter, for 2D schemes but, unfortunately, there is no standard GUI for 3D structures. Perhaps Pymol would be integrated as a potential 3D GUI in the next version of the ISS for 3D structures. CONCLUSION The theoretical inference proposed here demonstrates that it is not necessary to generate all the possible final states, given their computational expenses. Instead it is possible to arrive only at the subset of the final states with the highest probability of being goal states with a minor computational cost. The use of AI as an algorithmic tool in informed searches together with chemical properties used as refinement factors opens new (and less computationally costly) research ways for biologists to discover new proteins unknown in Nature and beneficial for Mankind. ACKNOWLEDGMENT The work described in this paper was conducted as a part of a Fall 2012 Artificial Intelligence course, taught in the Computer Science department of the University of Massachusetts Lowell by Prof. Fred Martin. REFERENCES 1.Bui, T.N. and Sundarraj, G. An Efficient Genetic Algorithm for Predicting Protein Tertiary Structures in the 2D HP Model. in GECCO '05: Proceedings of the 2005 conference on Genetic and evolutionary computation, (Washington, DC, 2005), ACM Press (2005), FASTA Format. 3.Hart, W.E. and Newman A. Protein Structure Prediction with Lattice Models. in Aluru, S. ed. Handbook of Computational Molecular Biology, Chapman & Hall CRC Computer and Information Science Series, 2006, ModelViewController. 5.Mount, D.W. Bioinformatics Sequence and Genome Analysis. 2 nd ed. Cold Spring Harbor Laboratory Press, Newman, A. A New Algorithm for Protein Folding in the HP Model. in SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, (San Francisco, CA, 2002), SIAM Press (2002), Russell, S. and Norvig, P. Artificial Intelligence: A Modern Approach. 3 rd ed. Prentice Hall, Pevsner, J. Bioinformatics and Functional Genomics. 2 nd ed. Wiley-Blackwell, Proteinogenic amino acid Python Programming Language Python Programming/Object-oriented programming. Object-oriented_programming. 12.Tkinter.
5
Proteins: Characteristics and Properties of Amino Acids
SBI4U:Biochemistry Macromolecules Eachaminoacidhasatleastoneamineandoneacidfunctionalgroupasthe nameimplies.thedifferentpropertiesresultfromvariationsinthestructuresof differentrgroups.thergroupisoftenreferredtoastheaminoacidsidechain.
More informationThe Select Command and Boolean Operators
The Select Command and Boolean Operators Part of the Jmol Training Guide from the MSOE Center for BioMolecular Modeling Interactive version available at http://cbm.msoe.edu/teachingresources/jmol/jmoltraining/boolean.html
More informationPart 4 The Select Command and Boolean Operators
Part 4 The Select Command and Boolean Operators http://cbm.msoe.edu/newwebsite/learntomodel Introduction By default, every command you enter into the Console affects the entire molecular structure. However,
More informationSequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas Informal inductive proof of best alignment path onsider the last step in the best
More informationSequence comparison: Score matrices
Sequence comparison: Score matrices http://facultywashingtonedu/jht/gs559_2013/ Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best
More informationLecture 14 - Cells. Astronomy Winter Lecture 14 Cells: The Building Blocks of Life
Lecture 14 Cells: The Building Blocks of Life Astronomy 141 Winter 2012 This lecture describes Cells, the basic structural units of all life on Earth. Basic components of cells: carbohydrates, lipids,
More informationSequence comparison: Score matrices. Genome 559: Introduction to Statistical and Computational Genomics Prof. James H. Thomas
Sequence comparison: Score matrices Genome 559: Introduction to Statistical and omputational Genomics Prof James H Thomas FYI - informal inductive proof of best alignment path onsider the last step in
More informationPROTEIN STRUCTURE AMINO ACIDS H R. Zwitterion (dipolar ion) CO 2 H. PEPTIDES Formal reactions showing formation of peptide bond by dehydration:
PTEI STUTUE ydrolysis of proteins with aqueous acid or base yields a mixture of free amino acids. Each type of protein yields a characteristic mixture of the ~ 20 amino acids. AMI AIDS Zwitterion (dipolar
More informationPROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS
Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:
More informationViewing and Analyzing Proteins, Ligands and their Complexes 2
2 Viewing and Analyzing Proteins, Ligands and their Complexes 2 Overview Viewing the accessible surface Analyzing the properties of proteins containing thousands of atoms is best accomplished by representing
More informationUsing Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell
Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell Mathematics and Biochemistry University of Wisconsin - Madison 0 There Are Many Kinds Of Proteins The word protein comes
More informationRange of Certified Values in Reference Materials. Range of Expanded Uncertainties as Disseminated. NMI Service
Calibration and Capabilities Amount of substance,, Russian Federation (Ural Scientific and Research Institiute Metrology, Rosstandart) (D.I. Mendeleyev Institute Metrology, Rosstandart) The uncertainty
More informationINTRODUCTION. Amino acids occurring in nature have the general structure shown below:
Biochemistry I Laboratory Amino Acid Thin Layer Chromatography INTRODUCTION The primary importance of amino acids in cell structure and metabolism lies in the fact that they serve as building blocks for
More informationHow did they form? Exploring Meteorite Mysteries
Exploring Meteorite Mysteries Objectives Students will: recognize that carbonaceous chondrite meteorites contain amino acids, the first step towards living plants and animals. conduct experiments that
More informationProtein Secondary Structure Prediction
part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 the goal is the prediction of the secondary structure conformation which is local each amino
More informationDiscussion Section (Day, Time):
Chemistry 27 Spring 2005 Exam 1 Chemistry 27 Professor Gavin MacBeath arvard University Spring 2005 our Exam 1 Friday, February 25, 2005 11:07 AM 12:00 PM Discussion Section (Day, Time): TF: Directions:
More informationAmino Acids and Peptides
Amino Acids Amino Acids and Peptides Amino acid a compound that contains both an amino group and a carboxyl group α-amino acid an amino acid in which the amino group is on the carbon adjacent to the carboxyl
More informationEnzyme Catalysis & Biotechnology
L28-1 Enzyme Catalysis & Biotechnology Bovine Pancreatic RNase A Biochemistry, Life, and all that L28-2 A brief word about biochemistry traditionally, chemical engineers used organic and inorganic chemistry
More information1. Wings 5.. Jumping legs 2. 6 Legs 6. Crushing mouthparts 3. Segmented Body 7. Legs 4. Double set of wings 8. Curly antennae
Biology Cladogram practice Name Per Date What is a cladogram? It is a diagram that depicts evolutionary relationships among groups. It is based on PHYLOGENY, which is the study of evolutionary relationships.
More informationHypergraphs, Metabolic Networks, Bioreaction Systems. G. Bastin
Hypergraphs, Metabolic Networks, Bioreaction Systems. G. Bastin PART 1 : Metabolic flux analysis and minimal bioreaction modelling PART 2 : Dynamic metabolic flux analysis of underdetermined networks 2
More informationLecture 15: Realities of Genome Assembly Protein Sequencing
Lecture 15: Realities of Genome Assembly Protein Sequencing Study Chapter 8.10-8.15 1 Euler s Theorems A graph is balanced if for every vertex the number of incoming edges equals to the number of outgoing
More informationORGANIC - BROWN 8E CH AMINO ACIDS AND PROTEINS.
!! www.clutchprep.com CONCEPT: INTRODUCTION TO PROTEINS Proteins are polypeptides that have some biological function. Peptides are composed of polymers of monomeric units called α-amino acids The 20 most
More informationEXAM 1 Fall 2009 BCHS3304, SECTION # 21734, GENERAL BIOCHEMISTRY I Dr. Glen B Legge
EXAM 1 Fall 2009 BCHS3304, SECTION # 21734, GENERAL BIOCHEMISTRY I 2009 Dr. Glen B Legge This is a Scantron exam. All answers should be transferred to the Scantron sheet using a #2 pencil. Write and bubble
More informationChemistry Chapter 22
hemistry 2100 hapter 22 Proteins Proteins serve many functions, including the following. 1. Structure: ollagen and keratin are the chief constituents of skin, bone, hair, and nails. 2. atalysts: Virtually
More informationAll Proteins Have a Basic Molecular Formula
All Proteins Have a Basic Molecular Formula Homa Torabizadeh Abstract This study proposes a basic molecular formula for all proteins. A total of 10,739 proteins belonging to 9 different protein groups
More informationExam III. Please read through each question carefully, and make sure you provide all of the requested information.
09-107 onors Chemistry ame Exam III Please read through each question carefully, and make sure you provide all of the requested information. 1. A series of octahedral metal compounds are made from 1 mol
More informationPatrick: An Introduction to Medicinal Chemistry 5e Chapter 03
01) Which of the following statements is not true regarding the active site of an enzyme? a. An active site is normally on the surface of an enzyme. b. An active site is normally hydrophobic in nature.
More information1. Amino Acids and Peptides Structures and Properties
1. Amino Acids and Peptides Structures and Properties Chemical nature of amino acids The!-amino acids in peptides and proteins (excluding proline) consist of a carboxylic acid ( COOH) and an amino ( NH
More informationEvidence from Evolution Activity 75 Points. Fossils Use your textbook and the diagrams on the next page to answer the following questions.
Name(s): Biology Evidence from Evolution Activity 75 Points Fossils Use your textbook and the diagrams on the next page to answer the following questions. 1. What are fossils? How are most fossils formed?
More informationfile:///biology Exploring Life/BiologyExploringLife04/
Objectives Identify carbon skeletons and functional groups in organic molecules. Relate monomers and polymers. Describe the processes of building and breaking polymers. Key Terms organic molecule inorganic
More informationProtein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods
Cell communication channel Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu SEQUENCE STRUCTURE DNA Sequence Protein Sequence Protein Structure Protein structure ATGAAATTTGGAAACTTCCTTCTCACTTATCAGCCACCT...
More informationSEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA
SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS 1 Prokaryotes and Eukaryotes 2 DNA and RNA 3 4 Double helix structure Codons Codons are triplets of bases from the RNA sequence. Each triplet defines an amino-acid.
More informationGeneration Date: 12/07/2015 Generated By: Tristan Wiley Title: Bio I Winter Packet
Generation Date: 12/07/2015 Generated By: Tristan Wiley Title: Bio I Winter Packet 1. Many natural ecosystems have been destroyed by human activity. To better manage our remaining natural ecosystems, we
More informationProperties of amino acids in proteins
Properties of amino acids in proteins one of the primary roles of DNA (but not the only one!) is to code for proteins A typical bacterium builds thousands types of proteins, all from ~20 amino acids repeated
More informationProteome Informatics. Brian C. Searle Creative Commons Attribution
Proteome Informatics Brian C. Searle searleb@uw.edu Creative Commons Attribution Section structure Class 1 Class 2 Homework 1 Mass spectrometry and de novo sequencing Database searching and E-value estimation
More informationPeriodic Table. 8/3/2006 MEDC 501 Fall
Periodic Table 8/3/2006 MEDC 501 Fall 2006 1 rbitals Shapes of rbitals s - orbital p -orbital 8/3/2006 MEDC 501 Fall 2006 2 Ionic Bond - acl Electronic Structure 11 a :: 1s 2 2s 2 2p x2 2p y2 2p z2 3s
More informationPractice Midterm Exam 200 points total 75 minutes Multiple Choice (3 pts each 30 pts total) Mark your answers in the space to the left:
MITES ame Practice Midterm Exam 200 points total 75 minutes Multiple hoice (3 pts each 30 pts total) Mark your answers in the space to the left: 1. Amphipathic molecules have regions that are: a) polar
More informationC h a p t e r 2 A n a l y s i s o f s o m e S e q u e n c e... methods use different attributes related to mis sense mutations such as
C h a p t e r 2 A n a l y s i s o f s o m e S e q u e n c e... 2.1Introduction smentionedinchapter1,severalmethodsareavailabletoclassifyhuman missensemutationsintoeitherbenignorpathogeniccategoriesandthese
More informationTranslation. A ribosome, mrna, and trna.
Translation The basic processes of translation are conserved among prokaryotes and eukaryotes. Prokaryotic Translation A ribosome, mrna, and trna. In the initiation of translation in prokaryotes, the Shine-Dalgarno
More informationChemical Properties of Amino Acids
hemical Properties of Amino Acids Protein Function Make up about 15% of the cell and have many functions in the cell 1. atalysis: enzymes 2. Structure: muscle proteins 3. Movement: myosin, actin 4. Defense:
More informationCHAPTER 29 HW: AMINO ACIDS + PROTEINS
CAPTER 29 W: AMI ACIDS + PRTEIS For all problems, consult the table of 20 Amino Acids provided in lecture if an amino acid structure is needed; these will be given on exams. Use natural amino acids (L)
More informationProtein Struktur (optional, flexible)
Protein Struktur (optional, flexible) 22/10/2009 [ 1 ] Andrew Torda, Wintersemester 2009 / 2010, AST nur für Informatiker, Mathematiker,.. 26 kt, 3 ov 2009 Proteins - who cares? 22/10/2009 [ 2 ] Most important
More informationProteomics. November 13, 2007
Proteomics November 13, 2007 Acknowledgement Slides presented here have been borrowed from presentations by : Dr. Mark A. Knepper (LKEM, NHLBI, NIH) Dr. Nathan Edwards (Center for Bioinformatics and Computational
More informationANSWERS TO CASE STUDIES Chapter 2: Drug Design and Relationship of Functional Groups to Pharmacologic Activity
ANSWERS TO CASE STUDIES Chapter 2: Drug Design and Relationship of Functional Groups to Pharmacologic Activity Absorption/Acid-Base Case (p. 42) Question #1: Drug Cetirizine Clemastin e Functional groups
More informationProtein Identification Using Tandem Mass Spectrometry. Nathan Edwards Informatics Research Applied Biosystems
Protein Identification Using Tandem Mass Spectrometry Nathan Edwards Informatics Research Applied Biosystems Outline Proteomics context Tandem mass spectrometry Peptide fragmentation Peptide identification
More informationEdward Susko Department of Mathematics and Statistics, Dalhousie University. Introduction. Installation
1 dist est: Estimation of Rates-Across-Sites Distributions in Phylogenetic Subsititution Models Version 1.0 Edward Susko Department of Mathematics and Statistics, Dalhousie University Introduction The
More informationProtein Structure Bioinformatics Introduction
1 Swiss Institute of Bioinformatics Protein Structure Bioinformatics Introduction Basel, 27. September 2004 Torsten Schwede Biozentrum - Universität Basel Swiss Institute of Bioinformatics Klingelbergstr
More informationNH 2. Biochemistry I, Fall Term Sept 9, Lecture 5: Amino Acids & Peptides Assigned reading in Campbell: Chapter
Biochemistry I, Fall Term Sept 9, 2005 Lecture 5: Amino Acids & Peptides Assigned reading in Campbell: Chapter 3.1-3.4. Key Terms: ptical Activity, Chirality Peptide bond Condensation reaction ydrolysis
More information12/6/12. Dr. Sanjeeva Srivastava IIT Bombay. Primary Structure. Secondary Structure. Tertiary Structure. Quaternary Structure.
Dr. anjeeva rivastava Primary tructure econdary tructure Tertiary tructure Quaternary tructure Amino acid residues α Helix Polypeptide chain Assembled subunits 2 1 Amino acid sequence determines 3-D structure
More informationRead more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov
2018 Biochemistry 110 California Institute of Technology Lecture 2: Principles of Protein Structure Linus Pauling (1901-1994) began his studies at Caltech in 1922 and was directed by Arthur Amos oyes to
More informationTowards Understanding the Origin of Genetic Languages
Towards Understanding the Origin of Genetic Languages Why do living organisms use 4 nucleotide bases and 20 amino acids? Apoorva Patel Centre for High Energy Physics and Supercomputer Education and Research
More informationGENETIC CODE AS A HARMONIC SYSTEM: TWO SUPPLEMENTS. Miloje M. Rakočević
GENETIC CODE AS A HARMONIC SYSTEM: TWO SUPPLEMENTS Miloje M. Rakočević Department of Chemistry, Faculty of Science, University of Niš, Ćirila i Metodija 2, Serbia (E-mail: m.m.r@eunet.yu) Abstract. The
More informationStudent Handout 2. Human Sepiapterin Reductase mrna Gene Map A 3DMD BioInformatics Activity. Genome Sequencing. Sepiapterin Reductase
Project-Based Learning ctivity Human Sepiapterin Reductase mrn ene Map 3DMD BioInformatics ctivity 498 ---+---------+--------- ---------+---------+---------+---------+---------+---------+---------+---------+---------+---------
More information8 Grundlagen der Bioinformatik, SS 09, D. Huson, April 28, 2009
8 Grundlagen der Bioinformatik, SS 09, D. Huson, April 28, 2009 2 Pairwise alignment We will discuss: 1. Strings 2. Dot matrix method for comparing sequences 3. Edit distance and alignment 4. The number
More informationFinding the Best Biological Pairwise Alignment Through Genetic Algorithm Determinando o Melhor Alinhamento Biológico Através do Algoritmo Genético
Finding the Best Biological Pairwise Alignment Through Genetic Algorithm Determinando o Melhor Alinhamento Biológico Através do Algoritmo Genético Paulo Mologni 1, Ailton Akira Shinoda 2, Carlos Dias Maciel
More informationDiscussion Section (Day, Time):
Chemistry 27 Spring 2005 Exam 3 Chemistry 27 Professor Gavin MacBeath arvard University Spring 2005 our Exam 3 Friday April 29 th, 2005 11:07 AM 12:00 PM Discussion Section (Day, Time): TF: Directions:
More informationThe Calculation of Physical Properties of Amino Acids Using Molecular Modeling Techniques (II)
1046 Bull. Korean Chem. Soc. 2004, Vol. 25, No. 7 Myung-Jae Lee and Ui-Rak Kim The Calculation of Physical Properties of Amino Acids Using Molecular Modeling Techniques (II) Myung-Jae Lee and Ui-Rak Kim
More informationGENERAL BIOLOGY LABORATORY EXERCISE Amino Acid Sequence Analysis of Cytochrome C in Bacteria and Eukarya Using Bioinformatics
GENERAL BIOLOGY LABORATORY EXERCISE Amino Acid Sequence Analysis of Cytochrome C in Bacteria and Eukarya Using Bioinformatics INTRODUCTION: All life forms undergo metabolic processes to obtain energy.
More informationStudies Leading to the Development of a Highly Selective. Colorimetric and Fluorescent Chemosensor for Lysine
Supporting Information for Studies Leading to the Development of a Highly Selective Colorimetric and Fluorescent Chemosensor for Lysine Ying Zhou, a Jiyeon Won, c Jin Yong Lee, c * and Juyoung Yoon a,
More information8 Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 18, 2011
8 Grundlagen der Bioinformatik, SoSe 11, D. Huson, April 18, 2011 2 Pairwise alignment We will discuss: 1. Strings 2. Dot matrix method for comparing sequences 3. Edit distance and alignment 4. The number
More informationIn eukaryotes the most important regulatory genes contain homeobox sequences and are called homeotic genes.
1 rowth and development in organisms is controlled by a number of mechanisms that operate at the cellular level. The control elements involved in these mechanisms include hormones, the second messenger
More informationStructures in equilibrium at point A: Structures in equilibrium at point B: (ii) Structure at the isoelectric point:
ame 21 F10-Final Exam Page 2 I. (42 points) (1) (16 points) The titration curve for L-lysine is shown below. Provide (i) the main structures in equilibrium at each of points A and B indicated below and
More informationInvestigating Evolutionary Relationships between Species through the Light of Graph Theory based on the Multiplet Structure of the Genetic Code
07 IEEE 7th International Advance Computing Conference Investigating Evolutionary Relationships between Species through the Light of Graph Theory based on the Multiplet Structure of the Genetic Code Antara
More informationA rapid and highly selective colorimetric method for direct detection of tryptophan in proteins via DMSO acceleration
A rapid and highly selective colorimetric method for direct detection of tryptophan in proteins via DMSO acceleration Yanyan Huang, Shaoxiang Xiong, Guoquan Liu, Rui Zhao Beijing National Laboratory for
More informationLS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor
LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor Note: Adequate space is given for each answer. Questions that require a brief explanation should
More informationProtein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013
Protein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013 The presentation is based on the presentation by Professor Alexander Dikiy, which is given in the course compedium:
More informationCollision Cross Section: Ideal elastic hard sphere collision:
Collision Cross Section: Ideal elastic hard sphere collision: ( r r 1 ) Where is the collision cross-section r 1 r ) ( 1 Where is the collision distance r 1 r These equations negate potential interactions
More informationSupporting information. Contents
Qi Jiang, Chunhua Hu and Michael D. Ward* Contribution from the Molecular Design Institute, Department of Chemistry, New York University, 100 Washington Square East, New York, NY 10003-6688 Supporting
More information4) Chapter 1 includes heredity (i.e. DNA and genes) as well as evolution. Discuss the connection between heredity and evolution?
Name- Chapters 1-5 Questions 1) Life is easy to recognize but difficult to define. The dictionary defines life as the state or quality that distinguishes living beings or organisms from dead ones and from
More informationCHEMISTRY ATAR COURSE DATA BOOKLET
CHEMISTRY ATAR COURSE DATA BOOKLET 2018 2018/2457 Chemistry ATAR Course Data Booklet 2018 Table of contents Periodic table of the elements...3 Formulae...4 Units...4 Constants...4 Solubility rules for
More informationDental Biochemistry EXAM I
Dental Biochemistry EXAM I August 29, 2005 In the reaction below: CH 3 -CH 2 OH -~ ethanol CH 3 -CHO acetaldehyde A. acetoacetate is being produced B. ethanol is being oxidized to acetaldehyde C. acetaldehyde
More informationHomework 9: Protein Folding & Simulated Annealing : Programming for Scientists Due: Thursday, April 14, 2016 at 11:59 PM
Homework 9: Protein Folding & Simulated Annealing 02-201: Programming for Scientists Due: Thursday, April 14, 2016 at 11:59 PM 1. Set up We re back to Go for this assignment. 1. Inside of your src directory,
More informationDiscussion Section (Day, Time): TF:
ame: Chemistry 27 Professor Gavin MacBeath arvard University Spring 2004 Final Exam Thursday, May 28, 2004 2:15 PM - 5:15 PM Discussion Section (Day, Time): Directions: TF: 1. Do not write in red ink.
More informationSeparation of Large and Small Peptides by Supercritical Fluid Chromatography and Detection by Mass Spectrometry
Separation of Large and Small Peptides by Supercritical Fluid Chromatography and Detection by Mass Spectrometry Application Note Biologics and Biosimilars Author Edgar Naegele Agilent Technologies, Inc.
More informationScoring Matrices. Shifra Ben-Dor Irit Orr
Scoring Matrices Shifra Ben-Dor Irit Orr Scoring matrices Sequence alignment and database searching programs compare sequences to each other as a series of characters. All algorithms (programs) for comparison
More information1014NSC Fundamentals of Biochemistry Semester Summary
1014NSC Fundamentals of Biochemistry Semester Summary Griffith University, Nathan Campus Semester 1, 2014 Topics include: - Water & ph - Protein Diversity - Nucleic Acids - DNA Replication - Transcription
More information(Lys), resulting in translation of a polypeptide without the Lys amino acid. resulting in translation of a polypeptide without the Lys amino acid.
1. A change that makes a polypeptide defective has been discovered in its amino acid sequence. The normal and defective amino acid sequences are shown below. Researchers are attempting to reproduce the
More informationSystematic approaches to study cancer cell metabolism
Systematic approaches to study cancer cell metabolism Kivanc Birsoy Laboratory of Metabolic Regulation and Genetics The Rockefeller University, NY Cellular metabolism is complex ~3,000 metabolic genes
More informationResearch Article Novel Numerical Characterization of Protein Sequences Based on Individual Amino Acid and Its Application
BioMed Research International Volume 215, Article ID 99567, 8 pages http://dx.doi.org/1.1155/215/99567 Research Article Novel Numerical Characterization of Protein Sequences Based on Individual Amino Acid
More informationBCH 4053 Exam I Review Spring 2017
BCH 4053 SI - Spring 2017 Reed BCH 4053 Exam I Review Spring 2017 Chapter 1 1. Calculate G for the reaction A + A P + Q. Assume the following equilibrium concentrations: [A] = 20mM, [Q] = [P] = 40fM. Assume
More informationUNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh
UNT TWELVE PROTENS : PEPTDE BONDNG AND POLYPEPTDES 12 CONCEPTS Many proteins are important in biological structure-for example, the keratin of hair, collagen of skin and leather, and fibroin of silk. Other
More informationPrinciples of Biochemistry
Principles of Biochemistry Fourth Edition Donald Voet Judith G. Voet Charlotte W. Pratt Chapter 4 Amino Acids: The Building Blocks of proteins (Page 76-90) Chapter Contents 1- Amino acids Structure: 2-
More informationNSCI Basic Properties of Life and The Biochemistry of Life on Earth
NSCI 314 LIFE IN THE COSMOS 4 Basic Properties of Life and The Biochemistry of Life on Earth Dr. Karen Kolehmainen Department of Physics CSUSB http://physics.csusb.edu/~karen/ WHAT IS LIFE? HARD TO DEFINE,
More informationChapter 5. Proteomics and the analysis of protein sequence Ⅱ
Proteomics Chapter 5. Proteomics and the analysis of protein sequence Ⅱ 1 Pairwise similarity searching (1) Figure 5.5: manual alignment One of the amino acids in the top sequence has no equivalent and
More informationModule No. 31: Peptide Synthesis: Definition, Methodology & applications
PAPER 9: TECHNIQUES USED IN MOLECULAR BIOPHYSICS I Module No. 31: Peptide Synthesis: Definition, Methodology & applications Objectives: 1. Introduction 2. Synthesis of peptide 2.1. N-terminal protected
More informationBENG 183 Trey Ideker. Protein Sequencing
BENG 183 Trey Ideker Protein Sequencing The following slides borrowed from Hong Li s Biochemistry Course: www.sb.fsu.edu/~hongli/4053notes Introduction to Proteins Proteins are of vital importance to biological
More informationIntroduction to graph theory and molecular networks
Introduction to graph theory and molecular networks Sushmita Roy sroy@biostat.wisc.edu Computational Network Biology Biostatistics & Medical Informatics 826 https://compnetbiocourse.discovery.wisc.edu
More informationNational Nutrient Database for Standard Reference Release 28 slightly revised May, 2016
National base for Standard Reference Release 28 slightly revised May, 206 Full Report (All s) 005, Cheese, cottage, lowfat, 2% milkfat Report Date: February 23, 208 02:7 EST values and weights are for
More informationProperties of Amino Acids
Biochemistry Department Date:19/9/ 2017 Properties of Amino Acids Prof.Dr./ FAYDA Elazazy Professor of Biochemistry and Molecular Biology 1 Intended Learning Outcomes (ILOs) By the end of this lecture,
More informationDiscussion Section (Day, Time):
Chemistry 27 pring 2005 Exam 3 Chemistry 27 Professor Gavin MacBeath arvard University pring 2005 our Exam 3 Friday April 29 th, 2005 11:07 AM 12:00 PM Discussion ection (Day, Time): TF: Directions: 1.
More informationFrom Amino Acids to Proteins - in 4 Easy Steps
From Amino Acids to Proteins - in 4 Easy Steps Although protein structure appears to be overwhelmingly complex, you can provide your students with a basic understanding of how proteins fold by focusing
More informationSTRUCTURAL BIOINFORMATICS I. Fall 2015
STRUCTURAL BIOINFORMATICS I Fall 2015 Info Course Number - Classification: Biology 5411 Class Schedule: Monday 5:30-7:50 PM, SERC Room 456 (4 th floor) Instructors: Vincenzo Carnevale - SERC, Room 704C;
More information4. The Michaelis-Menten combined rate constant Km, is defined for the following kinetic mechanism as k 1 k 2 E + S ES E + P k -1
Fall 2000 CH 595C Exam 1 Answer Key Multiple Choice 1. One of the reasons that enzymes are such efficient catalysts is that a) the energy level of the enzyme-transition state complex is much higher than
More informationDental Biochemistry Exam The total number of unique tripeptides that can be produced using all of the common 20 amino acids is
Exam Questions for Dental Biochemistry Monday August 27, 2007 E.J. Miller 1. The compound shown below is CH 3 -CH 2 OH A. acetoacetate B. acetic acid C. acetaldehyde D. produced by reduction of acetaldehyde
More informationGeometric interpretation of signals: background
Geometric interpretation of signals: background David G. Messerschmitt Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-006-9 http://www.eecs.berkeley.edu/pubs/techrpts/006/eecs-006-9.html
More informationBiophysical Society On-line Textbook
Biophysical Society On-line Textbook PROTEINS CHAPTER 1. PROTEIN STRUCTURE Section 1. Primary structure, secondary motifs, tertiary architecture, and quaternary organization Jannette Carey* and Vanessa
More informationQuantifying sequence similarity
Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity
More informationUsing an Artificial Regulatory Network to Investigate Neural Computation
Using an Artificial Regulatory Network to Investigate Neural Computation W. Garrett Mitchener College of Charleston January 6, 25 W. Garrett Mitchener (C of C) UM January 6, 25 / 4 Evolution and Computing
More information(Bio)chemical Proteomics. Alex Kentsis October, 2013
(Bio)chemical Proteomics Alex Kentsis October, 2013 http://alexkentsis.net A brief history of chemical proteomics 1907: Eduard Buchner, demonstration of cell-free alcohol fermentation (i.e. enzymes) 1946:
More informationAnalysis of Relevant Physicochemical Properties in Obligate and Non-obligate Protein-protein Interactions
Analysis of Relevant Physicochemical Properties in Obligate and Non-obligate Protein-protein Interactions Mina Maleki, Md. Mominul Aziz, Luis Rueda School of Computer Science, University of Windsor 401
More information