BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2013 14. Systems Biology Exp. Methods
Overview Transcriptomics Basics of microarrays Comparative analysis Interactomics: yeast two-hybrid system Chromatography Mass Spectrometry Proteomics/Metabolomics LC-MS Differential quantification Peptide identification 2
OMICS Mania http://omics.org/index.php/the_omics_pathway 3
Microarrays Microarrays measure mrna concentration through hybridization to a probe immobilized on a carrier material ( chip ). Hybridization requires complementarity of probe sequence (short oligonucleotides) and the analyte/target sequence. mrna is usually reverse-transcribed into DNA for this purpose. Current microarrays contain a very large number (hundreds of thousands) of these probes arranged in a regular pattern on the chip. 4
Hybridization and Detection http://www.scq.ubc.ca/wp-content/cdnaarray.gif 5
Types of Probes cdna probes Obtained by DNA amplification through PCR Allow the construction of chips and libraries cdna extracted from arbitrary samples can be amplified and then spotted, no knowledge of sequence necessary Custom oligo probes Custom synthesis of oligos Allows the construction of arbitrary chips based on sequence alone Examples: Affymetrix, Illumina 6
Spotted Arrays Spotting robots spot (pipetting or ink jet) pre-synthesized probes onto the chip in a regular pattern This allows the design of microarrays with arbitrary custom probes Low-density arrays (few probes compared to industrial arrays) http://www.digitalapoptosis.com/archives/science/microarray_printer.jpg http://upload.wikimedia.org/wikipedia/commons/0/0a/microarray_printing.ogg 7
In Situ Synthesis In situ synthesis means the synthesis of the probe on the chip itself This can be achieved by photolithography similar to the production of integrated circuits (see figure) Affymetrix arrays are produced in that way ABI uses an in situ synthesis through inkjet application http://www.scq.ubc.ca/wp-content/genechip.gif 8
Affymetrix Arrays Among the most widely-used arrays are those from Affymetrix ( affy chips ) There are many variants with probes covering all genes of human, mouse, rat and various other organisms Up to 500,000 probes per chip, 25 nt per probe Exonarrays contain probes for all exons to detect alternative splicings of genes SNP arrays contain well-known SNP variants and allow the detection of genetic variation 9
Affymetrix Array Synthesis http://www.youtube.com/watch?v=mun54ecfhpw 10
Fluorescence Readout Laser excites fluorescence dyes on chip CCD detects fluorescence signal of the whole chip Different dyes = different colors/channels Quantify fluorescence per probe/spot from the image http://www.affymetrix.com/products/instruments/specific/ht_array_plate_scanner.affx http://www.wi.mit.edu/programs/ask/img/082406b.jpg 11
Bead Arrays (Illumina) http://www.illumina.com/technology/beadarray_technology.ilmn 12
Comparative Analysis http://www.microarray.lu/images/overview_1.jpg 13
Microarray Analysis http://www.youtube.com/watch?v=vnsthmnjkhm& 14
Interactomics: Networks Protein-protein interaction (PPI) networks are graphs containing an edge for each PPI Resulting networks are very complex but contain valuable information for reverse engineering pathways http://www.nature.com/nbt/journal/v20/n10/fig_tab/nbt1002-991_f1.html 15
Yeast Two-Hybrid Screening Yeast two-hybrid (Y2H) is a highthroughput method for interactomics based on genetically engineered yeast To test whether two protein domains (prey and bait) interact, both are expressed as fusion proteins Bait is fused to a DNA-binding domain binding to a promoter region Prey is fused to an activation domain If both interact, they activate a reporter gene (e.g., lacz to create colored colonies) http://www.genscript.com/images/yeast_two_hybrid_system.jpg 16
Yeast Two-Hybrid Screening To test large numbers of interactions, prey and bait constructs are created as separate plasmids (one for bait, one for prey) Screening all combinations of prey and bait against each other results in a complete interaction matrix http://www.nature.com/nature/journal/v422/n6928/fig_tab/nature01512_f2.html 17
Y2H: Problems Y2H has some problems leading to false positives (FP) or false negatives (FN) Bait and prey interaction has to occur in the nucleus of yeast, where they might fold differently than in their natural environment (FN) Bait and prey might not be expressed simultaneously in vivo (FP) Bait and prey might not be located in the same subcellular location and thus not see each other in vivo (FP) 18
Chromatography Chromatography is a separation technique From greek chroma and graphein color and to write Initially developed by Mikhail Semyonovich Tsvet Simple fundamental idea: Two phases: stationary and mobile Analytes are separated while mobile phase passes along the stationary phase Various separation mechanisms, various choices for mobile/stationary phases possible M. S. Tsvet (1872-1919) 19
Column Chromatography http://fig.cox.miami.edu/~cmallery/255/255hist/ecbxp4x3_chrom.jpg 20
Chromatography Liquid chromatography (LC) Mobile phase liquid, stationary phase usually solid Very versatile technique High-Performance Liquid Chromatography (HPLC) for analytical purposes Gas chromatography (GC) Mobile phase is a gas passing over the solid phase Usually at higher temperatures Limited to volatile compounds Others Thin-Layer Chromatography (TLC) Paper Chromatography (PC) 21
HPLC High-Performance Liquid Chromatography (HPLC) Liquid mobile phase/eluent (e.g., water/acetonitrile mixtures) Solid stationary phase (column) Mobile phase is pumped over the column at very high pressure (typically several hundred bar) Sample is injected into the column Analytes separate based on their properties Commonly used: RP (Reversed Phase) HPLC separation based on hydrophobicity Hydrophobic analytes are held back on the column Hydrophilic analytes prefer to spend their time in the mobiles phase and are washed through 22
What is HPLC? pump column (stationary phase) detector mobile phase retention time (RT) 23
How does HPLC work? injection valve pump analyte mixture eluent column detector 24
How does HPLC work? injection valve pump analyte mixture eluent column detector 25
How does HPLC work? injection valve pump analyte mixture eluent column detector 26
How does HPLC work? injection valve pump analyte mixture eluent column detector 27
How does HPLC work? injection valve pump analyte mixture eluent column detector 28
How does HPLC work? injection valve pump analyte mixture eluent column detector 29
How does HPLC work? injection valve pump analyte mixture eluent column detector 30
How does HPLC work? injection valve pump analyte mixture eluent column detector 31
How does HPLC work? injection valve pump analyte mixture eluent column detector 32
HPLC Columns are often miniaturized to very small diameters (mm to a few µm) Typical lengths of columns are 5-20 cm Smaller columns require smaller amounts of the analytes (important for rare/precious biological samples) 33
HPLC http://www.chem.agilent.com/cag/peak/peak3-95/graphics/introducing.gif http://www.uni-saarland.de/fak8/huber/images/institut/ultimate3000a.jpg 34
Mass Spectrometry Mass spectrometry (MS) is an analytical technique to measure the mass (or more precisely: mass-to-charge ratio, m/z) of an analyte Mass spectrometry is a key technique in proteomics and metabolomics, although it has been used for many decades in chemistry and physics The analyte needs to be ionized gentle ionization techniques applicable to biomolecules have only been developed at the end of the 1980s For OMICS analyses MS is usually coupled to a second separation technique (e.g., HPLC for proteomics, HPLC or GC for metabolomics) and used as a detector there ( hyphenated techniques ; HPLC-MS, GC-MS) 35
Principles of Mass Spectrometry Goal: Measure mass (m/z) and abundance of the analytes in a sample mass spectrometer Int ion source mass analyzer detector sample m/z + + + + 2,200x + + + + + 900x 2,900x 36
MALDI Ionization MALDI = Matrix-Assisted Laser Desorption/Ionization a gentle ionization method particularly suited for biomolecules Analyte is embedded into a crystalline matrix (e.g., 3,5- dimethoxy-4-hydroxycinnamic acid) and spotted onto a metal target Laser shoots onto spot, absorbed by matrix, which evaporates and pulls the analyte into gas phase Matrix is acidic and can provide protons that are transferred to the analyte to ionize it http://www.magnet.fsu.edu/education/tutorials/tools/images/ionization-maldi.jpg http://upload.wikimedia.org/wikipedia/commons/c/cb/maldi_target.jpg 37
Electrospray Ionization (ESI) Analyte solution is sprayed into a vacuum through a charged capillary Solvent evaporates and the droplets disperse due to the charge What remains are charged analyte particles Even biomolecules are gently ionized through this technique without breaking ESI is thus the key technique for peptide MS ESI also allows the direct coupling of HPLC to MS (HPLC-MS) http://www.magnet.fsu.edu/education/tutorials/tools/images/ionization_esi.jpg http://upload.wikimedia.org/wikipedia/commons/e/e2/nanoesift.jpg 38
Mass Analyzer: Quadrupole Oscillating electrostatic fields stabilize the flight path for a specific mass-tocharge ratio these ions will pass through the quadrupole Ions with different m/z will be accelerated out of the quadrupole Changing the frequency allows the selection of a different m/z stable ion path unstable ion path 39
Mass Analyzer: Time of Flight Detector - + - + - + - + Drift zone + - Reflectron U a Time-of-flight mass analyzer (TOF): Ions are extracted from the ion source through an electrostatic field in pulses in a field-free drift zone An electrostatic mirror (reflectron) reflects the ions back onto the detector Detector counts the particles and records the time of flight between extraction pulse and a particle hitting the detector 40
Mass Analyzer: Time of Flight Drift tubes have sizes of over a meter in real-world instruments Areflectron doubles the drift length, and thus the instrument s resolution It also focuses the ions onto the detector Drift tube Reflectron http://www.biochem.mpg.de/nigg/research/koerner/instruments/absatz_pic_reflexiii.jpg http://upload.wikimedia.org/wikipedia/commons/thumb/d/d8/reflectron.jpg/800px-reflectron.jpg 41
Mass Analyzer: Time of Flight Detector - + - + - + - + Drift zone + - Reflectron U a The kinetic energy transferred to the ions depends on the acceleration voltage U a and the particle s charge Lighter particles fly faster than heavier particles of the same charge Hence, they arrive earlier at the detector The time of flight is thus a measure of the particle s mass 42
Mass Analyzer: Time of Flight Energy transferred to an ion with charge q accelerated by an electrostatic field with acceleration voltage U a : E pot = qu a This energy is obviously converted into kinetic energy as the ion accelerates: E kin = ½ mv² = qu a For a given path length s from extraction to detector, the time of flight t is thus t = s / v Time of flight for a given path length and acceleration voltage, which are instrument parameters, depends on the ion s charge and mass only 43
LC-ESI-MS/MS System 44
LC-MS http://www.youtube.com/watch?v=mysa2pfuc0y&hd=1 45
MS Technologies Aebersold, Mann, Nature, 2003, 422:198-207 46
Mass Spectrum A mass spectrometer records the ion count as a function of the mass-to-charge ratio (m/z) A mix of ions will cause a broad distribution of mass-to-charge ratios This is called a mass spectrum Intensity m/z 47
Isotope Patterns Natural isotopes occur with well-known abundances About 1% of all carbon atoms is heavier (by 1 Da) than the other 99% Thus some of the peptides have a higher mass Other peptides might even contain more than one heavy carbon atom 12 C 98.90% 13 C 1.10% 14 N 99.63% 15 N 0.37% 16 O 99.76% 17 O 0.04% 18 O 0.20% 1 H 99.98% 2 H 0.02% 48
Estimating the Isotope Pattern Peak intensities correspond to likelihood of occurrence of that mass Likelihood P k for isotope variant k (0: monoisotopic, 1 = + 1 Da etc.) of the isotope pattern to occur determined by binomial distribution The higher the peptide mass, the higher the likelihood of heavier isotope peaks to occur m [Da] P (k=0) P (k=1) P (k=2) P (k=3) P (k=4) 1000 0.55 0.30 0.10 0.02 0.00 2000 0.30 0.33 0.21 0.09 0.03 3000 0.17 0.28 0.25 0.15 0.08 4000 0.09 0.20 0.24 0.19 0.12 49
HPLC-MS Map HPLC-MS produces series of MS spectra The resulting data is called a map a two-dimensional data set Features are all MS peaks belonging to one chemical species (peptide) feature 50
Differential Proteomics samples samples digestion HPLC MS digestion labeling HPLC MS Label-Free Quantification Stable Isotope Labeling maps map 51
SILAC Stable Isotope Labeling with AminoAcidsin CellCulture Mumby, Brekken, Genome Biol (2005), 6:230 52
Identification Once differential features have been quantified, they have to be identified Tandem MS Second MS stage of selected peptides Peptides are fragmented Fragment spectra are unique fingerprints Allows assignment of peptide sequences Differential peptides are mapped back to their differentially expressed parent proteins 53
Tandem Mass Spectrometry MS can be done in two stages: first stage separates ions by m/z Selected ions are then selected, trapped, undergo CID and are then analyzed by a second MS stage These tandem mass spectra or MS/MS spectra allow the identification of the peptides quantified in the first MS stage http://www.nature.com/nrd/journal/v2/n2/full/nrd1011.html 54
Peptide Fragmentation Collision-induced dissociation (CID) allows the fragmentation of molecules through collision with a neutral gas The gas molecules transfer their kinetic energy to the analytes Bond cleavages occur resulting in characteristic fragment ions Peptides fragment preferentially around the peptide backbone This gives rise to several series of fragment ions, where b and y ions are the most common 100 % Intensity 0 250 500 750 1000 m/z 55
Peptide Sequencing From the series of b/y ions (ladders) one can reconstruct the peptide sequence We will discuss algorithms for this later in this lecture SGEFLEEDELK 100 % Intensity 100 % Intensity [M+2H] 2+ y 7 0 250 500 750 1000 m/z 0 y b 5 3 y 2 y b 4 3 y b5 4 b b8 b9 y 6 b 8 7 y 9 250 500 750 1000 m/z 56
Summary Modern high-throughput techniques enable the measurement of transcripts, proteins, and metabolites in a massively parallel fashion Microarrays can quantify oligonucleotide concentrations through fluorescence High-performance liquid chromatography (HPLC) is a powerful separation method used in metabolomics and proteomics Mass spectrometry is a very sensitive and highly accurate technique to identify and quantify peptides and metabolites 57