Ensembl Exercise Answers Adapted from Ensembl tutorials presented by Dr. Bert Overduin, EBI

Size: px
Start display at page:

Download "Ensembl Exercise Answers Adapted from Ensembl tutorials presented by Dr. Bert Overduin, EBI"

Transcription

1 Ensembl Exercise Answers Adapted from Ensembl tutorials presented by Dr. Bert Overduin, EBI Exercise 1 Exploring the human MYH9 gene (a) Go to the Ensembl homepage ( Select Search: Human and type MYH9 gene Click [Go]. Click on Homo sapiens on the page with search results. Click on Gene. Click on Ensembl protein_coding Gene: ENSG (HGNC Symbol: MYH9). Chromosome 22 on the reverse strand. Ensembl has 11 transcripts annotated for this gene. Three transcripts are protein coding. The longest transcript is MYH9-001 and it codes for a protein of 1960 amino acids MYH9-001 has a CCDS record. CCDS is the consensus coding sequence set. These coding sequences (CDS) have been agreed upon by Ensembl, NCBI, UCSC and Havana. The CCDS set is a collection of reviewed, agreed-upon coding sequences (for human mouse). These sequences are high- confidence, and unlikely to change in the future. (b) These are some of the phenotypes associated to MYH9 according to MIM: autosomal dominant deafness, Epstein syndrome, and Fechtner syndrome. Click on any of these for more information in the MIM record itself. (c) Click on ENST It has 41 exons. This is shown in the Transcript summary. Click on Exons in the side menu. Exon 1 is completely untranslated, and exons 2 and 41 are partially untranslated (UTR sequence is shown in purple). You can also see this in the cdna view. Click on General identifiers in the side menu. MYH9-HUMAN from Swiss-Prot matches the Ensembl transcript. Click on it to go to UniProtKB, or click align for the alignment between the Ensembl translation and the Swiss-Prot record. Have a look at Ontology table. The Gene Ontology project ( maps terms to a protein in three classes: biological process, cellular component, and molecular function. Meiotic spindle organization, cell morphogenesis, and cytokinesis are some of the roles associated with MYH (d) Click on Oligo probes in the side menu. Probesets from Affymetrix, Agilent, Codelink, Illumina, and Phalanx match to this transcript sequence. Expression analysis with any of these probesets would reveal information about the transcript. Hint: this information can sometimes be found in the ArrayExpress Atlas: Exercise 2 - Exploring a genomic region in human (a) Go to the Ensembl homepage ( Select Search: Human and type 13: in the for text box (or alternatively leave the Search drop-down list like it is and type human 13: in the for text box). Click [Go]. This genomic region is located on cytogenetic band q13.1. It is made up of seven contigs, indicated by the alternating light and dark blue coloured bars in the Contigs track. (b) Draw with your mouse a box encompassing the BRCA2 transcripts. Click on Jump to region in the pop-up menu. (c) Click [Configure this page] in the side menu. Type clones in the Find a track text box. Select 1Mb clone set, 32k clone set and Tilepath. Click ( ).There is not one single clone only that contains the complete BRCA2 gene. For example clone RP11-37E23 contains most of the gene, but not its very 3 end. This was reflected on the two contigs needed to make up the BRCA2 gene (the Contigs

2 track is on by default). (d) Click [Configure this page] in the side menu. Type refseq in the Find a track text box. Select Human RefSeq import Expanded with labels. Click ( ).Click on individual transcript models (RefSeq or otherwise) to retrieve more information about them.there has been one transcript annotated by RefSeq for the BRCA2 gene, i.e. NM_ This transcript is almost identical to Ensembl transcript BRCA2-001 (ENST ). Both encode a 3418 aa protein but the RefSeq transcript is shorter at the 5 UTR and longer at the 5 UTR. (e) Click [Export data] in the side menu. Click [Next>]. Click on Text.Note that the sequence has a header that provides information about the genome assembly (GRCh37), the chromosome number, the start and end coordinates and the strand. For example:>13 dna:chromosome chromosome:grch37:13: : :1 (f) Click [Configure this page] in the side menu. Click [Reset configuration]. Click ( ). Exercise 3 Exploring a sequence variant (human) (a) Go to the Ensembl homepage ( Select Search: Human and type f5 in the for text box. Click [Go]. Click on Variation table under F5 (Human Gene). Click on Show for Missense variant in the Summary of variation consequences in ENSG table. Type 534 in the Filter text box. The dbsnp accession number for the Arg534Gln (Q/R) variant is rs6025. Note that HGVS (Human Genome Variation Society) notations are not by default shown in the table. They can be added as follows: Click on Configure this page in the side menu. Click on Consequence options. Check Show HGVS notations. Click( ) (b) rs6025 is supported by all six possible types of evidence (represented by icons), i.e. Multiple observations (the variant has multiple independent dbsnp submissions, i.e. submissions with different submitter handles or different discovery samples), Frequency (the variant is reported to be polymorphic in at least one sample), HapMap (the variant is polymorphic in at least one HapMap panel), 1000 Genomes (the variant was discovered in the 1000 Genomes Project), Cited (dbsnp holds a citation from PubMed for the variant) and ESP (the variant was discovered in the Exome Sequencing Project). (c) Click on rs6025.no, rs6025 is missense for two F5 transcripts. It is 3 prime UTR for one F5 transcript, i.e. ENST Note that in total four transcripts have been annotated for the F5 gene: (d) In Ensembl the alleles of rs6025 are given as T and C, because these are the alleles in the forward strand of the genome. In dbsnp the alleles are given as A and G because the person(s) who submitted this variant apparently had sequenced the reverse strand of the genome. In literature the alleles are mostly given as A and G, because the F5 gene is located on the reverse strand of the genome, thus the alleles in the actual gene and transcript sequences are A and G. (e) Ensembl puts the allele that is present in the GRCh37 reference genome first, i.e. T (forward strand). In the case of rs6025 this is the minor allele. That the reference genome can contain the minor allele for a variant is because it is an amalgamation of the genomes of just a few individuals and not a reference in the sense of a representation of what is most common in the human population as a whole. In the literature normally the major allele (in the population of interest) is put first.

3 (f) rs6025 is predicted to be tolerated and benign according to SIFT and PolyPhen, because they predict the effect of the change from reference allele to alternate allele, i.e. from T (minor allele) to C (major allele). (g) Click on Population genetics in the side menu. Yes, there is ethnic variation in the frequency of the T allele. Among the 1000 Genomes populations studied, it ranges from 0 in the various African and East Asian populations to in the CEU (Utah Residents (CEPH) with Northern and Western European ancestry) population. (h) Click on Phenotype Data in the side menu. rs6025 has been associated with a number of different phenotypes, i.e. venous thromboembolism, susceptibility to Budd-Chiari syndrome, recurrent abortion, thrombophilia due to activated protein C resistance, thrombophilia due to factor V Leiden and susceptibility to ischemic stroke. (i) Click on Phylogenetic Context in the side menu. Gorilla, orangutan, macaque and marmoset all have a C in this position, which confirms that C is indeed the ancestral allele. (j) Go to the Neandertal Genome Browser ( rs6025 in the Search Neandertal text box. Click [Go]. Click on rs6025. Click on Jump to region in detail. Click on Configure this page in the side menu. Click on Variation features. Select All variations Normal. Click [SAVE and close]. Draw a box of about 50 bp around rs6025 (shown in yellow in the center of the display). Click on Jump to region in the pop-up menu. The Sequences track shows that there are five reads for Neandertal at the position of rs6025, four with a C and one with a T. However, the T is at the very end of a sequence read and can be therefore of questionable quality. So, all in all, there is not enough proof that the T allele was already present in Neandertal. Exercise 4 Orthologues, paralogues and gene trees (human) (a) Go to the Ensembl homepage ( 8 Select Search: Human and type long wave sensitive opsin in the for text box. 8 Click [Go].Click on OPN1LW (Human Gene). Note that LW in the gene symbol OPN1LW stands for long-wave. (b) Click on Comparative Genomics - Paralogues in the side menu. Nine within-species paralogues have been identified for the human OPN1LW gene. According to the Target and Query %id, the proteins encoded by the genes ENSG (OPN1MW2) and ENSG (OPN1MW), i.e. the medium-wave-sensitive (green) opsins, show the highest sequence similarity to red opsin (Target %id indicates the percentage of the sequence of red opsin matching the sequence of the paralogue protein. Query %id indicates the percentage of the sequence of the paralogue protein matching the sequence of red opsin). (c) Click on the Location: X:153,409, ,424,507 tab. The OPN1LW (red opsin) and OPN1MW and OPN1MW2 (green opsin) genes are located next to each other on the X chromosome, while the OPN1SW (blue opsin) gene is located on chromosome 7. As females have two X chromosomes a normal gene on one chromosome can often make up for a defective one on the other, whereas males cannot make up for a defective gene. Thus, red-green colour blindness is much more prevalent in males than in females. Variation in the genes for red and green opsin can cause subtle differences in colour perception, while tandem rearrangements due to unequal crossing-over between these genes cause more serious defects in colour vision. (d) Click on the Gene: OPN1LW tab. Click on Comparative Genomics - Gene tree (image) in the side menu. Click on View options: View paralogs of current gene below the gene tree image. Click on the nodes (red squares) for the duplication events that have given rise to the various paralogues. A duplication event on the level of the Catarrhini (Apes and Old World monkeys) has given rise to the OPN1LW (red opsin) and OPN1MW and OPN1MW2 (green opsin) genes. The other paralogues are due

4 to earlier duplication events. This agrees with the fact that the green opsins show the highest sequence similarity with red opsin (see question b) and the fact that the genes for the red and green opsins are located close to each other on the genome (see question c). Note: On the Paralogues page nine paralogues are shown (see question b). Five of these are of the type other paralogue. These are paralogues that are too distant to be in the same gene tree, but can still be related as part of a broader super-family. Therefore, the gene tree for the OPN1LW gene only shows four of its nine paralogues. The precise taxonomic level of duplication for the other paralogues is left as undetermined. (e) Click on the speciation node (blue square) that is at the base of the complete gene tree. Click on Expand for Jalview in the pop-up menu (that should say Taxon: Chordates ). Click [Start Jalview]. Close the pop-up window with the gene tree. Click on Select > Select all on the menu bar of the popup window with the protein sequence alignment. Click on Calculate > Sort > by ID on the menu bar. Select the protein sequences of the human paralogues. Click on Select > Invert Sequence Selection on the menu bar. Click on Edit > Delete on the menu bar. As the alignment is based on the complete set of protein sequences in the gene tree, the alignment of this subset of five proteins will contain empty columns. These can be removed using the option Edit > Remove Empty Columns on the menu bar. Click on Edit > Remove Empty Columns on the menu bar. Exercise 5 BioMart Go to the Ensembl homepage ( Click on the BioMart link on the toolbar. Start with all human Ensembl genes:choose the Ensembl Genes 73 database. Choose the Homo sapiens genes (GRCh37.p12) dataset. Now, filter for the genes on the Y chromosome:click on Filters in the left panel. Expand the REGION section by clicking on the + box. Select Chromosome Y. Make sure the check box in front of the filter is ticked otherwise the filter won t work. Click the [Count] button on the toolbar. This should give you 506 / Genes. Now filter further for genes that are protein-coding:expand the GENE section by clicking on the + box. Select Gene type protein_coding. Click the [Count] button on the toolbar. This should give you 54 / Genes. Finally, filter for genes that encode proteins containing one or more transmembrane domains:expand the PROTEIN DOMAINS section by clicking on the + box. Select Transmembrane domains Only.Click the [Count] button on the toolbar. This should give you 4 / Genes. Specify the attributes to be included in the output (note that a number of attributes will already be selected by default): Click on Attributes in the left panel. Expand the GENE section by clicking on the + box. Select, in addition to the attributes Ensembl Gene ID and Ensembl Transcript ID that are already selected, for instance Associated Gene Name and Description. Have a look at a preview of the results (only 10 rows of the results will be shown):click the [Results] button on the toolbar. If you are happy with how the results look in the preview, output all the results:select View All rows as HTML or export all results to a file. Note: When you select View All rows as HTML, your results will be shown under a new tab or in a new window in your Internet browser.

5 Although you have filtered for only four genes, your results will contain more than four rows. This is because several of the genes have more than one transcript that encodes for a protein containing one or more transmembrane domains and consequently the results contain a separate row for each of these transcripts.

Browsing Genomic Information with Ensembl Plants

Browsing Genomic Information with Ensembl Plants Browsing Genomic Information with Ensembl Plants Etienne de Villiers, PhD (Adapted from slides by Bert Overduin EMBL-EBI) Outline of workshop Brief introduction to Ensembl Plants History Content Tutorial

More information

Synteny Portal Documentation

Synteny Portal Documentation Synteny Portal Documentation Synteny Portal is a web application portal for visualizing, browsing, searching and building synteny blocks. Synteny Portal provides four main web applications: SynCircos,

More information

Comparing whole genomes

Comparing whole genomes BioNumerics Tutorial: Comparing whole genomes 1 Aim The Chromosome Comparison window in BioNumerics has been designed for large-scale comparison of sequences of unlimited length. In this tutorial you will

More information

Using Bioinformatics to Study Evolutionary Relationships Instructions

Using Bioinformatics to Study Evolutionary Relationships Instructions 3 Using Bioinformatics to Study Evolutionary Relationships Instructions Student Researcher Background: Making and Using Multiple Sequence Alignments One of the primary tasks of genetic researchers is comparing

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

Student Handout Fruit Fly Ethomics & Genomics

Student Handout Fruit Fly Ethomics & Genomics Student Handout Fruit Fly Ethomics & Genomics Summary of Laboratory Exercise In this laboratory unit, students will connect behavioral phenotypes to their underlying genes and molecules in the model genetic

More information

Hands-On Nine The PAX6 Gene and Protein

Hands-On Nine The PAX6 Gene and Protein Hands-On Nine The PAX6 Gene and Protein Main Purpose of Hands-On Activity: Using bioinformatics tools to examine the sequences, homology, and disease relevance of the Pax6: a master gene of eye formation.

More information

The MANTiS Manual. Contents. MANTiS Version 1.1

The MANTiS Manual. Contents. MANTiS Version 1.1 The MANTiS Manual MANTiS Version 1.1 Contents Connection to the MANTiS database... 2 Memory settings... 2 Main functionalities... 2 Character Mapping View... 4 Genome content View... 5 Biological processes

More information

training workshop 2015

training workshop 2015 TransPLANT user training workshop 2015 Slides: http://tinyurl.com/transplant2015 Workshop on variation data EMBL-EBI Hinxton-UK 2nd July 2015 Ensembl Genomes Team Notes: This workshop is based on Ensembl

More information

MegAlign Pro Pairwise Alignment Tutorials

MegAlign Pro Pairwise Alignment Tutorials MegAlign Pro Pairwise Alignment Tutorials All demo data for the following tutorials can be found in the MegAlignProAlignments.zip archive here. Tutorial 1: Multiple versus pairwise alignments 1. Extract

More information

Emily Blanton Phylogeny Lab Report May 2009

Emily Blanton Phylogeny Lab Report May 2009 Introduction It is suggested through scientific research that all living organisms are connected- that we all share a common ancestor and that, through time, we have all evolved from the same starting

More information

BIOINFORMATICS LAB AP BIOLOGY

BIOINFORMATICS LAB AP BIOLOGY BIOINFORMATICS LAB AP BIOLOGY Bioinformatics is the science of collecting and analyzing complex biological data. Bioinformatics combines computer science, statistics and biology to allow scientists to

More information

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010 BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for

More information

Bioinformatics Exercises

Bioinformatics Exercises Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted

More information

Space Objects. Section. When you finish this section, you should understand the following:

Space Objects. Section. When you finish this section, you should understand the following: GOLDMC02_132283433X 8/24/06 2:21 PM Page 97 Section 2 Space Objects When you finish this section, you should understand the following: How to create a 2D Space Object and label it with a Space Tag. How

More information

SeeSAR 7.1 Beginners Guide. June 2017

SeeSAR 7.1 Beginners Guide. June 2017 SeeSAR 7.1 Beginners Guide June 2017 Part 1: Basics 1 Type a pdb code and press return or Load your own protein or already existing project, or Just load molecules To begin, let s type 2zff and download

More information

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES

RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES Molecular Biology-2018 1 Definitions: RELATIONSHIPS BETWEEN GENES/PROTEINS HOMOLOGUES Heterologues: Genes or proteins that possess different sequences and activities. Homologues: Genes or proteins that

More information

RGP finder: prediction of Genomic Islands

RGP finder: prediction of Genomic Islands Training courses on MicroScope platform RGP finder: prediction of Genomic Islands Dynamics of bacterial genomes Gene gain Horizontal gene transfer Gene loss Deletion of one or several genes Duplication

More information

Open a Word document to record answers to any italicized questions. You will the final document to me at

Open a Word document to record answers to any italicized questions. You will  the final document to me at Molecular Evidence for Evolution Open a Word document to record answers to any italicized questions. You will email the final document to me at tchnsci@yahoo.com Pre Lab Activity: Genes code for amino

More information

Ligand Scout Tutorials

Ligand Scout Tutorials Ligand Scout Tutorials Step : Creating a pharmacophore from a protein-ligand complex. Type ke6 in the upper right area of the screen and press the button Download *+. The protein will be downloaded and

More information

GEP Annotation Report

GEP Annotation Report GEP Annotation Report Note: For each gene described in this annotation report, you should also prepare the corresponding GFF, transcript and peptide sequence files as part of your submission. Student name:

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

ISIS/Draw "Quick Start"

ISIS/Draw Quick Start ISIS/Draw "Quick Start" Click to print, or click Drawing Molecules * Basic Strategy 5.1 * Drawing Structures with Template tools and template pages 5.2 * Drawing bonds and chains 5.3 * Drawing atoms 5.4

More information

1. Understand the methods for analyzing population structure in genomes

1. Understand the methods for analyzing population structure in genomes MSCBIO 2070/02-710: Computational Genomics, Spring 2016 HW3: Population Genetics Due: 24:00 EST, April 4, 2016 by autolab Your goals in this assignment are to 1. Understand the methods for analyzing population

More information

Project Manual Bio3055. Apoptosis: Caspase-1

Project Manual Bio3055. Apoptosis: Caspase-1 Project Manual Bio3055 Apoptosis: Caspase-1 Bednarski 2003 Funded by HHMI Apoptosis: Caspase-1 Introduction: Apoptosis is another name for programmed cell death. It is a series of events in a cell that

More information

A Browser for Pig Genome Data

A Browser for Pig Genome Data A Browser for Pig Genome Data Thomas Mailund January 2, 2004 This report briefly describe the blast and alignment data available at http://www.daimi.au.dk/ mailund/pig-genome/ hits.html. The report describes

More information

NMR Predictor. Introduction

NMR Predictor. Introduction NMR Predictor This manual gives a walk-through on how to use the NMR Predictor: Introduction NMR Predictor QuickHelp NMR Predictor Overview Chemical features GUI features Usage Menu system File menu Edit

More information

VCell Tutorial. Building a Rule-Based Model

VCell Tutorial. Building a Rule-Based Model VCell Tutorial Building a Rule-Based Model We will demonstrate how to create a rule-based model of EGFR receptor interaction with two adapter proteins Grb2 and Shc. A Receptor-monomer reversibly binds

More information

Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST

Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST Investigation 3: Comparing DNA Sequences to Understand Evolutionary Relationships with BLAST Introduction Bioinformatics is a powerful tool which can be used to determine evolutionary relationships and

More information

Lesson Plan 2 - Middle and High School Land Use and Land Cover Introduction. Understanding Land Use and Land Cover using Google Earth

Lesson Plan 2 - Middle and High School Land Use and Land Cover Introduction. Understanding Land Use and Land Cover using Google Earth Understanding Land Use and Land Cover using Google Earth Image an image is a representation of reality. It can be a sketch, a painting, a photograph, or some other graphic representation such as satellite

More information

Introduction to Bioinformatics Online Course: IBT

Introduction to Bioinformatics Online Course: IBT Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec1 Building a Multiple Sequence Alignment Learning Outcomes 1- Understanding Why multiple

More information

Relative Photometry with data from the Peter van de Kamp Observatory D. Cohen and E. Jensen (v.1.0 October 19, 2014)

Relative Photometry with data from the Peter van de Kamp Observatory D. Cohen and E. Jensen (v.1.0 October 19, 2014) Relative Photometry with data from the Peter van de Kamp Observatory D. Cohen and E. Jensen (v.1.0 October 19, 2014) Context This document assumes familiarity with Image reduction and analysis at the Peter

More information

Comparing Genomes! Homologies and Families! Sequence Alignments!

Comparing Genomes! Homologies and Families! Sequence Alignments! Comparing Genomes! Homologies and Families! Sequence Alignments! Allows us to achieve a greater understanding of vertebrate evolution! Tells us what is common and what is unique between different species

More information

Homology and Information Gathering and Domain Annotation for Proteins

Homology and Information Gathering and Domain Annotation for Proteins Homology and Information Gathering and Domain Annotation for Proteins Outline Homology Information Gathering for Proteins Domain Annotation for Proteins Examples and exercises The concept of homology The

More information

ST-Links. SpatialKit. Version 3.0.x. For ArcMap. ArcMap Extension for Directly Connecting to Spatial Databases. ST-Links Corporation.

ST-Links. SpatialKit. Version 3.0.x. For ArcMap. ArcMap Extension for Directly Connecting to Spatial Databases. ST-Links Corporation. ST-Links SpatialKit For ArcMap Version 3.0.x ArcMap Extension for Directly Connecting to Spatial Databases ST-Links Corporation www.st-links.com 2012 Contents Introduction... 3 Installation... 3 Database

More information

Mathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007

Mathangi Thiagarajan Rice Genome Annotation Workshop May 23rd, 2007 -2 Transcript Alignment Assembly and Automated Gene Structure Improvements Using PASA-2 Mathangi Thiagarajan mathangi@jcvi.org Rice Genome Annotation Workshop May 23rd, 2007 About PASA PASA is an open

More information

Androgen-independent prostate cancer

Androgen-independent prostate cancer The following tutorial walks through the identification of biological themes in a microarray dataset examining androgen-independent. Visit the GeneSifter Data Center (www.genesifter.net/web/datacenter.html)

More information

Geodatabases and ArcCatalog

Geodatabases and ArcCatalog Geodatabases and ArcCatalog Prepared by Francisco Olivera, Ph.D. and Srikanth Koka Department of Civil Engineering Texas A&M University February 2004 Contents Brief Overview of Geodatabases Goals of the

More information

Data Structures & Database Queries in GIS

Data Structures & Database Queries in GIS Data Structures & Database Queries in GIS Objective In this lab we will show you how to use ArcGIS for analysis of digital elevation models (DEM s), in relationship to Rocky Mountain bighorn sheep (Ovis

More information

Performing a Pharmacophore Search using CSD-CrossMiner

Performing a Pharmacophore Search using CSD-CrossMiner Table of Contents Introduction... 2 CSD-CrossMiner Terminology... 2 Overview of CSD-CrossMiner... 3 Searching with a Pharmacophore... 4 Performing a Pharmacophore Search using CSD-CrossMiner Version 2.0

More information

Homology. and. Information Gathering and Domain Annotation for Proteins

Homology. and. Information Gathering and Domain Annotation for Proteins Homology and Information Gathering and Domain Annotation for Proteins Outline WHAT IS HOMOLOGY? HOW TO GATHER KNOWN PROTEIN INFORMATION? HOW TO ANNOTATE PROTEIN DOMAINS? EXAMPLES AND EXERCISES Homology

More information

User Guide. Affirmatively Furthering Fair Housing Data and Mapping Tool. U.S. Department of Housing and Urban Development

User Guide. Affirmatively Furthering Fair Housing Data and Mapping Tool. U.S. Department of Housing and Urban Development User Guide Affirmatively Furthering Fair Housing Data and Mapping Tool U.S. Department of Housing and Urban Development December, 2015 1 Table of Contents 1. Getting Started... 5 1.1 Software Version...

More information

- conserved in Eukaryotes. - proteins in the cluster have identifiable conserved domains. - human gene should be included in the cluster.

- conserved in Eukaryotes. - proteins in the cluster have identifiable conserved domains. - human gene should be included in the cluster. NCBI BLAST Services DELTA-BLAST BLAST (http://blast.ncbi.nlm.nih.gov/), Basic Local Alignment Search tool, is a suite of programs for finding similarities between biological sequences. DELTA-BLAST is a

More information

PDF-4+ Tools and Searches

PDF-4+ Tools and Searches PDF-4+ Tools and Searches PDF-4+ 2019 The PDF-4+ 2019 database is powered by our integrated search display software. PDF-4+ 2019 boasts 74 search selections coupled with 126 display fields resulting in

More information

Supporting Information

Supporting Information Supporting Information Das et al. 10.1073/pnas.1302500110 < SP >< LRRNT > < LRR1 > < LRRV1 > < LRRV2 Pm-VLRC M G F V V A L L V L G A W C G S C S A Q - R Q R A C V E A G K S D V C I C S S A T D S S P E

More information

ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009

ICM-Chemist How-To Guide. Version 3.6-1g Last Updated 12/01/2009 ICM-Chemist How-To Guide Version 3.6-1g Last Updated 12/01/2009 ICM-Chemist HOW TO IMPORT, SKETCH AND EDIT CHEMICALS How to access the ICM Molecular Editor. 1. Click here 2. Start sketching How to sketch

More information

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner Table of Contents Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner Introduction... 2 CSD-CrossMiner Terminology... 2 Overview of CSD-CrossMiner... 3 Features

More information

OECD QSAR Toolbox v.4.1. Step-by-step example for building QSAR model

OECD QSAR Toolbox v.4.1. Step-by-step example for building QSAR model OECD QSAR Toolbox v.4.1 Step-by-step example for building QSAR model Background Objectives The exercise Workflow of the exercise Outlook 2 Background This is a step-by-step presentation designed to take

More information

You w i ll f ol l ow these st eps : Before opening files, the S c e n e panel is active.

You w i ll f ol l ow these st eps : Before opening files, the S c e n e panel is active. You w i ll f ol l ow these st eps : A. O pen a n i m a g e s t a c k. B. Tr a c e t h e d e n d r i t e w i t h t h e user-guided m ode. C. D e t e c t t h e s p i n e s a u t o m a t i c a l l y. D. C

More information

Exercises for Windows

Exercises for Windows Exercises for Windows CAChe User Interface for Windows Select tool Application window Document window (workspace) Style bar Tool palette Select entire molecule Select Similar Group Select Atom tool Rotate

More information

Tutorial 12 Excess Pore Pressure (B-bar method) Undrained loading (B-bar method) Initial pore pressure Excess pore pressure

Tutorial 12 Excess Pore Pressure (B-bar method) Undrained loading (B-bar method) Initial pore pressure Excess pore pressure Tutorial 12 Excess Pore Pressure (B-bar method) Undrained loading (B-bar method) Initial pore pressure Excess pore pressure Introduction This tutorial will demonstrate the Excess Pore Pressure (Undrained

More information

Electric Fields and Equipotentials

Electric Fields and Equipotentials OBJECTIVE Electric Fields and Equipotentials To study and describe the two-dimensional electric field. To map the location of the equipotential surfaces around charged electrodes. To study the relationship

More information

Introduction to simulation databases with ADQL and Topcat

Introduction to simulation databases with ADQL and Topcat Introduction to simulation databases with ADQL and Topcat Kristin Riebe, GAVO July 05, 2016 Introduction Simulation databases like the Millennium Database or CosmoSim contain data sets from cosmological

More information

PDF-2 Tools and Searches

PDF-2 Tools and Searches PDF-2 Tools and Searches PDF-2 2019 The PDF-2 2019 database is powered by our integrated search display software. PDF-2 2019 boasts 69 search selections coupled with 53 display fields resulting in a nearly

More information

SECOORA Data Portal Exercises

SECOORA Data Portal Exercises SECOORA Data Portal Exercises Exercise #1: April 2018- Carolina Storm using Historic Real-time Sensor Exercise #2: Exploration of Data Trends for Estuarine Fish Abundance and Sea Surface Temperature Exercise

More information

OECD QSAR Toolbox v.4.1. Tutorial illustrating new options for grouping with metabolism

OECD QSAR Toolbox v.4.1. Tutorial illustrating new options for grouping with metabolism OECD QSAR Toolbox v.4.1 Tutorial illustrating new options for grouping with metabolism Outlook Background Objectives Specific Aims The exercise Workflow 2 Background Grouping with metabolism is a procedure

More information

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build a userdefined

OECD QSAR Toolbox v.3.3. Step-by-step example of how to build a userdefined OECD QSAR Toolbox v.3.3 Step-by-step example of how to build a userdefined QSAR Background Objectives The exercise Workflow of the exercise Outlook 2 Background This is a step-by-step presentation designed

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES HOW CAN BIOINFORMATICS BE USED AS A TOOL TO DETERMINE EVOLUTIONARY RELATIONSHPS AND TO BETTER UNDERSTAND PROTEIN HERITAGE?

More information

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution

Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution Bioinformatics: Investigating Molecular/Biochemical Evidence for Evolution Background How does an evolutionary biologist decide how closely related two different species are? The simplest way is to compare

More information

PDF-4+ Tools and Searches

PDF-4+ Tools and Searches PDF-4+ Tools and Searches PDF-4+ 2018 The PDF-4+ 2018 database is powered by our integrated search display software. PDF-4+ 2018 boasts 72 search selections coupled with 125 display fields resulting in

More information

Consents Resource Consents Map

Consents Resource Consents Map Consents Resource Consents Map Select the map from the Maps introduction page http://www.waikatoregion.govt.nz/maps/ If you have the map browser open the Resource Consents map will also display when selected

More information

Please click the link below to view the YouTube video offering guidance to purchasers:

Please click the link below to view the YouTube video offering guidance to purchasers: Guide Contents: Video Guide What is Quick Quote? Quick Quote Access Levels Your Quick Quote Control Panel How do I create a Quick Quote? How do I Distribute a Quick Quote? How do I Add Suppliers to a Quick

More information

(THIS IS AN OPTIONAL BUT WORTHWHILE EXERCISE)

(THIS IS AN OPTIONAL BUT WORTHWHILE EXERCISE) PART 2: Analysis in ArcGIS (THIS IS AN OPTIONAL BUT WORTHWHILE EXERCISE) Step 1: Start ArcCatalog and open a geodatabase If you have a shortcut icon for ArcCatalog on your desktop, double-click it to start

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Spatial Data Analysis in Archaeology Anthropology 589b. Kriging Artifact Density Surfaces in ArcGIS

Spatial Data Analysis in Archaeology Anthropology 589b. Kriging Artifact Density Surfaces in ArcGIS Spatial Data Analysis in Archaeology Anthropology 589b Fraser D. Neiman University of Virginia 2.19.07 Spring 2007 Kriging Artifact Density Surfaces in ArcGIS 1. The ingredients. -A data file -- in.dbf

More information

1 Introduction. Abstract

1 Introduction. Abstract CBS 530 Assignment No 2 SHUBHRA GUPTA shubhg@asu.edu 993755974 Review of the papers: Construction and Analysis of a Human-Chimpanzee Comparative Clone Map and Intra- and Interspecific Variation in Primate

More information

Presenting Tree Inventory. Tomislav Sapic GIS Technologist Faculty of Natural Resources Management Lakehead University

Presenting Tree Inventory. Tomislav Sapic GIS Technologist Faculty of Natural Resources Management Lakehead University Presenting Tree Inventory Tomislav Sapic GIS Technologist Faculty of Natural Resources Management Lakehead University Suggested Options 1. Print out a Google Maps satellite image of the inventoried block

More information

Quality Measures (QM) Report. Self Guided Tutorial

Quality Measures (QM) Report. Self Guided Tutorial Quality Measures (QM) Report Self Guided Tutorial 1 Tutorial Contents Overview of the QM Online Report Facility Summary Report Resident Drill down Monthly Trend Report Resident Roster Report Printing Reports/Export

More information

Computer simulation of radioactive decay

Computer simulation of radioactive decay Computer simulation of radioactive decay y now you should have worked your way through the introduction to Maple, as well as the introduction to data analysis using Excel Now we will explore radioactive

More information

Session 5: Phylogenomics

Session 5: Phylogenomics Session 5: Phylogenomics B.- Phylogeny based orthology assignment REMINDER: Gene tree reconstruction is divided in three steps: homology search, multiple sequence alignment and model selection plus tree

More information

Supplementary Information

Supplementary Information Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively

More information

Task 1: Start ArcMap and add the county boundary data from your downloaded dataset to the data frame.

Task 1: Start ArcMap and add the county boundary data from your downloaded dataset to the data frame. Exercise 6 Coordinate Systems and Map Projections The following steps describe the general process that you will follow to complete the exercise. Specific steps will be provided later in the step-by-step

More information

Sequences, Structures, and Gene Regulatory Networks

Sequences, Structures, and Gene Regulatory Networks Sequences, Structures, and Gene Regulatory Networks Learning Outcomes After this class, you will Understand gene expression and protein structure in more detail Appreciate why biologists like to align

More information

Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network

Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network Systematic prediction of gene function in Arabidopsis thaliana using a probabilistic functional gene network Sohyun Hwang 1, Seung Y Rhee 2, Edward M Marcotte 3,4 & Insuk Lee 1 protocol 1 Department of

More information

85. Geo Processing Mineral Liberation Data

85. Geo Processing Mineral Liberation Data Research Center, Pori / Pertti Lamberg 15023-ORC-J 1 (23) 85. Geo Processing Mineral Liberation Data 85.1. Introduction The Mineral Liberation Analyzer, MLA, is an automated mineral analysis system that

More information

Tutorial. Getting started. Sample to Insight. March 31, 2016

Tutorial. Getting started. Sample to Insight. March 31, 2016 Getting started March 31, 2016 Sample to Insight CLC bio, a QIAGEN Company Silkeborgvej 2 Prismet 8000 Aarhus C Denmark Telephone: +45 70 22 32 44 www.clcbio.com support-clcbio@qiagen.com Getting started

More information

Appendix B Microsoft Office Specialist exam objectives maps

Appendix B Microsoft Office Specialist exam objectives maps B 1 Appendix B Microsoft Office Specialist exam objectives maps This appendix covers these additional topics: A Excel 2003 Specialist exam objectives with references to corresponding material in Course

More information

Journal of Proteomics & Bioinformatics - Open Access

Journal of Proteomics & Bioinformatics - Open Access Abstract Methodology for Phylogenetic Tree Construction Kudipudi Srinivas 2, Allam Appa Rao 1, GR Sridhar 3, Srinubabu Gedela 1* 1 International Center for Bioinformatics & Center for Biotechnology, Andhra

More information

Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe

Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe ATLAS W path Instructions for tutors Version from 2 February 2018 Uta Bilow, Carsten Bittrich, Constanze Hasterok, Konrad Jende, Michael Kobel, Christian Rudolph, Felix Socher, Julia Woithe Technische

More information

Introduction to ArcGIS 10.2

Introduction to ArcGIS 10.2 Introduction to ArcGIS 10.2 Francisco Olivera, Ph.D., P.E. Srikanth Koka Lauren Walker Aishwarya Vijaykumar Keri Clary Department of Civil Engineering April 21, 2014 Contents Brief Overview of ArcGIS 10.2...

More information

Geodatabases and ArcCatalog

Geodatabases and ArcCatalog Geodatabases and ArcCatalog Francisco Olivera, Ph.D., P.E. Srikanth Koka Lauren Walker Aishwarya Vijaykumar Keri Clary Department of Civil Engineering April 21, 2014 Contents Geodatabases and ArcCatalog...

More information

Data Mining with the PDF-4 Databases. FeO Non-stoichiometric Oxides

Data Mining with the PDF-4 Databases. FeO Non-stoichiometric Oxides Data Mining with the PDF-4 Databases FeO Non-stoichiometric Oxides This is one of three example-based tutorials for using the data mining capabilities of the PDF-4+ database and it covers the following

More information

M E R C E R W I N WA L K T H R O U G H

M E R C E R W I N WA L K T H R O U G H H E A L T H W E A L T H C A R E E R WA L K T H R O U G H C L I E N T S O L U T I O N S T E A M T A B L E O F C O N T E N T 1. Login to the Tool 2 2. Published reports... 7 3. Select Results Criteria...

More information

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi)

Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction. Lesser Tenrec (Echinops telfairi) Phylogenetics - Orthology, phylogenetic experimental design and phylogeny reconstruction Lesser Tenrec (Echinops telfairi) Goals: 1. Use phylogenetic experimental design theory to select optimal taxa to

More information

OECD QSAR Toolbox v.4.0. Tutorial on how to predict Skin sensitization potential taking into account alert performance

OECD QSAR Toolbox v.4.0. Tutorial on how to predict Skin sensitization potential taking into account alert performance OECD QSAR Toolbox v.4.0 Tutorial on how to predict Skin sensitization potential taking into account alert performance Outlook Background Objectives Specific Aims Read across and analogue approach The exercise

More information

Watershed Modeling Orange County Hydrology Using GIS Data

Watershed Modeling Orange County Hydrology Using GIS Data v. 10.0 WMS 10.0 Tutorial Watershed Modeling Orange County Hydrology Using GIS Data Learn how to delineate sub-basins and compute soil losses for Orange County (California) hydrologic modeling Objectives

More information

How many states. Record high temperature

How many states. Record high temperature Record high temperature How many states Class Midpoint Label 94.5 99.5 94.5-99.5 0 97 99.5 104.5 99.5-104.5 2 102 102 104.5 109.5 104.5-109.5 8 107 107 109.5 114.5 109.5-114.5 18 112 112 114.5 119.5 114.5-119.5

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

Automatic Watershed Delineation using ArcSWAT/Arc GIS

Automatic Watershed Delineation using ArcSWAT/Arc GIS Automatic Watershed Delineation using ArcSWAT/Arc GIS By: - Endager G. and Yalelet.F 1. Watershed Delineation This tool allows the user to delineate sub watersheds based on an automatic procedure using

More information

Cerno Application Note Extending the Limits of Mass Spectrometry

Cerno Application Note Extending the Limits of Mass Spectrometry Creation of Accurate Mass Library for NIST Database Search Novel MS calibration has been shown to enable accurate mass and elemental composition determination on quadrupole GC/MS systems for either molecular

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

OECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance

OECD QSAR Toolbox v.4.1. Tutorial on how to predict Skin sensitization potential taking into account alert performance OECD QSAR Toolbox v.4.1 Tutorial on how to predict Skin sensitization potential taking into account alert performance Outlook Background Objectives Specific Aims Read across and analogue approach The exercise

More information

85. Geo Processing Mineral Liberation Data

85. Geo Processing Mineral Liberation Data Research Center, Pori / Pertti Lamberg 14024-ORC-J 1 (23) 85. Geo Processing Mineral Liberation Data 85.1. Introduction The Mineral Liberation Analyzer, MLA, is an automated mineral analysis system that

More information

Practical considerations of working with sequencing data

Practical considerations of working with sequencing data Practical considerations of working with sequencing data File Types Fastq ->aligner -> reference(genome) coordinates Coordinate files SAM/BAM most complete, contains all of the info in fastq and more!

More information

The Quantizing functions

The Quantizing functions The Quantizing functions What is quantizing? Quantizing in its fundamental form is a function that automatically moves recorded notes, positioning them on exact note values: For example, if you record

More information

A Database of human biological pathways

A Database of human biological pathways A Database of human biological pathways Steve Jupe - sjupe@ebi.ac.uk 1 Rationale Journal information Nature 407(6805):770-6.The Biochemistry of Apoptosis. Caspase-8 is the key initiator caspase in the

More information

Tutorial: Structural Analysis of a Protein-Protein Complex

Tutorial: Structural Analysis of a Protein-Protein Complex Molecular Modeling Section (MMS) Department of Pharmaceutical and Pharmacological Sciences University of Padova Via Marzolo 5-35131 Padova (IT) @contact: stefano.moro@unipd.it Tutorial: Structural Analysis

More information

Basic Local Alignment Search Tool

Basic Local Alignment Search Tool Basic Local Alignment Search Tool Alignments used to uncover homologies between sequences combined with phylogenetic studies o can determine orthologous and paralogous relationships Local Alignment uses

More information

Preparations and Starting the program

Preparations and Starting the program Preparations and Starting the program https://oldwww.abo.fi/fakultet/ookforskning 1) Create a working directory on your computer for your Chemkin work, and 2) download kinetic mechanism files AAUmech.inp

More information