Practical Bioinformatics

Similar documents
Python & Numpy A tutorial

Practical Bioinformatics

Introduction to Python and its unit testing framework

Best Practices. Nicola Chiapolini. Physik-Institut University of Zurich. June 8, 2015

HIRES 2017 Syllabus. Instructors:

Welcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1

CS Homework 3. October 15, 2009

Department of Chemical Engineering University of California, Santa Barbara Spring Exercise 2. Due: Thursday, 4/19/09

Homework 4, Part B: Structured perceptron

CIS519: Applied Machine Learning Fall Homework 5. Due: December 10 th, 2018, 11:59 PM

Using the EartH2Observe data portal to analyse drought indicators. Lesson 4: Using Python Notebook to access and process data

Numerical linear algebra

Introduction to numerical projects

You will be writing code in the Python programming language, which you may have learnt in the Python module.

Designing Information Devices and Systems I Spring 2017 Babak Ayazifar, Vladimir Stojanovic Homework 2

Scripting Languages Fast development, extensible programs

Designing Information Devices and Systems I Fall 2018 Homework 5

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Set 2 January 12 th, 2018

Designing Information Devices and Systems I Spring 2018 Homework 13

Homework Assignment 4 Root Finding Algorithms The Maximum Velocity of a Car Due: Friday, February 19, 2010 at 12noon

High-Performance Scientific Computing

Analysis Methods in Atmospheric and Oceanic Science

Satellite project, AST 1100

Introduction to Portal for ArcGIS

CS 237 Fall 2018, Homework 06 Solution

epistasis Documentation

Lab 1: Empirical Energy Methods Due: 2/14/18

Notater: INF3331. Veronika Heimsbakk December 4, Introduction 3

COMS 6100 Class Notes

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations

You will be writing code in the Python programming language, which you may have learnt in the Python module.

6. How Functions Work and Are Accessed. Topics: Modules and Functions More on Importing Call Frames

HOMEWORK #4: LOGISTIC REGRESSION

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Homework 5. This homework is due October 6, 2014, at 12:00 noon.

HOMEWORK #4: LOGISTIC REGRESSION

Designing Information Devices and Systems I Fall 2015 Anant Sahai, Ali Niknejad Homework 2. This homework is due September 14, 2015, at Noon.

Project One: C Bump functions

Physics 212E Spring 2004 Classical and Modern Physics. Computer Exercise #2

Designing Information Devices and Systems II Fall 2018 Elad Alon and Miki Lustig Homework 12

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Designing Information Devices and Systems I Spring 2018 Homework 11

VPython Class 2: Functions, Fields, and the dreaded ˆr

Statistical Machine Learning Theory. From Multi-class Classification to Structured Output Prediction. Hisashi Kashima.

Deduction by Daniel Bonevac. Chapter 3 Truth Trees

Manual for a computer class in ML

Designing Information Devices and Systems I Fall 2018 Homework 3

Inequalities. CK12 Editor. Say Thanks to the Authors Click (No sign in required)

Adam Blank Spring 2017 CSE 311. Foundations of Computing I. * All slides are a combined effort between previous instructors of the course

Section 2.3: Statements Containing Multiple Quantifiers

Computational Methods for Nonlinear Systems

DATA SCIENCE SIMPLIFIED USING ARCGIS API FOR PYTHON

Portal for ArcGIS: An Introduction

Welcome to Physics 161 Elements of Physics Fall 2018, Sept 4. Wim Kloet

MEAM 520. Homogenous Transformations

f = Xw + b, We can compute the total square error of the function values above, compared to the observed training set values:

Clojure Concurrency Constructs, Part Two. CSCI 5828: Foundations of Software Engineering Lecture 13 10/07/2014

BASIC TECHNOLOGY Pre K starts and shuts down computer, monitor, and printer E E D D P P P P P P P P P P

Designing Information Devices and Systems I Fall 2017 Homework 3

MATH 341, Section 001 FALL 2014 Introduction to the Language and Practice of Mathematics

Welcome to AP Physics C

Python. Tutorial. Jan Pöschko. March 22, Graz University of Technology

PHP-Einführung - Lesson 4 - Object Oriented Programming. Alexander Lichter June 27, 2017

NAME: DATE: Leaving Certificate GEOGRAPHY: Maps and aerial photographs. Maps and Aerial Photographs

27. THESE SENTENCES CERTAINLY LOOK DIFFERENT

Introduction to Portal for ArcGIS. Hao LEE November 12, 2015

Exam: 4 hour multiple choice. Agenda. Course Introduction to Statistics. Lecture 1: Introduction to Statistics. Per Bruun Brockhoff

FALL 2018 MATH 4211/6211 Optimization Homework 4

AMS 132: Discussion Section 2

BANKING DAY SCHEDULE (SHORT DAY)

Designing Information Devices and Systems I Spring 2016 Elad Alon, Babak Ayazifar Homework 11

Tutorial Three: Loops and Conditionals

EECS 349 (Machine Learning) Homework 4, Fall 2011

AP PHYSICS C Mechanics - SUMMER ASSIGNMENT FOR

Problem Set 5 Solutions

Dave wrote to his 3 rd Grade teacher From: Dave Date: (month/date/year, time) Subject: Greeting from Egypt To: Mr./Mrs.

Supplement for MAA 3200, Prof S Hudson, Fall 2018 Constructing Number Systems

27. THESE SENTENCES CERTAINLY LOOK DIFFERENT

Animal Physiology By Knut Schmidt-Nielsen READ ONLINE

Cell phones: If your cell phone rings, you are talking on the cell phone or text messaging I will ask you to leave for the day.

NAME: DATE: MATHS: Statistics. Maths Statistics

Computational Methods for Nonlinear Systems

Jointly Learning Python Programming and Analytic Geometry

Convex Optimization / Homework 1, due September 19

Unit 4 Advanced Functions Review Day

Section 4.2: Mathematical Induction 1

Introduction to Python

Analyzing the Earth Using Remote Sensing

Functions in Python L435/L555. Dept. of Linguistics, Indiana University Fall Functions in Python. Functions.

365 DAYS OF POSITIVE SELF-TALK BY SHAD HELMSTETTER PH.D. DOWNLOAD EBOOK : 365 DAYS OF POSITIVE SELF-TALK BY SHAD HELMSTETTER PH.D.

Classification with Perceptrons. Reading:

CS1110 Lab 3 (Feb 10-11, 2015)

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

ArcGIS API for Python for Data Scientists. Andrew Chapkowski Alberto Nieto

How to Make or Plot a Graph or Chart in Excel

Student Questionnaire (s) Main Survey

8. TRANSFORMING TOOL #1 (the Addition Property of Equality)

L Hopital s Rule. We will use our knowledge of derivatives in order to evaluate limits that produce indeterminate forms.

Students will explore Stellarium, an open-source planetarium and astronomical visualization software.

Monte Python. the way back from cosmology to Fundamental Physics. Miguel Zumalacárregui. Nordic Institute for Theoretical Physics and UC Berkeley

Business Statistics. Lecture 9: Simple Regression

Transcription:

4/24/2017

Resources Course website: http://histo.ucsf.edu/bms270/ Resources on the course website: Syllabus Papers and code (for downloading before class) Slides and transcripts (available after class) On-line textbooks (Dive into Python, Numerical Recipes,...) Programs for this course (Canopy, Cluster3, JavaTreeView,...)

Homework E-mail Mark your python sessions (.ipynb files) after class E-mail Mark any homework code/results before tomorrow s class

Homework E-mail Mark your python sessions (.ipynb files) after class E-mail Mark any homework code/results before tomorrow s class It is fine to work together and to consult books, the web, etc. (but let me know if you do) It is fine to e-mail Mark questions Don t blindly copy-paste other people s code (you won t learn)

Homework E-mail Mark your python sessions (.ipynb files) after class E-mail Mark any homework code/results before tomorrow s class It is fine to work together and to consult books, the web, etc. (but let me know if you do) It is fine to e-mail Mark questions Don t blindly copy-paste other people s code (you won t learn) If you get stuck, try working things out on paper first.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics. Analyzing data.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics. Analyzing data. Writing standalone scripts.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics. Analyzing data. Writing standalone scripts. Shepherding data between analysis tools.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics. Analyzing data. Writing standalone scripts. Shepherding data between analysis tools. Aggregating data from multiple sources.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics. Analyzing data. Writing standalone scripts. Shepherding data between analysis tools. Aggregating data from multiple sources. Implementing new methods from the literature.

Goals At the end of this class, you should have the confidence to take on the day to day tasks of bioinformatics. Analyzing data. Writing standalone scripts. Shepherding data between analysis tools. Aggregating data from multiple sources. Implementing new methods from the literature. This is also good preparation for communicating with computational collaborators.

Course problems: expression and sequence analysis

Course problems: expression and sequence analysis Part 2: Genotype (Sequence analysis) Part 1: Phenotype (Expression profiling)

Course tool: Python

Python distribution: Enthought Canopy

Python distribution: Enthought Canopy

Python distribution: Enthought Canopy

Python shell: ipython (jupyter) notebook

Anatomy of a Programming Language

Anatomy of a Programming Language

Anatomy of a Programming Language

Anatomy of a Programming Language

Talking to Python: Nouns # This i s a comment # This i s an i n t ( i n t e g e r ) 42 # This i s a f l o a t ( r a t i o n a l number ) 4. 2 # These a r e a l l s t r i n g s ( s e q u e n c e s o f c h a r a c t e r s ) ATGC Mendel s Laws >CAA36839. 1 Calmodulin MADQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAEL QDMINEVDADDLPGNGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDK DGNGYISAAELRHVMTNLGEKLTDEEVDEMIREADIDGDGQVNYEEFVQ MMTAK

Python as a Calculator # A d d i t i o n 1+1 # S u b t r a c t i o n 2 3 # M u l t i p l i c a t i o n 3 5 # D i v i s i o n ( gotcha : be s u r e to use f l o a t s ) 5/3.0 # E x p o n e n t i a t i o n 2 3 # Order o f o p e r a t i o n s 2 3 (3+4) 2

Remembering objects # Use a s i n g l e = f o r a s s i g n m e n t : TLC = GATACA YFG = CTATGT MFG = CTATGT # A name can o ccur on both s i d e s o f an assignment : c o d o n p o s i t i o n = 1857 c o d o n p o s i t i o n = c o d o n p o s i t i o n + 3 # Short hand f o r common updates : codon += 3 weight = 10 e x p r e s s i o n = 2 CFU /= 10.0

Python as a Calculator 1 Calculate the molarity of a 70mer oligonucleotide with A 260 =.03 using the formula from Maniatis: C =.02A 260 330L (1) 2 Calculate the T m of a QuickChange mutagenesis primer with length 25bp (L = 25), 13 GC bases (n GC = 13), and 2 mismatches to the template (n MM = 2) using the formula from Stratagene: T m = 81.5 + 41n GC 100n MM 675 L (2)

Displaying values with print # Use p r i n t to show the v a l u e o f an o b j e c t message = Hello, world p r i n t ( message ) # Or s e v e r a l o b j e c t s : p r i n t ( 1, 2, 3, 4 ) # Older v e r s i o n s o f Python use a # d i f f e r e n t p r i n t s y n t a x p r i n t Hello, world

Comparing objects # Use double == f o r comparison : YFG == MFG # Other comparison o p e r a t o r s : # Not e q u a l : TLC!= MFG # L e s s than : 3 < 5 # G r e a t e r than, or e q u a l to : 7 >= 6

Making decisions i f (YFG == MFG) : p r i n t Synonyms! i f ( p r o t e i n l e n g t h < 6 0 ) : p r i n t Probably too s h o r t to f o l d. e l i f ( p r o t e i n l e n g t h > 1 0 0 0 0 ) : p r i n t What i s t h i s, t i t i n? e l s e : p r i n t Okay, t h i s l o o k s r e a s o n a b l e.

Collections of objects # A l i s t i s a mutable sequence o f o b j e c t s m y l i s t = [ 1, 3.1415926535, GATACA, 4, 5 ] # I n d e x i n g m y l i s t [ 0 ] == 1 m y l i s t [ 1] == 5 # A s s i g n i n g by i n d e x m y l i s t [ 0 ] = ATG # S l i c i n g m y l i s t [ 1 : 3 ] == [ 3. 1 4 1 5 9 2 6 5 3 5, GATACA ] m y l i s t [ : 2 ] == [ 1, 3.1415926535] m y l i s t [ 3 : ] == [ 4, 5 ] # A s s i g n i n g a second name to a l i s t a l s o m y l i s t = m y l i s t # A s s i g n i n g to a copy o f a l i s t m y o t h e r l i s t = m y l i s t [ : ]

Repeating yourself: iteration # A f o r l o o p i t e r a t e s through a l i s t one element # at a time : f o r i i n [ 1, 2, 3, 4, 5 ] : p r i n t i, i 2 # A w h i l e l o o p i t e r a t e s f o r as l o n g as a c o n d i t i o n # i s t r u e : p o p u l a t i o n = 1 while ( p o p u l a t i o n < 1 e5 ) : p r i n t p o p u l a t i o n p o p u l a t i o n = 2

Verb that noun! return value = function(parameter,...) Python, do function to parameter # B u i l t i n f u n c t i o n s # Generate a l i s t from 0 to n 1 a = range ( 5 ) # Sum o v e r an i t e r a b l e o b j e c t sum( a ) # Find the l e n g t h o f an o b j e c t len ( a )

Verb that noun! return value = function(parameter,...) Python, do function to parameter # I m p o r t i n g f u n c t i o n s from modules import numpy numpy. s q r t ( 9 ) import m a t p l o t l i b. p y p l o t as p l t f i g = p l t. f i g u r e ( ) p l t. p l o t ( [ 1, 2, 3, 4, 5 ], [ 0, 1, 0, 1, 0 ] ) from IPython. c o r e. d i s p l a y import d i s p l a y d i s p l a y ( f i g )

New verbs def f u n c t i o n ( parameter1, parameter2 ) : Do t h i s! # Code to do t h i s return r e t u r n v a l u e

Summary Python is a general purpose programming language.

Summary Python is a general purpose programming language. We can extend Python s built-in functions by defining our own functions (or by importing third party modules).

Summary Python is a general purpose programming language. We can extend Python s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like for, while, and if.

Summary Python is a general purpose programming language. We can extend Python s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like for, while, and if. We can use an interactive Python session to experiment with new ideas and to explore data.

Summary Python is a general purpose programming language. We can extend Python s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like for, while, and if. We can use an interactive Python session to experiment with new ideas and to explore data. Saving interactive sessions is a good way to document our computer experiments.

Summary Python is a general purpose programming language. We can extend Python s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like for, while, and if. We can use an interactive Python session to experiment with new ideas and to explore data. Saving interactive sessions is a good way to document our computer experiments. Likewise, we can use modules and scripts to document our computer protocols.

Summary Python is a general purpose programming language. We can extend Python s built-in functions by defining our own functions (or by importing third party modules). We can define complex behaviors through control statements like for, while, and if. We can use an interactive Python session to experiment with new ideas and to explore data. Saving interactive sessions is a good way to document our computer experiments. Likewise, we can use modules and scripts to document our computer protocols. Most of these statements are applicable to any programming language (Perl, R, Bash, Java, C/C++, FORTRAN,...)

Homework: Make your own Fun Write functions for these calculations, and test them on random data: 1 Mean: 2 Standard deviation: σ x = x = N i N x i N i (x i x) 2 N 1 3 Correlation coefficient (Pearson s r): i r(x, y) = (x i x)(y i ȳ) i (x i x) 2 i (y i ȳ) 2