Introduction to Python for Biologists

Size: px
Start display at page:

Download "Introduction to Python for Biologists"

Transcription

1 Introduction to Python for Biologists Katerina Taškova 1 Jean-Fred Fontaine 1,2 1 Faculty of Biology, Johannes Gutenberg-Universität Mainz, Mainz, Germany 2 Genomics and Computational Biology, Kernel Press, Mainz, Germany March 21, 2017

2 Introduction to Python for Biologists Table of Contents Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 2

3 Introduction to Python for Biologists Introduction Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 3

4 Introduction to Python for Biologists Introduction What is Python? Python is a general-purpose programming language created by Guido van Rossum (1991) high-level (abstraction from the details of the computer) interpreted (needs an interpreter software) Python design philosophy code readability syntax brevity Python is widely used for Biology rich built-in features powerful scientific extensions plotting capabilities March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 4

5 Introduction to Python for Biologists Introduction Structured programming I Instructions are executed sequentially, one per line Conditional statements allow selective execution of code blocks Loops allow repeated execution of code blocks Functions allow on-demand execution of code blocks March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 5

6 Introduction to Python for Biologists Introduction Structured programming II 1 i n s t r u c t i o n 1 # 1 s t i n s t r u c t i o n ( hashtag # s t a r t s comments ) 2 # blank l i n e 3 repeat 20 times # 2nd i n s t r u c t i o n ( loop s t a r t s a block ) 4 i n s t r u c t i o n a # block defined by i n d e n t a t i o n ( spaces or tabs ) 5 i n s t r u c t i o n b # 2nd i n s t r u c t i o n i n block 6 # blank l i n e 7 i f n>10 # 3 rd i n s t r u c t i o n ( C o n d i t i o n a l statement ) 8 i n s t r u c t i o n a # 1 s t i n s t r u c t i o n i n block 9 i n s t r u c t i o n b # 2nd i n s t r u c t i o n i n block 10 # blank l i n e 11 # blank l i n e 12 # backslashs j o i n l i n e s 13 i n s t r u c t i o n 3 \ # 3 rd i n s t r u c t i o n, p a r t 1 14 i n s t r u c t i o n 3 # 3 rd i n s t r u c t i o n, p a r t 2 15 # blank l i n e 16 # Expressions i n ( ), {}, or [ ] can span m u l t i p l e l i n e s 17 i n s t r u c t i o n 4 ( 1, 2, 3 # 4 th i n s t r u c t i o n, p a r t , 5, 6) # 4 th i n s t r u c t i o n, p a r t 2 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 6

7 Introduction to Python for Biologists Introduction Namespace Variables are names associated with data e.g. a=2 assigns value 2 to variable a Functions are names associated to specific code blocks built-in functions are available (see list on slide 100) e.g. print(a) will display 2 on the screen The user namespace is the set of names available to the user users can define new names of variables and functions in their namespace imported modules can add names of variables and functions in the user namespace March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 7

8 Introduction to Python for Biologists Introduction Object-oriented programming Data is organized in classes and objects a class is a template defining what objects can store and do an object is an instance of a class objects have attributes to store data and methods to do actions object namespaces are different from user namespace Example class Human is defined as: has a name (an attribute name ) has an age (an attribute age ) can introduce itself (a method who ) example with 1 existing Human object P1: 1 P1. name = Mary # assigns value to a t t r i b u t e name 2 P1. age = 26 # assigns value to a t t r i b u t e age 3 P1. who ( ) # d i s p l a y s My name i s Mary I am 2 6! 4 who ( ) # e r r o r! not i n the user namespace March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 8

9 Introduction to Python for Biologists Introduction Modules Modules can add functionalities to Python e.g. classes and functions Example of available modules: NumPy for scientific computing Matplotlib for plotting BioPython for Biology Modules have to be imported into the code 1 # import datetime module i n i t s own namespace 2 import datetime 3 datetime. date. today ( ) # today ( ) # e r r o r! 5 6 # import f u n c t i o n s log2 and log10 from module math 7 # i n c u r r e n t namespace 8 from math import log2, log10 9 log10 ( 1 ) # equal 0 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 9

10 Introduction to Python for Biologists Running code Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 10

11 Introduction to Python for Biologists Running code Running code I From a terminal by using the interactive Python shell 1 $ python3 # opens Python s h e l l 2 a=2 # assigns 2 to a 3 b=3 # assigns 3 to b 4 e x i t ( ) # closes Python s h e l l From a terminal by running a script file e.g. let say myscript.py is a script file (simple text file) and it contains: print( hello world! ) 1 $ python3 myscript. py # runs python3 and the s c r i p t 2 h e l l o world! # r e s u l t of the s c r i p t on the t e r m i n a l March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 11

12 Introduction to Python for Biologists Running code Running code II From Jupyter Notebook web-based graphical interface manage cells of code or text see execution results on the same notebook save/open notebooks March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 12

13 Introduction to Python for Biologists Running code Documentation and messages I Documentation and help: use the built-in help() function e.g. help(print) to display help for function print() see help menu or Google it Examples of error messages 1 # F o r g e t t i n g quotes 2 p r i n t ( Hello world ) 3 # F i l e < s t d i n >, l i n e 2 4 # p r i n t ( Hello world ) 5 # ˆ 6 # SyntaxError : i n v a l i d syntax March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 13

14 Introduction to Python for Biologists Running code Documentation and messages II 1 # S p e l l i n g mistakes 2 p r i n ( Hello world ) 3 # Traceback ( most recent c a l l l a s t ) : 4 # F i l e < s t d i n >, l i n e 2, i n <module> 5 # NameError : name p r i n i s not defined 1 # Wrong l i n e break w i t h i n a s t r i n g 2 p r i n t ( Hello 3 World ) 4 # F i l e < s t d i n >, l i n e 2 5 # p r i n t ( Hello 6 # ˆ 7 # SyntaxError : EOL while scanning s t r i n g l i t e r a l March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 14

15 Introduction to Python for Biologists Literals and variables Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 15

16 Introduction to Python for Biologists Literals and variables Numeric and strings literals I 1 # Numeric l i t e r a l s E3 # means # S t r i n g s l i t e r a l s 7 A s t r i n g # A s t r i n g 8 A s t r i n g # A s t r i n g 9 A s t r i n g # A s t r i n g 10 Three s i n g l e quotes # Three s i n g l e quotes 11 Three double quotes # Three double quotes 12 A \ s t r i n g \ # A s t r i n g ( backslash escape sequence ) 13 r A \ s t r i n g \ # A \ s t r i n g \ ( raw s t r i n g ) Python stores literals in objects of corresponding classes (class int for integers, float for floatting point, and str for strings) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 16

17 Introduction to Python for Biologists Literals and variables Numeric and strings literals II Printing numeric and strings literals 1 p r i n t (12) # 12 2 p r i n t (1+2) # p r i n t ( Hello World ) # Hello World 5 6 p r i n t ( Hello World, 1+2) # Hello World 3 7 p r i n t ( Hello World, 1+2, sep= ) # Hello World 3 8 p r i n t ( Hello World, 1+2, sep= \ t ) # Hello World 3 9 # (\ t : tab, \n : newline ) p r i n t ( AB, end= ) # AB ( avoid newline at the end ) 12 p r i n t ( CD ) # ABCD p r i n t ( Max i s, 12, and Min i s, 3) # Max i s 12 and Min i s 3 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 17

18 Introduction to Python for Biologists Literals and variables Variables I Variables are names used to access objects first letter is a character (not a digit) no space characters allowed case-sensitive (variable name var is not Var) prefer alphanumeric characters (e.g. abc123) avoid accents, non-alphanumeric, non English underscores may be used (e.g. abc 123) The following keywords can not be used as variable names and, assert, break, class, continue def, del, elif, else, except, exec, finally, for, from global, if, import, in, is, lambda, not, or, pass print, raise, return, try, while, yield March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 18

19 Introduction to Python for Biologists Literals and variables Variables II 1 # Numeric types 2 a=2 # a i s assigned an i n t o b j e c t of value 2 3 p r i n t ( a ) # p r i n t s the o b j e c t assigned to a ( 2 ) 4 b=a # b i s assigned the same o b j e c t as a ( 2 ) 5 p r i n t ( b ) # 2 6 a=5 # a i s assigned a new o b j e c t of value 5 7 p r i n t ( a ) # 5 8 p r i n t ( b ) # 2 ( b i s s t i l l assigned to o b j e c t of value 2) 9 10 # S t r i n g s 11 c1= a 12 p r i n t ( c1 ) # a 13 myname125 = abc March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 19

20 Introduction to Python for Biologists Numeric types Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 20

21 Introduction to Python for Biologists Numeric types Numeric types I 1 type ( 7 ) # <class i n t > ( i n t e g e r number ) 2 type ( ) # <class f l o a t > ( f l o a t i n g p o i n t ) 3 type (4.52 e 3) # <class f l o a t > ( f l o a t i n g p o i n t ) 4 5 # Operators ( s p e c i a l b u i l t i n f u n c t i o n s ) # 4 ( a d d i t i o n ) # 3 ( s u b s t r a c t i o n ) # 6 ( m u l t i p l i c a t i o n ) 9 9 / 2 # 4.5 ( d i v i s i o n ) 10 9 / / 2 # 4 ( i n t e g e r d i v i s i o n ) 11 9 % 2 # 1 ( i n t e g e r d i v i s i o n remainder ) # 8 ( exponent ) # Lowest to highest operators precedence ( equal i f on same l i n e ) 15 +, # Addition, S u b t r a c t i o n 16, /, / /, % # M u l t i p l i c a t i o n, D i v i s i o n s, Remainder 17 +x, x # P o s i t i v e, Negative 18 # Exponentiation March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 21

22 Introduction to Python for Biologists Numeric types Numeric types II 1 # B u i l t i n f u n c t i o n s 2 abs ( 2.58) # 2.58 ( absolute value of x ) 3 round ( 2. 5 ) # 2 ( round to c l o s e s t i n t e g e r ) 4 5 # With v a r i a b l e s 6 a = 1 # 1 7 b = # 2 8 c = a + b # 3 9 d = a+c b # 7 ( precedence of over +) 10 d = ( a+c ) b # 8 ( use parentheses to break precedence ) # Short n o t a t i o n s ( v a l i d f o r +,,, /,... ) 13 a += 1 # a = a a = 5 # a = a # Special f l o a t values 17 f l o a t ( NaN ) # nan ( Not a Number ) 18 f l o a t ( I n f ) # i n f : I n f i n i t e p o s i t i v e ; i n f : I n f i n i t e negative March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 22

23 Introduction to Python for Biologists Strings Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 23

24 Introduction to Python for Biologists Strings Sequence types Text sequence type: Strings: immutable sequences of characters Basic sequence types: Lists: mutable sequences Tuples: immutable sequences Ranges: immutable sequence of numbers Sequence operations: All sequence types support common sequence operations (slide 98) Mutable sequence types support specific operations (slide 99) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 24

25 Introduction to Python for Biologists Strings Strings I 1 # Quotes 2 A s t r i n g # A s t r i n g 3 A s t r i n g # A s t r i n g 4 A s t r i n g # A s t r i n g 5 Three s i n g l e quotes # Three s i n g l e quotes 6 Three double quotes # Three double quotes 7 8 # Escape sequences ( see annexes ) 9 A s i n g l e quote # A s i n g l e quote 10 A s i n g l e quote \ # A s i n g l e quote 11 A t a b u l a t i o n \ t 12 A newline \n See other escape sequences in slide 97 Triple quoted strings may span multiple lines - all associated whitespace will be included in the string literal March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 25

26 Introduction to Python for Biologists Strings Strings II 1 # Operators 2 pipe + t t e # = p i p e t t e ( concatenation ) 3 A 7 # = AAAAAAA ( r e p l i c a t i o n ) 4 A 3 + C 2 # = AAACC 5 A + s t r ( 2. 0 ) # = A2. 0 ( convert number then concatenate ) 6 7 # B u i l t i n f u n c t i o n s 8 len ( A s t r i n g of characters ) # 22 ( l e n g t h i n characters ) 9 type ( a ) # <class s t r > ( s t r i n g ) # S l i c e s [ s t a r t : end : step ] (0 i s index of f i r s t character ) 12 ABCDEFG [ 2 : 5 ] # CDE ( F at index 5 excluded ) 13 ABCDEFG [ : 5 ] # ABCDE ( from begining ) 14 ABCDEFG [ 5 : ] # FG ( to the end ) 15 ABCDEFG [ 2:] # FG ( 2 from the end : to the end ) 16 ABCDEFG [ 0 : 5 : 2 ] # ACE ( every second l e t t e r w ith step =2) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 26

27 Introduction to Python for Biologists Strings Strings methods I Strings are immutable: new objects are created for changes 1 seq = ACGtCCAgTnAGaaGT 2 3 # Case 4 seq. c a p i t a l i z e ( ) # Acgtccagtnagaagt 5 seq. casefold ( ) # acgtccagtnagaagt ( e s z e t t => ss ) 6 seq. lower ( ) # acgtccagtnagaagt ( e s z e t t => e s z e t t ) 7 seq. swapcase ( ) # acgtccagtnagaagt 8 seq. upper ( ) # ACGTCCAGTNAGAAGT 9 10 # Search and replace 11 seq. count ( a ) # 2 ( case s e n s i t i v e ) 12 seq. count ( G,0, 4) # 1 ( s l i c e s t a r t and end indexes ) 13 seq. endswith ( GT ) # True 14 seq. endswith ( G,0, 4) # False ( s l i c e s t a r t and end indexes ) 15 seq. f i n d ( GtC ) # 2 (1 s t h i t index, 1 otherwise ) 16 seq. replace ( aa, t t ) # ACGtCCAgTnAGttGT ( case s e n s i t i v e ) 17 seq. replace ( A, x, 2) # xcgtccxgtnagaagt (2 f i r s t h i t s only ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 27

28 Introduction to Python for Biologists Strings Strings methods II 1 seq = ACGtCCAgTnAGaaGT 2 3 # I s f u n c t i o n s 4 seq. isalnum ( ) # True ( Are a l l characters alphanumeric?) 5 seq. i s a l p h a ( ) # True ( Are a l l characters a l p h a b e t i c?) 6 seq. i s l o w e r ( ) # False ( Are a l l characters lowercase?) 7 seq. isnumeric ( ) # False ( Are a l l numeric characters?) 8 seq. isspace ( ) # False ( Are a l l whitespace characters?) 9 seq. isupper ( ) # False ( Are a l l characters uppercase?) # Join and s p l i t 12. j o i n ( [ A, B ] ) # A B 13. j o i n ( seq ) # A C G t C C A g T n A G a a G T 14 seq. p a r t i t i o n ( aa ) # ( ACGtCCAgTnAG, aa, GT ) : a t u p l e 15 seq. s p l i t ( aa ) # [ ACGtCCAgTnAG, GT ] : a l i s t 16 1\n2. s p l i t l i n e s ( ) # [ 1, 2 ] ( s p l i t at l i n e boundaries \r, \n ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 28

29 Introduction to Python for Biologists Strings Strings methods III 1 seq = ACGtCCAgTnAGaaGT 2 3 # D e l e t i n g 4 seq. l s t r i p ( ) # remove leading whitespace characters 5 seq. r s t r i p ( ) # remove t r a i l i n g whitespace characters 6 seq. s t r i p ( ) # remove whitespace characters from both ends 7 8 seq. l s t r i p ( AC ) # GtCCAgTnAGaaGT ( remove C s or A s ) 9 seq. l s t r i p ( CA ) # GtCCAgTnAGaaGT ( remove C s or A s ) 10 seq. l s t r i p ( C ) # ACGtCCAgTnAGaaGT ( no impact ) 11 # same f o r r s t r i p but from the r i g h t and s t r i p from both ends # Simple parsing of t e x t l i n e s from CSV f i l e s 14 l i n e. s t r i p ( ). s p l i t (, ) # remove newline and s p l i t CSV (\ t i f TSV) # t r a n s l a t e ( case s e n s i t i v e ) 17 t a b l e = seq. maketrans ( atcg, tagc ) # map characters by index 18 seq. lower ( ). t r a n s l a t e ( t a b l e ) # t g c a g g t c a n t c t t c a March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 29

30 Introduction to Python for Biologists Exercise Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 30

31 Introduction to Python for Biologists Exercise Exercise Create the following directory structure Dokumente python notebooks data Jupyter Notebook File: Literals.ipynb URL: Download the file into the notebooks folder March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 31

32 Introduction to Python for Biologists Lists, tuples and ranges Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 32

33 Introduction to Python for Biologists Lists, tuples and ranges Sequence types Text sequence type: Strings: immutable sequences of characters Basic sequence types: Lists: mutable sequences Tuples: immutable sequences Ranges: immutable sequence of numbers Sequence operations: All sequence types support common sequence operations (slide 98) Mutable sequence types support specific operations (slide 99) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 33

34 Introduction to Python for Biologists Lists, tuples and ranges Lists I A List is an ordered collection of objects 1 L i s t 1 = [ ] # an empty l i s t 2 3 L i s t 1 = [ b, a, 1, cat, K, dog, F ] 4 L i s t 1 [ 0 ] # b ( access item of index 0) 5 L i s t 1 [ 1 ] # a ( access item of index 1) 6 L i s t 1 [ 1] # F ( access the l a s t item ) 7 L i s t 1 [ 2] # dog ( access the second l a s t item ) 8 9 # S l i c e s [ s t a r t : end : step ] 10 L i s t 1 [ 2 : 5 ] # [ 1, cat, K ] ( index 5 excluded ) 11 L i s t 1 [ : 5 ] # [ b, a, 1, cat, K ] 12 L i s t 1 [ 5 : ] # [ dog, F ] 13 L i s t 1 [ 2:] # [ dog, F ] 14 L i s t 1 [ 0 : 5 : 2 ] # [ b, 1, K ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 34

35 Introduction to Python for Biologists Lists, tuples and ranges Lists II 1 # B u i l t i n f u n c t i o n s 2 L i s t 2 = [ 1, 2, 3, 4, 5] 3 len ( L i s t 1 ) # 5 ( l e n g t h = 7 items ) 4 max( L i s t 2 ) # 5 5 min ( L i s t 2 ) # 1 6 sum( L i s t 2 ) # # L i s t methods 9 L i s t 2 = [ ] # empty l i s t 10 L i s t 2. append ( 1 ) # [ 1 ] 11 L i s t 2. append ( A ) # [ 1, A ] 12 L i s t 2. extend ( [ B, 2 ] ) # [ 1, A, B, 2] 13 L i s t 2. pop ( 2 ) # [ 1, A, 2] 14 L i s t 2. i n s e r t ( 3, A ) # [ 1, A, 2, A ] ( i n s e r t A at index 3) 15 L i s t 2. index ( A ) # 1 ( index of the 1 s t A ) 16 L i s t 2. count ( A ) # 2 ( number of A ) 17 L i s t 2. reverse ( ) # [ A, 2, A, 1] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 35

36 Introduction to Python for Biologists Lists, tuples and ranges Lists III 1 # s o r t i n g 2 L i s t 3 = [ 5, 3, 4, 1, 2] 3 sorted ( L i s t 3 ) # [ 1, 2, 3, 4, 5] ( b u i l d a new sorted l i s t ) 4 L i s t 3 # [ 5, 3, 4, 1, 2] ( L i s t 3 not changed ) 5 L i s t 3. s o r t ( ) # modifies the l i s t in place 6 L i s t 3 # [ 1, 2, 3, 4, 5] (. s o r t ( ) did modify L i s t 3! ) 7 8 # nested l i s t / 2D l i s t s / t a b l e s 9 mylist = [ [ b, a ], 10 [ 1, cat ] ] # a l i s t of 2 l i s t s 11 mylist [ 0 ] # r e t u r n s the f i r s t l i s t [ b, a ] 12 mylist [ 0 ] [ 0 ] # b (1 s t item of the 1 s t l i s t ) 13 mylist [ 0 ] [ 1 ] # a (2 nd item of the 1 s t l i s t ) 14 mylist [ 1 ] # r e t u r n s the 2nd l i s t [ 1, cat ] 15 mylist [ 1 ] [ 0 ] = 1 0 # [ [ b, a ], [10, cat ] ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 36

37 Introduction to Python for Biologists Lists, tuples and ranges Lists IV 1 mylist = [ [ b, a ], 2 [ 1, cat ] ] 3 4 f o r s u b l i s t i n mylist : # loop over s u b l i s t s 5 f o r value i n s u b l i s t : # loop over values 6 p r i n t ( value ) # p r i n t 1 value per l i n e 7 # b 8 # a 9 # # cat f o r s u b l i s t i n mylist : # loop over s u b l i s t s 13 n e w s u b l i s t = map( s t r, s u b l i s t ) # convert each item to s t r i n g 14 p r i n t ( \ t. j o i n ( n e w s u b l i s t ) ) # p r i n t as TSV t a b l e 15 # b a 16 # 10 cat March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 37

38 Introduction to Python for Biologists Lists, tuples and ranges Tuples and ranges A Tuple is an ordered collection of objects 1 Tuple1 = ( ) # empty t u p l e 2 Tuple1 = ( b, a, 1, cat, K, dog, F ) # defined t u p l e 3 4 Tuple1 [ 0 ] # b 5 Tuple1 [ 1 : 3 ] # ( a, 1) ( index 3 excluded ) Ranges 1 # Range ( s t a r t, stop [, step ] ) 2 range (10) # range ( 0, 10) => no nice p r i n t method 3 l i s t ( range (10) ) # [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 4 l i s t ( range ( 0, 30, 5) ) # [ 0, 5, 10, 15, 20, 25] 5 l i s t ( range ( 0, 5, 1) ) # [ 0, 1, 2, 3, 4] 6 l i s t ( range ( 0 ) ) # [ ] 7 l i s t ( range ( 1, 0) ) # [ ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 38

39 Introduction to Python for Biologists Sets and dictionaries Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 39

40 Introduction to Python for Biologists Sets and dictionaries Sets I A Set is a mutable unordered collection of objects 1 S0 = set ( ) # an empty set 2 S0 = { a, 1} # a new set of 2 items 3 S1 = { a, 1, b, R } # a new set of 4 items 4 S2 = { a, 1, b, S } # a new set of 4 items 5 len ( S0 ) # # Operators 8 R i n S1 # True 9 R not i n S2 # True 10 S1 S2 # i n S1 but not i n S2 => { R } 11 S1 S2 # i n S1 or i n S2 => {1, a, S, R, b } 12 S1 & S2 # i n S1 and i n S2 => {1, b, a } 13 S1 ˆ S2 # i n S1 or i n S2 but not i n both => { R, S } 14 S0 <= S1 # S0 i s subset of S2 => True 15 S1 >= S2 # S1 i s superset of S2 => False 16 S1 >= S0 # True 17 S0. i s d i s j o i n t ( S1 ) # False March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 40

41 Introduction to Python for Biologists Sets and dictionaries Sets II 1 # Methods 2 S0. copy ( ) # r e t u r n a new set w ith a shallow copy of S0 3 S0. add ( item ) # add element item to the set 4 S0. remove ( item ) # remove element item from the set 5 S0. discard ( item ) # remove element item from the set i f present 6 S0. pop ( ) # remove and r e t u r n an a r b i t r a r y element 7 S0. c l e a r ( ) # remove a l l elements from the set March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 41

42 Introduction to Python for Biologists Sets and dictionaries Dictionaries I A Dictionary is a mutable indexed collection of objects (indexed by unique keys) 1 d = {} # empty d i c t i o n a r y 2 d = { A : ALA, C : CYS } # d i c t i o n a r y w ith 2 items 3 d [ A ] # ALA 4 d [ C ] # CYS 5 d [ H ]= HIS # add new item 6 d # { H : HIS, C : CYS, A : ALA } 7 del d [ A ] # { C : CYS, H : HIS } 8 9 C i n d # True ( key C i s i n d ) 10 A not i n d # True ( key A i s not i n d anymore ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 42

43 Introduction to Python for Biologists Sets and dictionaries Dictionaries II d[key] get value by key d[key] = val set value by key del d[key] delete item by key d.clear() delete all items len(d) number of items d.copy() make a shallow copy d.keys() return a view of all keys d.values() return a view of all values d.items() return a view of all items (key,value) d.update(d2) add all items from dictionary d2 d.get(key [, val]) get value by key if exists, otherwise val d.setdefaults(key [, val]) like d.get(k,val), also set d[k]=val if k not in d pop(key[, default]) remove key and return its value, return default otherwise. d.popitem() remove a random item and returns it as tuple Table: Functions for dictionaries March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 43

44 Introduction to Python for Biologists Convert and copy Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 44

45 Introduction to Python for Biologists Convert and copy Converting types I Many Python functions are sensitive to the type of data. For example, you cannot concatenate a string with an integer: 1 sign = You are years old # e r r o r!! 2 sign = You are + s t r ( 21) + years old # OK 3 sign # You are 21 years old 4 5 # convert to i n t ( from s t r or f l o a t ) 6 i n t ( 2014 ) # from a s t r i n g 7 i n t ( ) # from a f l o a t 8 9 # convert to f l o a t ( from s t r or i n t ) 10 f l o a t ( 1.99 ) # from a s t r i n g 11 f l o a t ( 5 ) # from an i n t e g e r March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 45

46 Introduction to Python for Biologists Convert and copy Converting types II 1 # convert to s t r ( from i n t, f l o a t, l i s t, tuple, d i c t and set ) 2 s t r ( ) # s t r ( [ 1, 2, 3, 4 ] ) # [ 1, 2, 3, 4 ] 4 5 # convert a sequence type to another 6 # ( s t r, l i s t, tuple, and set f u n c t i o n s ) 7 new set = set ( o l d l i s t ) # l i s t to set 8 new tuple = t u p l e ( o l d l i s t ) # l i s t to t u p l e 9 new set = set ( Hello ) # s t r i n g to set { H, o, e, l } 10 n e w l i s t = l i s t ( Hello ) # s t r i n g to l i s t [ H, e, l, l, o ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 46

47 Introduction to Python for Biologists Convert and copy Copy I Assignments (=) do not copy objects, they create bindings between a target and an object. 1 # Numeric types ( immutable ) 2 a = 1 # a binds the o b j e c t 1 3 b = a # b binds the o b j e c t 1 4 b = b + 1 # b binds a new o b j e c t created by the sum 5 a # 1 6 b # # S t r i n g s ( immutable ) 9 a = Hello # a binds the o b j e c t Hello 10 b = a # b binds the o b j e c t Hello 11 a = a. replace ( o, o World! ) # a binds a new o b j e c t 12 a # Hello World! 13 b # Hello March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 47

48 Introduction to Python for Biologists Convert and copy Copy II For collections that are mutable or contain mutable items, a shallow copy is sometimes needed so one can change one copy without changing the other. 1 # D i c t i o n a r y ( mutable ) 2 d1 = { A : ALA, C : CYS } # d1 binds the o b j e c t 3 d2 = d1 # d2 binds the o b j e c t 4 d2 [ H ]= HIS # add item to the o b j e c t 5 d1 # { A : ALA, H : HIS, C : CYS } 6 d2 # { A : ALA, H : HIS, C : CYS } 7 8 d2 = d1. copy ( ) # d2 binds a shallow copy of the o b j e c t 9 d2 [ P ]= PRO # add item to the copied o b j e c t 10 d1 # { A : ALA, H : HIS, C : CYS } 11 d2 # { A : ALA, H : HIS, P : PRO, C : CYS } March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 48

49 Introduction to Python for Biologists Convert and copy Copy III 1 # L i s t ( mutable ) 2 l 1 = [ A, H, C ] 3 l 2 = l 1 4 l 2. append ( P ) 5 l 1 # [ A, H, C, P ] 6 l 2 # [ A, H, C, P ] 7 8 l 2 = l 1 [ : ] # shallow copy by assigning a s l i c e of the a l l l i s t 9 l 2. append ( V ) 10 l 1 # [ A, H, C, P ] 11 l 2 # [ A, H, C, P, V ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 49

50 Introduction to Python for Biologists Convert and copy Copy IV Convert types to get copies 1 n e w l i s t = l i s t ( o l d l i s t ) # shallow copy 2 new dict = d i c t ( o l d d i c t ) # shallow copy 3 new set = set ( o l d l i s t ) # copy l i s t as a set 4 new tuple = t u p l e ( o l d l i s t ) # copy l i s t a t u p l e The copy module 1 import copy 2 x. copy ( ) # shallow copy of x 3 x. deepcopy ( ) # deep copy of x, i n c l u d i n g embedded o b j e c t s March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 50

51 Introduction to Python for Biologists Loops Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 51

52 Introduction to Python for Biologists Loops For loop I 1 # For items i n a l i s t 2 f o r person i n [ I s a b e l, Kate, Michael ] : 3 p r i n t ( Hi, person ) 4 # Hi I s a b e l 5 # Hi Kate 6 # Hi Michael 7 8 # For items i n a d i c t i o n a r y 9 seq = # an empty s t r i n g 10 d = { A : ALA, C : CYS } # a d i c t i o n a r y w ith 2 keys 11 f o r k i n d. keys ( ) : # loop over the keys 12 seq += d [ k ] # append value to seq 13 p r i n t ( seq ) # CYSALA March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 52

53 Introduction to Python for Biologists Loops For loop II 1 # For items i n a s t r i n g 2 f o r c i n abc : 3 p r i n t ( c ) 4 # a 5 # b 6 # c 7 8 # For items i n a range 9 f o r n i n range ( 3 ) : 10 p r i n t ( n ) 11 # 0 12 # 1 13 # # For items from any i t e r a t o r 16 f o r n i n i t e r a t o r : 17 p r i n t ( n ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 53

54 Introduction to Python for Biologists Loops Enumerate 1 # loop g e t t i n g index and value 2 RNAs = [ mirna, trna, mrna ] 3 f o r i, rna i n enumerate (RNAs) : 4 p r i n t ( i, rna ) 5 # 0 mirna 6 # 1 trna 7 # 2 mrna 8 9 # loop over 2 l i s t s 10 RNAtypes = [ micro, t r a n s f e r, messenger ] 11 f o r i, t i n enumerate ( RNAtypes ) : 12 r = RNAs[ i ] 13 p r i n t ( i, t, r ) 14 # 0 micro mirna 15 # 1 t r a n s f e r trna 16 # 2 messenger mrna March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 54

55 Introduction to Python for Biologists Loops While loop 1 i =0 2 value=1 3 while value <200: 4 i +=1 5 value = i 6 p r i n t ( i, value ) 7 # # # # # # March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 55

56 Introduction to Python for Biologists Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 56

57 Introduction to Python for Biologists Exercise URL Jupyter Notebook File: Sequences.ipynb Download the file into the notebooks folder Data file File: shrub dimensions.csv Download the file into the data folder March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 57

58 Introduction to Python for Biologists Functions Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 58

59 Introduction to Python for Biologists Functions Functions I 1 from random import choice # import f u n c t i o n choice 2 3 # Simple f u n c t i o n 4 def kmerfixed ( ) : # d e f i n e f u n c t i o n kmerfixed 5 p r i n t ( ACGTAGACGC ) # p r i n t predefined s t r i n g 6 7 kmerfixed ( ) # d i s p l a y ACGTAGACGC 8 9 # Returning a value 10 def kmer10 ( ) : # d e f i n e f u n c t i o n kmer10 11 seq= # d e f i n e an empty s t r i n g 12 f o r count i n range ( 10) : # repeat 10 times 13 seq += choice ( CGTA ) # add 1 random nt to s t r i n g 14 r e t u r n ( seq ) # r e t u r n s t r i n g newkmer = kmer10 ( ) # get r e s u l t of f u n c t i o n i n t o v a r i a b l e 17 p r i n t ( newkmer ) # c a l l the f u n c t i o n e. g. ACGGATACGC March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 59

60 Introduction to Python for Biologists Functions Functions II 1 # One parameter 2 def kmer ( k ) : # defin e kmer with 1 param. k 3 seq= 4 f o r count i n range ( k ) : # k i s used to d e f i n e the range 5 seq+=choice ( CGTA ) 6 r e t u r n ( seq ) 7 8 p r i n t ( kmer ( k =4) ) # e. g. TACC 9 p r i n t ( kmer (20) ) # e. g. CACAATGGGTACCCCGGACC 10 p r i n t ( kmer ( 0 ) ) # 11 p r i n t ( kmer ( ) ) # TypeError : kmer ( ) missing 1 r e q u i r e d 12 # p o s i t i o n a l argument : k March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 60

61 Introduction to Python for Biologists Functions Functions III 1 # Parameters with more parameters and d e f a u l t values 2 def generic kmer ( alphabet= ACGT, k=10) : 3 seq= 4 f o r count i n range ( k ) : 5 seq+=choice ( alphabet ) 6 r e t u r n ( seq ) 7 8 generic kmer ( AB12, 15) # e. g. 112AA1A12AA generic kmer ( AB12 ) # e. g. 1AA1B1BA2A 10 generic kmer ( k=20) # e. g. GTGGGCTTGTGCCCTGCACT 11 generic kmer ( ) # e. g. CTTGCCGGGA 12 generic kmer ( k =8, alphabet= #$%& ) # e. g. $$#&%$%$ March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 61

62 Introduction to Python for Biologists Functions Name spaces I Variable and function names defined globally can be seen in functions: this is the global namespace 1 a = 10 # g l o b a l v a r i a b l e 2 3 def my function ( ) : 4 p r i n t ( a ) # w i l l use the g l o b a l v a r i a b l e 5 6 my function ( ) # 10 ( the g l o b a l a ) 7 p r i n t ( a ) # 10 ( the g l o b a l a ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 62

63 Introduction to Python for Biologists Functions Name spaces II Names defined within a function can not be seen outside: the function has its own namespace. 1 a = 10 # g l o b a l v a r i a b l e 2 3 def my function ( ) : 4 a = 1 # l o c a l v a r i a b l e defined by assignment 5 b = 2 # l o c a l v a r i a b l e defined by assignment 6 p r i n t ( a ) 7 8 my function ( ) # 1 ( the l o c a l a ) 9 p r i n t ( a ) # 10 ( the g l o b a l a ) 10 p r i n t ( b ) # NameError : name b i s not defined March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 63

64 Introduction to Python for Biologists Functions Name spaces III Use parameters and returned values to get and set variables outside the name space 1 a = 10 # g l o b a l v a r i a b l e 2 3 def my function ( v a l ) : # l o c a l v a r i a b l e v a l 4 b = 2 5 v a l = v a l + b 6 r e t u r n ( v a l ) 7 p r i n t ( a ) # 10 ( the g l o b a l a ) 8 p r i n t ( my function ( a ) ) # 12 9 p r i n t ( a ) # 10 ( the g l o b a l a unchanged ) c = my function ( a ) # set v a l to 10 and assign 10+2 to c 12 p r i n t ( c ) # 12 ( g l o b a l a was changed ) 13 p r i n t ( a ) # 10 ( g l o b a l a was unchanged ) a = my function ( a ) # change g l o b a l a w ith value 10+2 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 64

65 Introduction to Python for Biologists Branching Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 65

66 Introduction to Python for Biologists Branching Truth Value Testing I Any object can be tested for truth value. The following values are considered false (other values are considered True): None False zero value: e.g. 0 or 0.0 an empty sequence or mapping: e.g., (), [ ], { }. Operations and built-in functions that have a Boolean result always return 0 for False and 1 for True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 66

67 Introduction to Python for Biologists Branching Boolean Operations I A Boolean is equal to True or False a and b (true if a and b are true, false otherwise) a or b (true if a or b is true (1 alone or both), false otherwise) a ˆ b (true if either a or b is true (not both), false otherwise) not b (true if b is false, false otherwise) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 67

68 Introduction to Python for Biologists Branching Boolean Operations II All example code for tests below return True unless otherwise specified 1 # l e t set values of 3 v a r i a b l e s ( s i n g l e = symbol ) 2 a = True 3 b = False 4 c = True # simple t e s t s using two = symbols (==) 8 a == True 9 b == False 10 c == True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 68

69 Introduction to Python for Biologists Branching Boolean Operations III 1 # l e t set values of 3 v a r i a b l e s ( one = symbol ) 2 a = True 3 b = False 4 c = True 5 6 # order i s i r r e l e v a n t 7 ( a or b ) == ( b or a ) 8 ( a and b ) == ( b and a ) 9 10 # n e u t r a l ( whatever value of a ) 11 ( a or False ) == a 12 ( a and True ) == a # always the same ( whatever value of a ) 15 ( a and False ) == False 16 ( a or True ) == True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 69

70 Introduction to Python for Biologists Branching Boolean Operations IV 1 # l e t set values of 3 v a r i a b l e s ( one = symbol ) 2 a = True 3 b = False 4 c = True 5 6 # precedence == > not > and > or 7 ( a and b or c ) == ( ( a and b ) or c ) 8 ( not a == b ) == ( not ( a == b ) ) 9 10 # e q u i v a l e n t expressions 11 ( ( a or b ) or c ) == ( a or ( b or c ) ) == ( a or b or c ) 12 ( a or a or a ) == a 13 ( b and b and b ) == b b and b and b == b # False and False and True => False!! a and ( b or c ) == ( a and b ) or ( a and c ) 18 a or ( b and c ) == ( a or b ) and ( a or c ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 70

71 Introduction to Python for Biologists Branching Comparisons 1 Operations 2 < # s t r i c t l y l ess than 3 <= # less than or equal 4 > # s t r i c t l y g r e a t e r than 5 >= # g r e a t e r than or equal 6 == # equal ( two symbols =) 7 math. i s c l o s e ( a, b ) # equal f o r f l o a t i n g p o i n t s a and b 8! = # not equal 9 i s # o b j e c t i d e n t i t y 10 i s not # negated o b j e c t i d e n t i t y 11 x < y <= z # i s e q u i v a l e n t to x < y and y <= z Comparisons between objects of same class are supported if operator defined for the class. Different numerical types can be compared: e.g. 2<4.56 Floating points can not be compared exactly due to the limited precision to represent infinite numbers such as 1/3 = March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 71

72 Introduction to Python for Biologists Branching Conditionals IF-ELIF-ELSE 1 seq = ATGAnnATG 2 i f n i n seq : 3 p r i n t ( sequence contains undefined bases ( n ) ) 4 e l i f x i n seq : 5 p r i n t ( sequence contains unknown bases x but not n ) 6 else : 7 p r i n t ( no undefined bases i n sequence ) 8 9 # 10 # sequence contains undefined bases ELIF and ELSE are optional multiple ELIF are possible March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 72

73 Introduction to Python for Biologists Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 73

74 Introduction to Python for Biologists Exercise URL Jupyter Notebook File: Conditionals.ipynb Download the file into the notebooks folder March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 74

75 Introduction to Python for Biologists Regular Expressions Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 75

76 Introduction to Python for Biologists Regular Expressions RE: Regular Expressions I Regular expressions (called REs, or regexes, or regex patterns) are a powerful language for matching text patterns (re module) In Python a regular expression search is typically written as: 1 match = re. search ( expression, s t r i n g ) The re.search() method takes a regular expression pattern and a string and searches for that pattern within the string. If the search is successful, re.search() returns a Match object (actually class sre.sre Match ) or None otherwise. March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 76

77 Introduction to Python for Biologists Regular Expressions RE: Regular Expressions II 1 import re # import re module 2 s t r = an example word : cat!! # Example s t r i n g 3 match = re. search ( r word : \w\w\w, s t r ) # Search a p a t t e r n 4 i f match : 5 p r i n t ( found, match. group ( ) ) # found word : cat 6 else : 7 p r i n t ( did not f i n d ) In the pattern string, \w codes a character (letter, digit or underscore) The r at the start of the pattern string designates a python raw string which passes through backslashes without change. March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 77

78 Introduction to Python for Biologists Regular Expressions RE: Basic Patterns Pattern Match a, X, 9, < ordinary characters match themselves exactly. a period matches any single character except newline \w matches a word character: a letter or digit or underbar [a-za-z0-9 ] \W matches any non-word character \b boundary between word and non-word \s a single whitespace character space, newline, return, tab, form [\n \r \t \f] \S matches any non-whitespace character \t tab \n newline \r return \d decimal digit [0-9] ˆ circumflex (top hat) matches the start of a string $ dollar matches the end of a string \ inhibits the specialness of a character. So, for example, use \. to match a period Table: Regular expressions: basic patterns March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 78

79 Introduction to Python for Biologists Regular Expressions RE: Basic examples I The basic rules of RE search for a pattern within a string are: The search proceeds through the string from start to end, stopping at the first match found All of the pattern must be matched, but not all of the string If match = re.search(pat, str) is successful, match is not None and in particular match.group() is the matching text March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 79

80 Introduction to Python for Biologists Regular Expressions RE: Basic examples II 1 match = re. search ( r i i i, p i i i g ) # found 2 match. group ( ) == i i i # True 3 4 match = re. search ( r i g s, p i i i g ) # not found 5 match == None # True 6 7 match = re. search ( r.. g, p i i i g ) # found 8 match. group ( ) == i i g # True 9 10 match = re. search ( r \d\d\d, p123g ) # found 11 match. group ( ) == 123 # True match = re. search ( r ) # found 14 match. group ( ) == abc # True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 80

81 Introduction to Python for Biologists Regular Expressions RE: Repetitions I Repetitions are defined using +, *,? and { } + means 1 or more occurrences of the pattern to its left e.g. i+ = one or more i s * means 0 or more occurrences of the pattern to its left? means match 0 or 1 occurrences of the pattern to its left curly brackets are used to specify exact number of repetitions e.g. A{5} for 5 A letters A{6,10} for 6 to 10 A letters Leftmost and Largest: First the search finds the leftmost match for the pattern, and second it tries to use up as much of the string as possible i.e. + and * go as far as possible (they are said to be greedy ). March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 81

82 Introduction to Python for Biologists Regular Expressions RE: Repetitions II 1 # simple r e p e t i t i o n s 2 re. search ( r p i +, p i i i g ). group ( ) # p i i i 3 re. search ( r p i?, ap ). group ( ) # p 4 re. search ( r p i?, a p i i ). group ( ) # p i 5 re. search ( r p i, ap ). group ( ) # p 6 re. search ( r p i, a p i i ). group ( ) # p i i 7 re. search ( r p i {3}, a p i i i i i ). group ( ) # p i i i 8 re. search ( r i +, p i i g i i i i ). group ( ) # i i (1 s t h i t only ) 9 10 # 3 d i g i t s p o s s i b l y separated by whitespaces (\ s ) 11 re. search ( r \d\s \d\s \d, xx1 2 3xx ). group ( ) # re. search ( r \d\s \d\s \d, xx12 3xx ). group ( ) # re. search ( r \d\s \d\s \d, xx123xx ). group ( ) # 123 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 82

83 Introduction to Python for Biologists Regular Expressions RE: Sets of characters I Square brackets indicate a set of characters [ABC] matches A or B or C. The codes \w, \s etc. work inside square brackets too with the one exception that dot (.) just means a literal dot Dash indicate a range or itself if put at the end [a-z] for lowercase alphabetic characters [a-za-z] for alphabetic characters [AB-] for A, B or dash Circumflex (ˆ) at the start inverts the set [ˆAB] for any character except A or B. March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 83

84 Introduction to Python for Biologists Regular Expressions RE: Sets of characters II 1 s t r = purple a l i c e b@google. com monkey dishwasher 2 match = re. search ( r \w+@\w+, s t r ) 3 i f match : 4 p r i n t match. group ( ) ## b@google 5 6 match = re. search ( r [ \w. ]+@[ \w. ]+, s t r ) 7 i f match : 8 p r i n t match. group ( ) ## a l i c e b@google. com March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 84

85 Introduction to Python for Biologists Regular Expressions RE: Functions I RE module functions: re.match() returns a Match object if occurrence found at begining of string, None otherwise re.search() returns a Match object for 1st occurrence, None if not found re.findall() returns a list of matched sub strings, an empty list if not found re.finditer() returns an iterator on Match objects of the occurrences, an empty iterator if not found Match object methods: match.start() returns start index match.end() returns end index match.span() returns start and end index in a tuple match.group() returns matched string March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 85

86 Introduction to Python for Biologists Regular Expressions RE: Functions II 1 import re 2 seq = RPAPPDRAPDQX # A sequence 3 expr = A.{ 1, 2}D # A and D separated by 1 or 2 characters 4 5 match = re. search ( expr, seq ) 6 i f match : 7 p r i n t ( 8 match. s t a r t ( ), # s t a r t index 9 match. end ( ), # end index 10 match. span ( ), # s t a r t and end index 11 match. group ( ), # the matched s t r i n g 12 seq [ match. s t a r t ( ) : match. end ( ) ], # the matched s t r i n g 13 sep= 14 ) 15 # 2 6 ( 2, 6) APPD APPD March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 86

87 Introduction to Python for Biologists Regular Expressions RE: Functions III 1 import re 2 seq = RPAPPDRAPDQX # A sequence 3 expr = A.{ 1, 2}D # A and D separated by 1 or 2 characters 4 5 match = re. match ( expr, seq ) # Not found at begining 6 p r i n t ( match ) 7 # None 8 9 matches = re. f i n d a l l ( expr, seq ) # Found 2 occurrences 10 p r i n t ( matches ) 11 # [ APPD, APD ] matches = re. f i n d i t e r ( expr, seq ) # Found 2 occurrences 14 f o r m i n matches : # I t e r a t e over Match o b j e c t s 15 p r i n t ( m. span ( ), m. group ( ) ) # Use each Match o b j e c t 16 # ( 2, 6) APPD 17 # ( 7, 10) APD March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 87

88 Introduction to Python for Biologists Regular Expressions RE: Group Extraction Groups are defined with parentheses On a successful search match.group(): the whole match text match.group(1): match text of 1st left parenthesis match.group(2): match text of 2nd left parenthesis... 1 import re 2 s t r = purple a l i c e b@google. com monkey dishwasher 3 match = re. search ( ( [ \w. ]+)@( [ \w. ]+), s t r ) 4 i f match : 5 p r i n t ( match. group ( ) ) ## a l i c e b@google. com 6 p r i n t ( match. group ( 1 ) ) ## a l i c e b 7 p r i n t ( match. group ( 2 ) ) ## google. com March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 88

89 Introduction to Python for Biologists Regular Expressions RE: Group Extraction and Findall If the pattern includes a single set of parenthesis, then findall() returns a list of strings corresponding to that single group If the pattern includes 2 or more parenthesis groups, then instead of returning a list of strings, findall() returns a list of tuples. Each tuple represents one match of the pattern, and inside the tuple is the group(1), group(2)... data. 1 s t r = alice@google. com, monkey bob@abc. com dishwasher 2 t u p l e s = re. f i n d a l l ( r ( [ \w\. ]+)@( [ \w\. ]+), s t r ) 3 p r i n t ( t u p l e s ) 4 # [ ( a l i c e, google. com ), ( bob, abc. com ) ] 5 6 f o r t i n t u p l e s : 7 p r i n t ( t [ 0 ], t [ 1 ], sep= ) 8 # a l i c e google. com 9 # bob abc. com March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 89

Notater: INF3331. Veronika Heimsbakk December 4, Introduction 3

Notater: INF3331. Veronika Heimsbakk December 4, Introduction 3 Notater: INF3331 Veronika Heimsbakk veronahe@student.matnat.uio.no December 4, 2013 Contents 1 Introduction 3 2 Bash 3 2.1 Variables.............................. 3 2.2 Loops...............................

More information

Python. chrysn

Python. chrysn Python chrysn 2008-09-25 Introduction Structure, Language & Syntax Strengths & Weaknesses Introduction Structure, Language & Syntax Strengths & Weaknesses Python Python is an interpreted,

More information

Contents. 6.1 Conditional statements 6.2 Iteration 6.3 Procedures and flow control

Contents. 6.1 Conditional statements 6.2 Iteration 6.3 Procedures and flow control Introduction This reference manual gives an essentially complete description of the standard facilities available in SMP. Extensive illustrative examples are included. The same text is given without these

More information

Introduction to Python

Introduction to Python Introduction to Python Luis Pedro Coelho Institute for Molecular Medicine (Lisbon) Lisbon Machine Learning School II Luis Pedro Coelho (IMM) Introduction to Python Lisbon Machine Learning School II (1

More information

Regular Expressions and Finite-State Automata. L545 Spring 2008

Regular Expressions and Finite-State Automata. L545 Spring 2008 Regular Expressions and Finite-State Automata L545 Spring 2008 Overview Finite-state technology is: Fast and efficient Useful for a variety of language tasks Three main topics we ll discuss: Regular Expressions

More information

L435/L555. Dept. of Linguistics, Indiana University Fall 2016

L435/L555. Dept. of Linguistics, Indiana University Fall 2016 in in L435/L555 Dept. of Linguistics, Indiana University Fall 2016 1 / 13 in we know how to output something on the screen: print( Hello world. ) input: input() returns the input from the keyboard

More information

1 Opening URLs. 2 Regular Expressions. 3 Look Back. 4 Graph Theory. 5 Crawler / Spider

1 Opening URLs. 2 Regular Expressions. 3 Look Back. 4 Graph Theory. 5 Crawler / Spider 1 Opening URLs 2 Regular Expressions 3 Look Back 4 Graph Theory 5 Crawler / Spider Sandeep Sadanandan (TU, Munich) Python For Fine Programmers June 5, 2009 1 / 14 Opening URLs The module used for opening

More information

Introduction to Python

Introduction to Python Introduction to Python Luis Pedro Coelho luis@luispedro.org @luispedrocoelho European Molecular Biology Laboratory Lisbon Machine Learning School 2015 Luis Pedro Coelho (@luispedrocoelho) Introduction

More information

Python & Numpy A tutorial

Python & Numpy A tutorial Python & Numpy A tutorial Devert Alexandre School of Software Engineering of USTC 13 February 2012 Slide 1/38 Table of Contents 1 Why Python & Numpy 2 First steps with Python 3 Fun with lists 4 Quick tour

More information

Python. Tutorial. Jan Pöschko. March 22, Graz University of Technology

Python. Tutorial. Jan Pöschko. March 22, Graz University of Technology Tutorial Graz University of Technology March 22, 2010 Why? is: very readable easy to learn interpreted & interactive like a UNIX shell, only better object-oriented but not religious about it slower than

More information

COMP 204. Exceptions continued. Yue Li based on material from Mathieu Blanchette, Carlos Oliver Gonzalez and Christopher Cameron

COMP 204. Exceptions continued. Yue Li based on material from Mathieu Blanchette, Carlos Oliver Gonzalez and Christopher Cameron COMP 204 Exceptions continued Yue Li based on material from Mathieu Blanchette, Carlos Oliver Gonzalez and Christopher Cameron 1 / 27 Types of bugs 1. Syntax errors 2. Exceptions (runtime) 3. Logical errors

More information

Functions in Python L435/L555. Dept. of Linguistics, Indiana University Fall Functions in Python. Functions.

Functions in Python L435/L555. Dept. of Linguistics, Indiana University Fall Functions in Python. Functions. L435/L555 Dept. of Linguistics, Indiana University Fall 2016 1 / 18 What is a function? Definition A function is something you can call (possibly with some parameters, i.e., the things in parentheses),

More information

Introduction to Python and its unit testing framework

Introduction to Python and its unit testing framework Introduction to Python and its unit testing framework Instructions These are self evaluation exercises. If you know no python then work through these exercises and it will prepare yourself for the lab.

More information

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions

Introduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions Introduction to Language Theory and Compilation: Exercises Session 2: Regular expressions Regular expressions (RE) Finite automata are an equivalent formalism to regular languages (for each regular language,

More information

C++ For Science and Engineering Lecture 13

C++ For Science and Engineering Lecture 13 C++ For Science and Engineering Lecture 13 John Chrispell Tulane University Wednesday September 22, 2010 Logical Expressions: Sometimes you want to use logical and and logical or (even logical not) in

More information

10. Finite Automata and Turing Machines

10. Finite Automata and Turing Machines . Finite Automata and Turing Machines Frank Stephan March 2, 24 Introduction Alan Turing s th Birthday Alan Turing was a completely original thinker who shaped the modern world, but many people have never

More information

2 Getting Started with Numerical Computations in Python

2 Getting Started with Numerical Computations in Python 1 Documentation and Resources * Download: o Requirements: Python, IPython, Numpy, Scipy, Matplotlib o Windows: google "windows download (Python,IPython,Numpy,Scipy,Matplotlib" o Debian based: sudo apt-get

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation Closure properties of Regular Languages Pumping Theorem Your Questions? Previous class days' material Reading Assignments HW 6 or 7 problems Anything else 1 Regular Expressions

More information

MiniMat: Matrix Language in OCaml LLVM

MiniMat: Matrix Language in OCaml LLVM Terence Lim - tl2735@columbia.edu August 17, 2016 Contents 1 Introduction 4 1.1 Goals........................................ 4 1.1.1 Flexible matrix notation......................... 4 1.1.2 Uncluttered................................

More information

Automata and Computability. Solutions to Exercises

Automata and Computability. Solutions to Exercises Automata and Computability Solutions to Exercises Fall 28 Alexis Maciel Department of Computer Science Clarkson University Copyright c 28 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

More information

Automata and Computability. Solutions to Exercises

Automata and Computability. Solutions to Exercises Automata and Computability Solutions to Exercises Spring 27 Alexis Maciel Department of Computer Science Clarkson University Copyright c 27 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata

More information

COMP 204. Object Oriented Programming (OOP) Mathieu Blanchette

COMP 204. Object Oriented Programming (OOP) Mathieu Blanchette COMP 204 Object Oriented Programming (OOP) Mathieu Blanchette 1 / 14 Object-Oriented Programming OOP is a way to write and structure programs to make them easier to design, understand, debug, and maintain.

More information

Sets are one of the basic building blocks for the types of objects considered in discrete mathematics.

Sets are one of the basic building blocks for the types of objects considered in discrete mathematics. Section 2.1 Introduction Sets are one of the basic building blocks for the types of objects considered in discrete mathematics. Important for counting. Programming languages have set operations. Set theory

More information

1 Alphabets and Languages

1 Alphabets and Languages 1 Alphabets and Languages Look at handout 1 (inference rules for sets) and use the rules on some examples like {a} {{a}} {a} {a, b}, {a} {{a}}, {a} {{a}}, {a} {a, b}, a {{a}}, a {a, b}, a {{a}}, a {a,

More information

Comp 11 Lectures. Dr. Mike Shah. August 9, Tufts University. Dr. Mike Shah (Tufts University) Comp 11 Lectures August 9, / 34

Comp 11 Lectures. Dr. Mike Shah. August 9, Tufts University. Dr. Mike Shah (Tufts University) Comp 11 Lectures August 9, / 34 Comp 11 Lectures Dr. Mike Shah Tufts University August 9, 2017 Dr. Mike Shah (Tufts University) Comp 11 Lectures August 9, 2017 1 / 34 Please do not distribute or host these slides without prior permission.

More information

Foundations of

Foundations of 91.304 Foundations of (Theoretical) Computer Science Chapter 1 Lecture Notes (Section 1.3: Regular Expressions) David Martin dm@cs.uml.edu d with some modifications by Prof. Karen Daniels, Spring 2012

More information

Context Free Grammars

Context Free Grammars Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.

More information

Outline. policies for the second part. with answers. MCS 260 Lecture 10.5 Introduction to Computer Science Jan Verschelde, 9 July 2014

Outline. policies for the second part. with answers. MCS 260 Lecture 10.5 Introduction to Computer Science Jan Verschelde, 9 July 2014 Outline 1 midterm exam on Friday 11 July 2014 policies for the second part 2 some review questions with answers MCS 260 Lecture 10.5 Introduction to Computer Science Jan Verschelde, 9 July 2014 Intro to

More information

Data Intensive Computing Handout 11 Spark: Transitive Closure

Data Intensive Computing Handout 11 Spark: Transitive Closure Data Intensive Computing Handout 11 Spark: Transitive Closure from random import Random spark/transitive_closure.py num edges = 3000 num vertices = 500 rand = Random(42) def generate g raph ( ) : edges

More information

UNIT II REGULAR LANGUAGES

UNIT II REGULAR LANGUAGES 1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular

More information

Discrete Mathematical Structures: Theory and Applications

Discrete Mathematical Structures: Theory and Applications Chapter 1: Foundations: Sets, Logic, and Algorithms Discrete Mathematical Structures: Theory and Applications Learning Objectives Learn about sets Explore various operations on sets Become familiar with

More information

Computation Theory Finite Automata

Computation Theory Finite Automata Computation Theory Dept. of Computing ITT Dublin October 14, 2010 Computation Theory I 1 We would like a model that captures the general nature of computation Consider two simple problems: 2 Design a program

More information

CS177 Spring Final exam. Fri 05/08 7:00p - 9:00p. There are 40 multiple choice questions. Each one is worth 2.5 points.

CS177 Spring Final exam. Fri 05/08 7:00p - 9:00p. There are 40 multiple choice questions. Each one is worth 2.5 points. CS177 Spring 2015 Final exam Fri 05/08 7:00p - 9:00p There are 40 multiple choice questions. Each one is worth 2.5 points. Answer the questions on the bubble sheet given to you. Only the answers on the

More information

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing

CMSC 330: Organization of Programming Languages. Pushdown Automata Parsing CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described

More information

9A. Iteration with range. Topics: Using for with range Summation Computing Min s Functions and for-loops A Graphics Applications

9A. Iteration with range. Topics: Using for with range Summation Computing Min s Functions and for-loops A Graphics Applications 9A. Iteration with range Topics: Using for with range Summation Computing Min s Functions and for-loops A Graphics Applications Iterating Through a String s = abcd for c in s: print c Output: a b c d In

More information

Read all questions and answers carefully! Do not make any assumptions about the code other than those that are clearly stated.

Read all questions and answers carefully! Do not make any assumptions about the code other than those that are clearly stated. Read all questions and answers carefully! Do not make any assumptions about the code other than those that are clearly stated. CS177 Fall 2014 Final - Page 2 of 38 December 17th, 10:30am 1. Consider we

More information

8.2 Examples of Classes

8.2 Examples of Classes 206 8.2 Examples of Classes Don t be put off by the terminology of classes we presented in the preceding section. Designing classes that appropriately break down complex data is a skill that takes practice

More information

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages

What Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages Do Homework 2. What Is a Language? Grammars, Languages, and Machines L Language Grammar Accepts Machine Strings: the Building Blocks of Languages An alphabet is a finite set of symbols: English alphabet:

More information

Multiplying Products of Prime Powers

Multiplying Products of Prime Powers Problem 1: Multiplying Products of Prime Powers Each positive integer can be expressed (in a unique way, according to the Fundamental Theorem of Arithmetic) as a product of powers of the prime numbers.

More information

CS177 Fall Midterm 1. Wed 10/07 6:30p - 7:30p. There are 25 multiple choice questions. Each one is worth 4 points.

CS177 Fall Midterm 1. Wed 10/07 6:30p - 7:30p. There are 25 multiple choice questions. Each one is worth 4 points. CS177 Fall 2015 Midterm 1 Wed 10/07 6:30p - 7:30p There are 25 multiple choice questions. Each one is worth 4 points. Answer the questions on the bubble sheet given to you. Only the answers on the bubble

More information

Scripting Languages Fast development, extensible programs

Scripting Languages Fast development, extensible programs Scripting Languages Fast development, extensible programs Devert Alexandre School of Software Engineering of USTC November 30, 2012 Slide 1/60 Table of Contents 1 Introduction 2 Dynamic languages A Python

More information

How do regular expressions work? CMSC 330: Organization of Programming Languages

How do regular expressions work? CMSC 330: Organization of Programming Languages How do regular expressions work? CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata What we ve learned What regular expressions are What they can express, and cannot

More information

IV. Turing Machine. Yuxi Fu. BASICS, Shanghai Jiao Tong University

IV. Turing Machine. Yuxi Fu. BASICS, Shanghai Jiao Tong University IV. Turing Machine Yuxi Fu BASICS, Shanghai Jiao Tong University Alan Turing Alan Turing (23Jun.1912-7Jun.1954), an English student of Church, introduced a machine model for effective calculation in On

More information

Using the EartH2Observe data portal to analyse drought indicators. Lesson 4: Using Python Notebook to access and process data

Using the EartH2Observe data portal to analyse drought indicators. Lesson 4: Using Python Notebook to access and process data Using the EartH2Observe data portal to analyse drought indicators Lesson 4: Using Python Notebook to access and process data Preface In this fourth lesson you will again work with the Water Cycle Integrator

More information

6.001 Recitation 22: Streams

6.001 Recitation 22: Streams 6.001 Recitation 22: Streams RI: Gerald Dalley, dalleyg@mit.edu, 4 May 2007 http://people.csail.mit.edu/dalleyg/6.001/sp2007/ The three chief virtues of a programmer are: Laziness, Impatience and Hubris

More information

n CS 160 or CS122 n Sets and Functions n Propositions and Predicates n Inference Rules n Proof Techniques n Program Verification n CS 161

n CS 160 or CS122 n Sets and Functions n Propositions and Predicates n Inference Rules n Proof Techniques n Program Verification n CS 161 Discrete Math at CSU (Rosen book) Sets and Functions (Rosen, Sections 2.1,2.2, 2.3) TOPICS Discrete math Set Definition Set Operations Tuples 1 n CS 160 or CS122 n Sets and Functions n Propositions and

More information

Plan for Today and Beginning Next week (Lexical Analysis)

Plan for Today and Beginning Next week (Lexical Analysis) Plan for Today and Beginning Next week (Lexical Analysis) Regular Expressions Finite State Machines DFAs: Deterministic Finite Automata Complications NFAs: Non Deterministic Finite State Automata From

More information

CSE 211. Pushdown Automata. CSE 211 (Theory of Computation) Atif Hasan Rahman

CSE 211. Pushdown Automata. CSE 211 (Theory of Computation) Atif Hasan Rahman CSE 211 Pushdown Automata CSE 211 (Theory of Computation) Atif Hasan Rahman Lecturer Department of Computer Science and Engineering Bangladesh University of Engineering & Technology Adapted from slides

More information

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations Lab 2 Worksheet Problems Problem : Geometry and Linear Equations Linear algebra is, first and foremost, the study of systems of linear equations. You are going to encounter linear systems frequently in

More information

Introduction. How to use this book. Linear algebra. Mathematica. Mathematica cells

Introduction. How to use this book. Linear algebra. Mathematica. Mathematica cells Introduction How to use this book This guide is meant as a standard reference to definitions, examples, and Mathematica techniques for linear algebra. Complementary material can be found in the Help sections

More information

1 Definition of a Turing machine

1 Definition of a Turing machine Introduction to Algorithms Notes on Turing Machines CS 4820, Spring 2017 April 10 24, 2017 1 Definition of a Turing machine Turing machines are an abstract model of computation. They provide a precise,

More information

FIT100 Spring 01. Project 2. Astrological Toys

FIT100 Spring 01. Project 2. Astrological Toys FIT100 Spring 01 Project 2 Astrological Toys In this project you will write a series of Windows applications that look up and display astrological signs and dates. The applications that will make up the

More information

LAST TIME..THE COMMAND LINE

LAST TIME..THE COMMAND LINE LAST TIME..THE COMMAND LINE Maybe slightly to fast.. Finder Applications Utilities Terminal The color can be adjusted and is simply a matter of taste. Per default it is black! Dr. Robert Kofler Introduction

More information

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions

Lexical Analysis Part II: Constructing a Scanner from Regular Expressions Lexical Analysis Part II: Constructing a Scanner from Regular Expressions CS434 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper, Ken Kennedy

More information

Part I: Definitions and Properties

Part I: Definitions and Properties Turing Machines Part I: Definitions and Properties Finite State Automata Deterministic Automata (DFSA) M = {Q, Σ, δ, q 0, F} -- Σ = Symbols -- Q = States -- q 0 = Initial State -- F = Accepting States

More information

8. More on Iteration with For

8. More on Iteration with For 8. More on Iteration with For Topics: Using for with range Summation Computing Min s Functions and for-loops Graphical iteration Traversing a String Character-by-Character s = abcd for c in s: print c

More information

Math Review. for the Quantitative Reasoning measure of the GRE General Test

Math Review. for the Quantitative Reasoning measure of the GRE General Test Math Review for the Quantitative Reasoning measure of the GRE General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important for solving

More information

CSE 105 THEORY OF COMPUTATION

CSE 105 THEORY OF COMPUTATION CSE 105 THEORY OF COMPUTATION Fall 2016 http://cseweb.ucsd.edu/classes/fa16/cse105-abc/ Today's learning goals Sipser Ch 3 Trace the computation of a Turing machine using its transition function and configurations.

More information

MA/CSSE 474 Theory of Computation

MA/CSSE 474 Theory of Computation MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online

More information

A High Level Programming Language for Quantum Computing

A High Level Programming Language for Quantum Computing QUARK QUantum Analysis and Realization Kit A High Level Programming Language for Quantum Computing Team In lexicographical order: Name UNI Role Daria Jung djj2115 Verification and Validation Jamis Johnson

More information

Introduction to Metalogic

Introduction to Metalogic Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)

More information

Assignment 1: Due Friday Feb 6 at 6pm

Assignment 1: Due Friday Feb 6 at 6pm CS1110 Spring 2015 Assignment 1: Due Friday Feb 6 at 6pm You must work either on your own or with one partner. If you work with a partner, you and your partner must first register as a group in CMS and

More information

Solutions to Exercises Chapter 2: On numbers and counting

Solutions to Exercises Chapter 2: On numbers and counting Solutions to Exercises Chapter 2: On numbers and counting 1 Criticise the following proof that 1 is the largest natural number. Let n be the largest natural number, and suppose than n 1. Then n > 1, and

More information

10. The GNFA method is used to show that

10. The GNFA method is used to show that CSE 355 Midterm Examination 27 February 27 Last Name Sample ASU ID First Name(s) Ima Exam # Sample Regrading of Midterms If you believe that your grade has not been recorded correctly, return the entire

More information

MCS 260 Exam 2 13 November In order to get full credit, you need to show your work.

MCS 260 Exam 2 13 November In order to get full credit, you need to show your work. MCS 260 Exam 2 13 November 2015 Name: Do not start until instructed to do so. In order to get full credit, you need to show your work. You have 50 minutes to complete the exam. Good Luck! Problem 1 /15

More information

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.

Recap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT. Recap DFA,NFA, DTM Slides by Prof. Debasis Mitra, FIT. 1 Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { {, } } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Technische Universität München SS 2017 Lehrstuhl für Informatik V Dr. Tobias Neckel M. Sc. Ionuț Farcaș April 26, 2017 Algorithms for Uncertainty Quantification Tutorial 1: Python overview In this worksheet,

More information

I. Numerical Computing

I. Numerical Computing I. Numerical Computing A. Lectures 1-3: Foundations of Numerical Computing Lecture 1 Intro to numerical computing Understand difference and pros/cons of analytical versus numerical solutions Lecture 2

More information

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata

CMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata CMSC330 Spring 2018 1 How do regular expressions work? What we ve learned What regular expressions are What they

More information

CS 341 Homework 16 Languages that Are and Are Not Context-Free

CS 341 Homework 16 Languages that Are and Are Not Context-Free CS 341 Homework 16 Languages that Are and Are Not Context-Free 1. Show that the following languages are context-free. You can do this by writing a context free grammar or a PDA, or you can use the closure

More information

Most General computer?

Most General computer? Turing Machines Most General computer? DFAs are simple model of computation. Accept only the regular languages. Is there a kind of computer that can accept any language, or compute any function? Recall

More information

CMSC 330: Organization of Programming Languages

CMSC 330: Organization of Programming Languages CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata CMSC 330 Spring 2017 1 How do regular expressions work? What we ve learned What regular expressions are What they

More information

Computer Science Introductory Course MSc - Introduction to Java

Computer Science Introductory Course MSc - Introduction to Java Computer Science Introductory Course MSc - Introduction to Java Lecture 1: Diving into java Pablo Oliveira ENST Outline 1 Introduction 2 Primitive types 3 Operators 4 5 Control Flow

More information

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix.

(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix. (NB. Pages 22-40 are intended for those who need repeated study in formal languages) Length of a string Number of symbols in the string. Formal languages Basic concepts for symbols, strings and languages:

More information

CSCI 239 Discrete Structures of Computer Science Lab 6 Vectors and Matrices

CSCI 239 Discrete Structures of Computer Science Lab 6 Vectors and Matrices CSCI 239 Discrete Structures of Computer Science Lab 6 Vectors and Matrices This lab consists of exercises on real-valued vectors and matrices. Most of the exercises will required pencil and paper. Put

More information

CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community

CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community http://csc.cs.rit.edu Linked Lists 1. You are given the linked list: 1 2 3. You may assume that each node has one field

More information

Sets. Introduction to Set Theory ( 2.1) Basic notations for sets. Basic properties of sets CMSC 302. Vojislav Kecman

Sets. Introduction to Set Theory ( 2.1) Basic notations for sets. Basic properties of sets CMSC 302. Vojislav Kecman Introduction to Set Theory ( 2.1) VCU, Department of Computer Science CMSC 302 Sets Vojislav Kecman A set is a new type of structure, representing an unordered collection (group, plurality) of zero or

More information

PetriScript Reference Manual 1.0. Alexandre Hamez Xavier Renault

PetriScript Reference Manual 1.0. Alexandre Hamez Xavier Renault PetriScript Reference Manual.0 Alexandre Hamez Xavier Renault Contents PetriScript basics 3. Using PetriScript................................. 3.2 Generalities.................................... 4.3

More information

CSE 20. Lecture 4: Introduction to Boolean algebra. CSE 20: Lecture4

CSE 20. Lecture 4: Introduction to Boolean algebra. CSE 20: Lecture4 CSE 20 Lecture 4: Introduction to Boolean algebra Reminder First quiz will be on Friday (17th January) in class. It is a paper quiz. Syllabus is all that has been done till Wednesday. If you want you may

More information

Practical Bioinformatics

Practical Bioinformatics 4/24/2017 Resources Course website: http://histo.ucsf.edu/bms270/ Resources on the course website: Syllabus Papers and code (for downloading before class) Slides and transcripts (available after class)

More information

Computation and Logic Definitions

Computation and Logic Definitions Computation and Logic Definitions True and False Also called Boolean truth values, True and False represent the two values or states an atom can assume. We can use any two distinct objects to represent

More information

CSCI-141 Exam 1 Review September 19, 2015 Presented by the RIT Computer Science Community

CSCI-141 Exam 1 Review September 19, 2015 Presented by the RIT Computer Science Community CSCI-141 Exam 1 Review September 19, 2015 Presented by the RIT Computer Science Community http://csc.cs.rit.edu 1. Python basics (a) The programming language Python (circle the best answer): i. is primarily

More information

Today s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets

Today s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets Today s topics Introduction to Set Theory ( 1.6) Sets Definitions Operations Proving Set Identities Reading: Sections 1.6-1.7 Upcoming Functions A set is a new type of structure, representing an unordered

More information

A Little Logic. Propositional Logic. Satisfiability Problems. Solving Sudokus. First Order Logic. Logic Programming

A Little Logic. Propositional Logic. Satisfiability Problems. Solving Sudokus. First Order Logic. Logic Programming A Little Logic International Center for Computational Logic Technische Universität Dresden Germany Propositional Logic Satisfiability Problems Solving Sudokus First Order Logic Logic Programming A Little

More information

Extended Introduction to Computer Science CS1001.py. Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration

Extended Introduction to Computer Science CS1001.py. Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration Extended Introduction to Computer Science CS1001.py Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration Instructors: Benny Chor, Amir Rubinstein Teaching Assistants: Yael Baran,

More information

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?

Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage? Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage? What is the most powerful of automata? In this lecture we will introduce

More information

Theoretical Computer Science

Theoretical Computer Science Theoretical Computer Science Zdeněk Sawa Department of Computer Science, FEI, Technical University of Ostrava 17. listopadu 15, Ostrava-Poruba 708 33 Czech republic September 22, 2017 Z. Sawa (TU Ostrava)

More information

Automata Theory and Formal Grammars: Lecture 1

Automata Theory and Formal Grammars: Lecture 1 Automata Theory and Formal Grammars: Lecture 1 Sets, Languages, Logic Automata Theory and Formal Grammars: Lecture 1 p.1/72 Sets, Languages, Logic Today Course Overview Administrivia Sets Theory (Review?)

More information

Section 1: Sets and Interval Notation

Section 1: Sets and Interval Notation PART 1 From Sets to Functions Section 1: Sets and Interval Notation Introduction Set concepts offer the means for understanding many different aspects of mathematics and its applications to other branches

More information

The Turing Machine. CSE 211 (Theory of Computation) The Turing Machine continued. Turing Machines

The Turing Machine. CSE 211 (Theory of Computation) The Turing Machine continued. Turing Machines The Turing Machine Turing Machines Professor Department of Computer Science and Engineering Bangladesh University of Engineering and Technology Dhaka-1000, Bangladesh The Turing machine is essentially

More information

Meta-programming & you

Meta-programming & you Meta-programming & you Robin Message Cambridge Programming Research Group 10 th May 2010 What s meta-programming about? 1 result=somedb. customers. select 2 { first_name+ +last_name } 3 where name LIKE

More information

Digital Systems and Information Part II

Digital Systems and Information Part II Digital Systems and Information Part II Overview Arithmetic Operations General Remarks Unsigned and Signed Binary Operations Number representation using Decimal Codes BCD code and Seven-Segment Code Text

More information

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin

Chapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin Chapter 0 Introduction Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin October 2014 Automata Theory 2 of 22 Automata theory deals

More information

1.2 The Role of Variables

1.2 The Role of Variables 1.2 The Role of Variables variables sentences come in several flavors true false conditional In this section, a name is given to mathematical sentences that are sometimes true, sometimes false they are

More information

Finite State Transducers

Finite State Transducers Finite State Transducers Eric Gribkoff May 29, 2013 Original Slides by Thomas Hanneforth (Universitat Potsdam) Outline 1 Definition of Finite State Transducer 2 Examples of FSTs 3 Definition of Regular

More information

Modern Functional Programming and Actors With Scala and Akka

Modern Functional Programming and Actors With Scala and Akka Modern Functional Programming and Actors With Scala and Akka Aaron Kosmatin Computer Science Department San Jose State University San Jose, CA 95192 707-508-9143 akosmatin@gmail.com Abstract For many years,

More information

Rigorous Software Development CSCI-GA

Rigorous Software Development CSCI-GA Rigorous Software Development CSCI-GA 3033-009 Instructor: Thomas Wies Spring 2013 Lecture 2 Alloy Analyzer Analyzes micro models of software Helps to identify key properties of a software design find

More information

Lectures about Python, useful both for beginners and experts, can be found at (http://scipy-lectures.github.io).

Lectures about Python, useful both for beginners and experts, can be found at  (http://scipy-lectures.github.io). Random Matrix Theory (Sethna, "Entropy, Order Parameters, and Complexity", ex. 1.6, developed with Piet Brouwer) 2016, James Sethna, all rights reserved. This is an ipython notebook. This hints file is

More information

With Question/Answer Animations. Chapter 2

With Question/Answer Animations. Chapter 2 With Question/Answer Animations Chapter 2 Chapter Summary Sets The Language of Sets Set Operations Set Identities Functions Types of Functions Operations on Functions Sequences and Summations Types of

More information

The Pip Project. PLT (W4115) Fall Frank Wallingford

The Pip Project. PLT (W4115) Fall Frank Wallingford The Pip Project PLT (W4115) Fall 2008 Frank Wallingford (frw2106@columbia.edu) Contents 1 Introduction 4 2 Original Proposal 5 2.1 Introduction....................................................... 5

More information