Introduction to Python for Biologists
|
|
- Spencer Hopkins
- 6 years ago
- Views:
Transcription
1 Introduction to Python for Biologists Katerina Taškova 1 Jean-Fred Fontaine 1,2 1 Faculty of Biology, Johannes Gutenberg-Universität Mainz, Mainz, Germany 2 Genomics and Computational Biology, Kernel Press, Mainz, Germany March 21, 2017
2 Introduction to Python for Biologists Table of Contents Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 2
3 Introduction to Python for Biologists Introduction Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 3
4 Introduction to Python for Biologists Introduction What is Python? Python is a general-purpose programming language created by Guido van Rossum (1991) high-level (abstraction from the details of the computer) interpreted (needs an interpreter software) Python design philosophy code readability syntax brevity Python is widely used for Biology rich built-in features powerful scientific extensions plotting capabilities March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 4
5 Introduction to Python for Biologists Introduction Structured programming I Instructions are executed sequentially, one per line Conditional statements allow selective execution of code blocks Loops allow repeated execution of code blocks Functions allow on-demand execution of code blocks March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 5
6 Introduction to Python for Biologists Introduction Structured programming II 1 i n s t r u c t i o n 1 # 1 s t i n s t r u c t i o n ( hashtag # s t a r t s comments ) 2 # blank l i n e 3 repeat 20 times # 2nd i n s t r u c t i o n ( loop s t a r t s a block ) 4 i n s t r u c t i o n a # block defined by i n d e n t a t i o n ( spaces or tabs ) 5 i n s t r u c t i o n b # 2nd i n s t r u c t i o n i n block 6 # blank l i n e 7 i f n>10 # 3 rd i n s t r u c t i o n ( C o n d i t i o n a l statement ) 8 i n s t r u c t i o n a # 1 s t i n s t r u c t i o n i n block 9 i n s t r u c t i o n b # 2nd i n s t r u c t i o n i n block 10 # blank l i n e 11 # blank l i n e 12 # backslashs j o i n l i n e s 13 i n s t r u c t i o n 3 \ # 3 rd i n s t r u c t i o n, p a r t 1 14 i n s t r u c t i o n 3 # 3 rd i n s t r u c t i o n, p a r t 2 15 # blank l i n e 16 # Expressions i n ( ), {}, or [ ] can span m u l t i p l e l i n e s 17 i n s t r u c t i o n 4 ( 1, 2, 3 # 4 th i n s t r u c t i o n, p a r t , 5, 6) # 4 th i n s t r u c t i o n, p a r t 2 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 6
7 Introduction to Python for Biologists Introduction Namespace Variables are names associated with data e.g. a=2 assigns value 2 to variable a Functions are names associated to specific code blocks built-in functions are available (see list on slide 100) e.g. print(a) will display 2 on the screen The user namespace is the set of names available to the user users can define new names of variables and functions in their namespace imported modules can add names of variables and functions in the user namespace March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 7
8 Introduction to Python for Biologists Introduction Object-oriented programming Data is organized in classes and objects a class is a template defining what objects can store and do an object is an instance of a class objects have attributes to store data and methods to do actions object namespaces are different from user namespace Example class Human is defined as: has a name (an attribute name ) has an age (an attribute age ) can introduce itself (a method who ) example with 1 existing Human object P1: 1 P1. name = Mary # assigns value to a t t r i b u t e name 2 P1. age = 26 # assigns value to a t t r i b u t e age 3 P1. who ( ) # d i s p l a y s My name i s Mary I am 2 6! 4 who ( ) # e r r o r! not i n the user namespace March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 8
9 Introduction to Python for Biologists Introduction Modules Modules can add functionalities to Python e.g. classes and functions Example of available modules: NumPy for scientific computing Matplotlib for plotting BioPython for Biology Modules have to be imported into the code 1 # import datetime module i n i t s own namespace 2 import datetime 3 datetime. date. today ( ) # today ( ) # e r r o r! 5 6 # import f u n c t i o n s log2 and log10 from module math 7 # i n c u r r e n t namespace 8 from math import log2, log10 9 log10 ( 1 ) # equal 0 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 9
10 Introduction to Python for Biologists Running code Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 10
11 Introduction to Python for Biologists Running code Running code I From a terminal by using the interactive Python shell 1 $ python3 # opens Python s h e l l 2 a=2 # assigns 2 to a 3 b=3 # assigns 3 to b 4 e x i t ( ) # closes Python s h e l l From a terminal by running a script file e.g. let say myscript.py is a script file (simple text file) and it contains: print( hello world! ) 1 $ python3 myscript. py # runs python3 and the s c r i p t 2 h e l l o world! # r e s u l t of the s c r i p t on the t e r m i n a l March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 11
12 Introduction to Python for Biologists Running code Running code II From Jupyter Notebook web-based graphical interface manage cells of code or text see execution results on the same notebook save/open notebooks March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 12
13 Introduction to Python for Biologists Running code Documentation and messages I Documentation and help: use the built-in help() function e.g. help(print) to display help for function print() see help menu or Google it Examples of error messages 1 # F o r g e t t i n g quotes 2 p r i n t ( Hello world ) 3 # F i l e < s t d i n >, l i n e 2 4 # p r i n t ( Hello world ) 5 # ˆ 6 # SyntaxError : i n v a l i d syntax March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 13
14 Introduction to Python for Biologists Running code Documentation and messages II 1 # S p e l l i n g mistakes 2 p r i n ( Hello world ) 3 # Traceback ( most recent c a l l l a s t ) : 4 # F i l e < s t d i n >, l i n e 2, i n <module> 5 # NameError : name p r i n i s not defined 1 # Wrong l i n e break w i t h i n a s t r i n g 2 p r i n t ( Hello 3 World ) 4 # F i l e < s t d i n >, l i n e 2 5 # p r i n t ( Hello 6 # ˆ 7 # SyntaxError : EOL while scanning s t r i n g l i t e r a l March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 14
15 Introduction to Python for Biologists Literals and variables Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 15
16 Introduction to Python for Biologists Literals and variables Numeric and strings literals I 1 # Numeric l i t e r a l s E3 # means # S t r i n g s l i t e r a l s 7 A s t r i n g # A s t r i n g 8 A s t r i n g # A s t r i n g 9 A s t r i n g # A s t r i n g 10 Three s i n g l e quotes # Three s i n g l e quotes 11 Three double quotes # Three double quotes 12 A \ s t r i n g \ # A s t r i n g ( backslash escape sequence ) 13 r A \ s t r i n g \ # A \ s t r i n g \ ( raw s t r i n g ) Python stores literals in objects of corresponding classes (class int for integers, float for floatting point, and str for strings) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 16
17 Introduction to Python for Biologists Literals and variables Numeric and strings literals II Printing numeric and strings literals 1 p r i n t (12) # 12 2 p r i n t (1+2) # p r i n t ( Hello World ) # Hello World 5 6 p r i n t ( Hello World, 1+2) # Hello World 3 7 p r i n t ( Hello World, 1+2, sep= ) # Hello World 3 8 p r i n t ( Hello World, 1+2, sep= \ t ) # Hello World 3 9 # (\ t : tab, \n : newline ) p r i n t ( AB, end= ) # AB ( avoid newline at the end ) 12 p r i n t ( CD ) # ABCD p r i n t ( Max i s, 12, and Min i s, 3) # Max i s 12 and Min i s 3 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 17
18 Introduction to Python for Biologists Literals and variables Variables I Variables are names used to access objects first letter is a character (not a digit) no space characters allowed case-sensitive (variable name var is not Var) prefer alphanumeric characters (e.g. abc123) avoid accents, non-alphanumeric, non English underscores may be used (e.g. abc 123) The following keywords can not be used as variable names and, assert, break, class, continue def, del, elif, else, except, exec, finally, for, from global, if, import, in, is, lambda, not, or, pass print, raise, return, try, while, yield March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 18
19 Introduction to Python for Biologists Literals and variables Variables II 1 # Numeric types 2 a=2 # a i s assigned an i n t o b j e c t of value 2 3 p r i n t ( a ) # p r i n t s the o b j e c t assigned to a ( 2 ) 4 b=a # b i s assigned the same o b j e c t as a ( 2 ) 5 p r i n t ( b ) # 2 6 a=5 # a i s assigned a new o b j e c t of value 5 7 p r i n t ( a ) # 5 8 p r i n t ( b ) # 2 ( b i s s t i l l assigned to o b j e c t of value 2) 9 10 # S t r i n g s 11 c1= a 12 p r i n t ( c1 ) # a 13 myname125 = abc March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 19
20 Introduction to Python for Biologists Numeric types Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 20
21 Introduction to Python for Biologists Numeric types Numeric types I 1 type ( 7 ) # <class i n t > ( i n t e g e r number ) 2 type ( ) # <class f l o a t > ( f l o a t i n g p o i n t ) 3 type (4.52 e 3) # <class f l o a t > ( f l o a t i n g p o i n t ) 4 5 # Operators ( s p e c i a l b u i l t i n f u n c t i o n s ) # 4 ( a d d i t i o n ) # 3 ( s u b s t r a c t i o n ) # 6 ( m u l t i p l i c a t i o n ) 9 9 / 2 # 4.5 ( d i v i s i o n ) 10 9 / / 2 # 4 ( i n t e g e r d i v i s i o n ) 11 9 % 2 # 1 ( i n t e g e r d i v i s i o n remainder ) # 8 ( exponent ) # Lowest to highest operators precedence ( equal i f on same l i n e ) 15 +, # Addition, S u b t r a c t i o n 16, /, / /, % # M u l t i p l i c a t i o n, D i v i s i o n s, Remainder 17 +x, x # P o s i t i v e, Negative 18 # Exponentiation March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 21
22 Introduction to Python for Biologists Numeric types Numeric types II 1 # B u i l t i n f u n c t i o n s 2 abs ( 2.58) # 2.58 ( absolute value of x ) 3 round ( 2. 5 ) # 2 ( round to c l o s e s t i n t e g e r ) 4 5 # With v a r i a b l e s 6 a = 1 # 1 7 b = # 2 8 c = a + b # 3 9 d = a+c b # 7 ( precedence of over +) 10 d = ( a+c ) b # 8 ( use parentheses to break precedence ) # Short n o t a t i o n s ( v a l i d f o r +,,, /,... ) 13 a += 1 # a = a a = 5 # a = a # Special f l o a t values 17 f l o a t ( NaN ) # nan ( Not a Number ) 18 f l o a t ( I n f ) # i n f : I n f i n i t e p o s i t i v e ; i n f : I n f i n i t e negative March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 22
23 Introduction to Python for Biologists Strings Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 23
24 Introduction to Python for Biologists Strings Sequence types Text sequence type: Strings: immutable sequences of characters Basic sequence types: Lists: mutable sequences Tuples: immutable sequences Ranges: immutable sequence of numbers Sequence operations: All sequence types support common sequence operations (slide 98) Mutable sequence types support specific operations (slide 99) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 24
25 Introduction to Python for Biologists Strings Strings I 1 # Quotes 2 A s t r i n g # A s t r i n g 3 A s t r i n g # A s t r i n g 4 A s t r i n g # A s t r i n g 5 Three s i n g l e quotes # Three s i n g l e quotes 6 Three double quotes # Three double quotes 7 8 # Escape sequences ( see annexes ) 9 A s i n g l e quote # A s i n g l e quote 10 A s i n g l e quote \ # A s i n g l e quote 11 A t a b u l a t i o n \ t 12 A newline \n See other escape sequences in slide 97 Triple quoted strings may span multiple lines - all associated whitespace will be included in the string literal March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 25
26 Introduction to Python for Biologists Strings Strings II 1 # Operators 2 pipe + t t e # = p i p e t t e ( concatenation ) 3 A 7 # = AAAAAAA ( r e p l i c a t i o n ) 4 A 3 + C 2 # = AAACC 5 A + s t r ( 2. 0 ) # = A2. 0 ( convert number then concatenate ) 6 7 # B u i l t i n f u n c t i o n s 8 len ( A s t r i n g of characters ) # 22 ( l e n g t h i n characters ) 9 type ( a ) # <class s t r > ( s t r i n g ) # S l i c e s [ s t a r t : end : step ] (0 i s index of f i r s t character ) 12 ABCDEFG [ 2 : 5 ] # CDE ( F at index 5 excluded ) 13 ABCDEFG [ : 5 ] # ABCDE ( from begining ) 14 ABCDEFG [ 5 : ] # FG ( to the end ) 15 ABCDEFG [ 2:] # FG ( 2 from the end : to the end ) 16 ABCDEFG [ 0 : 5 : 2 ] # ACE ( every second l e t t e r w ith step =2) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 26
27 Introduction to Python for Biologists Strings Strings methods I Strings are immutable: new objects are created for changes 1 seq = ACGtCCAgTnAGaaGT 2 3 # Case 4 seq. c a p i t a l i z e ( ) # Acgtccagtnagaagt 5 seq. casefold ( ) # acgtccagtnagaagt ( e s z e t t => ss ) 6 seq. lower ( ) # acgtccagtnagaagt ( e s z e t t => e s z e t t ) 7 seq. swapcase ( ) # acgtccagtnagaagt 8 seq. upper ( ) # ACGTCCAGTNAGAAGT 9 10 # Search and replace 11 seq. count ( a ) # 2 ( case s e n s i t i v e ) 12 seq. count ( G,0, 4) # 1 ( s l i c e s t a r t and end indexes ) 13 seq. endswith ( GT ) # True 14 seq. endswith ( G,0, 4) # False ( s l i c e s t a r t and end indexes ) 15 seq. f i n d ( GtC ) # 2 (1 s t h i t index, 1 otherwise ) 16 seq. replace ( aa, t t ) # ACGtCCAgTnAGttGT ( case s e n s i t i v e ) 17 seq. replace ( A, x, 2) # xcgtccxgtnagaagt (2 f i r s t h i t s only ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 27
28 Introduction to Python for Biologists Strings Strings methods II 1 seq = ACGtCCAgTnAGaaGT 2 3 # I s f u n c t i o n s 4 seq. isalnum ( ) # True ( Are a l l characters alphanumeric?) 5 seq. i s a l p h a ( ) # True ( Are a l l characters a l p h a b e t i c?) 6 seq. i s l o w e r ( ) # False ( Are a l l characters lowercase?) 7 seq. isnumeric ( ) # False ( Are a l l numeric characters?) 8 seq. isspace ( ) # False ( Are a l l whitespace characters?) 9 seq. isupper ( ) # False ( Are a l l characters uppercase?) # Join and s p l i t 12. j o i n ( [ A, B ] ) # A B 13. j o i n ( seq ) # A C G t C C A g T n A G a a G T 14 seq. p a r t i t i o n ( aa ) # ( ACGtCCAgTnAG, aa, GT ) : a t u p l e 15 seq. s p l i t ( aa ) # [ ACGtCCAgTnAG, GT ] : a l i s t 16 1\n2. s p l i t l i n e s ( ) # [ 1, 2 ] ( s p l i t at l i n e boundaries \r, \n ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 28
29 Introduction to Python for Biologists Strings Strings methods III 1 seq = ACGtCCAgTnAGaaGT 2 3 # D e l e t i n g 4 seq. l s t r i p ( ) # remove leading whitespace characters 5 seq. r s t r i p ( ) # remove t r a i l i n g whitespace characters 6 seq. s t r i p ( ) # remove whitespace characters from both ends 7 8 seq. l s t r i p ( AC ) # GtCCAgTnAGaaGT ( remove C s or A s ) 9 seq. l s t r i p ( CA ) # GtCCAgTnAGaaGT ( remove C s or A s ) 10 seq. l s t r i p ( C ) # ACGtCCAgTnAGaaGT ( no impact ) 11 # same f o r r s t r i p but from the r i g h t and s t r i p from both ends # Simple parsing of t e x t l i n e s from CSV f i l e s 14 l i n e. s t r i p ( ). s p l i t (, ) # remove newline and s p l i t CSV (\ t i f TSV) # t r a n s l a t e ( case s e n s i t i v e ) 17 t a b l e = seq. maketrans ( atcg, tagc ) # map characters by index 18 seq. lower ( ). t r a n s l a t e ( t a b l e ) # t g c a g g t c a n t c t t c a March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 29
30 Introduction to Python for Biologists Exercise Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 30
31 Introduction to Python for Biologists Exercise Exercise Create the following directory structure Dokumente python notebooks data Jupyter Notebook File: Literals.ipynb URL: Download the file into the notebooks folder March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 31
32 Introduction to Python for Biologists Lists, tuples and ranges Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 32
33 Introduction to Python for Biologists Lists, tuples and ranges Sequence types Text sequence type: Strings: immutable sequences of characters Basic sequence types: Lists: mutable sequences Tuples: immutable sequences Ranges: immutable sequence of numbers Sequence operations: All sequence types support common sequence operations (slide 98) Mutable sequence types support specific operations (slide 99) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 33
34 Introduction to Python for Biologists Lists, tuples and ranges Lists I A List is an ordered collection of objects 1 L i s t 1 = [ ] # an empty l i s t 2 3 L i s t 1 = [ b, a, 1, cat, K, dog, F ] 4 L i s t 1 [ 0 ] # b ( access item of index 0) 5 L i s t 1 [ 1 ] # a ( access item of index 1) 6 L i s t 1 [ 1] # F ( access the l a s t item ) 7 L i s t 1 [ 2] # dog ( access the second l a s t item ) 8 9 # S l i c e s [ s t a r t : end : step ] 10 L i s t 1 [ 2 : 5 ] # [ 1, cat, K ] ( index 5 excluded ) 11 L i s t 1 [ : 5 ] # [ b, a, 1, cat, K ] 12 L i s t 1 [ 5 : ] # [ dog, F ] 13 L i s t 1 [ 2:] # [ dog, F ] 14 L i s t 1 [ 0 : 5 : 2 ] # [ b, 1, K ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 34
35 Introduction to Python for Biologists Lists, tuples and ranges Lists II 1 # B u i l t i n f u n c t i o n s 2 L i s t 2 = [ 1, 2, 3, 4, 5] 3 len ( L i s t 1 ) # 5 ( l e n g t h = 7 items ) 4 max( L i s t 2 ) # 5 5 min ( L i s t 2 ) # 1 6 sum( L i s t 2 ) # # L i s t methods 9 L i s t 2 = [ ] # empty l i s t 10 L i s t 2. append ( 1 ) # [ 1 ] 11 L i s t 2. append ( A ) # [ 1, A ] 12 L i s t 2. extend ( [ B, 2 ] ) # [ 1, A, B, 2] 13 L i s t 2. pop ( 2 ) # [ 1, A, 2] 14 L i s t 2. i n s e r t ( 3, A ) # [ 1, A, 2, A ] ( i n s e r t A at index 3) 15 L i s t 2. index ( A ) # 1 ( index of the 1 s t A ) 16 L i s t 2. count ( A ) # 2 ( number of A ) 17 L i s t 2. reverse ( ) # [ A, 2, A, 1] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 35
36 Introduction to Python for Biologists Lists, tuples and ranges Lists III 1 # s o r t i n g 2 L i s t 3 = [ 5, 3, 4, 1, 2] 3 sorted ( L i s t 3 ) # [ 1, 2, 3, 4, 5] ( b u i l d a new sorted l i s t ) 4 L i s t 3 # [ 5, 3, 4, 1, 2] ( L i s t 3 not changed ) 5 L i s t 3. s o r t ( ) # modifies the l i s t in place 6 L i s t 3 # [ 1, 2, 3, 4, 5] (. s o r t ( ) did modify L i s t 3! ) 7 8 # nested l i s t / 2D l i s t s / t a b l e s 9 mylist = [ [ b, a ], 10 [ 1, cat ] ] # a l i s t of 2 l i s t s 11 mylist [ 0 ] # r e t u r n s the f i r s t l i s t [ b, a ] 12 mylist [ 0 ] [ 0 ] # b (1 s t item of the 1 s t l i s t ) 13 mylist [ 0 ] [ 1 ] # a (2 nd item of the 1 s t l i s t ) 14 mylist [ 1 ] # r e t u r n s the 2nd l i s t [ 1, cat ] 15 mylist [ 1 ] [ 0 ] = 1 0 # [ [ b, a ], [10, cat ] ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 36
37 Introduction to Python for Biologists Lists, tuples and ranges Lists IV 1 mylist = [ [ b, a ], 2 [ 1, cat ] ] 3 4 f o r s u b l i s t i n mylist : # loop over s u b l i s t s 5 f o r value i n s u b l i s t : # loop over values 6 p r i n t ( value ) # p r i n t 1 value per l i n e 7 # b 8 # a 9 # # cat f o r s u b l i s t i n mylist : # loop over s u b l i s t s 13 n e w s u b l i s t = map( s t r, s u b l i s t ) # convert each item to s t r i n g 14 p r i n t ( \ t. j o i n ( n e w s u b l i s t ) ) # p r i n t as TSV t a b l e 15 # b a 16 # 10 cat March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 37
38 Introduction to Python for Biologists Lists, tuples and ranges Tuples and ranges A Tuple is an ordered collection of objects 1 Tuple1 = ( ) # empty t u p l e 2 Tuple1 = ( b, a, 1, cat, K, dog, F ) # defined t u p l e 3 4 Tuple1 [ 0 ] # b 5 Tuple1 [ 1 : 3 ] # ( a, 1) ( index 3 excluded ) Ranges 1 # Range ( s t a r t, stop [, step ] ) 2 range (10) # range ( 0, 10) => no nice p r i n t method 3 l i s t ( range (10) ) # [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9] 4 l i s t ( range ( 0, 30, 5) ) # [ 0, 5, 10, 15, 20, 25] 5 l i s t ( range ( 0, 5, 1) ) # [ 0, 1, 2, 3, 4] 6 l i s t ( range ( 0 ) ) # [ ] 7 l i s t ( range ( 1, 0) ) # [ ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 38
39 Introduction to Python for Biologists Sets and dictionaries Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 39
40 Introduction to Python for Biologists Sets and dictionaries Sets I A Set is a mutable unordered collection of objects 1 S0 = set ( ) # an empty set 2 S0 = { a, 1} # a new set of 2 items 3 S1 = { a, 1, b, R } # a new set of 4 items 4 S2 = { a, 1, b, S } # a new set of 4 items 5 len ( S0 ) # # Operators 8 R i n S1 # True 9 R not i n S2 # True 10 S1 S2 # i n S1 but not i n S2 => { R } 11 S1 S2 # i n S1 or i n S2 => {1, a, S, R, b } 12 S1 & S2 # i n S1 and i n S2 => {1, b, a } 13 S1 ˆ S2 # i n S1 or i n S2 but not i n both => { R, S } 14 S0 <= S1 # S0 i s subset of S2 => True 15 S1 >= S2 # S1 i s superset of S2 => False 16 S1 >= S0 # True 17 S0. i s d i s j o i n t ( S1 ) # False March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 40
41 Introduction to Python for Biologists Sets and dictionaries Sets II 1 # Methods 2 S0. copy ( ) # r e t u r n a new set w ith a shallow copy of S0 3 S0. add ( item ) # add element item to the set 4 S0. remove ( item ) # remove element item from the set 5 S0. discard ( item ) # remove element item from the set i f present 6 S0. pop ( ) # remove and r e t u r n an a r b i t r a r y element 7 S0. c l e a r ( ) # remove a l l elements from the set March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 41
42 Introduction to Python for Biologists Sets and dictionaries Dictionaries I A Dictionary is a mutable indexed collection of objects (indexed by unique keys) 1 d = {} # empty d i c t i o n a r y 2 d = { A : ALA, C : CYS } # d i c t i o n a r y w ith 2 items 3 d [ A ] # ALA 4 d [ C ] # CYS 5 d [ H ]= HIS # add new item 6 d # { H : HIS, C : CYS, A : ALA } 7 del d [ A ] # { C : CYS, H : HIS } 8 9 C i n d # True ( key C i s i n d ) 10 A not i n d # True ( key A i s not i n d anymore ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 42
43 Introduction to Python for Biologists Sets and dictionaries Dictionaries II d[key] get value by key d[key] = val set value by key del d[key] delete item by key d.clear() delete all items len(d) number of items d.copy() make a shallow copy d.keys() return a view of all keys d.values() return a view of all values d.items() return a view of all items (key,value) d.update(d2) add all items from dictionary d2 d.get(key [, val]) get value by key if exists, otherwise val d.setdefaults(key [, val]) like d.get(k,val), also set d[k]=val if k not in d pop(key[, default]) remove key and return its value, return default otherwise. d.popitem() remove a random item and returns it as tuple Table: Functions for dictionaries March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 43
44 Introduction to Python for Biologists Convert and copy Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 44
45 Introduction to Python for Biologists Convert and copy Converting types I Many Python functions are sensitive to the type of data. For example, you cannot concatenate a string with an integer: 1 sign = You are years old # e r r o r!! 2 sign = You are + s t r ( 21) + years old # OK 3 sign # You are 21 years old 4 5 # convert to i n t ( from s t r or f l o a t ) 6 i n t ( 2014 ) # from a s t r i n g 7 i n t ( ) # from a f l o a t 8 9 # convert to f l o a t ( from s t r or i n t ) 10 f l o a t ( 1.99 ) # from a s t r i n g 11 f l o a t ( 5 ) # from an i n t e g e r March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 45
46 Introduction to Python for Biologists Convert and copy Converting types II 1 # convert to s t r ( from i n t, f l o a t, l i s t, tuple, d i c t and set ) 2 s t r ( ) # s t r ( [ 1, 2, 3, 4 ] ) # [ 1, 2, 3, 4 ] 4 5 # convert a sequence type to another 6 # ( s t r, l i s t, tuple, and set f u n c t i o n s ) 7 new set = set ( o l d l i s t ) # l i s t to set 8 new tuple = t u p l e ( o l d l i s t ) # l i s t to t u p l e 9 new set = set ( Hello ) # s t r i n g to set { H, o, e, l } 10 n e w l i s t = l i s t ( Hello ) # s t r i n g to l i s t [ H, e, l, l, o ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 46
47 Introduction to Python for Biologists Convert and copy Copy I Assignments (=) do not copy objects, they create bindings between a target and an object. 1 # Numeric types ( immutable ) 2 a = 1 # a binds the o b j e c t 1 3 b = a # b binds the o b j e c t 1 4 b = b + 1 # b binds a new o b j e c t created by the sum 5 a # 1 6 b # # S t r i n g s ( immutable ) 9 a = Hello # a binds the o b j e c t Hello 10 b = a # b binds the o b j e c t Hello 11 a = a. replace ( o, o World! ) # a binds a new o b j e c t 12 a # Hello World! 13 b # Hello March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 47
48 Introduction to Python for Biologists Convert and copy Copy II For collections that are mutable or contain mutable items, a shallow copy is sometimes needed so one can change one copy without changing the other. 1 # D i c t i o n a r y ( mutable ) 2 d1 = { A : ALA, C : CYS } # d1 binds the o b j e c t 3 d2 = d1 # d2 binds the o b j e c t 4 d2 [ H ]= HIS # add item to the o b j e c t 5 d1 # { A : ALA, H : HIS, C : CYS } 6 d2 # { A : ALA, H : HIS, C : CYS } 7 8 d2 = d1. copy ( ) # d2 binds a shallow copy of the o b j e c t 9 d2 [ P ]= PRO # add item to the copied o b j e c t 10 d1 # { A : ALA, H : HIS, C : CYS } 11 d2 # { A : ALA, H : HIS, P : PRO, C : CYS } March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 48
49 Introduction to Python for Biologists Convert and copy Copy III 1 # L i s t ( mutable ) 2 l 1 = [ A, H, C ] 3 l 2 = l 1 4 l 2. append ( P ) 5 l 1 # [ A, H, C, P ] 6 l 2 # [ A, H, C, P ] 7 8 l 2 = l 1 [ : ] # shallow copy by assigning a s l i c e of the a l l l i s t 9 l 2. append ( V ) 10 l 1 # [ A, H, C, P ] 11 l 2 # [ A, H, C, P, V ] March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 49
50 Introduction to Python for Biologists Convert and copy Copy IV Convert types to get copies 1 n e w l i s t = l i s t ( o l d l i s t ) # shallow copy 2 new dict = d i c t ( o l d d i c t ) # shallow copy 3 new set = set ( o l d l i s t ) # copy l i s t as a set 4 new tuple = t u p l e ( o l d l i s t ) # copy l i s t a t u p l e The copy module 1 import copy 2 x. copy ( ) # shallow copy of x 3 x. deepcopy ( ) # deep copy of x, i n c l u d i n g embedded o b j e c t s March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 50
51 Introduction to Python for Biologists Loops Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 51
52 Introduction to Python for Biologists Loops For loop I 1 # For items i n a l i s t 2 f o r person i n [ I s a b e l, Kate, Michael ] : 3 p r i n t ( Hi, person ) 4 # Hi I s a b e l 5 # Hi Kate 6 # Hi Michael 7 8 # For items i n a d i c t i o n a r y 9 seq = # an empty s t r i n g 10 d = { A : ALA, C : CYS } # a d i c t i o n a r y w ith 2 keys 11 f o r k i n d. keys ( ) : # loop over the keys 12 seq += d [ k ] # append value to seq 13 p r i n t ( seq ) # CYSALA March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 52
53 Introduction to Python for Biologists Loops For loop II 1 # For items i n a s t r i n g 2 f o r c i n abc : 3 p r i n t ( c ) 4 # a 5 # b 6 # c 7 8 # For items i n a range 9 f o r n i n range ( 3 ) : 10 p r i n t ( n ) 11 # 0 12 # 1 13 # # For items from any i t e r a t o r 16 f o r n i n i t e r a t o r : 17 p r i n t ( n ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 53
54 Introduction to Python for Biologists Loops Enumerate 1 # loop g e t t i n g index and value 2 RNAs = [ mirna, trna, mrna ] 3 f o r i, rna i n enumerate (RNAs) : 4 p r i n t ( i, rna ) 5 # 0 mirna 6 # 1 trna 7 # 2 mrna 8 9 # loop over 2 l i s t s 10 RNAtypes = [ micro, t r a n s f e r, messenger ] 11 f o r i, t i n enumerate ( RNAtypes ) : 12 r = RNAs[ i ] 13 p r i n t ( i, t, r ) 14 # 0 micro mirna 15 # 1 t r a n s f e r trna 16 # 2 messenger mrna March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 54
55 Introduction to Python for Biologists Loops While loop 1 i =0 2 value=1 3 while value <200: 4 i +=1 5 value = i 6 p r i n t ( i, value ) 7 # # # # # # March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 55
56 Introduction to Python for Biologists Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 56
57 Introduction to Python for Biologists Exercise URL Jupyter Notebook File: Sequences.ipynb Download the file into the notebooks folder Data file File: shrub dimensions.csv Download the file into the data folder March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 57
58 Introduction to Python for Biologists Functions Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 58
59 Introduction to Python for Biologists Functions Functions I 1 from random import choice # import f u n c t i o n choice 2 3 # Simple f u n c t i o n 4 def kmerfixed ( ) : # d e f i n e f u n c t i o n kmerfixed 5 p r i n t ( ACGTAGACGC ) # p r i n t predefined s t r i n g 6 7 kmerfixed ( ) # d i s p l a y ACGTAGACGC 8 9 # Returning a value 10 def kmer10 ( ) : # d e f i n e f u n c t i o n kmer10 11 seq= # d e f i n e an empty s t r i n g 12 f o r count i n range ( 10) : # repeat 10 times 13 seq += choice ( CGTA ) # add 1 random nt to s t r i n g 14 r e t u r n ( seq ) # r e t u r n s t r i n g newkmer = kmer10 ( ) # get r e s u l t of f u n c t i o n i n t o v a r i a b l e 17 p r i n t ( newkmer ) # c a l l the f u n c t i o n e. g. ACGGATACGC March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 59
60 Introduction to Python for Biologists Functions Functions II 1 # One parameter 2 def kmer ( k ) : # defin e kmer with 1 param. k 3 seq= 4 f o r count i n range ( k ) : # k i s used to d e f i n e the range 5 seq+=choice ( CGTA ) 6 r e t u r n ( seq ) 7 8 p r i n t ( kmer ( k =4) ) # e. g. TACC 9 p r i n t ( kmer (20) ) # e. g. CACAATGGGTACCCCGGACC 10 p r i n t ( kmer ( 0 ) ) # 11 p r i n t ( kmer ( ) ) # TypeError : kmer ( ) missing 1 r e q u i r e d 12 # p o s i t i o n a l argument : k March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 60
61 Introduction to Python for Biologists Functions Functions III 1 # Parameters with more parameters and d e f a u l t values 2 def generic kmer ( alphabet= ACGT, k=10) : 3 seq= 4 f o r count i n range ( k ) : 5 seq+=choice ( alphabet ) 6 r e t u r n ( seq ) 7 8 generic kmer ( AB12, 15) # e. g. 112AA1A12AA generic kmer ( AB12 ) # e. g. 1AA1B1BA2A 10 generic kmer ( k=20) # e. g. GTGGGCTTGTGCCCTGCACT 11 generic kmer ( ) # e. g. CTTGCCGGGA 12 generic kmer ( k =8, alphabet= #$%& ) # e. g. $$#&%$%$ March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 61
62 Introduction to Python for Biologists Functions Name spaces I Variable and function names defined globally can be seen in functions: this is the global namespace 1 a = 10 # g l o b a l v a r i a b l e 2 3 def my function ( ) : 4 p r i n t ( a ) # w i l l use the g l o b a l v a r i a b l e 5 6 my function ( ) # 10 ( the g l o b a l a ) 7 p r i n t ( a ) # 10 ( the g l o b a l a ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 62
63 Introduction to Python for Biologists Functions Name spaces II Names defined within a function can not be seen outside: the function has its own namespace. 1 a = 10 # g l o b a l v a r i a b l e 2 3 def my function ( ) : 4 a = 1 # l o c a l v a r i a b l e defined by assignment 5 b = 2 # l o c a l v a r i a b l e defined by assignment 6 p r i n t ( a ) 7 8 my function ( ) # 1 ( the l o c a l a ) 9 p r i n t ( a ) # 10 ( the g l o b a l a ) 10 p r i n t ( b ) # NameError : name b i s not defined March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 63
64 Introduction to Python for Biologists Functions Name spaces III Use parameters and returned values to get and set variables outside the name space 1 a = 10 # g l o b a l v a r i a b l e 2 3 def my function ( v a l ) : # l o c a l v a r i a b l e v a l 4 b = 2 5 v a l = v a l + b 6 r e t u r n ( v a l ) 7 p r i n t ( a ) # 10 ( the g l o b a l a ) 8 p r i n t ( my function ( a ) ) # 12 9 p r i n t ( a ) # 10 ( the g l o b a l a unchanged ) c = my function ( a ) # set v a l to 10 and assign 10+2 to c 12 p r i n t ( c ) # 12 ( g l o b a l a was changed ) 13 p r i n t ( a ) # 10 ( g l o b a l a was unchanged ) a = my function ( a ) # change g l o b a l a w ith value 10+2 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 64
65 Introduction to Python for Biologists Branching Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 65
66 Introduction to Python for Biologists Branching Truth Value Testing I Any object can be tested for truth value. The following values are considered false (other values are considered True): None False zero value: e.g. 0 or 0.0 an empty sequence or mapping: e.g., (), [ ], { }. Operations and built-in functions that have a Boolean result always return 0 for False and 1 for True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 66
67 Introduction to Python for Biologists Branching Boolean Operations I A Boolean is equal to True or False a and b (true if a and b are true, false otherwise) a or b (true if a or b is true (1 alone or both), false otherwise) a ˆ b (true if either a or b is true (not both), false otherwise) not b (true if b is false, false otherwise) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 67
68 Introduction to Python for Biologists Branching Boolean Operations II All example code for tests below return True unless otherwise specified 1 # l e t set values of 3 v a r i a b l e s ( s i n g l e = symbol ) 2 a = True 3 b = False 4 c = True # simple t e s t s using two = symbols (==) 8 a == True 9 b == False 10 c == True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 68
69 Introduction to Python for Biologists Branching Boolean Operations III 1 # l e t set values of 3 v a r i a b l e s ( one = symbol ) 2 a = True 3 b = False 4 c = True 5 6 # order i s i r r e l e v a n t 7 ( a or b ) == ( b or a ) 8 ( a and b ) == ( b and a ) 9 10 # n e u t r a l ( whatever value of a ) 11 ( a or False ) == a 12 ( a and True ) == a # always the same ( whatever value of a ) 15 ( a and False ) == False 16 ( a or True ) == True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 69
70 Introduction to Python for Biologists Branching Boolean Operations IV 1 # l e t set values of 3 v a r i a b l e s ( one = symbol ) 2 a = True 3 b = False 4 c = True 5 6 # precedence == > not > and > or 7 ( a and b or c ) == ( ( a and b ) or c ) 8 ( not a == b ) == ( not ( a == b ) ) 9 10 # e q u i v a l e n t expressions 11 ( ( a or b ) or c ) == ( a or ( b or c ) ) == ( a or b or c ) 12 ( a or a or a ) == a 13 ( b and b and b ) == b b and b and b == b # False and False and True => False!! a and ( b or c ) == ( a and b ) or ( a and c ) 18 a or ( b and c ) == ( a or b ) and ( a or c ) March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 70
71 Introduction to Python for Biologists Branching Comparisons 1 Operations 2 < # s t r i c t l y l ess than 3 <= # less than or equal 4 > # s t r i c t l y g r e a t e r than 5 >= # g r e a t e r than or equal 6 == # equal ( two symbols =) 7 math. i s c l o s e ( a, b ) # equal f o r f l o a t i n g p o i n t s a and b 8! = # not equal 9 i s # o b j e c t i d e n t i t y 10 i s not # negated o b j e c t i d e n t i t y 11 x < y <= z # i s e q u i v a l e n t to x < y and y <= z Comparisons between objects of same class are supported if operator defined for the class. Different numerical types can be compared: e.g. 2<4.56 Floating points can not be compared exactly due to the limited precision to represent infinite numbers such as 1/3 = March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 71
72 Introduction to Python for Biologists Branching Conditionals IF-ELIF-ELSE 1 seq = ATGAnnATG 2 i f n i n seq : 3 p r i n t ( sequence contains undefined bases ( n ) ) 4 e l i f x i n seq : 5 p r i n t ( sequence contains unknown bases x but not n ) 6 else : 7 p r i n t ( no undefined bases i n sequence ) 8 9 # 10 # sequence contains undefined bases ELIF and ELSE are optional multiple ELIF are possible March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 72
73 Introduction to Python for Biologists Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 73
74 Introduction to Python for Biologists Exercise URL Jupyter Notebook File: Conditionals.ipynb Download the file into the notebooks folder March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 74
75 Introduction to Python for Biologists Regular Expressions Introduction Running code Literals and variables Numeric types Strings Exercise Lists, tuples and ranges Sets and dictionaries Convert and copy Loops Functions Branching Regular Expressions Annexes March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 75
76 Introduction to Python for Biologists Regular Expressions RE: Regular Expressions I Regular expressions (called REs, or regexes, or regex patterns) are a powerful language for matching text patterns (re module) In Python a regular expression search is typically written as: 1 match = re. search ( expression, s t r i n g ) The re.search() method takes a regular expression pattern and a string and searches for that pattern within the string. If the search is successful, re.search() returns a Match object (actually class sre.sre Match ) or None otherwise. March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 76
77 Introduction to Python for Biologists Regular Expressions RE: Regular Expressions II 1 import re # import re module 2 s t r = an example word : cat!! # Example s t r i n g 3 match = re. search ( r word : \w\w\w, s t r ) # Search a p a t t e r n 4 i f match : 5 p r i n t ( found, match. group ( ) ) # found word : cat 6 else : 7 p r i n t ( did not f i n d ) In the pattern string, \w codes a character (letter, digit or underscore) The r at the start of the pattern string designates a python raw string which passes through backslashes without change. March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 77
78 Introduction to Python for Biologists Regular Expressions RE: Basic Patterns Pattern Match a, X, 9, < ordinary characters match themselves exactly. a period matches any single character except newline \w matches a word character: a letter or digit or underbar [a-za-z0-9 ] \W matches any non-word character \b boundary between word and non-word \s a single whitespace character space, newline, return, tab, form [\n \r \t \f] \S matches any non-whitespace character \t tab \n newline \r return \d decimal digit [0-9] ˆ circumflex (top hat) matches the start of a string $ dollar matches the end of a string \ inhibits the specialness of a character. So, for example, use \. to match a period Table: Regular expressions: basic patterns March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 78
79 Introduction to Python for Biologists Regular Expressions RE: Basic examples I The basic rules of RE search for a pattern within a string are: The search proceeds through the string from start to end, stopping at the first match found All of the pattern must be matched, but not all of the string If match = re.search(pat, str) is successful, match is not None and in particular match.group() is the matching text March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 79
80 Introduction to Python for Biologists Regular Expressions RE: Basic examples II 1 match = re. search ( r i i i, p i i i g ) # found 2 match. group ( ) == i i i # True 3 4 match = re. search ( r i g s, p i i i g ) # not found 5 match == None # True 6 7 match = re. search ( r.. g, p i i i g ) # found 8 match. group ( ) == i i g # True 9 10 match = re. search ( r \d\d\d, p123g ) # found 11 match. group ( ) == 123 # True match = re. search ( r ) # found 14 match. group ( ) == abc # True March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 80
81 Introduction to Python for Biologists Regular Expressions RE: Repetitions I Repetitions are defined using +, *,? and { } + means 1 or more occurrences of the pattern to its left e.g. i+ = one or more i s * means 0 or more occurrences of the pattern to its left? means match 0 or 1 occurrences of the pattern to its left curly brackets are used to specify exact number of repetitions e.g. A{5} for 5 A letters A{6,10} for 6 to 10 A letters Leftmost and Largest: First the search finds the leftmost match for the pattern, and second it tries to use up as much of the string as possible i.e. + and * go as far as possible (they are said to be greedy ). March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 81
82 Introduction to Python for Biologists Regular Expressions RE: Repetitions II 1 # simple r e p e t i t i o n s 2 re. search ( r p i +, p i i i g ). group ( ) # p i i i 3 re. search ( r p i?, ap ). group ( ) # p 4 re. search ( r p i?, a p i i ). group ( ) # p i 5 re. search ( r p i, ap ). group ( ) # p 6 re. search ( r p i, a p i i ). group ( ) # p i i 7 re. search ( r p i {3}, a p i i i i i ). group ( ) # p i i i 8 re. search ( r i +, p i i g i i i i ). group ( ) # i i (1 s t h i t only ) 9 10 # 3 d i g i t s p o s s i b l y separated by whitespaces (\ s ) 11 re. search ( r \d\s \d\s \d, xx1 2 3xx ). group ( ) # re. search ( r \d\s \d\s \d, xx12 3xx ). group ( ) # re. search ( r \d\s \d\s \d, xx123xx ). group ( ) # 123 March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 82
83 Introduction to Python for Biologists Regular Expressions RE: Sets of characters I Square brackets indicate a set of characters [ABC] matches A or B or C. The codes \w, \s etc. work inside square brackets too with the one exception that dot (.) just means a literal dot Dash indicate a range or itself if put at the end [a-z] for lowercase alphabetic characters [a-za-z] for alphabetic characters [AB-] for A, B or dash Circumflex (ˆ) at the start inverts the set [ˆAB] for any character except A or B. March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 83
84 Introduction to Python for Biologists Regular Expressions RE: Sets of characters II 1 s t r = purple a l i c e b@google. com monkey dishwasher 2 match = re. search ( r \w+@\w+, s t r ) 3 i f match : 4 p r i n t match. group ( ) ## b@google 5 6 match = re. search ( r [ \w. ]+@[ \w. ]+, s t r ) 7 i f match : 8 p r i n t match. group ( ) ## a l i c e b@google. com March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 84
85 Introduction to Python for Biologists Regular Expressions RE: Functions I RE module functions: re.match() returns a Match object if occurrence found at begining of string, None otherwise re.search() returns a Match object for 1st occurrence, None if not found re.findall() returns a list of matched sub strings, an empty list if not found re.finditer() returns an iterator on Match objects of the occurrences, an empty iterator if not found Match object methods: match.start() returns start index match.end() returns end index match.span() returns start and end index in a tuple match.group() returns matched string March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 85
86 Introduction to Python for Biologists Regular Expressions RE: Functions II 1 import re 2 seq = RPAPPDRAPDQX # A sequence 3 expr = A.{ 1, 2}D # A and D separated by 1 or 2 characters 4 5 match = re. search ( expr, seq ) 6 i f match : 7 p r i n t ( 8 match. s t a r t ( ), # s t a r t index 9 match. end ( ), # end index 10 match. span ( ), # s t a r t and end index 11 match. group ( ), # the matched s t r i n g 12 seq [ match. s t a r t ( ) : match. end ( ) ], # the matched s t r i n g 13 sep= 14 ) 15 # 2 6 ( 2, 6) APPD APPD March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 86
87 Introduction to Python for Biologists Regular Expressions RE: Functions III 1 import re 2 seq = RPAPPDRAPDQX # A sequence 3 expr = A.{ 1, 2}D # A and D separated by 1 or 2 characters 4 5 match = re. match ( expr, seq ) # Not found at begining 6 p r i n t ( match ) 7 # None 8 9 matches = re. f i n d a l l ( expr, seq ) # Found 2 occurrences 10 p r i n t ( matches ) 11 # [ APPD, APD ] matches = re. f i n d i t e r ( expr, seq ) # Found 2 occurrences 14 f o r m i n matches : # I t e r a t e over Match o b j e c t s 15 p r i n t ( m. span ( ), m. group ( ) ) # Use each Match o b j e c t 16 # ( 2, 6) APPD 17 # ( 7, 10) APD March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 87
88 Introduction to Python for Biologists Regular Expressions RE: Group Extraction Groups are defined with parentheses On a successful search match.group(): the whole match text match.group(1): match text of 1st left parenthesis match.group(2): match text of 2nd left parenthesis... 1 import re 2 s t r = purple a l i c e b@google. com monkey dishwasher 3 match = re. search ( ( [ \w. ]+)@( [ \w. ]+), s t r ) 4 i f match : 5 p r i n t ( match. group ( ) ) ## a l i c e b@google. com 6 p r i n t ( match. group ( 1 ) ) ## a l i c e b 7 p r i n t ( match. group ( 2 ) ) ## google. com March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 88
89 Introduction to Python for Biologists Regular Expressions RE: Group Extraction and Findall If the pattern includes a single set of parenthesis, then findall() returns a list of strings corresponding to that single group If the pattern includes 2 or more parenthesis groups, then instead of returning a list of strings, findall() returns a list of tuples. Each tuple represents one match of the pattern, and inside the tuple is the group(1), group(2)... data. 1 s t r = alice@google. com, monkey bob@abc. com dishwasher 2 t u p l e s = re. f i n d a l l ( r ( [ \w\. ]+)@( [ \w\. ]+), s t r ) 3 p r i n t ( t u p l e s ) 4 # [ ( a l i c e, google. com ), ( bob, abc. com ) ] 5 6 f o r t i n t u p l e s : 7 p r i n t ( t [ 0 ], t [ 1 ], sep= ) 8 # a l i c e google. com 9 # bob abc. com March 21, 2017 Johannes Gutenberg-Universität Mainz Taškova & Fontaine 89
Notater: INF3331. Veronika Heimsbakk December 4, Introduction 3
Notater: INF3331 Veronika Heimsbakk veronahe@student.matnat.uio.no December 4, 2013 Contents 1 Introduction 3 2 Bash 3 2.1 Variables.............................. 3 2.2 Loops...............................
More informationPython. chrysn
Python chrysn 2008-09-25 Introduction Structure, Language & Syntax Strengths & Weaknesses Introduction Structure, Language & Syntax Strengths & Weaknesses Python Python is an interpreted,
More informationContents. 6.1 Conditional statements 6.2 Iteration 6.3 Procedures and flow control
Introduction This reference manual gives an essentially complete description of the standard facilities available in SMP. Extensive illustrative examples are included. The same text is given without these
More informationIntroduction to Python
Introduction to Python Luis Pedro Coelho Institute for Molecular Medicine (Lisbon) Lisbon Machine Learning School II Luis Pedro Coelho (IMM) Introduction to Python Lisbon Machine Learning School II (1
More informationRegular Expressions and Finite-State Automata. L545 Spring 2008
Regular Expressions and Finite-State Automata L545 Spring 2008 Overview Finite-state technology is: Fast and efficient Useful for a variety of language tasks Three main topics we ll discuss: Regular Expressions
More informationL435/L555. Dept. of Linguistics, Indiana University Fall 2016
in in L435/L555 Dept. of Linguistics, Indiana University Fall 2016 1 / 13 in we know how to output something on the screen: print( Hello world. ) input: input() returns the input from the keyboard
More information1 Opening URLs. 2 Regular Expressions. 3 Look Back. 4 Graph Theory. 5 Crawler / Spider
1 Opening URLs 2 Regular Expressions 3 Look Back 4 Graph Theory 5 Crawler / Spider Sandeep Sadanandan (TU, Munich) Python For Fine Programmers June 5, 2009 1 / 14 Opening URLs The module used for opening
More informationIntroduction to Python
Introduction to Python Luis Pedro Coelho luis@luispedro.org @luispedrocoelho European Molecular Biology Laboratory Lisbon Machine Learning School 2015 Luis Pedro Coelho (@luispedrocoelho) Introduction
More informationPython & Numpy A tutorial
Python & Numpy A tutorial Devert Alexandre School of Software Engineering of USTC 13 February 2012 Slide 1/38 Table of Contents 1 Why Python & Numpy 2 First steps with Python 3 Fun with lists 4 Quick tour
More informationPython. Tutorial. Jan Pöschko. March 22, Graz University of Technology
Tutorial Graz University of Technology March 22, 2010 Why? is: very readable easy to learn interpreted & interactive like a UNIX shell, only better object-oriented but not religious about it slower than
More informationCOMP 204. Exceptions continued. Yue Li based on material from Mathieu Blanchette, Carlos Oliver Gonzalez and Christopher Cameron
COMP 204 Exceptions continued Yue Li based on material from Mathieu Blanchette, Carlos Oliver Gonzalez and Christopher Cameron 1 / 27 Types of bugs 1. Syntax errors 2. Exceptions (runtime) 3. Logical errors
More informationFunctions in Python L435/L555. Dept. of Linguistics, Indiana University Fall Functions in Python. Functions.
L435/L555 Dept. of Linguistics, Indiana University Fall 2016 1 / 18 What is a function? Definition A function is something you can call (possibly with some parameters, i.e., the things in parentheses),
More informationIntroduction to Python and its unit testing framework
Introduction to Python and its unit testing framework Instructions These are self evaluation exercises. If you know no python then work through these exercises and it will prepare yourself for the lab.
More informationIntroduction to Language Theory and Compilation: Exercises. Session 2: Regular expressions
Introduction to Language Theory and Compilation: Exercises Session 2: Regular expressions Regular expressions (RE) Finite automata are an equivalent formalism to regular languages (for each regular language,
More informationC++ For Science and Engineering Lecture 13
C++ For Science and Engineering Lecture 13 John Chrispell Tulane University Wednesday September 22, 2010 Logical Expressions: Sometimes you want to use logical and and logical or (even logical not) in
More information10. Finite Automata and Turing Machines
. Finite Automata and Turing Machines Frank Stephan March 2, 24 Introduction Alan Turing s th Birthday Alan Turing was a completely original thinker who shaped the modern world, but many people have never
More information2 Getting Started with Numerical Computations in Python
1 Documentation and Resources * Download: o Requirements: Python, IPython, Numpy, Scipy, Matplotlib o Windows: google "windows download (Python,IPython,Numpy,Scipy,Matplotlib" o Debian based: sudo apt-get
More informationMA/CSSE 474 Theory of Computation
MA/CSSE 474 Theory of Computation Closure properties of Regular Languages Pumping Theorem Your Questions? Previous class days' material Reading Assignments HW 6 or 7 problems Anything else 1 Regular Expressions
More informationMiniMat: Matrix Language in OCaml LLVM
Terence Lim - tl2735@columbia.edu August 17, 2016 Contents 1 Introduction 4 1.1 Goals........................................ 4 1.1.1 Flexible matrix notation......................... 4 1.1.2 Uncluttered................................
More informationAutomata and Computability. Solutions to Exercises
Automata and Computability Solutions to Exercises Fall 28 Alexis Maciel Department of Computer Science Clarkson University Copyright c 28 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata
More informationAutomata and Computability. Solutions to Exercises
Automata and Computability Solutions to Exercises Spring 27 Alexis Maciel Department of Computer Science Clarkson University Copyright c 27 Alexis Maciel ii Contents Preface vii Introduction 2 Finite Automata
More informationCOMP 204. Object Oriented Programming (OOP) Mathieu Blanchette
COMP 204 Object Oriented Programming (OOP) Mathieu Blanchette 1 / 14 Object-Oriented Programming OOP is a way to write and structure programs to make them easier to design, understand, debug, and maintain.
More informationSets are one of the basic building blocks for the types of objects considered in discrete mathematics.
Section 2.1 Introduction Sets are one of the basic building blocks for the types of objects considered in discrete mathematics. Important for counting. Programming languages have set operations. Set theory
More information1 Alphabets and Languages
1 Alphabets and Languages Look at handout 1 (inference rules for sets) and use the rules on some examples like {a} {{a}} {a} {a, b}, {a} {{a}}, {a} {{a}}, {a} {a, b}, a {{a}}, a {a, b}, a {{a}}, a {a,
More informationComp 11 Lectures. Dr. Mike Shah. August 9, Tufts University. Dr. Mike Shah (Tufts University) Comp 11 Lectures August 9, / 34
Comp 11 Lectures Dr. Mike Shah Tufts University August 9, 2017 Dr. Mike Shah (Tufts University) Comp 11 Lectures August 9, 2017 1 / 34 Please do not distribute or host these slides without prior permission.
More informationFoundations of
91.304 Foundations of (Theoretical) Computer Science Chapter 1 Lecture Notes (Section 1.3: Regular Expressions) David Martin dm@cs.uml.edu d with some modifications by Prof. Karen Daniels, Spring 2012
More informationContext Free Grammars
Automata and Formal Languages Context Free Grammars Sipser pages 101-111 Lecture 11 Tim Sheard 1 Formal Languages 1. Context free languages provide a convenient notation for recursive description of languages.
More informationOutline. policies for the second part. with answers. MCS 260 Lecture 10.5 Introduction to Computer Science Jan Verschelde, 9 July 2014
Outline 1 midterm exam on Friday 11 July 2014 policies for the second part 2 some review questions with answers MCS 260 Lecture 10.5 Introduction to Computer Science Jan Verschelde, 9 July 2014 Intro to
More informationData Intensive Computing Handout 11 Spark: Transitive Closure
Data Intensive Computing Handout 11 Spark: Transitive Closure from random import Random spark/transitive_closure.py num edges = 3000 num vertices = 500 rand = Random(42) def generate g raph ( ) : edges
More informationUNIT II REGULAR LANGUAGES
1 UNIT II REGULAR LANGUAGES Introduction: A regular expression is a way of describing a regular language. The various operations are closure, union and concatenation. We can also find the equivalent regular
More informationDiscrete Mathematical Structures: Theory and Applications
Chapter 1: Foundations: Sets, Logic, and Algorithms Discrete Mathematical Structures: Theory and Applications Learning Objectives Learn about sets Explore various operations on sets Become familiar with
More informationComputation Theory Finite Automata
Computation Theory Dept. of Computing ITT Dublin October 14, 2010 Computation Theory I 1 We would like a model that captures the general nature of computation Consider two simple problems: 2 Design a program
More informationCS177 Spring Final exam. Fri 05/08 7:00p - 9:00p. There are 40 multiple choice questions. Each one is worth 2.5 points.
CS177 Spring 2015 Final exam Fri 05/08 7:00p - 9:00p There are 40 multiple choice questions. Each one is worth 2.5 points. Answer the questions on the bubble sheet given to you. Only the answers on the
More informationCMSC 330: Organization of Programming Languages. Pushdown Automata Parsing
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing Chomsky Hierarchy Categorization of various languages and grammars Each is strictly more restrictive than the previous First described
More information9A. Iteration with range. Topics: Using for with range Summation Computing Min s Functions and for-loops A Graphics Applications
9A. Iteration with range Topics: Using for with range Summation Computing Min s Functions and for-loops A Graphics Applications Iterating Through a String s = abcd for c in s: print c Output: a b c d In
More informationRead all questions and answers carefully! Do not make any assumptions about the code other than those that are clearly stated.
Read all questions and answers carefully! Do not make any assumptions about the code other than those that are clearly stated. CS177 Fall 2014 Final - Page 2 of 38 December 17th, 10:30am 1. Consider we
More information8.2 Examples of Classes
206 8.2 Examples of Classes Don t be put off by the terminology of classes we presented in the preceding section. Designing classes that appropriately break down complex data is a skill that takes practice
More informationWhat Is a Language? Grammars, Languages, and Machines. Strings: the Building Blocks of Languages
Do Homework 2. What Is a Language? Grammars, Languages, and Machines L Language Grammar Accepts Machine Strings: the Building Blocks of Languages An alphabet is a finite set of symbols: English alphabet:
More informationMultiplying Products of Prime Powers
Problem 1: Multiplying Products of Prime Powers Each positive integer can be expressed (in a unique way, according to the Fundamental Theorem of Arithmetic) as a product of powers of the prime numbers.
More informationCS177 Fall Midterm 1. Wed 10/07 6:30p - 7:30p. There are 25 multiple choice questions. Each one is worth 4 points.
CS177 Fall 2015 Midterm 1 Wed 10/07 6:30p - 7:30p There are 25 multiple choice questions. Each one is worth 4 points. Answer the questions on the bubble sheet given to you. Only the answers on the bubble
More informationScripting Languages Fast development, extensible programs
Scripting Languages Fast development, extensible programs Devert Alexandre School of Software Engineering of USTC November 30, 2012 Slide 1/60 Table of Contents 1 Introduction 2 Dynamic languages A Python
More informationHow do regular expressions work? CMSC 330: Organization of Programming Languages
How do regular expressions work? CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata What we ve learned What regular expressions are What they can express, and cannot
More informationIV. Turing Machine. Yuxi Fu. BASICS, Shanghai Jiao Tong University
IV. Turing Machine Yuxi Fu BASICS, Shanghai Jiao Tong University Alan Turing Alan Turing (23Jun.1912-7Jun.1954), an English student of Church, introduced a machine model for effective calculation in On
More informationUsing the EartH2Observe data portal to analyse drought indicators. Lesson 4: Using Python Notebook to access and process data
Using the EartH2Observe data portal to analyse drought indicators Lesson 4: Using Python Notebook to access and process data Preface In this fourth lesson you will again work with the Water Cycle Integrator
More information6.001 Recitation 22: Streams
6.001 Recitation 22: Streams RI: Gerald Dalley, dalleyg@mit.edu, 4 May 2007 http://people.csail.mit.edu/dalleyg/6.001/sp2007/ The three chief virtues of a programmer are: Laziness, Impatience and Hubris
More informationn CS 160 or CS122 n Sets and Functions n Propositions and Predicates n Inference Rules n Proof Techniques n Program Verification n CS 161
Discrete Math at CSU (Rosen book) Sets and Functions (Rosen, Sections 2.1,2.2, 2.3) TOPICS Discrete math Set Definition Set Operations Tuples 1 n CS 160 or CS122 n Sets and Functions n Propositions and
More informationPlan for Today and Beginning Next week (Lexical Analysis)
Plan for Today and Beginning Next week (Lexical Analysis) Regular Expressions Finite State Machines DFAs: Deterministic Finite Automata Complications NFAs: Non Deterministic Finite State Automata From
More informationCSE 211. Pushdown Automata. CSE 211 (Theory of Computation) Atif Hasan Rahman
CSE 211 Pushdown Automata CSE 211 (Theory of Computation) Atif Hasan Rahman Lecturer Department of Computer Science and Engineering Bangladesh University of Engineering & Technology Adapted from slides
More informationLab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations
Lab 2 Worksheet Problems Problem : Geometry and Linear Equations Linear algebra is, first and foremost, the study of systems of linear equations. You are going to encounter linear systems frequently in
More informationIntroduction. How to use this book. Linear algebra. Mathematica. Mathematica cells
Introduction How to use this book This guide is meant as a standard reference to definitions, examples, and Mathematica techniques for linear algebra. Complementary material can be found in the Help sections
More information1 Definition of a Turing machine
Introduction to Algorithms Notes on Turing Machines CS 4820, Spring 2017 April 10 24, 2017 1 Definition of a Turing machine Turing machines are an abstract model of computation. They provide a precise,
More informationFIT100 Spring 01. Project 2. Astrological Toys
FIT100 Spring 01 Project 2 Astrological Toys In this project you will write a series of Windows applications that look up and display astrological signs and dates. The applications that will make up the
More informationLAST TIME..THE COMMAND LINE
LAST TIME..THE COMMAND LINE Maybe slightly to fast.. Finder Applications Utilities Terminal The color can be adjusted and is simply a matter of taste. Per default it is black! Dr. Robert Kofler Introduction
More informationLexical Analysis Part II: Constructing a Scanner from Regular Expressions
Lexical Analysis Part II: Constructing a Scanner from Regular Expressions CS434 Spring 2005 Department of Computer Science University of Alabama Joel Jones Copyright 2003, Keith D. Cooper, Ken Kennedy
More informationPart I: Definitions and Properties
Turing Machines Part I: Definitions and Properties Finite State Automata Deterministic Automata (DFSA) M = {Q, Σ, δ, q 0, F} -- Σ = Symbols -- Q = States -- q 0 = Initial State -- F = Accepting States
More information8. More on Iteration with For
8. More on Iteration with For Topics: Using for with range Summation Computing Min s Functions and for-loops Graphical iteration Traversing a String Character-by-Character s = abcd for c in s: print c
More informationMath Review. for the Quantitative Reasoning measure of the GRE General Test
Math Review for the Quantitative Reasoning measure of the GRE General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important for solving
More informationCSE 105 THEORY OF COMPUTATION
CSE 105 THEORY OF COMPUTATION Fall 2016 http://cseweb.ucsd.edu/classes/fa16/cse105-abc/ Today's learning goals Sipser Ch 3 Trace the computation of a Turing machine using its transition function and configurations.
More informationMA/CSSE 474 Theory of Computation
MA/CSSE 474 Theory of Computation CFL Hierarchy CFL Decision Problems Your Questions? Previous class days' material Reading Assignments HW 12 or 13 problems Anything else I have included some slides online
More informationA High Level Programming Language for Quantum Computing
QUARK QUantum Analysis and Realization Kit A High Level Programming Language for Quantum Computing Team In lexicographical order: Name UNI Role Daria Jung djj2115 Verification and Validation Jamis Johnson
More informationIntroduction to Metalogic
Philosophy 135 Spring 2008 Tony Martin Introduction to Metalogic 1 The semantics of sentential logic. The language L of sentential logic. Symbols of L: Remarks: (i) sentence letters p 0, p 1, p 2,... (ii)
More informationAssignment 1: Due Friday Feb 6 at 6pm
CS1110 Spring 2015 Assignment 1: Due Friday Feb 6 at 6pm You must work either on your own or with one partner. If you work with a partner, you and your partner must first register as a group in CMS and
More informationSolutions to Exercises Chapter 2: On numbers and counting
Solutions to Exercises Chapter 2: On numbers and counting 1 Criticise the following proof that 1 is the largest natural number. Let n be the largest natural number, and suppose than n 1. Then n > 1, and
More information10. The GNFA method is used to show that
CSE 355 Midterm Examination 27 February 27 Last Name Sample ASU ID First Name(s) Ima Exam # Sample Regrading of Midterms If you believe that your grade has not been recorded correctly, return the entire
More informationMCS 260 Exam 2 13 November In order to get full credit, you need to show your work.
MCS 260 Exam 2 13 November 2015 Name: Do not start until instructed to do so. In order to get full credit, you need to show your work. You have 50 minutes to complete the exam. Good Luck! Problem 1 /15
More informationRecap DFA,NFA, DTM. Slides by Prof. Debasis Mitra, FIT.
Recap DFA,NFA, DTM Slides by Prof. Debasis Mitra, FIT. 1 Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { {, } } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite
More informationAlgorithms for Uncertainty Quantification
Technische Universität München SS 2017 Lehrstuhl für Informatik V Dr. Tobias Neckel M. Sc. Ionuț Farcaș April 26, 2017 Algorithms for Uncertainty Quantification Tutorial 1: Python overview In this worksheet,
More informationI. Numerical Computing
I. Numerical Computing A. Lectures 1-3: Foundations of Numerical Computing Lecture 1 Intro to numerical computing Understand difference and pros/cons of analytical versus numerical solutions Lecture 2
More informationCMSC 330: Organization of Programming Languages. Regular Expressions and Finite Automata
CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata CMSC330 Spring 2018 1 How do regular expressions work? What we ve learned What regular expressions are What they
More informationCS 341 Homework 16 Languages that Are and Are Not Context-Free
CS 341 Homework 16 Languages that Are and Are Not Context-Free 1. Show that the following languages are context-free. You can do this by writing a context free grammar or a PDA, or you can use the closure
More informationMost General computer?
Turing Machines Most General computer? DFAs are simple model of computation. Accept only the regular languages. Is there a kind of computer that can accept any language, or compute any function? Recall
More informationCMSC 330: Organization of Programming Languages
CMSC 330: Organization of Programming Languages Regular Expressions and Finite Automata CMSC 330 Spring 2017 1 How do regular expressions work? What we ve learned What regular expressions are What they
More informationComputer Science Introductory Course MSc - Introduction to Java
Computer Science Introductory Course MSc - Introduction to Java Lecture 1: Diving into java Pablo Oliveira ENST Outline 1 Introduction 2 Primitive types 3 Operators 4 5 Control Flow
More information(NB. Pages are intended for those who need repeated study in formal languages) Length of a string. Formal languages. Substrings: Prefix, suffix.
(NB. Pages 22-40 are intended for those who need repeated study in formal languages) Length of a string Number of symbols in the string. Formal languages Basic concepts for symbols, strings and languages:
More informationCSCI 239 Discrete Structures of Computer Science Lab 6 Vectors and Matrices
CSCI 239 Discrete Structures of Computer Science Lab 6 Vectors and Matrices This lab consists of exercises on real-valued vectors and matrices. Most of the exercises will required pencil and paper. Put
More informationCS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community
CS-141 Exam 2 Review October 19, 2016 Presented by the RIT Computer Science Community http://csc.cs.rit.edu Linked Lists 1. You are given the linked list: 1 2 3. You may assume that each node has one field
More informationSets. Introduction to Set Theory ( 2.1) Basic notations for sets. Basic properties of sets CMSC 302. Vojislav Kecman
Introduction to Set Theory ( 2.1) VCU, Department of Computer Science CMSC 302 Sets Vojislav Kecman A set is a new type of structure, representing an unordered collection (group, plurality) of zero or
More informationPetriScript Reference Manual 1.0. Alexandre Hamez Xavier Renault
PetriScript Reference Manual.0 Alexandre Hamez Xavier Renault Contents PetriScript basics 3. Using PetriScript................................. 3.2 Generalities.................................... 4.3
More informationCSE 20. Lecture 4: Introduction to Boolean algebra. CSE 20: Lecture4
CSE 20 Lecture 4: Introduction to Boolean algebra Reminder First quiz will be on Friday (17th January) in class. It is a paper quiz. Syllabus is all that has been done till Wednesday. If you want you may
More informationPractical Bioinformatics
4/24/2017 Resources Course website: http://histo.ucsf.edu/bms270/ Resources on the course website: Syllabus Papers and code (for downloading before class) Slides and transcripts (available after class)
More informationComputation and Logic Definitions
Computation and Logic Definitions True and False Also called Boolean truth values, True and False represent the two values or states an atom can assume. We can use any two distinct objects to represent
More informationCSCI-141 Exam 1 Review September 19, 2015 Presented by the RIT Computer Science Community
CSCI-141 Exam 1 Review September 19, 2015 Presented by the RIT Computer Science Community http://csc.cs.rit.edu 1. Python basics (a) The programming language Python (circle the best answer): i. is primarily
More informationToday s topics. Introduction to Set Theory ( 1.6) Naïve set theory. Basic notations for sets
Today s topics Introduction to Set Theory ( 1.6) Sets Definitions Operations Proving Set Identities Reading: Sections 1.6-1.7 Upcoming Functions A set is a new type of structure, representing an unordered
More informationA Little Logic. Propositional Logic. Satisfiability Problems. Solving Sudokus. First Order Logic. Logic Programming
A Little Logic International Center for Computational Logic Technische Universität Dresden Germany Propositional Logic Satisfiability Problems Solving Sudokus First Order Logic Logic Programming A Little
More informationExtended Introduction to Computer Science CS1001.py. Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration
Extended Introduction to Computer Science CS1001.py Lecture 8 part A: Finding Zeroes of Real Functions: Newton Raphson Iteration Instructors: Benny Chor, Amir Rubinstein Teaching Assistants: Yael Baran,
More informationTuring machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage?
Turing machines Finite automaton no storage Pushdown automaton storage is a stack What if we give the automaton a more flexible storage? What is the most powerful of automata? In this lecture we will introduce
More informationTheoretical Computer Science
Theoretical Computer Science Zdeněk Sawa Department of Computer Science, FEI, Technical University of Ostrava 17. listopadu 15, Ostrava-Poruba 708 33 Czech republic September 22, 2017 Z. Sawa (TU Ostrava)
More informationAutomata Theory and Formal Grammars: Lecture 1
Automata Theory and Formal Grammars: Lecture 1 Sets, Languages, Logic Automata Theory and Formal Grammars: Lecture 1 p.1/72 Sets, Languages, Logic Today Course Overview Administrivia Sets Theory (Review?)
More informationSection 1: Sets and Interval Notation
PART 1 From Sets to Functions Section 1: Sets and Interval Notation Introduction Set concepts offer the means for understanding many different aspects of mathematics and its applications to other branches
More informationThe Turing Machine. CSE 211 (Theory of Computation) The Turing Machine continued. Turing Machines
The Turing Machine Turing Machines Professor Department of Computer Science and Engineering Bangladesh University of Engineering and Technology Dhaka-1000, Bangladesh The Turing machine is essentially
More informationMeta-programming & you
Meta-programming & you Robin Message Cambridge Programming Research Group 10 th May 2010 What s meta-programming about? 1 result=somedb. customers. select 2 { first_name+ +last_name } 3 where name LIKE
More informationDigital Systems and Information Part II
Digital Systems and Information Part II Overview Arithmetic Operations General Remarks Unsigned and Signed Binary Operations Number representation using Decimal Codes BCD code and Seven-Segment Code Text
More informationChapter 0 Introduction. Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin
Chapter 0 Introduction Fourth Academic Year/ Elective Course Electrical Engineering Department College of Engineering University of Salahaddin October 2014 Automata Theory 2 of 22 Automata theory deals
More information1.2 The Role of Variables
1.2 The Role of Variables variables sentences come in several flavors true false conditional In this section, a name is given to mathematical sentences that are sometimes true, sometimes false they are
More informationFinite State Transducers
Finite State Transducers Eric Gribkoff May 29, 2013 Original Slides by Thomas Hanneforth (Universitat Potsdam) Outline 1 Definition of Finite State Transducer 2 Examples of FSTs 3 Definition of Regular
More informationModern Functional Programming and Actors With Scala and Akka
Modern Functional Programming and Actors With Scala and Akka Aaron Kosmatin Computer Science Department San Jose State University San Jose, CA 95192 707-508-9143 akosmatin@gmail.com Abstract For many years,
More informationRigorous Software Development CSCI-GA
Rigorous Software Development CSCI-GA 3033-009 Instructor: Thomas Wies Spring 2013 Lecture 2 Alloy Analyzer Analyzes micro models of software Helps to identify key properties of a software design find
More informationLectures about Python, useful both for beginners and experts, can be found at (http://scipy-lectures.github.io).
Random Matrix Theory (Sethna, "Entropy, Order Parameters, and Complexity", ex. 1.6, developed with Piet Brouwer) 2016, James Sethna, all rights reserved. This is an ipython notebook. This hints file is
More informationWith Question/Answer Animations. Chapter 2
With Question/Answer Animations Chapter 2 Chapter Summary Sets The Language of Sets Set Operations Set Identities Functions Types of Functions Operations on Functions Sequences and Summations Types of
More informationThe Pip Project. PLT (W4115) Fall Frank Wallingford
The Pip Project PLT (W4115) Fall 2008 Frank Wallingford (frw2106@columbia.edu) Contents 1 Introduction 4 2 Original Proposal 5 2.1 Introduction....................................................... 5
More information