Pathline: " 1234B2# ="3;6<89>";>6"B A1"<6"A <?;>"=21 9>";52#A. Miriah Meyer 1,2

Size: px
Start display at page:

Download "Pathline: " 1234B2# ="3;6<89>";>6"B A1"<6"A <?;>"=21 9>";52#A. Miriah Meyer 1,2"

Transcription

1 " 1234B2# ="3;6<89>";>6"B A1"<6"A <?;>"=21 9>";52#A +$ +" +% 1"!"#!"#"$!$!)!" / 0!"*!%%!" #" #$ #%!$ #&!% #' #( #), Pathline: - -*!". -+!". A Tool For Comparative -, Functional Genomics #%. 2.3'. "3.. Miriah Meyer 1,2 #$( Bang Wong 2 Mark Styczynski 3 Tamara Munzner Hanspeter Pfister 12.3'. Harvard University "3.. 2 Broad Institute 3 Georgia Institute of Technology 4567/ :8 0;-;68,<;5=/>!$* %"&'()*+&"$ %"&,+-$!$ 4 University +8/6:5=?@0#65@A" of British Columbia +8/6:5=?@0#65@A$ +8/6:5=BCC 4 1 -% -' -( -) -. -/ -%0 -%% -%' -%( -%) $% E&3. 2D3% &% E*3% 2'3$ &' E&3) 2)3" &( E)3' 2&3& $' E$3% 2*3. &) E$3* 2'3. $( E$3% 2*3. $( E$3% 2*3. &* E$3( 2%3$ &+ E&3% 2&3% &, ED3' 2$3% &'+ E"3) 2&3. E)3) 2*3D ED3% 2*3* E'3D 2*3( ED3' 2)3& E&3( 2(3( E(3' 2*3$ E)3* 2'3D E(3* 2*3" E*3' 2*3* E(3( 2D3% E'3' 2*3) E"3* 2%3. E"3( 2)3D ED3" 2*3. &!"%

2 roadmap background data & tasks Pathline case studies conclusions and future work

3 background

4 functional genomics how do genes work together to perform different functions in a cell?

5 functional genomics data gene expression molecular pathways

6 gene expression is the measured level of how much a gene is on or off... a single quantitative value 0.2

7 gene expression is the measured level of how much a gene is on or off... a single quantitative value biologists measure it for many genes g1 g2 g3 g g5-0.5 g6-0.7 g7-1.0 g8-0.5

8 gene expression is the measured level of how much a gene is on or off... a single quantitative value biologists measure it for many genes... in many samples (time points, tissue types, species) g1 g2 g3 g4 g5 s1 s2 s3 s4 s g g g

9 gene expression is the measured level of how much a gene is on or off... a single quantitative value biologists measure it for many genes... in many samples (time points, tissue types, species) g1 g2 g3 g4 g5 s1 s2 s3 s4 s visualized with heatmaps [Wilkinson09][Saldanha04][Seo02][Eisen98] [Gehlenborg10][Weinstein08] encode value with color g6 g7 g

10 gene expression is the measured level of how much a gene is on or off... a single quantitative value it is measured for many genes... in many samples (time points, tissue types, species) visualized with heatmaps [Wilkinson09] [Saldanha04] [Seo02] [Eisen98] [Gehlenborg10] [Weinstein08] encode value with color augmented with clustering [Eisen98]

11 the functioning of a cell is controlled by many interrelated chemical reactions performed by genes input output / input output genes genes = cell function

12

13 glycolysis tca cycle pathways

14 [Kim09] [Bourqui09] [Barsky08] [Letunic08] [Junker06] [Alm05] [Mlecnik05] [Shannon03]

15 [Kim09] [Bourqui09] [Barsky08] [Letunic08] [Junker06] [Alm05] [Mlecnik05] [Shannon03]

16 [Kim09] [Bourqui09] [Barsky08] [Letunic08] [Junker06] [Alm05] [Mlecnik05] [Shannon03] bioinformatics.cnmcresearch.org/geneshelf

17 functional genomics how do genes work together to perform different functions in a cell? comparative functional genomics how do the gene interactions vary across different species?

18 collaborators: Regev Lab at the Broad Institute biology: metabolism in yeast data: multiple genes multiple time points multiple related species multiple pathways problem: existing tools can only look at a subset of this data comparative functional genomics how do the gene interactions vary across different species?

19 contributions Pathline first interactive tool for visualizing multiple genes, time points, species, and pathways linearized pathway representation for comparing quantitative data along a pathway curvemap visual encoding of temporal gene expression validation case studies describing efficiency gains and new biological findings

20 data & tasks

21 glycolysis metabolic pathways 10 to 50 pathways of interest inputs/outputs called metabolites tca cycle gene expression 6000 genes and 140 metabolites 6 time points 14 species of yeast s1 g1 m1 g2 m2 s2 s3 s4 t1 g1 t2 0.2 g2 t g1 0.2 g g t1 t2 t3 t4 t5 g1 0.2 g g g t1 t2 t3 t4 t5 t6 g1 0.2 g g g g t1 t2 t3 t4 t5 t6 g2 1.0 g g g g g t1 t2 t3 t4 t5 t6 g5-0.5g g g2 1.0 g g g6-0.7g g8-0.3 g3-0.7g g t t1 g1 t20 g7-1.0g g4 1.0 g g g5 g8-0.5g g s5 s6 g3 m3 g8-0.5g g g g similarity scores aggregate time series for a gene/metabolite over species g7-1.0g phylogeny g '"#()* '"#+%, '"#-./ '"#-./01 aggregate s1 s2 s3 t1 t2 t3 t4 t5 t6 g , t1 t2 t3 t4 t5 t6 g t1 t2 t3 t4 t5 t6... g ,, = 0.83 similarity of expression across species aggregate: Pearson, Spearman, others 2"#3$. '"#(.4 5"#&6$ 5"#7.$ 5"#$.( '"#,$0 8"#9.: 2"#.$-!"#$%& '"#;.& '"#&6+

22 tasks 1. study expression data as a time series look for peaks, valleys, time shifts 2. detailed comparison of a limited number of time series filter using pathways filter again using genes or species 3. comparison of similarity scores of genes along a pathway(s) 4. comparison of multiple similarity scores

23 Pathline

24 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ metabolic pathways

25 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ similarity scores

26 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ similarity scores

27 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ similarity scores

28 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ gene expression

29 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ phylogeny

30 Pathline design decisions

31 encode quantitative values with spatial position [Cleveland84] [Lam07]

32 encode quantitative values with spatial position [Cleveland84] [Lam07] instead of a.. topological layout linearized pathway

33 encode quantitative values with spatial position [Cleveland84] [Lam07] instead of a.. conditioned heatmap curvemap $% $& $'( $') $'& $'* $+, $+' $++ $+.'.+.-.%./.(.).& courtesy of M. Styczynski from JavaTreeview jtreeview.sourceforge.net/.*.',

34 Pathline linearized pathway representation

35 linearized pathway representation 0.0 pathway similarity score 1.0 common axes to compare similarity scores

36 linearized pathway representation pathway similarity score common axes to compare similarity scores bars and circles

37 linearized pathway representation pathway similarity score common axes to compare similarity scores bars and circles

38 linearized pathway representation pathway similarity score common axes to compare similarity scores bars and circles visual layer for attenuation color-code gene direction

39 linearized pathway representation 0.0 pathway similarity score 1.0 common axes to compare similarity scores bars and circles visual layer for attenuation color-code gene direction multiple similarity scores

40 3,4", &4,, )& %& linearized pathway %* 0!"!#!$ representation 2& 1 %$. / %&, common axes to compare similarity scores %&, bars and circles visual layer for attenuation color-code gene direction )* %&'!+, %&' %&( multiple similarity scores multiple pathways %*' )+ %* %++

41 unroll from pathway to ordered list of nodes /*0#12)30* +*$*,-.!"#$%& 423$0 6 6!"#$%& 7 7 '(%)* unroll and cut reinsert,#.,1.,%.,5. stylized marks and shared coordinate frame

42 " 1234B2# ="3;6<89>";>6"B 3,4", &4,, linearized pathway )& %* %& representation 2& 0 1!"!#!$ %$. / %&, putting it together... %&, use spatial position for similarity scores instead of topology )* %&' %&'!+, %&( topology is secondary %*' )+ %* %++

43 Pathline curvemap

44 curvemap &'(%!"#$!"#% '% $%!"# $ % $ = % ( = >? )*+, inspired by heatmaps base visual unit is a curve )'-. '& '( )&/ ') 012- '* )3-4 '+ #562 ',

45 curvemap &'(%!"#$!"#% '% $%!"# $ % $ = % ( = >? )*+, inspired by heatmaps base visual unit is a curve )'-. filled, framed line charts to enhance shape perception )&/ '& '( ') 012- '* )3-4 '+ #562 ',

46 curvemap &'(%!"#$!"#% '% $%!"# $ % $ = % ( = >? )*+, inspired by heatmaps base visual unit is a curve )'-. filled, framed line charts to enhance shape perception rows are species )&/ columns are genes/metabolites 012- '& '( ') '* )3-4 '+ #562 ',

47 curvemap A1"<6"A <?;>"=21 9>";52#A $% &% &' (% (' () C,1, 0%1E C%1' 0%1% C21) 0%1F inspired by heatmaps base visual unit is a curve filled, framed line charts to enhance shape perception rows are species columns are genes/metabolites overlays to enhance trend perception (* (+ (, (- (. (/ (%0 (%% (%' (%) (%* C'1( 0,1) C)1F 0F1F C)1$ 0%1$ C)1( 021E C'1E 0%1" C)1, 0%1% C'1, 0E1' C'1E 0%1, C"1F 0$1F C"1F 0,1E C"1E 0%1( C)1( C%1' C)1, 0E1' 021$ 0,1"

48 !"!#!$%!$&!$#!$'!()!($!((!(* +$ +$#,$ +$#,( +( +**,* +$* +*)," -. +& +$) / 0 +( +$) 1$!*) +(# )2)) $2)) )2)) $2))!"#!"#"$ 3456/ :.:57-;:4</= %"&'()*+&"$ %"&,+-$,7/594<>?? B(2# B*2' B*2* B&2A BA2) BA2) B"2* B*2# B#2$ B#2$.'.+.-.%./.(.).&.*.',.''.'+.'-.'% " :7;8<01:125="79>6; ?"68<5;@ 0123A1# ;>:="<10 8=":41#@ Demo

49 case studies

50 " 0123A1# <"2:5;78=":=5"A ;>:="<10 &# &" $" $"% %"&'()*+&"$ $# &' &) /"!"#!"#"$ $# $, * + -. $"(!'( 0(12( "1(( $"( 0(12( "1(( $'' $"'!"!# :+:57*;:4<-= $#% %"&,+-$ &7-594<>?.!54?@" &7-594<>?.!54?@# &7-594<ABB '% '& '( ') '* '+ ', '- '. '%/ '%% '%& '%( '%) $% C%1' 021# $& C)1, 0,1" C,1, 0,1" C%1' 021( C21) 0'1% C'1( 021# C)1D 0)1) C)1# C(1( C#1' 0'1) C'1E 0"12 C)1, 0#1" C'1, 0)1D C'1E 0'1D C(12 0#1D C(1( 0"1" C"1' 0"1( whole genome duplication both genes one gene

51 " 0123A1# <"2:5;78=":=5"A ;>:="<10 &# &" $" $"% %"&'()*+&"$ $# &' &) /"!"#!"#"$ $# $, * + -. $"(!'( 0(12( "1(( $"( 0(12( "1(( $'' $"'!"!# :+:57*;:4<-= $#% %"&,+-$ &7-594<>?.!54?@" &7-594<>?.!54?@# &7-594<ABB '% '& '( ') '* '+ ', '- '. '%/ '%% '%& '%( '%) $% C%1' 021# $& C)1, 0,1" C,1, 0,1" C%1' 021( C21) 0'1% C'1( 021# C)1D 0)1) C)1# C(1( C#1' 0'1) C'1E 0"12 C)1, 0#1" C'1, 0)1D C'1E 0'1D C(12 0#1D C(1( 0"1" C"1' 0"1( whole genome duplication both genes one gene

52 " 0123A1# <"2:5;78=":=5"A ;>:="<10 &# &" $" $"% %"&'()*+&"$ $# &' &) /"!"#!"#"$ $# $, * + -. $"(!'( 0(12( "1(( $"( 0(12( "1(( $'' $"'!"!# :+:57*;:4<-= $#% %"&,+-$ &7-594<>?.!54?@" &7-594<>?.!54?@# &7-594<ABB '% '& '( ') '* '+ ', '- '. '%/ '%% '%& '%( '%) $% C%1' 021# $& C)1, 0,1" C,1, 0,1" C%1' 021( C21) 0'1% C'1( 021# C)1D 0)1) C)1# C(1( C#1' 0'1) C'1E 0"12 C)1, 0#1" C'1, 0)1D C'1E 0'1D C(12 0#1D C(1( 0"1" C"1' 0"1( whole genome duplication both genes one gene

53 " 0123A1# <"2:5;78=":=5"A ;>:="<10 &# &" $" $"% %"&'()*+&"$ $# &' &) /"!"#!"#"$ $# $, * + -. $"(!'( 0(12( "1(( $"( 0(12( "1(( $'' $"'!"!# :+:57*;:4<-= $#% %"&,+-$ &7-594<>?.!54?@" &7-594<>?.!54?@# &7-594<ABB '% '& '( ') '* '+ ', '- '. '%/ '%% '%& '%( '%) $% C%1' 021# $& C)1, 0,1" C,1, 0,1" C%1' 021( C21) 0'1% C'1( 021# C)1D 0)1) C)1# C(1( C#1' 0'1) C'1E 0"12 C)1, 0#1" C'1, 0)1D C'1E 0'1D C(12 0#1D C(1( 0"1" C"1' 0"1( whole genome duplication both genes one gene

54 gene-level relationships " 0123A1# <"2:5;78=":=5"A +2++ &2++ (& %& %) /!"!#!$ 1& 0 %$ -. %&+ %&+

55 gene-level relationships $% 2*3) D)3# $& 2,3* D,3* $' 2)3* DF3" 2)3" D$3" 2&3* DF3* 2*3) D)3) 2&3* DF3" 2)3, D*3+ 2+3, D)3, 2)3' D$3' 2*3) D#3+ 2&3, D'3" 2*3& D#3# 2,3* D"3" 2)3' D&3' 2&3* D+3+ 2&3& DF3& () (* (+ (, (% (& (' (- (. ()/ ()) ()* ()+ (), ;>:="<10 %& (& -. %$ %&+ / 0 %) %&+ 1& +2++ &2++ $% $& $' () (* (+ (, (% (& " 0123A1# ;>:="<10

56 gene-level relationships $% 2*3) D)3# $& 2,3* D,3* $' 2)3* DF3" 2)3" D$3" 2&3* DF3* 2*3) D)3) 2&3* DF3" 2)3, D*3+ 2+3, D)3, 2)3' D$3' 2*3) D#3+ 2&3, D'3" 2*3& D#3# 2,3* D"3" 2)3' D&3' 2&3* D+3+ 2&3& DF3& () (* (+ (, (% (& (' (- (. ()/ ()) ()* ()+ (), ;>:="<10 %& (& -. %$ %&+ / 0 %) %&+ 1& +2++ &2++ $% $& $' () (* (+ (, (% (& " 0123A1# ;>:="<10

57 gene-level relationships $% 2*3) D)3# $& 2,3* D,3* $' 2)3* DF3" 2)3" D$3" 2&3* DF3* 2*3) D)3) 2&3* DF3" 2)3, D*3+ 2+3, D)3, 2)3' D$3' 2*3) D#3+ 2&3, D'3" 2*3& D#3# 2,3* D"3" 2)3' D&3' 2&3* D+3+ 2&3& DF3& () (* (+ (, (% (& (' (- (. ()/ ()) ()* ()+ (), ;>:="<10 %& (& -. %$ %&+ / 0 %) %&+ 1& +2++ &2++ $% $& $' () (* (+ (, (% (& " 0123A1# ;>:="<10

58 gene-level relationships $% 2*3) D)3# $& 2,3* D,3* $' 2)3* DF3" 2)3" D$3" 2&3* DF3* 2*3) D)3) 2&3* DF3" 2)3, D*3+ 2+3, D)3, 2)3' D$3' 2*3) D#3+ 2&3, D'3" 2*3& D#3# 2,3* D"3" 2)3' D&3' 2&3* D+3+ 2&3& DF3& () (* (+ (, (% (& (' (- (. ()/ ()) ()* ()+ (), ;>:="<10 %& (& -. %$ %&+ / 0 %) %&+ 1& +2++ &2++ $% $& $' () (* (+ (, (% (& " 0123A1# ;>:="<10

59 gene-level relationships $% 2*3) D)3# $& 2,3* D,3* $' 2)3* DF3" 2)3" D$3" 2&3* DF3* 2*3) D)3) 2&3* DF3" 2)3, D*3+ 2+3, D)3, 2)3' D$3' 2*3) D#3+ 2&3, D'3" 2*3& D#3# 2,3* D"3" 2)3' D&3' 2&3* D+3+ 2&3& DF3& () (* (+ (, (% (& (' (- (. ()/ ()) ()* ()+ (), ;>:="<10 %& (& -. %$ %&+ / 0 %) %&+ 1& +2++ &2++ $% $& $' () (* (+ (, (% (& " 0123A1# ;>:="<10

60 conclusions and future work

61 conclusions Pathline: first interactive tool for comparative functional genomics multiple: genes, time points, species, and pathways two new visual encodings curvemap for expression data with multiple dimensions linearized pathway representation for comparing quantitative data case studies: efficiency gains and new discoveries

62 future work automate pathway selection and linearization continue with Regev Lab: more data types apply ideas to other biological systems Pathline linearized pathway representation curvemaps beyond biology: curvemap vs heatmap

63 " 1234B2# ="3;6<89>";>6"B A1"<6"A <?;>"=21 9>";52#A +$ +" +% 1"!"#!"#"$ 2.3'. "3..!"!" -% #" #$!$ #%!$ / #& -'!% #' #( #) -( 0 -)!) questions?, - -*!".!".!"* #%.!$* %"&'()*+&"$ 2.3'. "3..!$!%% #$( 4567/ :8 0;-;68,<;5=/> %"&,+-$ +8/6:5=?@0#65@A" +8/6:5=?@0#65@A$ +8/6:5=BCC -+ -, -. -/ -%0 -%% -%' -%( -%) $% E&3. 2D3% &% 2'3$ &' 2)3" &( 2&3& $' 2*3. &) 2'3. $( 2*3. $( acknowledgements Regev lab: Michelle Chan, Courtney French, Jay Konieczka, Jenna Pfiffner, Aviv Regev, and Dawn Thompson E"3* Data Visualization Initiative at the Broad Institute: National E*3% E&3) Science E)3' E$3% Foundation E$3* E$3% under E$3% E$3( Grant E&3% ED3' E"3) to the Computing Research Association for the CIFellows Project 2*3. &* 2%3$ &+ 2&3% &, 2$3% &'+ 2&3. E)3) 2*3D ED3% 2*3* E'3D 2*3( ED3' 2)3& E&3( 2(3( E(3' 2*3$ E)3* 2'3D E(3* 2*3" E*3' 2*3* E(3( 2D3% E'3' 2*3) 2%3. E"3( 2)3D ED3" 2*3. &!"%

Intro to Info Vis. CS 725/825 Information Visualization Spring } Before class. } During class. Dr. Michele C. Weigle

Intro to Info Vis. CS 725/825 Information Visualization Spring } Before class. } During class. Dr. Michele C. Weigle CS 725/825 Information Visualization Spring 2018 Intro to Info Vis Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs725-s18/ Today } Before class } Reading: Ch 1 - What's Vis, and Why Do It? } During

More information

Computational Systems Biology

Computational Systems Biology Computational Systems Biology Vasant Honavar Artificial Intelligence Research Laboratory Bioinformatics and Computational Biology Graduate Program Center for Computational Intelligence, Learning, & Discovery

More information

CIS 467/602-01: Data Visualization

CIS 467/602-01: Data Visualization IS 467/602-01: Data Visualization etworks Dr. David Koop IS 467, Spring 2015 Assignment 2 http://www.cis.umassd.edu/ ~dkoop/cis467/assignment2.html Scaling width Text elements and selectall Any other questions?

More information

Lecture 8: Temporal programs and the global structure of transcription networks. Chap 5 of Alon. 5.1 Introduction

Lecture 8: Temporal programs and the global structure of transcription networks. Chap 5 of Alon. 5.1 Introduction Lecture 8: Temporal programs and the global structure of transcription networks Chap 5 of Alon 5. Introduction We will see in this chapter that sensory transcription networks are largely made of just four

More information

V14 extreme pathways

V14 extreme pathways V14 extreme pathways A torch is directed at an open door and shines into a dark room... What area is lighted? Instead of marking all lighted points individually, it would be sufficient to characterize

More information

Correlation Networks

Correlation Networks QuickTime decompressor and a are needed to see this picture. Correlation Networks Analysis of Biological Networks April 24, 2010 Correlation Networks - Analysis of Biological Networks 1 Review We have

More information

Bioinformatics. Dept. of Computational Biology & Bioinformatics

Bioinformatics. Dept. of Computational Biology & Bioinformatics Bioinformatics Dept. of Computational Biology & Bioinformatics 3 Bioinformatics - play with sequences & structures Dept. of Computational Biology & Bioinformatics 4 ORGANIZATION OF LIFE ROLE OF BIOINFORMATICS

More information

Networks & pathways. Hedi Peterson MTAT Bioinformatics

Networks & pathways. Hedi Peterson MTAT Bioinformatics Networks & pathways Hedi Peterson (peterson@quretec.com) MTAT.03.239 Bioinformatics 03.11.2010 Networks are graphs Nodes Edges Edges Directed, undirected, weighted Nodes Genes Proteins Metabolites Enzymes

More information

Weighted gene co-expression analysis. Yuehua Cui June 7, 2013

Weighted gene co-expression analysis. Yuehua Cui June 7, 2013 Weighted gene co-expression analysis Yuehua Cui June 7, 2013 Weighted gene co-expression network (WGCNA) A type of scale-free network: A scale-free network is a network whose degree distribution follows

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai

Network Biology: Understanding the cell s functional organization. Albert-László Barabási Zoltán N. Oltvai Network Biology: Understanding the cell s functional organization Albert-László Barabási Zoltán N. Oltvai Outline: Evolutionary origin of scale-free networks Motifs, modules and hierarchical networks Network

More information

Bioinformatics. Transcriptome

Bioinformatics. Transcriptome Bioinformatics Transcriptome Jacques.van.Helden@ulb.ac.be Université Libre de Bruxelles, Belgique Laboratoire de Bioinformatique des Génomes et des Réseaux (BiGRe) http://www.bigre.ulb.ac.be/ Bioinformatics

More information

A Case Study -- Chu et al. The Transcriptional Program of Sporulation in Budding Yeast. What is Sporulation? CSE 527, W.L. Ruzzo 1

A Case Study -- Chu et al. The Transcriptional Program of Sporulation in Budding Yeast. What is Sporulation? CSE 527, W.L. Ruzzo 1 A Case Study -- Chu et al. An interesting early microarray paper My goals Show arrays used in a real experiment Show where computation is important Start looking at analysis techniques The Transcriptional

More information

56:198:582 Biological Networks Lecture 10

56:198:582 Biological Networks Lecture 10 56:198:582 Biological Networks Lecture 10 Temporal Programs and the Global Structure The single-input module (SIM) network motif The network motifs we have studied so far all had a defined number of nodes.

More information

Extend Your Metabolomics Insight!

Extend Your Metabolomics Insight! Extend Your Metabolomics Insight! Introducing MassHunter VistaFlux April 2016 Agilent THE Leader in Metabolomics! Fiehn EI Library METLIN MS/MS Library Mass Profiler Professional with Pathway Architect

More information

V19 Metabolic Networks - Overview

V19 Metabolic Networks - Overview V19 Metabolic Networks - Overview There exist different levels of computational methods for describing metabolic networks: - stoichiometry/kinetics of classical biochemical pathways (glycolysis, TCA cycle,...

More information

Zhongyi Xiao. Correlation. In probability theory and statistics, correlation indicates the

Zhongyi Xiao. Correlation. In probability theory and statistics, correlation indicates the Character Correlation Zhongyi Xiao Correlation In probability theory and statistics, correlation indicates the strength and direction of a linear relationship between two random variables. In general statistical

More information

BMD645. Integration of Omics

BMD645. Integration of Omics BMD645 Integration of Omics Shu-Jen Chen, Chang Gung University Dec. 11, 2009 1 Traditional Biology vs. Systems Biology Traditional biology : Single genes or proteins Systems biology: Simultaneously study

More information

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018

Zhiguang Huo 1, Chi Song 2, George Tseng 3. July 30, 2018 Bayesian latent hierarchical model for transcriptomic meta-analysis to detect biomarkers with clustered meta-patterns of differential expression signals BayesMP Zhiguang Huo 1, Chi Song 2, George Tseng

More information

networks in molecular biology Wolfgang Huber

networks in molecular biology Wolfgang Huber networks in molecular biology Wolfgang Huber networks in molecular biology Regulatory networks: components = gene products interactions = regulation of transcription, translation, phosphorylation... Metabolic

More information

Proteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering

Proteomics. 2 nd semester, Department of Biotechnology and Bioinformatics Laboratory of Nano-Biotechnology and Artificial Bioengineering Proteomics 2 nd semester, 2013 1 Text book Principles of Proteomics by R. M. Twyman, BIOS Scientific Publications Other Reference books 1) Proteomics by C. David O Connor and B. David Hames, Scion Publishing

More information

We could express the left side as a sum of vectors and obtain the Vector Form of a Linear System: a 12 a x n. a m2

We could express the left side as a sum of vectors and obtain the Vector Form of a Linear System: a 12 a x n. a m2 Week 22 Equations, Matrices and Transformations Coefficient Matrix and Vector Forms of a Linear System Suppose we have a system of m linear equations in n unknowns a 11 x 1 + a 12 x 2 + + a 1n x n b 1

More information

Network alignment and querying

Network alignment and querying Network biology minicourse (part 4) Algorithmic challenges in genomics Network alignment and querying Roded Sharan School of Computer Science, Tel Aviv University Multiple Species PPI Data Rapid growth

More information

FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps

FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps FCModeler: Dynamic Graph Display and Fuzzy Modeling of Regulatory and Metabolic Maps Julie Dickerson 1, Zach Cox 1 and Andy Fulmer 2 1 Iowa State University and 2 Proctor & Gamble. FCModeler Goals Capture

More information

Differential Modeling for Cancer Microarray Data

Differential Modeling for Cancer Microarray Data Differential Modeling for Cancer Microarray Data Omar Odibat Department of Computer Science Feb, 01, 2011 1 Outline Introduction Cancer Microarray data Problem Definition Differential analysis Existing

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

CS : Spatial Data Modeling and Analysis. Geovisualization

CS : Spatial Data Modeling and Analysis. Geovisualization CS260-002: Spatial Data Modeling and Analysis Geovisualization Visual Perception Learning Styles & Personality Types: Visual, Auditory, Kinesthetic Cholera cases in the London epidemic of 1854 Cholera

More information

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations Lab 2 Worksheet Problems Problem : Geometry and Linear Equations Linear algebra is, first and foremost, the study of systems of linear equations. You are going to encounter linear systems frequently in

More information

Comparative Network Analysis

Comparative Network Analysis Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

PNmerger: a Cytoscape plugin to merge biological pathways and protein interaction networks

PNmerger: a Cytoscape plugin to merge biological pathways and protein interaction networks PNmerger: a Cytoscape plugin to merge biological pathways and protein interaction networks http://www.hupo.org.cn/pnmerger Fuchu He E-mail: hefc@nic.bmi.ac.cn Tel: 86-10-68171208 FAX: 86-10-68214653 Yunping

More information

Dynamical Modeling in Biology: a semiotic perspective. Junior Barrera BIOINFO-USP

Dynamical Modeling in Biology: a semiotic perspective. Junior Barrera BIOINFO-USP Dynamical Modeling in Biology: a semiotic perspective Junior Barrera BIOINFO-USP Layout Introduction Dynamical Systems System Families System Identification Genetic networks design Cell Cycle Modeling

More information

Pathway Association Analysis Trey Ideker UCSD

Pathway Association Analysis Trey Ideker UCSD Pathway Association Analysis Trey Ideker UCSD A working network map of the cell Network evolutionary comparison / cross-species alignment to identify conserved modules The Working Map Network-based classification

More information

The Generalized Topological Overlap Matrix For Detecting Modules in Gene Networks

The Generalized Topological Overlap Matrix For Detecting Modules in Gene Networks The Generalized Topological Overlap Matrix For Detecting Modules in Gene Networks Andy M. Yip Steve Horvath Abstract Systems biologic studies of gene and protein interaction networks have found that these

More information

Self Similar (Scale Free, Power Law) Networks (I)

Self Similar (Scale Free, Power Law) Networks (I) Self Similar (Scale Free, Power Law) Networks (I) E6083: lecture 4 Prof. Predrag R. Jelenković Dept. of Electrical Engineering Columbia University, NY 10027, USA {predrag}@ee.columbia.edu February 7, 2007

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Haploid & diploid recombination and their evolutionary impact

Haploid & diploid recombination and their evolutionary impact Haploid & diploid recombination and their evolutionary impact W. Garrett Mitchener College of Charleston Mathematics Department MitchenerG@cofc.edu http://mitchenerg.people.cofc.edu Introduction The basis

More information

Hierarchical Clustering

Hierarchical Clustering Hierarchical Clustering Some slides by Serafim Batzoglou 1 From expression profiles to distances From the Raw Data matrix we compute the similarity matrix S. S ij reflects the similarity of the expression

More information

Math 221 Midterm Fall 2017 Section 104 Dijana Kreso

Math 221 Midterm Fall 2017 Section 104 Dijana Kreso The University of British Columbia Midterm October 5, 017 Group B Math 1: Matrix Algebra Section 104 (Dijana Kreso) Last Name: Student Number: First Name: Section: Format: 50 min long exam. Total: 5 marks.

More information

Data Visualization (DSC 530/CIS )

Data Visualization (DSC 530/CIS ) Data Visualization (DSC 530/CIS 602-01) Tables Dr. David Koop Visualization of Tables Items and attributes For now, attributes are not known to be positions Keys and values - key is an independent attribute

More information

Reductionist Paradox Are the laws of chemistry and physics sufficient for the discovery of new drugs?

Reductionist Paradox Are the laws of chemistry and physics sufficient for the discovery of new drugs? Quantitative Biology Colloquium, University of Arizona, March 1, 2011 Reductionist Paradox Are the laws of chemistry and physics sufficient for the discovery of new drugs? Gerry Maggiora, Ph.D. Adjunct

More information

Integration of functional genomics data

Integration of functional genomics data Integration of functional genomics data Laboratoire Bordelais de Recherche en Informatique (UMR) Centre de Bioinformatique de Bordeaux (Plateforme) Rennes Oct. 2006 1 Observations and motivations Genomics

More information

Clustering and Network

Clustering and Network Clustering and Network Jing-Dong Jackie Han jdhan@picb.ac.cn http://www.picb.ac.cn/~jdhan Copy Right: Jing-Dong Jackie Han What is clustering? A way of grouping together data samples that are similar in

More information

Cluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002

Cluster Analysis of Gene Expression Microarray Data. BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002 Cluster Analysis of Gene Expression Microarray Data BIOL 495S/ CS 490B/ MATH 490B/ STAT 490B Introduction to Bioinformatics April 8, 2002 1 Data representations Data are relative measurements log 2 ( red

More information

Practical Bioinformatics

Practical Bioinformatics 5/15/2015 Gotchas Indentation matters Clustering exercises Visualizing the distance matrix Loading and re-loading your functions # Use import the f i r s t time you l o a d a module # ( And keep u s i

More information

Protein Interaction Mapping: Use of Osprey to map Survival of Motor Neuron Protein interactions

Protein Interaction Mapping: Use of Osprey to map Survival of Motor Neuron Protein interactions Protein Interaction Mapping: Use of Osprey to map Survival of Motor Neuron Protein interactions Presented by: Meg Barnhart Computational Biosciences Arizona State University The Spinal Muscular Atrophy

More information

Supplementary Material

Supplementary Material Supplementary Material Variability in the degree distributions of subnets Sampling fraction 80% Sampling fraction 60% Sampling fraction 40% Sampling fraction 20% Figure 1: Average degree distributions

More information

METABOLIC PATHWAY PREDICTION/ALIGNMENT

METABOLIC PATHWAY PREDICTION/ALIGNMENT COMPUTATIONAL SYSTEMIC BIOLOGY METABOLIC PATHWAY PREDICTION/ALIGNMENT Hofestaedt R*, Chen M Bioinformatics / Medical Informatics, Technische Fakultaet, Universitaet Bielefeld Postfach 10 01 31, D-33501

More information

IMMERSIVE GRAPH-BASED VISUALIZATION AND EXPLORATION OF BIOLOGICAL DATA RELATIONSHIPS

IMMERSIVE GRAPH-BASED VISUALIZATION AND EXPLORATION OF BIOLOGICAL DATA RELATIONSHIPS Data Science Journal, Volume 4, 31 December 2005 189 IMMERSIVE GRAPH-BASED VISUALIZATION AND EXPLORATION OF BIOLOGICAL DATA RELATIONSHIPS N Férey, PE Gros, J Hérisson, R Gherbi* *Bioinformatics team, Human-Computer

More information

Predicting rice (Oryza sativa) metabolism

Predicting rice (Oryza sativa) metabolism Predicting rice (Oryza sativa) metabolism Sudip Kundu Department of Biophysics, Molecular Biology & Bioinformatics, University of Calcutta, WB, India. skbmbg@caluniv.ac.in Collaborators: Mark G Poolman

More information

Multiple Choice Questions

Multiple Choice Questions Multiple Choice Questions There is no penalty for guessing. Three points per question, so a total of 48 points for this section.. What is the complete relationship between homogeneous linear systems of

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

ANAXOMICS METHODOLOGIES - UNDERSTANDING

ANAXOMICS METHODOLOGIES - UNDERSTANDING ANAXOMICS METHODOLOGIES - UNDERSTANDING THE COMPLEXITY OF BIOLOGICAL PROCESSES Raquel Valls, Albert Pujol ǂ, Judith Farrés, Laura Artigas and José Manuel Mas Anaxomics Biotech, c/balmes 89, 08008 Barcelona,

More information

Lecture 5: November 19, Minimizing the maximum intracluster distance

Lecture 5: November 19, Minimizing the maximum intracluster distance Analysis of DNA Chips and Gene Networks Spring Semester, 2009 Lecture 5: November 19, 2009 Lecturer: Ron Shamir Scribe: Renana Meller 5.1 Minimizing the maximum intracluster distance 5.1.1 Introduction

More information

Dynamic modeling and analysis of cancer cellular network motifs

Dynamic modeling and analysis of cancer cellular network motifs SUPPLEMENTARY MATERIAL 1: Dynamic modeling and analysis of cancer cellular network motifs Mathieu Cloutier 1 and Edwin Wang 1,2* 1. Computational Chemistry and Bioinformatics Group, Biotechnology Research

More information

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification 10-810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Clustering & microarray technology

Clustering & microarray technology Clustering & microarray technology A large scale way to measure gene expression levels. Thanks to Kevin Wayne, Matt Hibbs, & SMD for a few of the slides 1 Why is expression important? Proteins Gene Expression

More information

A Multiobjective GO based Approach to Protein Complex Detection

A Multiobjective GO based Approach to Protein Complex Detection Available online at www.sciencedirect.com Procedia Technology 4 (2012 ) 555 560 C3IT-2012 A Multiobjective GO based Approach to Protein Complex Detection Sumanta Ray a, Moumita De b, Anirban Mukhopadhyay

More information

Analysis of Complex Systems

Analysis of Complex Systems Analysis of Complex Systems Lecture 1: Introduction Marcus Kaiser m.kaiser@ncl.ac.uk www.dynamic-connectome.org Preliminaries - 1 Lecturers Dr Marcus Kaiser, m.kaiser@ncl.ac.uk Practicals Frances Hutchings

More information

CS612 - Algorithms in Bioinformatics

CS612 - Algorithms in Bioinformatics Fall 2017 Databases and Protein Structure Representation October 2, 2017 Molecular Biology as Information Science > 12, 000 genomes sequenced, mostly bacterial (2013) > 5x10 6 unique sequences available

More information

Evidence for dynamically organized modularity in the yeast protein-protein interaction network

Evidence for dynamically organized modularity in the yeast protein-protein interaction network Evidence for dynamically organized modularity in the yeast protein-protein interaction network Sari Bombino Helsinki 27.3.2007 UNIVERSITY OF HELSINKI Department of Computer Science Seminar on Computational

More information

A taxonomy of visualization tasks for the analysis of biological pathway data

A taxonomy of visualization tasks for the analysis of biological pathway data The Author(s) BMC Bioinformatics 2016, 18(Suppl 2):21 DOI 10.1186/s12859-016-1443-5 RESEARCH A taxonomy of visualization tasks for the analysis of biological pathway data Paul Murray 1*,FintanMcGee 2 and

More information

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting.

Genome Annotation. Bioinformatics and Computational Biology. Genome sequencing Assembly. Gene prediction. Protein targeting. Genome Annotation Bioinformatics and Computational Biology Genome Annotation Frank Oliver Glöckner 1 Genome Analysis Roadmap Genome sequencing Assembly Gene prediction Protein targeting trna prediction

More information

BTRY 7210: Topics in Quantitative Genomics and Genetics

BTRY 7210: Topics in Quantitative Genomics and Genetics BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:

More information

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008

6.047 / Computational Biology: Genomes, Networks, Evolution Fall 2008 MIT OpenCourseWare http://ocw.mit.edu 6.047 / 6.878 Computational Biology: Genomes, Networks, Evolution Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Preface. Contributors

Preface. Contributors CONTENTS Foreword Preface Contributors PART I INTRODUCTION 1 1 Networks in Biology 3 Björn H. Junker 1.1 Introduction 3 1.2 Biology 101 4 1.2.1 Biochemistry and Molecular Biology 4 1.2.2 Cell Biology 6

More information

Protein Complex Identification by Supervised Graph Clustering

Protein Complex Identification by Supervised Graph Clustering Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie

More information

5/4/05 Biol 473 lecture

5/4/05 Biol 473 lecture 5/4/05 Biol 473 lecture animals shown: anomalocaris and hallucigenia 1 The Cambrian Explosion - 550 MYA THE BIG BANG OF ANIMAL EVOLUTION Cambrian explosion was characterized by the sudden and roughly simultaneous

More information

2 GENE FUNCTIONAL SIMILARITY. 2.1 Semantic values of GO terms

2 GENE FUNCTIONAL SIMILARITY. 2.1 Semantic values of GO terms Bioinformatics Advance Access published March 7, 2007 The Author (2007). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

More information

Supplementary Information

Supplementary Information Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively

More information

Applying Visual Analytics Methods to Spatial Time Series Data: Forest Fires, Phone Calls,

Applying Visual Analytics Methods to Spatial Time Series Data: Forest Fires, Phone Calls, Applying Visual Analytics Methods to Spatial Time Series Data: Forest Fires, Phone Calls, Dr. & Dr. Natalia Andrienko In collaboration with TU Darmstadt and Univ. Constance, DFG Priority Research Program

More information

Visualization. Maneesh Agrawala and Michael Bernstein CS 247

Visualization. Maneesh Agrawala and Michael Bernstein CS 247 Visualization Maneesh Agrawala and Michael Bernstein CS 247 Topics Why do we create visualizations? Data and image Estimating magnitude Deconstructions What is Visualization? Definition [www.oed.com] 1.

More information

Genomic Comparison of Bacterial Species Based on Metabolic Characteristics

Genomic Comparison of Bacterial Species Based on Metabolic Characteristics Genomic Comparison of Bacterial Species Based on Metabolic Characteristics Gaurav Jain 1, Haozhu Wang 1, Li Liao 1*, E. Fidelma Boyd 2* 1 Department of Computer and Information Sciences University of Delaware

More information

Learning in Bayesian Networks

Learning in Bayesian Networks Learning in Bayesian Networks Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Berlin: 20.06.2002 1 Overview 1. Bayesian Networks Stochastic Networks

More information

From Arabidopsis roots to bilinear equations

From Arabidopsis roots to bilinear equations From Arabidopsis roots to bilinear equations Dustin Cartwright 1 October 22, 2008 1 joint with Philip Benfey, Siobhan Brady, David Orlando (Duke University) and Bernd Sturmfels (UC Berkeley), research

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Advanced Statistical Methods: Beyond Linear Regression

Advanced Statistical Methods: Beyond Linear Regression Advanced Statistical Methods: Beyond Linear Regression John R. Stevens Utah State University Notes 3. Statistical Methods II Mathematics Educators Worshop 28 March 2009 1 http://www.stat.usu.edu/~jrstevens/pcmi

More information

Text mining and natural language analysis. Jefrey Lijffijt

Text mining and natural language analysis. Jefrey Lijffijt Text mining and natural language analysis Jefrey Lijffijt PART I: Introduction to Text Mining Why text mining The amount of text published on paper, on the web, and even within companies is inconceivably

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Chapter 5: Microarray Techniques

Chapter 5: Microarray Techniques Chapter 5: Microarray Techniques 5.2 Analysis of Microarray Data Prof. Yechiam Yemini (YY) Computer Science Department Columbia University Normalization Clustering Overview 2 1 Processing Microarray Data

More information

NOISE-ROBUST SOFT CLUSTERING OF GENE EXPRESSION TIME-COURSE DATA

NOISE-ROBUST SOFT CLUSTERING OF GENE EXPRESSION TIME-COURSE DATA Journal of Bioinformatics and Computational Biology Vol., No. 4 (5) 965 988 c Imperial College Press NOISE-ROBUST SOFT CLUSTERING OF GENE EXPRESSION TIME-COURSE DATA MATTHIAS E. FUTSCHIK, Institute of

More information

Compounding insights Thermo Scientific Compound Discoverer Software

Compounding insights Thermo Scientific Compound Discoverer Software Compounding insights Thermo Scientific Compound Discoverer Software Integrated, complete, toolset solves small-molecule analysis challenges Thermo Scientific Orbitrap mass spectrometers produce information-rich

More information

Homework 5. (due Wednesday 8 th Nov midnight)

Homework 5. (due Wednesday 8 th Nov midnight) Homework (due Wednesday 8 th Nov midnight) Use this definition for Column Space of a Matrix Column Space of a matrix A is the set ColA of all linear combinations of the columns of A. In other words, if

More information

V14 Graph connectivity Metabolic networks

V14 Graph connectivity Metabolic networks V14 Graph connectivity Metabolic networks In the first half of this lecture section, we use the theory of network flows to give constructive proofs of Menger s theorem. These proofs lead directly to algorithms

More information

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs

Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs Next Generation Computational Chemistry Tools to Predict Toxicity of CWAs William (Bill) Welsh welshwj@umdnj.edu Prospective Funding by DTRA/JSTO-CBD CBIS Conference 1 A State-wide, Regional and National

More information

Gene Ontology and overrepresentation analysis

Gene Ontology and overrepresentation analysis Gene Ontology and overrepresentation analysis Kjell Petersen J Express Microarray analysis course Oslo December 2009 Presentation adapted from Endre Anderssen and Vidar Beisvåg NMC Trondheim Overview How

More information

Written Exam 15 December Course name: Introduction to Systems Biology Course no

Written Exam 15 December Course name: Introduction to Systems Biology Course no Technical University of Denmark Written Exam 15 December 2008 Course name: Introduction to Systems Biology Course no. 27041 Aids allowed: Open book exam Provide your answers and calculations on separate

More information

Introduction to clustering methods for gene expression data analysis

Introduction to clustering methods for gene expression data analysis Introduction to clustering methods for gene expression data analysis Giorgio Valentini e-mail: valentini@dsi.unimi.it Outline Levels of analysis of DNA microarray data Clustering methods for functional

More information

Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction Networks

Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction Networks 22 International Conference on Environment Science and Engieering IPCEE vol.3 2(22) (22)ICSIT Press, Singapoore Structure and Centrality of the Largest Fully Connected Cluster in Protein-Protein Interaction

More information

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr

Introduction to Bioinformatics. Shifra Ben-Dor Irit Orr Introduction to Bioinformatics Shifra Ben-Dor Irit Orr Lecture Outline: Technical Course Items Introduction to Bioinformatics Introduction to Databases This week and next week What is bioinformatics? A

More information

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics

Chapter 18 Lecture. Concepts of Genetics. Tenth Edition. Developmental Genetics Chapter 18 Lecture Concepts of Genetics Tenth Edition Developmental Genetics Chapter Contents 18.1 Differentiated States Develop from Coordinated Programs of Gene Expression 18.2 Evolutionary Conservation

More information

Phylogenetic Analysis of Molecular Interaction Networks 1

Phylogenetic Analysis of Molecular Interaction Networks 1 Phylogenetic Analysis of Molecular Interaction Networks 1 Mehmet Koyutürk Case Western Reserve University Electrical Engineering & Computer Science 1 Joint work with Sinan Erten, Xin Li, Gurkan Bebek,

More information

CS444/544: Midterm Review. Carlos Scheidegger

CS444/544: Midterm Review. Carlos Scheidegger CS444/544: Midterm Review Carlos Scheidegger D3: DATA-DRIVEN DOCUMENTS The essential idea D3 creates a two-way association between elements of your dataset and entries in the DOM D3 operates on selections

More information

Agilent MassHunter Profinder: Solving the Challenge of Isotopologue Extraction for Qualitative Flux Analysis

Agilent MassHunter Profinder: Solving the Challenge of Isotopologue Extraction for Qualitative Flux Analysis Agilent MassHunter Profinder: Solving the Challenge of Isotopologue Extraction for Qualitative Flux Analysis Technical Overview Introduction Metabolomics studies measure the relative abundance of metabolites

More information

6-2 Matrix Multiplication, Inverses and Determinants

6-2 Matrix Multiplication, Inverses and Determinants Find AB and BA, if possible. 1. A = A = ; A is a 1 2 matrix and B is a 2 2 matrix. Because the number of columns of A is equal to the number of rows of B, AB exists. To find the first entry of AB, find

More information

Comparative Bioinformatics Midterm II Fall 2004

Comparative Bioinformatics Midterm II Fall 2004 Comparative Bioinformatics Midterm II Fall 2004 Objective Answer, part I: For each of the following, select the single best answer or completion of the phrase. (3 points each) 1. Deinococcus radiodurans

More information

Introduction to clustering methods for gene expression data analysis

Introduction to clustering methods for gene expression data analysis Introduction to clustering methods for gene expression data analysis Giorgio Valentini e-mail: valentini@dsi.unimi.it Outline Levels of analysis of DNA microarray data Clustering methods for functional

More information

Lecture 3: A basic statistical concept

Lecture 3: A basic statistical concept Lecture 3: A basic statistical concept P value In statistical hypothesis testing, the p value is the probability of obtaining a result at least as extreme as the one that was actually observed, assuming

More information

The Accuracy of Network Visualizations. Kevin W. Boyack SciTech Strategies, Inc.

The Accuracy of Network Visualizations. Kevin W. Boyack SciTech Strategies, Inc. The Accuracy of Network Visualizations Kevin W. Boyack SciTech Strategies, Inc. kboyack@mapofscience.com Overview Science mapping history Conceptual Mapping Early Bibliometric Maps Recent Bibliometric

More information

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information -

Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes. - Supplementary Information - Dynamic optimisation identifies optimal programs for pathway regulation in prokaryotes - Supplementary Information - Martin Bartl a, Martin Kötzing a,b, Stefan Schuster c, Pu Li a, Christoph Kaleta b a

More information