Integra(ng and Ranking Uncertain Scien(fic Data
|
|
- Percival Pearson
- 6 years ago
- Views:
Transcription
1 Jan 19, Biomedical and Health Informatics 2 Computer Science and Engineering University of Washington Integra(ng and Ranking Uncertain Scien(fic Data Wolfgang Ga*erbauer 2 Based on joint work with: Todd Detwiler 1, Abhay Jha 2, Brent Louie 1, Dan Suciu 2, and Peter Tarczy- Hornoch 1
2 Mo1va1on: Retrieving relevant infos across several DBs Keyword ABCC8 DB2 B C DB1 A B ABCC8 AGTTC... DB3 AGTTC... xxx AGTTC... xyz AGTCC... xyz DB4 B xxx xyy C a c Results a c e B C xyz e... AGTTC... xyy xzz e AGTTC... xyz AGTCC... xzz Problem: mul(ple expansions across different data- bases can quickly lead to many less relevant results. Ques1on: how can prune or rank those results? 2
3 Agenda How to model uncertainty in data integra(on? How do we rank? How well, how fast, how robust on real data? A short database research point of view 3
4 4 Probabilis1c Metrics in UII Granularity Schema Instance E/R En(ty p s p r Rela(onship q s q r 4
5 Example Belief Metrics Schema graph p s0 Q p s1 p s4 q q s1-4 s0-1 p s3 q s0-2 q s2-3 R q s3-5 Instance graph p s2 E 0 q r0-1 q r0-2 p r1 q E r1-5 1 p r2 q r2-3 E 2 E 3 q r2-4 p r3 p r4 E 4 qr3-5 q r4-6 E 5 E 6 p r5 p r6 Final Scores p = p s p r q = q s q r 5
6 Transla1on of Uncertain1es into Probabilis1c Weights We use domain experts to quan(fy and transform data uncertain(es into the 4 types of probabilis(c weights Example transforma(ons:
7 How can we assign some score (here the color)... Source: Todd
8 ... that allows ranking? Source: Todd
9 Agenda How to model uncertainty in data integra(on? How do we rank? How well, how fast, how robust on real data? A short database research point of view 9
10 Network Reliability Theory ( source- target reachability ) Source- target- reachability: probability that a node is reachable from the start (query) node Query node Func(on # Func(on #2 Func(on #2 = = 0.81 Func(on #1 = 1 Prob(all paths failed) = 1 ( )( ) =
11 Incorpora1ng Uncertainty: Network Reliability Theory score = probability that an answer node is reachable from the start (query) node. s q q p q p Problem: Compu(ng U2 score is #P. p q q p q t 11
12 Why is reliability = reachability hard? The following graph is nasty = hard! Can come in different forms: Wheatstone Bridge :n n:m n:1 Reachability score: = :n 1:n n:1 n:1 12
13 Closed solu1on is possible some1mes Detail: gene ABCC8, upstream node
14 Techniques to perform probabilis1c scoring Naive Monte Carlo simula(on Improved Monte Carlo simula(on Analyze the necessary number of simula(ons Graph reduc(ons (Parallel- serial reduc(ons) Closed solu(on for subgraphs Propaga1on score Deterministc counterparts 14
15 Ignoring correla1ons: the relevance propaga1on model Ignoring correla(ons leads to a local point of view. One equa(on for relevance r for each node n i and each arc a i,j Solve simple equa(on system (closed or itera(vely) ARC a i,j NODE e i,j p i q i,j r i1,j p j n i a i,j n j r i2,j n j r i r i,j r i3,j r j r i,j = r i q i,j r j = (1- i (1- r i,j )) p j 15
16 Example: reliability vs. propaga1on Reliability Propaga(on Reliability = Propaga(on s 0.5 s s u r = u r = u r =
17 Comparing reliability and propaga1on: complexity Reliability Propaga(on global measure combinatorial problem P# = hard Mone Carlo es(mates local measure con(nuous state space P = not hard Itera(ve algorithm 17
18 Agenda How to model uncertainty in data integra(on? How do we rank? How well, how fast, how robust on real data? A short database research point of view 18
19 Experiments: Func1onal gene annota1on 3 ques1ons 1) How well do different approaches perform? [Average precision (AP)] 2) How fast is probabilis(c query evalua(on? [Focus on reliability] 3) Where do you get the probabili(es from? How robust is our system to varia(ons in the input probabili(es? [Sensi(vity analysis] 6 data sources: Pfam, TIGRFAM, NCBIBlast, EntrezProtein, EntrezGen, AmiGo 3 scenarios 1) Well- known func(ons for well- studied proteins (306/20) 2) Less- known func(ons for well- studied proteins (7/3) 3) Unknown func(ons for less- studied proteins (11/11) 19
20 1. How well (1/3): Average Precision Assume 4 out of 10 items are relevant Rank Ranking method 1 relevant precision@k x 1.00 (=1/1) x 1.00 (=2/2) x x 0.75 (=3/4) 0.57 (=4/7) Ranking method 2 relevant precision@k x 1.00 (=1/1) x x x 0.67 (=2/3) 0.75 (=3/4) 0.50 (=4/8) Random AP Averaged over all 10 4 permutabons AP AP as measure for the quality of the ranking seman(cs with regard to ground truth 20
21 1. How well (2/3): Scoring func1ons Scoring func1on Example graph Example score Reliability s t Propaga(on s t InEdge s t 2 incoming edges PathCount s t 3 paths (1 shown) Random AP no score: AP averaged over all ranking permutabons 21
22 1. How well (3/3): AP across 3 scenarios Scenario 1: 306 well- known func(on, 20 well- studied proteins Scenario 2 7 less- known func(ons, 3 well- studied proteins Scenario 1: 11 unknown func(ons, 11 less- studied proteins Observa(on 1: Probabilis(c methods perform berer for predic(ng less- known or previously unknown func(ons! 22
23 2. How fast: Several techniques for speeding up reliability Techniques (not discussed in detail): naive Monte Carlo (N), efficient Monte Carlo (M), instead of simulabons (e4, e5), graph reducbons (R), closed solubon (C) Observa(on 2: Several techniques allowed us to evaluate the reliability seman(cs in ~20ms (propaga(on ~5ms, InEdge and Pathcount ~1ms) 23
24 3. How robust: sensi1vity analysis Our approach depends on transforming uncertainty into probabilisbc weights. How robust is the performance to systemabc variabons in these input parameters? Idea: mulb- way sensibvity analysis p = Lo 1 Lo(p)+ε ε = N(0, σ 2 ) Lo(p) = log( p 1 p ) Observa(on 3: Small random perturba(ons to the ini(al parameters do not nega(vely affect the quality of rankings. The approach is robust! 24
25 Take- way from experiments on real data Uncertainty of informa(on Unknown informa1on Less- known informa1on Well- known informa1on Determinis1c Probabilis1c Informa(on integra(on approach Explicit modeling of uncertain(es as probabili(es increases our ability to predict less- known or previously unknown protein func(ons. This suggests that uncertainty models offer u(lity for knowledge discovery. Small perturba(ons in the input probabili(es (parameters) tend to produce only minor changes in the quality of our result rankings. This suggests that probabilis(c methods are robust against varia(ons in the way uncertain(es are transformed into probabili(es. Several techniques allow us to evaluate probabilis(c rankings efficiently. This suggests that probabilis(c query evalua(on is not as hard for real- world problems as theory indicates. 25
26 Agenda How to model uncertainty in data integra(on? How do we rank? How well, how fast, how robust on real data? A short database research point of view 26
27 Short database background (1/2) Schema ATTEND(student,class) TEACH(class,prof)! DEP(prof,department)! SQL query select!a.student, T.department! from!attend A, TEACH T, DEP D! where!attend.class=teach.class! and!teach.prof=dep.prof! 27
28 Short database background (2/2) Schema R(A,B) S(B,C)! T(C,D)! SQL query select!r.a, T.D! from!r, S, T! where!r.b=s.b! and!s.c=t.c! Datalog q(x,u):-r(x,y),s(y,z),t(z,u)! Conjunc(ve queries: very efficient! 28
29 Probabilis)c databases (1/3) q(x,u):-r(x,y), S(y,z), T(z,u)! R S T A B B C C D a y 1 y 1 z 1 z 1 d a y 2 y 1 z 2 z 2 d y 2 z 2 Which tuples? q(a,d)! 29
30 Probabilis)c databases (2/3) q(x,u):-r p (x,y), S(y,z), T p (z,u)! Which tuples & how likely?! R p S T p A B a y 1 a y 2 a p 1 p 2 y 1 B C p 1 y 1 z 1 z 1 d p 3 p 2 y 1 z 2 z 2 d p 4 y 1 y 2 y 1 y 2 y 2 z 2 C D Nasty graph! Not efficient! p 3 p 4 d Can propagadon help? P[q(a,d)] = p 1 p 3 p 1 p 4 p 2 p 4 = reachability a d z 1 z 2 z 2 30
31 Probabilis)c databases (3/3) q(x,u):-r p (x,y), S(y,z), T p (y,u)! R p S T p A B B C C D a y 1 p 1 y 1 z 1 y 1 d p 3 a y 2 p 2 y 1 z 2 y 2 d p 4 y 2 z 2 q(x,y):-r p (x,y), R p (x,z), T p (z,u)! Non- linear chain queries / self joins. How to define a propagadon semandcs? 31
32 Which ranking seman1cs is appropriate for real data? Input (probabilis1c) data? Ouput ranked results R p 1 1. Hard in general A B a a p 1 a e p 2 Possible world seman(cs ~ reliability SensiBvity of ranking with respect to accur- acy of input probabilit b c p 3 d c p 4 d a p 5 e a p 6 e c p 7 e d p 8 4. Hidden dependencies in the input data in the first place Alterna(ve ranking seman(cs ~ propagabon Decrease in ranking quality due to approximabon Can we get good ranking results for arbitrary queries on real data or at least a good trade- off speed / ranking accuracy? 32
33 Further informa.on PAPER L. Detwiler, W. Ga4erbauer, B. Louie, D. Suciu and P. Tarczy- Hornoch. IntegraEng and Ranking Uncertain ScienEfic Data. In Proceedings of the 25th InternaEonal Conference on Data Engineering, PROJECT WEB PAGE h4p:// DATABASE RESEARCH GROUP h4p://db.cs.washington.edu/ CONTACT Wolfgang Ga4erbauer: THANKS! 33
Priors in Dependency network learning
Priors in Dependency network learning Sushmita Roy sroy@biostat.wisc.edu Computa:onal Network Biology Biosta2s2cs & Medical Informa2cs 826 Computer Sciences 838 hbps://compnetbiocourse.discovery.wisc.edu
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on. Professor Wei-Min Shen Week 8.1 and 8.2
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 8.1 and 8.2 Status Check Projects Project 2 Midterm is coming, please do your homework!
More informationCSE 473: Ar+ficial Intelligence. Hidden Markov Models. Bayes Nets. Two random variable at each +me step Hidden state, X i Observa+on, E i
CSE 473: Ar+ficial Intelligence Bayes Nets Daniel Weld [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188 materials are available at hnp://ai.berkeley.edu.]
More informationCSE 473: Ar+ficial Intelligence
CSE 473: Ar+ficial Intelligence Hidden Markov Models Luke Ze@lemoyer - University of Washington [These slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188
More informationLeast Squares Parameter Es.ma.on
Least Squares Parameter Es.ma.on Alun L. Lloyd Department of Mathema.cs Biomathema.cs Graduate Program North Carolina State University Aims of this Lecture 1. Model fifng using least squares 2. Quan.fica.on
More informationNetworks. Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource
Networks Can (John) Bruce Keck Founda7on Biotechnology Lab Bioinforma7cs Resource Networks in biology Protein-Protein Interaction Network of Yeast Transcriptional regulatory network of E.coli Experimental
More informationQuan&fying Uncertainty. Sai Ravela Massachuse7s Ins&tute of Technology
Quan&fying Uncertainty Sai Ravela Massachuse7s Ins&tute of Technology 1 the many sources of uncertainty! 2 Two days ago 3 Quan&fying Indefinite Delay 4 Finally 5 Quan&fying Indefinite Delay P(X=delay M=
More informationBayesian networks Lecture 18. David Sontag New York University
Bayesian networks Lecture 18 David Sontag New York University Outline for today Modeling sequen&al data (e.g., =me series, speech processing) using hidden Markov models (HMMs) Bayesian networks Independence
More informationLecture 4 Introduc-on to Data Flow Analysis
Lecture 4 Introduc-on to Data Flow Analysis I. Structure of data flow analysis II. Example 1: Reaching defini?on analysis III. Example 2: Liveness analysis IV. Generaliza?on 15-745: Intro to Data Flow
More informationIntroduc)on to Ar)ficial Intelligence
Introduc)on to Ar)ficial Intelligence Lecture 10 Probability CS/CNS/EE 154 Andreas Krause Announcements! Milestone due Nov 3. Please submit code to TAs! Grading: PacMan! Compiles?! Correct? (Will clear
More informationEnsemble of Climate Models
Ensemble of Climate Models Claudia Tebaldi Climate Central and Department of Sta7s7cs, UBC Reto Knu>, Reinhard Furrer, Richard Smith, Bruno Sanso Outline Mul7 model ensembles (MMEs) a descrip7on at face
More informationDifferen'al Privacy with Bounded Priors: Reconciling U+lity and Privacy in Genome- Wide Associa+on Studies
Differen'al Privacy with Bounded Priors: Reconciling U+lity and Privacy in Genome- Wide Associa+on Studies Florian Tramèr, Zhicong Huang, Erman Ayday, Jean- Pierre Hubaux ACM CCS 205 Denver, Colorado,
More informationCorrela'on. Keegan Korthauer Department of Sta's'cs UW Madison
Correla'on Keegan Korthauer Department of Sta's'cs UW Madison 1 Rela'onship Between Two Con'nuous Variables When we have measured two con$nuous random variables for each item in a sample, we can study
More informationUnit 3: Ra.onal and Radical Expressions. 3.1 Product Rule M1 5.8, M , M , 6.5,8. Objec.ve. Vocabulary o Base. o Scien.fic Nota.
Unit 3: Ra.onal and Radical Expressions M1 5.8, M2 10.1-4, M3 5.4-5, 6.5,8 Objec.ve 3.1 Product Rule I will be able to mul.ply powers when they have the same base, including simplifying algebraic expressions
More informationSeman&cs with Dense Vectors. Dorota Glowacka
Semancs with Dense Vectors Dorota Glowacka dorota.glowacka@ed.ac.uk Previous lectures: - how to represent a word as a sparse vector with dimensions corresponding to the words in the vocabulary - the values
More informationIntroduc)on to Ar)ficial Intelligence
Introduc)on to Ar)ficial Intelligence Lecture 13 Approximate Inference CS/CNS/EE 154 Andreas Krause Bayesian networks! Compact representa)on of distribu)ons over large number of variables! (OQen) allows
More informationSta$s$cal sequence recogni$on
Sta$s$cal sequence recogni$on Determinis$c sequence recogni$on Last $me, temporal integra$on of local distances via DP Integrates local matches over $me Normalizes $me varia$ons For cts speech, segments
More informationCS 6140: Machine Learning Spring What We Learned Last Week. Survey 2/26/16. VS. Model
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More informationDART Tutorial Part IV: Other Updates for an Observed Variable
DART Tutorial Part IV: Other Updates for an Observed Variable UCAR The Na'onal Center for Atmospheric Research is sponsored by the Na'onal Science Founda'on. Any opinions, findings and conclusions or recommenda'ons
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationFIDUCEO Fidelity and Uncertainty in Climate Data Records from Earth Observation
FIDUCEO has received funding from the European Union s Horizon 2020 Programme for Research and Innovation, under Grant Agreement no. 638822 FIDUCEO Fidelity and Uncertainty in Climate Data Records from
More informationMul$- model ensemble challenge ini$al/model uncertain$es
Mul$- model ensemble challenge ini$al/model uncertain$es Yuejian Zhu Ensemble team leader Environmental Modeling Center NCEP/NWS/NOAA Acknowledgments: EMC ensemble team staffs Presenta$on for WMO/WWRP
More informationLinear Regression and Correla/on. Correla/on and Regression Analysis. Three Ques/ons 9/14/14. Chapter 13. Dr. Richard Jerz
Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques
More informationLinear Regression and Correla/on
Linear Regression and Correla/on Chapter 13 Dr. Richard Jerz 1 Correla/on and Regression Analysis Correla/on Analysis is the study of the rela/onship between variables. It is also defined as group of techniques
More informationPSAAP Project Stanford
PSAAP Project QMU @ Stanford Component Analysis and rela:on to Full System Simula:ons 1 What do we want to predict? Objec:ve: predic:on of the unstart limit expressed as probability of unstart (or alterna:vely
More informationAnnouncements. Topics: Work On: - sec0ons 1.2 and 1.3 * Read these sec0ons and study solved examples in your textbook!
Announcements Topics: - sec0ons 1.2 and 1.3 * Read these sec0ons and study solved examples in your textbook! Work On: - Prac0ce problems from the textbook and assignments from the coursepack as assigned
More informationCSE 473: Ar+ficial Intelligence. Probability Recap. Markov Models - II. Condi+onal probability. Product rule. Chain rule.
CSE 473: Ar+ficial Intelligence Markov Models - II Daniel S. Weld - - - University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All CS188
More informationClass Notes. Examining Repeated Measures Data on Individuals
Ronald Heck Week 12: Class Notes 1 Class Notes Examining Repeated Measures Data on Individuals Generalized linear mixed models (GLMM) also provide a means of incorporang longitudinal designs with categorical
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 1 Evalua:on
More informationGraphical Models. Lecture 1: Mo4va4on and Founda4ons. Andrew McCallum
Graphical Models Lecture 1: Mo4va4on and Founda4ons Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. Board work Expert systems the desire for probability
More informationOutline. Logic. Knowledge bases. Wumpus world characteriza/on. Wumpus World PEAS descrip/on. A simple knowledge- based agent
Outline Logic Dr. Melanie Mar/n CS 4480 October 8, 2012 Based on slides from hap://aima.eecs.berkeley.edu/2nd- ed/slides- ppt/ Knowledge- based agents Wumpus world Logic in general - models and entailment
More informationCosmological N-Body Simulations and Galaxy Surveys
Cosmological N-Body Simulations and Galaxy Surveys Adrian Pope, High Energy Physics, Argonne Na3onal Laboratory, apope@anl.gov CScADS: Scien3fic Data and Analy3cs for Extreme- scale Compu3ng, 30 July 2012
More informationFounda'ons of Large- Scale Mul'media Informa'on Management and Retrieval. Lecture #4 Similarity. Edward Chang
Founda'ons of Large- Scale Mul'media Informa'on Management and Retrieval Lecture #4 Similarity Edward Y. Chang Edward Chang Foundations of LSMM 1 Edward Chang Foundations of LSMM 2 Similar? Edward Chang
More informationEESC 9945 Geodesy with the Global Posi6oning System. Class 2: Satellite orbits
EESC 9945 Geodesy with the Global Posi6oning System Class 2: Satellite orbits Background The model for the pseudorange was Today, we ll develop how to calculate the vector posi6on of the satellite The
More informationDART Tutorial Sec'on 1: Filtering For a One Variable System
DART Tutorial Sec'on 1: Filtering For a One Variable System UCAR The Na'onal Center for Atmospheric Research is sponsored by the Na'onal Science Founda'on. Any opinions, findings and conclusions or recommenda'ons
More informationLeast Square Es?ma?on, Filtering, and Predic?on: ECE 5/639 Sta?s?cal Signal Processing II: Linear Es?ma?on
Least Square Es?ma?on, Filtering, and Predic?on: Sta?s?cal Signal Processing II: Linear Es?ma?on Eric Wan, Ph.D. Fall 2015 1 Mo?va?ons If the second-order sta?s?cs are known, the op?mum es?mator is given
More informationRecurrent Neural Networks. Dr. Kira Radinsky CTO SalesPredict Visi8ng Professor/Scien8st Technion. Slides were adapted from lectures by Richard Socher
Recurrent Neural Networks Dr. Kira Radinsky CTO SalesPredict Visi8ng Professor/Scien8st Technion Slides were adapted from lectures by Richard Socher Overview Tradi8onal language models RNNs RNN language
More informationCrowdsourcing Mul/- Label Classifica/on. Jonathan Bragg University of Washington
Crowdsourcing Mul/- Label Classifica/on Jonathan Bragg University of Washington Collaborators Dan Weld University of Washington Mausam University of Washington à IIT Delhi Overview What is mul?- label
More informationLatent Dirichlet Alloca/on
Latent Dirichlet Alloca/on Blei, Ng and Jordan ( 2002 ) Presented by Deepak Santhanam What is Latent Dirichlet Alloca/on? Genera/ve Model for collec/ons of discrete data Data generated by parameters which
More informationSTAD68: Machine Learning
STAD68: Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 1 Evalua;on 3 Assignments worth 40%. Midterm worth 20%. Final
More informationProbability and Structure in Natural Language Processing
Probability and Structure in Natural Language Processing Noah Smith Heidelberg University, November 2014 Introduc@on Mo@va@on Sta@s@cal methods in NLP arrived ~20 years ago and now dominate. Mercer was
More informationPar$cle Filters Part I: Theory. Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading
Par$cle Filters Part I: Theory Peter Jan van Leeuwen Data- Assimila$on Research Centre DARC University of Reading Reading July 2013 Why Data Assimila$on Predic$on Model improvement: - Parameter es$ma$on
More informationHidden Markov Models and Applica2ons. Spring 2017 February 21,23, 2017
Hidden Markov Models and Applica2ons Spring 2017 February 21,23, 2017 Gene finding in prokaryotes Reading frames A protein is coded by groups of three nucleo2des (codons): ACGTACGTACGTACGT ACG-TAC-GTA-CGT-ACG-T
More informationEnsemble Data Assimila.on and Uncertainty Quan.fica.on
Ensemble Data Assimila.on and Uncertainty Quan.fica.on Jeffrey Anderson, Alicia Karspeck, Tim Hoar, Nancy Collins, Kevin Raeder, Steve Yeager Na.onal Center for Atmospheric Research Ocean Sciences Mee.ng
More informationCSE P 501 Compilers. Value Numbering & Op;miza;ons Hal Perkins Winter UW CSE P 501 Winter 2016 S-1
CSE P 501 Compilers Value Numbering & Op;miza;ons Hal Perkins Winter 2016 UW CSE P 501 Winter 2016 S-1 Agenda Op;miza;on (Review) Goals Scope: local, superlocal, regional, global (intraprocedural), interprocedural
More informationIS4200/CS6200 Informa0on Retrieval. PageRank Con+nued. with slides from Hinrich Schütze and Chris6na Lioma
IS4200/CS6200 Informa0on Retrieval PageRank Con+nued with slides from Hinrich Schütze and Chris6na Lioma Exercise: Assump0ons underlying PageRank Assump0on 1: A link on the web is a quality signal the
More informationApproximate Inference
Approximate Inference Simulation has a name: sampling Sampling is a hot topic in machine learning, and it s really simple Basic idea: Draw N samples from a sampling distribution S Compute an approximate
More informationProbability. CS 3793/5233 Artificial Intelligence Probability 1
CS 3793/5233 Artificial Intelligence 1 Motivation Motivation Random Variables Semantics Dice Example Joint Dist. Ex. Axioms Agents don t have complete knowledge about the world. Agents need to make decisions
More informationThe Mysteries of Quantum Mechanics
The Mysteries of Quantum Mechanics Class 5: Quantum Behavior and Interpreta=ons Steve Bryson www.stevepur.com/quantum Ques=ons? The Quantum Wave Quantum Mechanics says: A par=cle s behavior is described
More informationThe consistency between measured radiance and retrieved profiles at climate scales a study in uncertainty propaga9on
The consistency between measured radiance and retrieved profiles at climate scales a study in uncertainty propaga9on Nadia Smith, Dave Tobin, Bob Knuteson, Bill Smith Sr., Elisabeth Weisz and Hank Revercomb
More informationParallel Tempering Algorithm in Monte Carlo Simula5on
Parallel Tempering Algorithm in Monte Carlo Simula5on Tony Cheung (CUHK) Kevin Zhao (CUHK) Mentors: Ying Wai Li (ORNL) Markus Eisenbach (ORNL) Kwai Wong (UTK/ORNL) Monte Carlo Algorithms Mo5va5on: Idea:
More informationBias/variance tradeoff, Model assessment and selec+on
Applied induc+ve learning Bias/variance tradeoff, Model assessment and selec+on Pierre Geurts Department of Electrical Engineering and Computer Science University of Liège October 29, 2012 1 Supervised
More informationExact data mining from in- exact data Nick Freris
Exact data mining from in- exact data Nick Freris Qualcomm, San Diego October 10, 2013 Introduc=on (1) Informa=on retrieval is a large industry.. Biology, finance, engineering, marke=ng, vision/graphics,
More informationEngineering Characteriza.on of Spa.ally Variable Ground Mo.on
Engineering Characteriza.on of Spa.ally Variable Ground Mo.on Timothy D. Ancheta PEER Center, UC Berkeley Jonathan P. Stewart UCLA Civil & Environmental Engineering Department Norman A. Abrahamson Pacific
More information1998: enter Link Analysis
1998: enter Link Analysis uses hyperlink structure to focus the relevant set combine traditional IR score with popularity score Page and Brin 1998 Kleinberg Web Information Retrieval IR before the Web
More informationParameter Es*ma*on: Cracking Incomplete Data
Parameter Es*ma*on: Cracking Incomplete Data Khaled S. Refaat Collaborators: Arthur Choi and Adnan Darwiche Agenda Learning Graphical Models Complete vs. Incomplete Data Exploi*ng Data for Decomposi*on
More informationLecture 12 The Level Set Approach for Turbulent Premixed Combus=on
Lecture 12 The Level Set Approach for Turbulent Premixed Combus=on 12.- 1 A model for premixed turbulent combus7on, based on the non- reac7ng scalar G rather than on progress variable, has been developed
More informationCSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on
CSCI 360 Introduc/on to Ar/ficial Intelligence Week 2: Problem Solving and Op/miza/on Professor Wei-Min Shen Week 13.1 and 13.2 1 Status Check Extra credits? Announcement Evalua/on process will start soon
More informationSemantics of Ranking Queries for Probabilistic Data and Expected Ranks
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks Graham Cormode AT&T Labs Feifei Li FSU Ke Yi HKUST 1-1 Uncertain, uncertain, uncertain... (Probabilistic, probabilistic, probabilistic...)
More informationEnsemble Data Assimila.on for Climate System Component Models
Ensemble Data Assimila.on for Climate System Component Models Jeffrey Anderson Na.onal Center for Atmospheric Research In collabora.on with: Alicia Karspeck, Kevin Raeder, Tim Hoar, Nancy Collins IMA 11
More informationExponen'al func'ons and exponen'al growth. UBC Math 102
Exponen'al func'ons and exponen'al growth Course Calendar: OSH 4 due by 12:30pm in MX 1111 You are here Coming up (next week) Group version of Quiz 3 distributed by email Group version of Quiz 3 due in
More informationPlanning and Analyzing WFIRST Grism Observa:ons
Planning and Analyzing WFIRST Grism Observa:ons Stefano Casertano and the STScI Slitless Spectroscopy Working Group (Brammer, Dixon, MacKenty, Pirzkal, Ravindranath, Ryan) Pasadena 2/29/2016 - WFIRST mee6ng,
More informationCSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015
CSE446: Linear Regression Regulariza5on Bias / Variance Tradeoff Winter 2015 Luke ZeElemoyer Slides adapted from Carlos Guestrin Predic5on of con5nuous variables Billionaire says: Wait, that s not what
More informationCS 161: Design and Analysis of Algorithms
CS 161: Design and Analysis of Algorithms NP- Complete I P, NP Polynomial >me reduc>ons NP- Hard, NP- Complete Sat/ 3- Sat Decision Problem Suppose there is a func>on A that outputs True or False A decision
More information1. Introduc9on 2. Bivariate Data 3. Linear Analysis of Data
Lecture 3: Bivariate Data & Linear Regression 1. Introduc9on 2. Bivariate Data 3. Linear Analysis of Data a) Freehand Linear Fit b) Least Squares Fit c) Interpola9on/Extrapola9on 4. Correla9on 1. Introduc9on
More informationCri$ques Ø 5 cri&ques in total Ø Each with 6 points
Cri$ques Ø 5 cri&ques in total Ø Each with 6 points 1 Distributed Applica$on Alloca$on in Shared Sensor Networks Chengjie Wu, You Xu, Yixin Chen, Chenyang Lu Shared Sensor Network Example in San Francisco
More informationCS 7180: Behavioral Modeling and Decision- making in AI
CS 7180: Behavioral Modeling and Decision- making in AI Hidden Markov Models Prof. Amy Sliva October 26, 2012 Par?ally observable temporal domains POMDPs represented uncertainty about the state Belief
More informationCS 188: Artificial Intelligence Fall Recap: Inference Example
CS 188: Artificial Intelligence Fall 2007 Lecture 19: Decision Diagrams 11/01/2007 Dan Klein UC Berkeley Recap: Inference Example Find P( F=bad) Restrict all factors P() P(F=bad ) P() 0.7 0.3 eather 0.7
More informationCockcro: Ins<tute Postgraduate Lectures Numerical Methods and LaAce Design
Cockcro: Ins
More informationGraph structure learning for network inference
Graph structure learning for network inference Sushmita Roy sroy@biostat.wisc.edu Computa9onal Network Biology Biosta2s2cs & Medical Informa2cs 826 Computer Sciences 838 hbps://compnetbiocourse.discovery.wisc.edu
More informationElici%ng Informa%on from the Crowd a Part of the EC 13 Tutorial on Social Compu%ng and User- Generated Content
Elici%ng Informa%on from the Crowd a Part of the EC 13 Tutorial on Social Compu%ng and User- Generated Content Yiling Chen Harvard University June 16, 2013 Roadmap Elici%ng informa%on for events with verifiable
More informationCSE 473: Ar+ficial Intelligence. Example. Par+cle Filters for HMMs. An HMM is defined by: Ini+al distribu+on: Transi+ons: Emissions:
CSE 473: Ar+ficial Intelligence Par+cle Filters for HMMs Daniel S. Weld - - - University of Washington [Most slides were created by Dan Klein and Pieter Abbeel for CS188 Intro to AI at UC Berkeley. All
More informationCS 188: Artificial Intelligence. Bayes Nets
CS 188: Artificial Intelligence Probabilistic Inference: Enumeration, Variable Elimination, Sampling Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew
More informationPedro Alexandrino Fernandes, Dep. Chemistry & Biochemistry, University of Porto, Portugal
Pedro Alexandrino Fernandes, Dep. Chemistry & Biochemistry, University of Porto, Portugal pedro.fernandes@fc.up.pt 1. Introduc3on Intermolecular Associa3ons 1. Introduc3on What type of forces govern these
More informationDatabase design and implementation CMPSCI 645
Database design and implementation CMPSCI 645 Lectures 20: Probabilistic Databases *based on a tutorial by Dan Suciu Have we seen uncertainty in DB yet? } NULL values Age Height Weight 20 NULL 200 NULL
More informationA Model for Quan.fying Informa.on Leakage. Steven Whang, Hector Garcia Molina Stanford University
A Model for Quan.fying Informa.on Leakage Steven Whang, Hector Garcia Molina Stanford University Mo.va.on Insurers Test Data Profiles to Iden7fy Risky Clients Steven E. Whang 2 Mo.va.on How Apple and Amazon
More informationData Prepara)on. Dino Pedreschi. Anna Monreale. Università di Pisa
Data Prepara)on Anna Monreale Dino Pedreschi Università di Pisa KDD Process Interpretation and Evaluation Data Consolidation Selection and Preprocessing Warehouse Data Mining Prepared Data p(x)=0.02 Patterns
More informationImage Processing 1 (IP1) Bildverarbeitung 1
MIN-Fakultät Fachbereich Informatik Arbeitsbereich SAV/BV (KOGS) Image Processing 1 (IP1) Bildverarbeitung 1 Lecture 18 Mo
More informationAnnouncements. CS 188: Artificial Intelligence Fall Causality? Example: Traffic. Topology Limits Distributions. Example: Reverse Traffic
CS 188: Artificial Intelligence Fall 2008 Lecture 16: Bayes Nets III 10/23/2008 Announcements Midterms graded, up on glookup, back Tuesday W4 also graded, back in sections / box Past homeworks in return
More informationLINK ANALYSIS. Dr. Gjergji Kasneci Introduction to Information Retrieval WS
LINK ANALYSIS Dr. Gjergji Kasneci Introduction to Information Retrieval WS 2012-13 1 Outline Intro Basics of probability and information theory Retrieval models Retrieval evaluation Link analysis Models
More informationMolecular Replacement. Airlie McCoy
Molecular Replacement Airlie McCoy Molecular Replacement Find orienta5on and posi5on where model overlies the target structure Borrow the phases Then it becomes a refinement problem the phases change known
More informationComputer Vision. Pa0ern Recogni4on Concepts. Luis F. Teixeira MAP- i 2014/15
Computer Vision Pa0ern Recogni4on Concepts Luis F. Teixeira MAP- i 2014/15 Outline General pa0ern recogni4on concepts Classifica4on Classifiers Decision Trees Instance- Based Learning Bayesian Learning
More informationEnsemble 4DVAR and observa3on impact study with the GSIbased hybrid ensemble varia3onal data assimila3on system. for the GFS
Ensemble 4DVAR and observa3on impact study with the GSIbased hybrid ensemble varia3onal data assimila3on system for the GFS Xuguang Wang University of Oklahoma, Norman, OK xuguang.wang@ou.edu Ting Lei,
More informationUnsupervised Learning: K- Means & PCA
Unsupervised Learning: K- Means & PCA Unsupervised Learning Supervised learning used labeled data pairs (x, y) to learn a func>on f : X Y But, what if we don t have labels? No labels = unsupervised learning
More informationREGRESSION AND CORRELATION ANALYSIS
Problem 1 Problem 2 A group of 625 students has a mean age of 15.8 years with a standard devia>on of 0.6 years. The ages are normally distributed. How many students are younger than 16.2 years? REGRESSION
More informationTHE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION TO CONTINUOUS BELIEF NETS
Proceedings of the 00 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowdon, and J. M. Charnes, eds. THE VINE COPULA METHOD FOR REPRESENTING HIGH DIMENSIONAL DEPENDENT DISTRIBUTIONS: APPLICATION
More informationLinking Traits to Ecosystem Processes. Moira Hough
Linking Traits to Ecosystem Processes Moira Hough How do organisms impact ecosystems? Long history of study of ecological effects of biodiversity and species composi@on comes out of the diversity stability
More informationData Envelopment Analysis (DEA) with an applica6on to the assessment of Academics research performance
Data Envelopment Analysis (DEA) with an applica6on to the assessment of Academics research performance Outline DEA principles Assessing the research ac2vity in an ICT School via DEA Selec2on of Inputs
More informationMulti-join Query Evaluation on Big Data Lecture 2
Multi-join Query Evaluation on Big Data Lecture 2 Dan Suciu March, 2015 Dan Suciu Multi-Joins Lecture 2 March, 2015 1 / 34 Multi-join Query Evaluation Outline Part 1 Optimal Sequential Algorithms. Thursday
More informationStatistical Models for sequencing data: from Experimental Design to Generalized Linear Models
Best practices in the analysis of RNA-Seq and CHiP-Seq data 4 th -5 th May 2017 University of Cambridge, Cambridge, UK Statistical Models for sequencing data: from Experimental Design to Generalized Linear
More informationGraphical Models. Lecture 3: Local Condi6onal Probability Distribu6ons. Andrew McCallum
Graphical Models Lecture 3: Local Condi6onal Probability Distribu6ons Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. 1 Condi6onal Probability Distribu6ons
More informationGraphical Models. Lecture 10: Variable Elimina:on, con:nued. Andrew McCallum
Graphical Models Lecture 10: Variable Elimina:on, con:nued Andrew McCallum mccallum@cs.umass.edu Thanks to Noah Smith and Carlos Guestrin for some slide materials. 1 Last Time Probabilis:c inference is
More informationDeriva'on of The Kalman Filter. Fred DePiero CalPoly State University EE 525 Stochas'c Processes
Deriva'on of The Kalman Filter Fred DePiero CalPoly State University EE 525 Stochas'c Processes KF Uses State Predic'ons KF es'mates the state of a system Example Measure: posi'on State: [ posi'on velocity
More informationMachine learning for Dynamic Social Network Analysis
Machine learning for Dynamic Social Network Analysis Manuel Gomez Rodriguez Max Planck Ins7tute for So;ware Systems UC3M, MAY 2017 Interconnected World SOCIAL NETWORKS TRANSPORTATION NETWORKS WORLD WIDE
More informationPSPACE, NPSPACE, L, NL, Savitch's Theorem. More new problems that are representa=ve of space bounded complexity classes
PSPACE, NPSPACE, L, NL, Savitch's Theorem More new problems that are representa=ve of space bounded complexity classes Outline for today How we'll count space usage Space bounded complexity classes New
More informationPredicate abstrac,on and interpola,on. Many pictures and examples are borrowed from The So'ware Model Checker BLAST presenta,on.
Predicate abstrac,on and interpola,on Many pictures and examples are borrowed from The So'ware Model Checker BLAST presenta,on. Outline. Predicate abstrac,on the idea in pictures 2. Counter- example guided
More informationComputational Issues in BSM Theories -- Past, Present and Future
Computational Issues in BSM Theories -- Past, Present and Future Meifeng Lin Computa0onal Science Center Brookhaven Na0onal Laboratory Field Theore0c Computer Simula0ons for Par0cle Physics And Condensed
More informationLecture 10: Introduction to reasoning under uncertainty. Uncertainty
Lecture 10: Introduction to reasoning under uncertainty Introduction to reasoning under uncertainty Review of probability Axioms and inference Conditional probability Probability distributions COMP-424,
More informationSome thoughts on linearity, nonlinearity, and partial separability
Some thoughts on linearity, nonlinearity, and partial separability Paul Hovland Argonne Na0onal Laboratory Joint work with Boyana Norris, Sri Hari Krishna Narayanan, Jean Utke, Drew Wicke Argonne Na0onal
More information