Outline. Theory-based Bayesian framework for property induction Causal structure induction

Similar documents
Learning Partially Observable Markov Models from First Passage Times

A Differential Approach to Inference in Bayesian Networks

Bayesian Networks: Approximate Inference

Project 6: Minigoals Towards Simplifying and Rewriting Expressions

Lecture Notes No. 10

Review Topic 14: Relationships between two numerical variables

Chapter 4 State-Space Planning

Abstraction of Nondeterministic Automata Rong Su

University of Sioux Falls. MAT204/205 Calculus I/II

Spacetime and the Quantum World Questions Fall 2010

Linear Algebra Introduction

6.5 Improper integrals

1 PYTHAGORAS THEOREM 1. Given a right angled triangle, the square of the hypotenuse is equal to the sum of the squares of the other two sides.

System Validation (IN4387) November 2, 2012, 14:00-17:00

Arrow s Impossibility Theorem

Introduction to Graphical Models

1B40 Practical Skills

TIME AND STATE IN DISTRIBUTED SYSTEMS

8 THREE PHASE A.C. CIRCUITS

Alpha Algorithm: A Process Discovery Algorithm

Alpha Algorithm: Limitations

Random subgroups of a free group

The area under the graph of f and above the x-axis between a and b is denoted by. f(x) dx. π O

ANALYSIS AND MODELLING OF RAINFALL EVENTS

Designing Information Devices and Systems I Spring 2018 Homework 7

Arrow s Impossibility Theorem

CS 347 Parallel and Distributed Data Processing

Lecture 6: Coding theory

dx dt dy = G(t, x, y), dt where the functions are defined on I Ω, and are locally Lipschitz w.r.t. variable (x, y) Ω.

STRUCTURE OF CONCURRENCY Ryszard Janicki. Department of Computing and Software McMaster University Hamilton, ON, L8S 4K1 Canada

The University of Nottingham SCHOOL OF COMPUTER SCIENCE A LEVEL 2 MODULE, SPRING SEMESTER MACHINES AND THEIR LANGUAGES ANSWERS

arxiv: v1 [hep-ph] 11 Sep 2018

Bases for Vector Spaces

Discrete Structures Lecture 11

Quantum Nonlocality Pt. 2: No-Signaling and Local Hidden Variables May 1, / 16

CS 310 (sec 20) - Winter Final Exam (solutions) SOLUTIONS

CHENG Chun Chor Litwin The Hong Kong Institute of Education

Dorf, R.C., Wan, Z. T- Equivalent Networks The Electrical Engineering Handbook Ed. Richard C. Dorf Boca Raton: CRC Press LLC, 2000

378 Relations Solutions for Chapter 16. Section 16.1 Exercises. 3. Let A = {0,1,2,3,4,5}. Write out the relation R that expresses on A.

Table of Content. c 1 / 5

Generalization of 2-Corner Frequency Source Models Used in SMSIM

Behavior Composition in the Presence of Failure

Discrete Structures, Test 2 Monday, March 28, 2016 SOLUTIONS, VERSION α

Lecture 3. In this lecture, we will discuss algorithms for solving systems of linear equations.

A Non-parametric Approach in Testing Higher Order Interactions

Section 1.3 Triangles

Algorithms & Data Structures Homework 8 HS 18 Exercise Class (Room & TA): Submitted by: Peer Feedback by: Points:

NON-DETERMINISTIC FSA

How do we solve these things, especially when they get complicated? How do we know when a system has a solution, and when is it unique?

Coalgebra, Lecture 15: Equations for Deterministic Automata

Implication Graphs and Logic Testing

6.3.2 Spectroscopy. N Goalby chemrevise.org 1 NO 2 CH 3. CH 3 C a. NMR spectroscopy. Different types of NMR

Nondeterministic Automata vs Deterministic Automata

Technische Universität München Winter term 2009/10 I7 Prof. J. Esparza / J. Křetínský / M. Luttenberger 11. Februar Solution

Unit #9 : Definite Integral Properties; Fundamental Theorem of Calculus

6.3.2 Spectroscopy. N Goalby chemrevise.org 1 NO 2 H 3 CH3 C. NMR spectroscopy. Different types of NMR

Properties of Integrals, Indefinite Integrals. Goals: Definition of the Definite Integral Integral Calculations using Antiderivatives

CS 188: Artificial Intelligence Fall Announcements

Solutions for HW9. Bipartite: put the red vertices in V 1 and the black in V 2. Not bipartite!

Semantic Analysis. CSCI 3136 Principles of Programming Languages. Faculty of Computer Science Dalhousie University. Winter Reading: Chapter 4

Part I: Study the theorem statement.

Discrete Mathematics and Probability Theory Spring 2013 Anant Sahai Lecture 17

Symmetrical Components 1

Exercise 3 Logic Control

CS311 Computational Structures Regular Languages and Regular Grammars. Lecture 6

y1 y2 DEMUX a b x1 x2 x3 x4 NETWORK s1 s2 z1 z2

Structure learning in human causal induction

More on automata. Michael George. March 24 April 7, 2014

MA123, Chapter 10: Formulas for integrals: integrals, antiderivatives, and the Fundamental Theorem of Calculus (pp.

Analysis of Temporal Interactions with Link Streams and Stream Graphs

List all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.

Statistical models for record linkage

Eigenvectors and Eigenvalues

Logic Synthesis and Verification

CS 275 Automata and Formal Language Theory

Resources. Introduction: Binding. Resource Types. Resource Sharing. The type of a resource denotes its ability to perform different operations

Activities. 4.1 Pythagoras' Theorem 4.2 Spirals 4.3 Clinometers 4.4 Radar 4.5 Posting Parcels 4.6 Interlocking Pipes 4.7 Sine Rule Notes and Solutions

KENDRIYA VIDYALAYA IIT KANPUR HOME ASSIGNMENTS FOR SUMMER VACATIONS CLASS - XII MATHEMATICS (Relations and Functions & Binary Operations)

CMPSCI 250: Introduction to Computation. Lecture #31: What DFA s Can and Can t Do David Mix Barrington 9 April 2014

AT100 - Introductory Algebra. Section 2.7: Inequalities. x a. x a. x < a

Unfoldings of Networks of Timed Automata

Nondeterministic Finite Automata

Lecture 2: January 27

CS 373, Spring Solutions to Mock midterm 1 (Based on first midterm in CS 273, Fall 2008.)

Maintaining Mathematical Proficiency

Continuous Joint Distributions Chris Piech CS109, Stanford University

Linear Inequalities. Work Sheet 1

Review of Gaussian Quadrature method

Main topics for the First Midterm

Unit 4. Combinational Circuits

Comparing Alternative Methods for Inference in Multiply Sectioned Bayesian Networks

10. AREAS BETWEEN CURVES

Motion illusions as optimal percepts

Thomas Whitham Sixth Form

Continuous Random Variables Class 5, Jeremy Orloff and Jonathan Bloom

Factorising FACTORISING.

Polynomials. Polynomials. Curriculum Ready ACMNA:

Numerical Analysis: Trapezoidal and Simpson s Rule

Chapter 5 Plan-Space Planning

Discrete Mathematics and Probability Theory Summer 2014 James Cook Note 17

Transcription:

Outline Theory-sed Byesin frmework for property indution Cusl struture indution Constrint-sed (ottom-up) lerning Theory-sed Byesin lerning

The origins of usl knowledge Question: how do people relily ome to true eliefs out the usl struture of their world? Answer must speify: Prior usl knowledge Cusl inferene proedure

Desriptive: Multiple gols Prior knowledge must e psyhologilly relisti. Inferene proedure must generte the sme eliefs tht people do, given the sme input. Explntory: Prior knowledge must e pproximtely orret. Inferene proedure (onstrined y prior knowledge) must e relile.

Anlogy with vision (Perl, Cheng, Gopnik et l.) Externl world struture Vision (inverse grphis) Grphis Oserved imges

The fundmentl prolem Hidden usl struture: A B Cusl indution Oserved dt: C E D Cusl struture uses oservtions Cse A B C D E 1 0 1 1 1 1 2 1 0 1 0 1 3 0 0 0 1 0 4 0 1 1 0 1....

Under-onstrined prolems In oth visul pereption nd usl indution, mny world strutures ould hve produed the sme dt. Imge removed due to opyright onsidertions. Plese see: Freemn, WT. "The Generi Viewpoint Assumption in Frmework for Visul Pereption." Nture 368 (7 April 1994): 542-545. Imge Possile world strutures

Under-onstrined prolems In oth visul pereption nd usl indution, mny world strutures ould hve produed the sme dt. A B A B P ( A, B ) z P ( A ) P ( B ) X X A B A B A B Correltion Possile world strutures

Questions in visul pereption How is the externl world represented? 3-D models 2-D views Intermedite: 2 1/2-D sketh, lyers, intrinsi imges, et. Wht kind of knowledge does the mind hve out the world? truture of ojets Physis of surfes ttistis of senes How does inferene work? Bottom-up, modulr, ontext-free Top-down, flexile, ontext-sensitive

Questions in usl indution How is the externl world represented? Assoitions Cusl strutures Intermedite: Cusl strength prmeters Wht kind of knowledge does the mind hve out the world? Constrints on usl struture (e.g., usl order) Fithfulness (oserved independene reltions re rel) Cusl mehnisms How does inferene work? Bottom-up: onstrint-sed (dt mining) pproh Top-down: theory-sed Byesin pproh

ome voulry Cusl struture peifies nothing out usl mehnisms or Wht uses wht. prmeteriztions. A B A B C D vs. C D E E

ome voulry Cusl struture Wht uses wht. Cusl mehnism How uses influene effets. C X D C D E E

ome voulry Cusl struture Wht uses wht. Cusl mehnism How uses influene effets. C X D C D E E

ome voulry Cusl struture Wht uses wht. Cusl mehnism How uses influene effets. C X D C D E E E f ( C, D )

ome voulry Cusl struture Wht uses wht. Cusl mehnism How uses influene effets. C X D C D E E E f ( C, D, İ) İ ~ Gussin(µ, ı)

ome voulry Cusl struture Wht uses wht. Cusl mehnism How uses influene effets. Knowledge out usl strutures nd mehnisms n e represented t different sles of detil. Astrt ( light ) mehnism knowledge will e prtiulrly importnt: e.g., - deterministi, qusi-deterministi, semi-deterministi or stohsti? - strong or wek? - genertive or preventive influene? - independent of or intertive with other uses?

ome voulry Cusl struture Wht uses wht. Cusl mehnism How uses influene effets. Prmeteriztion Form of P(effet uses), e.g. noisy-or Cusl strengths (prmeters) Reltive ontriutions of different uses given prtiulr mehnism or prmeteriztion.

Approhes to struture lerning Constrint-sed lerning (Perl, Glymour, Gopnik): Assume struture is unknown, no knowledge of prmeteriztion or prmeters Byesin lerning (Hekermn, Friedmn/Koller): Assume struture is unknown, ritrry prmeteriztion. Theory-sed Byesin inferene (T & G): Assume struture is prtilly unknown, prmeteriztion is known ut prmeters my not e. Prior knowledge out struture nd prmeteriztion depends on domin theories (derived from ontology nd mehnisms).

Approhes to struture lerning Constrint-sed lerning (Perl, Glymour, Gopnik): Assume struture is unknown, no knowledge of prmeteriztion or prmeters Byesin lerning (Hekermn, Friedmn/Koller): Assume struture is unknown, ritrry prmeteriztion. Theory-sed Byesin inferene (T & G): Assume struture is prtilly unknown, prmeteriztion is known ut prmeters my not e. Prior knowledge out struture nd prmeteriztion depends on domin theories (derived from ontology nd mehnisms).

Cusl inferene in siene tndrd question: is X diret use of? tndrd empiril methodologies in mny domins: Psyhology Mediine Epidemiology Eonomis Biology Constrint-sed inferene ttempts to formlize this methodology.

Constrint-sed lerning Cusl grph: A B C D Fithfulness ssumption E Proility distriution: P ( A, B, C, D, E ) P ( V prents [ V ]) V { A, B, C, D, E } Cusl Mrkov ssumption P ( A, B, C, D, E ) P ( A ) P ( B ) P ( C A, B ) P ( D B ) P ( E C,D )

Definition of use Under the usl Mrkov priniple, A is diret use of B implies tht when ll other potentilly relevnt vriles re held onstnt, the proility of B depends upon the presene or sene of A. Under the fithfulness ssumption, (in)dependene nd onditionl (in)dependene reltions in the oserved dt imply onstrints on the hidden usl struture (see piture).

Exmple Wht is the usl struture relting smoking (), yellow teeth (), nd lung ner ()? Epidemiologil Dt: Ptient moking? ellow teeth? ung Cner? 1 yes yes yes 2 yes yes no 3 yes no yes 4 no no no 5 yes yes yes 6 yes no no 7 yes no yes 8 no no no....

Full Common Effet Common Cuse Chin One link Empty

Inferene proess A hypothesis:

Inferene proess A hypothesis: Wht evidene would support this hypothesis? Would tht evidene e onsistent with ny other hypothesis?

Exmple Wht is the usl struture relting smoking (), yellow teeth (), nd lung ner ()? Expeted simple orreltions: smoking, yellow teeth: yes smoking, lung ner: yes yellow teeth, lung ner: yes Expeted prtil (onditionl) orreltions: smoking, yellow teeth lung ner: yes smoking, lung ner yellow teeth: yes yellow teeth, lung ner smoking: no

Exmple Wht is the usl struture relting smoking (), yellow teeth (), nd lung ner ()? Expeted simple orreltions: smoking, yellow teeth: yes smoking, lung ner: yes yellow teeth, lung ner: yes Under fithfulness, two vriles tht re orrelted must shre ommon nestor. In this exmple, eh pir of nodes must shre ommon nestor.

Common Effet Chin Common Cuse n Ch Full One link Empty

Glol semntis Joint proility distriution ftorizes into produt of lol onditionl proilities: n P ( V 1,, V n ) P ( V i prents [V i ]) i 1 Burglry Erthquke Alrm JohnClls MryClls P ( B, E, A, J, M ) P ( B ) P ( E ) P ( A B, E ) P ( J A ) P (M A )

ol semntis Glol ftoriztion is equivlent to set of onstrints on pirwise reltionships etween vriles. Mrkov property : Eh node is onditionlly independent of its non-desendnts given its prents. U 1 U m Z 1j X Z nj 1 n Imge y MIT OCW.

ol semntis Glol ftoriztion is equivlent to set of onstrints on pirwise reltionships etween vriles. Eh node is onditionlly independent of ll others given its Mrkov lnket : prents, hildren, hildren s prents. U 1 U m Z 1j X Z nj 1 n Imge y MIT OCW.

Exmple Wht is the usl struture relting smoking, yellow teeth, nd lung ner? Expeted prtil (onditionl) orreltions: smoking, yellow teeth lung ner: yes smoking, lung ner yellow teeth: yes yellow teeth, lung ner smoking: no Under fithfulness: If two vriles nd re onditionlly independent given, then nd must not e in eh other s Mrkov lnket, nd must e in the Mrkov lnket of oth.

Common Effet Chin Common Cuse n Ch Full One link Empty

Empty Full Common Effet Chin Common Cuse Cn we distinguish etween the remining strutures? One link

The limits of onstrint-sed inferene Mrkov equivlene lss: A set of usl grphs tht nnot e distinguished sed on (in)dependene reltions. With two vriles, there re three possile usl grphs nd two equivlene lsses:

The limits of onstrint-sed inferene Mrkov equivlene lss: A set of usl grphs tht nnot e distinguished sed on (in)dependene reltions. With two vriles, there re three possile usl grphs nd two equivlene lsses: A nd B not independent. A nd B independent.

Full Common Effet Common Cuse Chin One link Empty

Full Common Effet One link Chin n Ch Common Cuse Empty

Additionl soures of onstrint Prior knowledge out usl struture Temporl order Domin-speifi onstrints Interventions Exogenously lmp one or more vriles to some known vlue, nd oserve other vriles over series of ses.

Interventions Exmple: Fore smple of sujets to smoke. Idel interventions lok ll other diret uses of the mnipulted vrile:

Interventions Exmple: Fore smple of sujets to smoke, nd nother smple to not smoke. Idel interventions lok ll other diret uses of the mnipulted vrile: I I I

Interventions Exmple: Fore smple of sujets to smoke, nd nother smple to not smoke. Non-idel interventions simply dd n extr use tht is under the lerner s ontrol: I I I

Advntges of the onstrint- Dedutive Domin-generl sed pproh No essentil role for domin knowledge: Knowledge of possile usl strutures not needed. Knowledge of possile usl mehnisms not used.

Disdvntges of the onstrint- Dedutive Domin-generl sed pproh No essentil role for domin knowledge: Knowledge of possile usl strutures not needed. Knowledge of possile usl mehnisms not used. Requires lrge smple sizes to mke relile inferenes.

Exmple Wht is the usl struture relting smoking, yellow teeth, nd lung ner? Epidemiologil Dt: Ptient moking? ellow teeth? ung Cner? 1 yes yes yes 2 yes yes no 3 yes no yes 4 no no no 5 yes yes yes 6 yes no no 7 yes no yes 8 no no no....

Computing (in)dependene tndrd methods sed on F 2 test: V=0 V=1 U=0 U=1 d 2 Ȥ 2 ( d )( u d u ) ( )( d )( )( d ) signifintly > 0: not independent not signifintly > 0: independent

Computing (in)dependene Are smoking nd yellow teeth independent? =0 =1 =0 2 0 =1 3 3 F2 = 1.6, p = 0.21

Computing (in)dependene Are smoking nd lung ner independent? =0 =1 =0 2 0 =1 2 4 F2 = 2.67, p = 0.10

Computing (in)dependene Are lung ner nd yellow teeth onditionlly independent given smoking? =1 =0 =1 =0 =0 =1 =0 1 2 =0 2 0 =1 1 2 =1 0 0 F2 = 0, p = 1.0 F 2 = undefined

Disdvntges of the onstrint- Dedutive Domin-generl sed pproh No essentil role for domin knowledge: Knowledge of possile usl strutures not needed. Knowledge of possile usl mehnisms not used. Requires lrge smple sizes to mke relile inferenes.

The Bliket detetor Imge removed due to opyright onsidertions. Plese see: Gopnik, A., nd D. M. oel. "Deteting Blikets: How oung Children use Informtion out Novel Cusl Powers in Ctegoriztion nd Indution." Child Development 71 (2000): 1205-1222.

Imge removed due to opyright onsidertions. Plese see: Gopnik, A., nd D. M. oel. "Deteting Blikets: How oung Children use Informtion out Novel Cusl Powers in Ctegoriztion nd Indution." Child Development 71 (2000): 1205-1222.

The Bliket detetor Cn we explin these inferenes using onstrint-sed lerning? Wht other explntions n we ome up with?