Integrating Logic Programs and Connectionist Systems

Size: px
Start display at page:

Download "Integrating Logic Programs and Connectionist Systems"

Transcription

1 Integrating Logic Programs and Connectionist Systems Sebastian Bader, Steffen Hölldobler International Center for Computational Logic Technische Universität Dresden Germany Pascal Hitzler Institute of Applied Informatics and Formal Description Methods Technische Universität Karlsruhe Germany Integrating Logic Programs and Connectionist Systems 1

2 Introduction & Motivation: Overview Introduction & Motivation Propositional Logic Existing Approaches Proposititonal Logic Programs and the Core Method First-Order Logic Existing Approaches First-Order Logic Programs and the Core Method The Neural-Symbolic Learning Cycle Challenge Problems Introduction & Motivation

3 Introduction & Motivation: Connectionist Systems Well-suited to learn, to adapt to new environments, to degrade gracefully etc. Many successful applications. Approximate functions. Hardly any knowledge about the functions is needed. Trained using incomplete data. Declarative semantics is not available. Recursive networks are hardly understood. McCarthy 1988: We still observe a propositional fixation. Structured objects are difficult to represent. Smolensky 1987: Can we instantiate the power of symbolic computation within fully connectionist systems? Introduction & Motivation 3

4 Introduction & Motivation: Logic Systems Well-suited to represent and reason about structured objects and structure-sensitive processes. Many successful applications. Direct implementation of relations and functions. Explicit expert knowledge is required. Highly recursive structures. Well understood declarative semantics. Logic systems are brittle. Expert knowledge may not be available. Can we instantiate the power of connectionist computation within a logic system? Introduction & Motivation 4

5 Introduction & Motivation: Objective Seek the best of both paradigms! Understanding the relation between connectionist and logic systems. Contribute to the open research problems of both areas. Well developed for propositional case. Hard problem: going beyond. In this lecture: Overview on existing approaches. Logic programs and recurrent networks. Semantic operators for logic programs can be computed by connectionist systems. Introduction to the core method. Introduction & Motivation 5

6 Connectionist Networks A connectionist network consists of a set U of units and a set W U U of connections. Each connection is labeled by a weight w R. If there is a connection from unit u j to u k, then w kj is its associated weight. A unit is specified by an input vector i = (i 1,..., i m ), i j R, 1 j m, an activation function Φ mapping i to a potential p R, an output function Ψ mapping p to an (output) value v R. If there is a connection from u j to u k then w kj v j is the input received by u k along this connection. The potential and value of a unit are synchronously recomputed (or updated). Often a linear time t is added as parameter to input, potential and value. The state of a network with units u 1,..., u n at time t is (v 1 (t),..., v n (t)). Introduction & Motivation 6

7 Propositional Logic Existing Approaches Finite Automata and McCulloch-Pitts Networks Weighted Automata and Semiring Artificial Neural Networks Propositional Reasoning and Symmetric/Stochastic Networks Other Approaches Proposititonal Logic Programs and the Core Method The Very Idea Logic Programs Propositional Core Method Backpropagation Knowledge-Based Artificial Neural Networks Propositional Core Method using Sigmoidal Units Further Extensions Propositional Logic 7

8 Finite Automata and McCulloch-Pitts Networks McCulloch, Pitts 1943: Can the activities of nervous systems be modelled by a logical calculus? A McCulloch-Pitts network consists of a set U of binary threshold units and a set W U U of weighted connections. The set U I of input units is defined as U I = {u k U ( u j U) w kj = 0}. The set U O of output units is defined as U O = {u j U ( u k U) w kj = 0}. U I. McCulloch-Pitts network. U O Theorem McCulloch-Pitts networks are finite automata and vice versa. Propositional Logic: Existing Approaches 8

9 Binary Threshold Units u k is a binary threshold unit if where θ k R is a threshold. Three binary threshold units: Φ( i k ) = p k = P m j j=1 w kjv j 1 if pk θ Ψ(p k ) = v k = k 0 otherwise v 1 w 1 = 1 θ = 0.5 v = v 1 u v 1 v w 31 = 1 w 3 = 1 θ 3 = 0.5 u 3 v 1 w 31 = 1 v 3 = v 1 v v w 3 = 1 θ 3 = 1.5 u 3 v 3 = v 1 v Propositional Logic: Existing Approaches 9

10 Weighted Automata and Semiring Artificial Neural Networks Bader, Hölldobler, Scalzitti 004: Can the result by McCulloch and Pitts be extended to weighted automata? Let (K,,, 0 K, 1 K ) be a semiring. u k is a -unit if u k is a -unit if Φ( i k ) = p k = L m j=1 w kj v j Ψ(p k ) = v k = p k Φ( i k ) = p k = J m j=1 w kj v j Ψ(p k ) = v k = p k A semiring artificial neural network consists of a set U of - and -units and a set W U U of K-weighted connections. Theorem Weighted automata are semiring artificial neural networks. Propositional Logic: Existing Approaches 10

11 Symmetric Networks Hopfield 198: Can statistical models for magnetic materials explain the behavior of certain classes of networks? A symmetric network consists of a set U of binary threshold units and a set W U U of weighted connections such that w kj = w jk for all k, j with k j. Asynchronous update procedure: while state v is unstable: update an arbitrary unit Minimizes the energy function E( v) = P k<j w kjv j v k + P k θ kv k. Propositional Logic: Existing Approaches 11

12 Stochastic Networks or Boltzmann Machines Hinton, Sejnowski 1983: Can we escape local minima? A stochastic network is a symmetric network, but the values are computed probabilistically P (v k = 1) = e (θ k p k )/T where T is called pseudo temperature. In equilibrium stochastic networks are more likely to be in a state with low energy. Kirkpatrick etal. 1983: Can we compute a global minima? Simulated annealing: decrease T gradually. Theorem (Geman, Geman 1984) A global minima is reached if T is decreased in infinitesimal small steps. Propositional Logic: Existing Approaches 1

13 Propositional Reasoning and Energy Minimization Pinkas 1991: Is there a link between propositional logic and symmetric networks? Let D = C 1,..., C m be a propositional formula in clause form. We define Example τ (C) = 8 >< >: 0 if C = [ ], p if C = [p], 1 p if C = [ p], τ (C 1 ) + τ (C ) τ (C 1 )τ (C ) if C = (C 1 C ). τ (D) = P m i=1 (1 τ (C i)) τ ( [ o, m], [ s, m], [ c, m], [ c, s], [ v, m] ) = vm cm cs + sm om + c + o. Propositional Logic: Existing Approaches 13

14 Propositional Reasoning and Symmetric Networks Theorem v = D iff τ (D) has a global minima at v. Compare τ (D) = vm cm cs + sm om + c + o with E( v) = P k<j w kjv j v k + P k θ kv k. u 1 = o u 5 = v u = m u 4 = c u 3 = s Propositional Logic: Existing Approaches 14

15 Propositional Non-Monotonic Reasoning Pinkas 1991a: Can the above mentioned approach be extended to non-monotonic reasoning? Consider D = (C 1, k 1 ),..., (C m, k m ), where C i are clauses and k i R +. The penalty of v for (C, k) is k if v = C and 0 otherwise. The penalty of v for D is the sum of the penalties for (C i, k i ). v is preferred over w wrt D if the penalty of v for D is smaller than the penalty of w for D. Modify τ to become τ (D) = P m i=1 k i(1 τ (C i )), e.g., τ ( ([ o, m], 1), ([ s, m], ), ([ c, m], 4), ([ c, s], 4), ([ v, m], 4) ) = 4vm 4cm 4cs + sm om + 8c + o. The corresponding stochastic network computes most preferred interpretations. Propositional Logic: Existing Approaches 15

16 Proposititonal Logic Programs and the Core Method The Very Idea Logic Programs Propositional Core Method Backpropagation Knowledge-Based Artificial Neural Networks Propositional Core Method using Sigmoidal Units Further Extensions Propositional Logic Programs and the Core Method 16

17 The Very Idea Various semantics for logic programs coincide with fixed points of associated immediate consequence operators (e.g., Apt, vanemden 198). Banach Contraction Mapping Theorem A contraction mapping f defined on a complete metric space (X, d) has a unique fixed point. The sequence y, f(y), f(f(y)),... converges to this fixed point for any y X. Fitting 1994: Consider logic programs, whose immediate consequence operator is a contraction. Funahashi 1989: Every continuous function on the reals can be uniformly approximated by feedforward connectionist networks. Hölldobler, Kalinke, Störr 1999: Consider logic programs, whose immediate consequence operator is continuous on the reals. Propositional Logic Programs and the Core Method 17

18 Metrics A metric on a space M is a mapping d : M M R such that d(x, y) = 0 iff x = y, d(x, y) = d(y, x), and d(x, y) d(x, z) + d(z, y). Let (M, d) be a metric space and S = (s i s i M) a sequence. S converges if ( s M)( ɛ > 0)( N)( n N) d(s n, s) ɛ. S is Cauchy if ( ɛ > O)( N)( n, m N) d(s n, s m ) ɛ. (M, d) is complete if every Cauchy sequence converges. A mapping f : M M is a contraction on (M, d) if ( 0 < k < 1)( x, y M) d(f(x), f(y)) k d(x, y). Propositional Logic Programs and the Core Method 18

19 Propositional Logic Programs A propositional logic program P over a propositional language L is a finite set of clauses A L 1... L n, where A is an atom, L i are literals and n 0. P is definite if all L i, 1 i n are atoms. Let V be the set of all propositional variables occurring in L. An interpretation I is a mapping V {, }. I can be represented by the set of atoms which are mapped to under I. V is the set of all interpretations. Immediate consequence operator T P : V V : T P (I) = {A there is a clause A L 1... L n P such that I = L 1... L n }. I is a supported model iff T P (I) = I. Propositional Logic Programs and the Core Method 19

20 The Core Method Let L be a logic language. Given a logic program P together with immediate consequence operator T P. Let I be the set of interpretations for P. Find a mapping R : I R n. Construct a feed-forward network computing f P : R n R n, called the core, such that the following holds: If T P (I) = J then f P (R(I)) = R(J), where I, J I. If f P ( s) = t then T P (R 1 ( s)) = R 1 ( t), where s, t R n. Connect the units in the output layer recursively to the units in the input layer. Show that the following holds I = lfp (T P ) iff the recurrent network converges to or approximates R(I). Connectionist model generation using recurrent networks with feed forward core. Propositional Logic Programs and the Core Method 0

21 3-Layer Recurrent Networks... output layer... hidden layer... input layer... core At each point in time all units do: apply activation function to obtain potential, apply output function to obtain output. Propositional Logic Programs and the Core Method 1

22 Propositional Core Method using Binary Threshold Units Let L be the language of propositional logic over a set V of variables. Let P be a propositional logic program, e.g., P = {A, C A B, C A B}. I = V is the set of interpretations for P. T P (I) = {A A L 1... L m P such that I = L 1... L m }. T P ( ) = {A} T P ({A}) = {A, C} T P ({A, C}) = {A, C} = lfp (T P ) Propositional Logic Programs and the Core Method

23 Representing Interpretations I = V Let n = V and identify V with {1,..., n}. Define R : I R n such that for all 1 j n we find: R(I)[j] = j 1 if j I, 0 if j I. E.g., if V = {A, B, C} = {1,, 3} and I = {A, C} then R(I) = (1, 0, 1). Other encodings are possible, e.g., R(I)[j] = j 1 if j I, 1 if j I. Propositional Logic Programs and the Core Method 3

24 Computing the Core Consider again P = {A, C A B, C A B}. A translation algorithm translates P into a core of binary threshold units: A B C ω ω ω output layer ω ω ω ω hidden layer ω 1 1 A B C 1 input layer Propositional Logic Programs and the Core Method 4

25 Some Results Proposition -layer networks cannot compute T P for definite P. Theorem For each program P, there exists a core computing T P. Recall P = {A, C A B, C A B}. Adding recurrent connections: ω ω ω ω ω ω Propositional Logic Programs and the Core Method 5

26 More Results A logic programs P is said to be strongly determined if there exists a metric d on the set of all Herbrand interpretations for P such that T P is a contraction wrt d. Corollary Let P be a strongly determined program. Then there exists a core with recurrent connections such that the computation with an arbitrary initial input converges and yields the unique fixed point of T P. Let n be the number of clauses and m be the number of propositional variables occurring in P. m + n units, mn connections in the core. T P (I) is computed in steps. The parallel computational model to compute T P (I) is optimal. The recurrent network settles down in 3n steps in the worst case. See Hölldobler, Kalinke 1994 or Hitzler, Hölldobler, Seda 004 for details. Propositional Logic Programs and the Core Method 6

27 Rule Extraction (1) Proposition For each core C there exists a program P such that C computes T P. u u u u u u 1 u u 1 u u 3 u 4 u 5 u 6 u 7 p 3 v 3 p 4 v 4 p 5 v 5 p 6 v 6 p 7 v Propositional Logic Programs and the Core Method 7

28 Rule Extraction () Extracted program: P = { A 1 A 1 A, A 1 A 1 A, A A 1 A, A 1 A 1 A, A A 1 A, A 1 A 1 A, A A 1 A }. Simplified form: P = {A 1, A A 1, A A 1 A }. Propositional Logic Programs and the Core Method 8

29 3-Layer Feed-Forward Networks Revisited Theorem (Funahashi 1989) Suppose that Ψ : R R is non-constant, bounded, monotone increasing and continuous. Let K R n be compact, let f : K R be continuous, and let ε > 0. Then there exists a 3-layer feed-forward network with output function Ψ for the hidden layer and linear output function for the input and output layer whose input-output mapping f : K R satisfies max x K f(x) f(x) < ε. Every continuous function f : K R can be uniformly approximated by input-output functions of 3-layer feed-forward networks. u k is a sigmoidal unit if Φ( i k ) = p k = P m j=1 w kjv j Ψ(p k ) = v k = 1 1+e β(θ k p k ) where θ k R is a threshold (or bias) and β > 0 a steepness parameter. Propositional Logic Programs and the Core Method 9

30 Backpropagation Bryson, Ho 1969, Werbos 1974, Parker 1985, Rumelhart, etal. 1986: Can 3-layer feed-forward networks learn a particular function? Training set of input-output pairs {( i l, o l ) 1 l n}. Minimize E = P l El where E l = 1 P k (ol k vl k ). Gradient descent algorithm to learn appropriate weights. Backpropagation 1 Initialize weights arbitrarily. Present input pattern i l at time t. 3 Compute output pattern v l at time t +. 4 Change weights according to w l ij = ηδl i vl j, where j δ l Ψ i = i (pl i ) (ol i vl i ) Ψ i (pl i ) P k δl k w ki η > 0 is called learning rate. if i is output unit, if i is hidden unit, Propositional Logic Programs and the Core Method 30

31 Knowledge Based Artificial Neural Networks Towell, Shavlik 1994: Can we do better than empirical learning? Sets of hierarchical logic programs, e.g., P = {A B C D, A D E, H F G, K A, H}. A ω ω 3ω K ω ω ω 3ω H B C D E F G Propositional Logic Programs and the Core Method 31

32 Knowledge Based Artificial Neural Networks Learning Given hierachical sets of propositional rules as background knowledge. Map rules into multi-layer feed forward networks with sigmoidal units. Add hidden units (optional). Add units for known input features that are not referenced in the rules. Fully connect layers. Add near-zero random numbers to all links and thresholds. Apply backpropagation. Empirical evaluation: system performs better than purely empirical and purely hand-built classifiers. Propositional Logic Programs and the Core Method 3

33 Knowledge Based Artificial Neural Networks A Problem Works if rules have few conditions and there are few rules with the same head. C ω A 19ω 19ω B A 1 A 9 A 10 B 1 B B 10 1 p A = p B = 9ω and v A = v B = 1+eβ(9.5ω 9ω) 0.46 with β = 1. 1 p C = 0.9ω and v c = 1+eβ(0.5ω 0.9ω) 0.6 with β = 1. Propositional Logic Programs and the Core Method 33

34 Propositional Core Method using Pipolar Sigmoidal Units d Avila Garcez, Zaverucha, Carvalho 1997: Can we combine the ideas in Hölldobler, Kalinke 1994 and Towell, Shavlik 1994 while avoiding the above mentioned problem? Consider propositional logic language. Let I be an interpretation and a [0, 1]. R(I)[j] = j v [a, 1] if j I, w [ 1, a] if j I. Replace threshold and sigmoidal units by bipolar sigmoidal ones, i.e., units with Φ( i k ) = p k = P m j=1 w kjv j, Ψ(p k ) = v k = 1+e β(θ k p k ) 1, where θ k R is a threshold (or bias) and β > 0 a steepness parameter. Propositional Logic Programs and the Core Method 34

35 The Task How should a, ω and θ i be selected such that: v i [a, 1] or v i [ 1, a] and the core computes the immediate consequence operator? Propositional Logic Programs and the Core Method 35

36 Hidden Layer Units Consider A L 1... L n. Let u be the hidden layer unit for this rule. Suppose I = L 1... L n. u receives input ωa from unit representing L i. p u nωa = p + u. Suppose I = L 1... L n. u receives input ωa from at least one unit representing L i. p u (n 1)ω1 ωa = p u. θ u = nωa+(n 1)ω ωa = (na + n 1 a) ω = (n 1)(a + 1)ω. Propositional Logic Programs and the Core Method 36

37 Output Layer Units Let µ be the number of clause with head A. Consider A L 1... L n. Suppose I = L 1... L n. p A ωa + (µ 1)ω( 1) = ωa (µ 1)ω = p + A. Suppose for all rules of the form A L 1... L n we find I = L 1... L n. p A µωa = p A. θ A = ωa (µ 1)ω µωa = (a µ + 1 µa) ω = (1 µ)(a + 1)ω. Propositional Logic Programs and the Core Method 37

38 Computing a Value for a p + u > p u : nωa > (n 1)ω ωa. nωa + ωa > (n 1)ω. a(n + 1)ω > (n 1)ω. a > n 1 n+1. p + A > p A : ωa (µ 1)ω > µaω. ωa + µaω > (µ 1)ω. a(1 + µ)ω > (µ 1)ω. a > µ 1 µ+1. Consider all rules minimum value for a. Propositional Logic Programs and the Core Method 38

39 Computing a Value for ω Ψ(p) = 1+eβ(θ p) 1 a. 1+eβ(θ p) 1 + a. 1+a 1 + eβ(θ p). 1+a 1 = 1+a 1+a 1+a = 1 a 1+a eβ(θ p). ln( 1 a 1+a ) β(θ p). 1 β ln(1 a 1+a ) θ p. Consider a hidden layer unit: 1 β ln(1 a 1+a ) (n 1)(a + 1)ω ω (n 1 a(n+1))β ln(1 a 1+a nωa = na+n a 1 na ω = n 1 a(n+1) ω. ) because a n 1 n+1. Consider all hidden and output layer units as well as the case that Ψ(p) a: minimum value for ω. Propositional Logic Programs and the Core Method 39

40 Results Relation to logic programs is preserved. The core is trainable by backpropagation. Many interesting applications, e.g.: DNA sequence analysis. Power system fault diagnosis. Empirical evaluation: system performs better than well-known machine learning systems. See d Avila Garcez, Broda, Gabbay 00 for details. Propositional Logic Programs and the Core Method 40

41 Further Extensions Many-valued logic programs Modal logic programs Answer set programming Metalevel priorities Rule extraction Propositional Logic Programs and the Core Method 41

42 Propositional Core Method Three-Valued Logic Programs Kalinke 1994: Consider truth values,, u. Interpretations are pairs I = I +, I. Immediate consequence operator Φ P (I) = J +, J, where J + = {A A L 1... L m P and I(L 1... L m ) = }, J = {A for all A L 1... L m P : I(L 1... L m ) = }. Let n = V and identify V with {1,..., n}. Define R : I R n as follows: j 1 if j I + ff R(I)[j 1] = 0 if j I + and R(I)[j] = j 1 if j I 0 if j I ff Propositional Logic Programs and the Core Method 4

43 Propositional Core Method Multi-Valued Logic Programs For each program P, there exists a core computing Φ P, e.g., P = {C A B, D C E, D C}. A A B B C C D D E E ω ω 3ω ω ω ω ω ω 3ω ω 3ω ω ω ω ω ω 1 A A B B C C D D E E Lane, Seda 004: Extension to finitely determined sets of truth values. Propositional Logic Programs and the Core Method 43

44 Propositional Core Method Modal Logic Programs d Avila Garcez, Lamb, Gabbay 00. Let L be a propositional logic language plus the modalities and, and a finite set of labels w 1,..., w k denoting worlds. Let B be an atom, then B and B are modal atoms. A modal definite logic program P is a set of clauses of the form w i : A A 1... A m together with a finite set of relations w i w j, where w i, 1 i, j k, are labels and A, A 1,..., A m are atoms or modal atoms. P = S k i=1 P i, where P i consists of all clauses labelled with w i. Propositional Logic Programs and the Core Method 44

45 Modal Logic Programs Semantics Example: P = {w 1 : A, w 1 : C A} {w : B} {w 3 : B} {w 4 : B} {w 1 w, w 1 w 3, w 1 w 4, w w 4, } Kripke semantics: C B B A w 1 C C C C B B B B B A w w 4 C B B A w 3 f C (w 1 ) = w 4 Propositional Logic Programs and the Core Method 45

46 Modal Immediate Consequence Operator Interpretations are tuples I = I 1,..., I k Immediate consequence operator MT P (I) = J 1,..., J k, where J i = {A there exists A A 1... A m P i such that {A 1,..., A m } I i } { A there exists w i w j P and A I j } { A for all w i w j P we find A I j } {A there exists w j w i P and A I j } {A there exists w j w i P, A I j and f A (w j ) = w i } Propositional Logic Programs and the Core Method 46

47 Modal Logic Programs The Translation Algorithm Let n = V and identify V with {1,..., n}. Let a [0, 1]. Define R : I R 3n as follows: R(I)[3j ] = R(I)[3j 1] = R(I)[3j] = Translation algorithm such that j v [a, 1] j w [ 1, a] v [a, 1] j w [ 1, a] v [a, 1] w [ 1, a] if j Ij if j I j if j Ij if j I j if j Ij if j I j for each world the local part of MT P is computed by a core, the cores are turned into recurrent networks, and the cores are connected with respect to the given set of relations. Propositional Logic Programs and the Core Method 47

48 The Example Network w 4 w w 3 w 1 C C C B B B A A A Propositional Logic Programs and the Core Method 48

49 First-Order Logic Existing Approaches Reflexive Reasoning and SHRUTI Connectionist Term Representations Holographic Reduced Representations Plate 1991 Recursive Auto-Associative Memory Pollack 1988 Horn logic and CHCL Hölldobler 1990, Hölldobler, Kurfess 199 Other Approaches First-Order Logic Programs and the Core Method Initial Approach Construction of Approximating Networks Topological Analysis and Generalisations Employing Iterated Function Systems First-Order Logic 49

50 Reflexive Reasoning Humans are capable of performing a wide variety of cognitive tasks with extreme ease and efficiency. For traditional AI systems, the same problems turn out to be intractable. Human consensus knowledge: about 10 8 rules and facts. Wanted: Reflexive decisions within sublinear time. Shastri, Ajjanagadde 1993: SHRUTI. First-Order Logic 50

51 SHRUTI Knowledge Base Finite set of constants C, finite set of variables V. Rules: ( X 1... X m ) (p 1 (...)... p n (...) ( Y 1... Y k p(...)). p, p i, 1 i n, are multi-place predicate symbols. Arguments of the p i : variables from {X 1,..., X m } V. Arguments of p are from {X 1,..., X m } {Y 1,..., Y k } C. {Y 1,..., Y k } V. {X 1,..., X m } {Y 1,..., Y k } =. Facts and queries (goals): ( Z 1... Z l ) q(...). Multi-place predicate symbol q. Arguments of q are from {Z 1,..., Z l } C. {Z 1,..., Z l } V. First-Order Logic 51

52 Further Restrictions Restrictions to rules, facts, and goals: No function symbols except constants. Only universally bound variables may occur as arguments in the conditions of a rule. All variables occurring in a fact or goal occur only once and are existentially bound. An existentially quantified variable is only unified with variables. A variable which occurs more than once in the conditions of a rule must occur in the conclusion of the rule and must be bound when the conclusion is unified with a goal. A rule is used only a fixed number of times. Incompleteness. First-Order Logic 5

53 SHRUTI Example Rules P = { owns(y, Z) gives(x, Y, Z), owns(x, Y ) buys(x, Y ), can-sell(x, Y ) owns(x, Y ), gives(john, josephine, book), ( X) buys(john, X), owns(josephine, ball) }, Queries: can-sell(josephine, book) yes ( X) owns(josephine, X) yes {X book} {X ball} First-Order Logic 53

54 SHRUTI : The Network book john ball josephine can-sell 6 6 owns from john from jos. from book gives buys from john First-Order Logic 54

55 Solving the Variable Binding Problem buys buys nd arg buys 1st arg buys buys gives gives 3nd arg gives nd arg gives 1st arg gives gives owns owns nd arg owns 1st arg owns owns can sell nd arg can sell 1st arg can sell can sell josephine ball john book First-Order Logic 55

56 SHRUTI Remarks Answers are derived in time proportional to depth of search space. Number of units as well as of connections is linear in the size of the knowledge base. Extensions: compute answer substitutions allow a fixed number of copies of rules allow multiple literals in the body of a rule built in a taxonomy ROBIN (Lange, Dyer 1989): signatures instead of phases. Biological plausibility. Trading expressiveness for time and size. Logical reconstruction by Beringer, Hölldobler 1993: Reflexive reasoning is reasoning by reduction. First-Order Logic 56

57 First-Order Logic Programs and the Core Method Initial Approach Construction of Approximating Networks Topological Analysis and Generalisations Employing Iterated Function Systems First-Order Logic Programs and the Core Method 57

58 Logic Programs A logic program P over a first-order language L is a finite set of clauses A L 1... L n, where A is an atom, L i are literals and n 0. B L is the set of all ground atoms over L called Herbrand base. A Herbrand interpretation I is a mapping B L {, }. B L is the set of all Herbrand interpretations. ground(p) is the set of all ground instances of clauses in P. Immediate consequence operator T P : B L B L: T P (I) = {A there is a clause A L 1... L n ground(p) such that I = L 1... L n }. I is a supported model iff T P (I) = I. First-Order Logic Programs and the Core Method 58

59 The Initial Approach Hölldobler, Kalinke, Störr 1999: Can the core method be extended to first-order logic programs? Problem Given a logic program P over a first order language L together with T P : B L B L. B L is countably infinite. The method used to relate propositional logic and connectionist systems is not applicable. How can the gap between the discrete, symbolic setting of logic, and the continuous, real valued setting of connectionist networks be closed? First-Order Logic Programs and the Core Method 59

60 The Goal Find R : B L R and f P : R R such that the following conditions hold. T P (I) = I implies f P (R(I)) = R(I ). f P (x) = x implies T P (R 1 (x)) = R 1 (x ). f P is a sound and complete encoding of T P. T P is a contraction on B L iff f P is a contraction on R. The contraction property and fixed points are preserved. f P is continuous on R. A connectionst network approximating f P is known to exist. First-Order Logic Programs and the Core Method 60

61 Acyclic Logic Programs Let P be a program over a first order language L. A level mapping for P is a function l : B L N. We define l( A) = l(a). We can associate a metric d L with L and l. Let I, J B L: d L (I, J) = j 0 if I = J n if n is the smallest level on which I and J differ. Proposition (Fitting 1994) ( B L, d L ) is a complete metric space. P is said to be acyclic wrt a level mapping l, if for every A L 1... L n ground(p) we find l(a) > l(l i ) for all i. Proposition Let P be an acyclic logic program wrt l and d L the metric associated with L and l, then T P is a contraction on ( B L, d L ). First-Order Logic Programs and the Core Method 61

62 Mapping Interpretations to Real Numbers Let D = {r R r = P i=1 a i4 i, where a i {0, 1} for all i}. Let l be a bijective level mapping. {, } can be identified with {0, 1}. The set of all mappings B L {, } can be identified with the set of all mappings N {0, 1}. Let I L be the set of all mappings from B L to {0, 1}. Let R : I L D be defined as R(I) = X I(l 1 (i))4 i. i=1 Proposition R is a bijection. We have a sound and complete encoding of interpretations. First-Order Logic Programs and the Core Method 6

63 Mapping Immediate Consequence Operators to Functions on the Reals We define f P : D D : r R(T P (R 1 (r))). T P I I R R r f P r We have a sound and complete encoding of T P. Proposition Let P be an acylic program wrt a bijective level mapping. f P is a contraction on D. Contraction property and fixed points are preserved. First-Order Logic Programs and the Core Method 63

64 Approximating Continuous Functions Corollary f P is continuous. Recall Funahashi s theorem: Every continuous function f : K R can be uniformly approximated by input-output functions of 3-layer feed forward networks. Theorem f P can be uniformly approximated by input-output functions of 3-layer feed forward networks. T P can be approximated as well by applying R 1. Connectionist network approximating immediate consequence operator exists. First-Order Logic Programs and the Core Method 64

65 An Example Consider P = {q(0), q(s(x)) q(x)} and let l(q(s n (0))) = n + 1. P is acyclic wrt l, l is bijective, R(B L ) = 1 3. f P (R(I)) = 4 l(q(0)) + P q(x) I 4 l(q(s(x))) = 4 l(q(0)) + P q(x) I 4 (l(q(x)))+1) = 1+R(I) 4. Approximation of f P to accuracy ε yields» 1 + x f(x) ε, 1 + x ε. Starting with some x and iterating f yields in the limit a value» 1 4ε r, 1 + 4ε. 3 3 Applying R 1 to r we find q(s n (0)) R 1 (r) if n < log 4 ε 1. First-Order Logic Programs and the Core Method 65

66 Approximation of Interpretations Let P be a logic program over a first order language L and l a level mapping. An interpretation I approximates an interpretation J to a degree n N if for all atoms A B L with l(a) < n we find I(A) = iff J(A) =. I approximates J to a degree n iff d L (I, J) n. First-Order Logic Programs and the Core Method 66

67 Approximation of Supported Models Given an acyclic logic program P with bijective level mapping. Let T P be the immediate consequence operator associated with P and M P the least supported model of P. We can approximate T P by a 3-layer feed forward network. We can turn this network into a recurrent one. Does the recurrent network approximate the supported model of P? Theorem For an arbitrary m N there exists a recursive network with sigmoidal activation function for the hidden layer units and linear activation functions for the input and output layer units computing a function f P such that there exists an n 0 N such that for all n n 0 and for all x [ 1, 1] we find d L (R 1 (f n P (x)), M P) m. First-Order Logic Programs and the Core Method 67

68 Literature Apt, van Emden 198: Contributions to the Theory of Logic Programming. Journal of the ACM 9, Bader, Hölldobler, Scalzitti 004: Semiring Artificial Neural Networks and Weighted Automata and an Application to Digital Image Encoding. In: KI 004: Advances in Artificial Intelligence, Lecture Notes in Artificial Intelligence 338, Beringer, Hölldobler 1993: On the Adequateness of the Connection Method. In: Proceedings of the AAAI National Conference on Artificial Intelligence, d Avila Garcez, Gabbay 00: Neural-Symbolic Learning Systems. Springer. d Avila Garcez, Lamb, Gabbay 00: A Connectionist Inductive Learning System for Modal Logic Programming. In: Proceedings of the IEEE International Conference on Neural Information Processing. d Avila Garcez, Zaverucha, Carvalho 1997: Logic Programming and Inductive Inference in Artificial Neural Networks. In: Knowledge Representation in Neural Networks Logos, Berlin, Fitting 1994: Metric Methods Three Examples and a Theorem. Journal of Logic Programming 1, First-Order Logic Programs and the Core Method 68

69 Funahashi 1989: On the Approximate Realization of Continuous Mappings by Neural Networks. Neural Networks, Geman, Geman 1984: Stochastic Relaxation, Gibbs Distribution, and the Bayesian Restoration of Image. IEEE Transactions on Pattern Analysis and Machine Intelligence 6, Hinton, Sejnowski 1983: Optimal Perceptual Inference. In: Proceedings of the IEEE Conference on Computer Vision and Recognition, Hitzler, Hölldobler, Seda 004: Logic Programs and Connectionist Networks. Journal of Applied Logic, Hölldobler 1990: A Structured Connectionist Unification Algorithm. In: Proceedings of the AAAI National Conference on Artificial Intelligence, Hölldobler, Kalinke 1994: Towards a Massively Parallel Computational Model for Logic Programming. In: Proceedings of the ECAI94 Workshop on Combining Symbolic and Connectionist Processing, Hölldobler, Kalinke, Störr 1999: Approximating the Semantics of Logic Programs by Recurrent Neural Networks. Applied Intelligence 11, Hölldobler, Kurfess 199: CHCL A Connectionist Inference System. In: Parallelization in Inference Systems, Lecture Notes in Artificial Intelligence, 590, Hopfield 198: Neural Networks and Physical Systems with Emergent Collective Computational Abilities. In: Proceedings of the National Academy of Sciences USA, First-Order Logic Programs and the Core Method 69

70 Kalinke 1994: Ein massiv paralleles Berechnungsmodell für normale logische Programme. Diplomarbeit, TU Dresden, Fakultät Informatik (in German). Kirkpatrick etal. 1983: Optimization by Simulated Annealing. Science 0, Lane, Seda 004: Some Aspects of the Integration of Connectionist and Logic- Based Systems. University College Cork. Lange, Dyer 1989: High-Level Inferencing in a Connectionist Network. Connection Science 1, McCarthy 1988: Epistemological Challenges for Connectionism. Behavioural and Brain Sciences 11, 44. McCulloch, Pitts 1943: A Logical Calculus and the Ideas immanent in Nervous Activity. Bulletin of Mathematical Biophysics 5, Plate 1991: Holographic Reduced Representations. In Proceedings of the International Joint Conference on Artificial Intelligence, Pinkas 1991: Symmetric Neural Networks and Logic Satisfiability. Neural Computation 3, Pinkas 1991a: Propositional Non-Monotonic Reasoning and Inconsistency in Symmetrical Neural Networks. In: Proceedings International Joint Conference on Artificial Intelligence, Pollack 1988: Recursive auto-associative memory: Devising compositional distributed representations. In: Proceedings of the Annual Conference of the Cognitive Science Society, First-Order Logic Programs and the Core Method 70

71 Rumelhart eta. 1986: Parallel Distributed Processing. MIT Press. Shastri, Ajjanagadde 1993: From Associations to Systematic Reasoning: A Connectionist Representation of Rules, Variables and Dynamic Bindings using Temporal Synchrony. Behavioural and Brain Sciences 16, Smolensky 1987: On Variable Binding and the Representation of Symbolic Structures in Connectionist Systems. Report No. CU-CS , Department of Computer Science & Institute of Cognitive Science, University of Colorado, Boulder. Towell, Shavlik 1994: Knowledge Based Artificial Neural Networks. Artificial Intelligence 70, First-Order Logic Programs and the Core Method 71

Recurrent Neural Networks and Logic Programs

Recurrent Neural Networks and Logic Programs Recurrent Neural Networks and Logic Programs The Very Idea Propositional Logic Programs Propositional Logic Programs and Learning Propositional Logic Programs and Modalities First Order Logic Programs

More information

The Core Method. The Very Idea. The Propositional CORE Method. Human Reasoning

The Core Method. The Very Idea. The Propositional CORE Method. Human Reasoning The Core Method International Center for Computational Logic Technische Universität Dresden Germany The Very Idea The Propositional CORE Method Human Reasoning The Core Method 1 The Very Idea Various semantics

More information

Spreading of Activation

Spreading of Activation Spreading of Activation Feed forward networks: no loops. Example u 2 u 4.5 1.5 1-1 3-1.5.5.5 1 u 5 1 u 1 u 3 Algebra, Logic and Formal Methods in Computer Science 1 Reflexive Reasoning Humans are capable

More information

We don t have a clue how the mind is working.

We don t have a clue how the mind is working. We don t have a clue how the mind is working. Neural-Symbolic Integration Pascal Hitzler Universität Karlsruhe (TH) Motivation Biological neural networks can easily do logical reasoning. Why is it so difficult

More information

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i )

Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). (v ji (1 x i ) + (1 v ji )x i ) Symmetric Networks Hertz, Krogh, Palmer: Introduction to the Theory of Neural Computation. Addison-Wesley Publishing Company (1991). How can we model an associative memory? Let M = {v 1,..., v m } be a

More information

Neural-Symbolic Integration

Neural-Symbolic Integration Neural-Symbolic Integration A selfcontained introduction Sebastian Bader Pascal Hitzler ICCL, Technische Universität Dresden, Germany AIFB, Universität Karlsruhe, Germany Outline of the Course Introduction

More information

Logic Programs and Connectionist Networks

Logic Programs and Connectionist Networks Wright State University CORE Scholar Computer Science and Engineering Faculty Publications Computer Science and Engineering 9-2004 Logic Programs and Connectionist Networks Pascal Hitzler pascal.hitzler@wright.edu

More information

Extracting Reduced Logic Programs from Artificial Neural Networks

Extracting Reduced Logic Programs from Artificial Neural Networks Extracting Reduced Logic Programs from Artificial Neural Networks Jens Lehmann 1, Sebastian Bader 2, Pascal Hitzler 3 1 Department of Computer Science, Technische Universität Dresden, Germany 2 International

More information

logic is everywhere Logik ist überall Hikmat har Jaga Hai Mantık her yerde la logica è dappertutto lógica está em toda parte

logic is everywhere Logik ist überall Hikmat har Jaga Hai Mantık her yerde la logica è dappertutto lógica está em toda parte Steffen Hölldobler logika je všude logic is everywhere la lógica está por todas partes Logika ada di mana-mana lógica está em toda parte la logique est partout Hikmat har Jaga Hai Human Reasoning, Logic

More information

Knowledge Extraction from DBNs for Images

Knowledge Extraction from DBNs for Images Knowledge Extraction from DBNs for Images Son N. Tran and Artur d Avila Garcez Department of Computer Science City University London Contents 1 Introduction 2 Knowledge Extraction from DBNs 3 Experimental

More information

Fibring Neural Networks

Fibring Neural Networks Fibring Neural Networks Artur S d Avila Garcez δ and Dov M Gabbay γ δ Dept of Computing, City University London, ECV 0H, UK aag@soicityacuk γ Dept of Computer Science, King s College London, WC2R 2LS,

More information

First-order deduction in neural networks

First-order deduction in neural networks First-order deduction in neural networks Ekaterina Komendantskaya 1 Department of Mathematics, University College Cork, Cork, Ireland e.komendantskaya@mars.ucc.ie Abstract. We show how the algorithm of

More information

Towards reasoning about the past in neural-symbolic systems

Towards reasoning about the past in neural-symbolic systems Towards reasoning about the past in neural-symbolic systems Rafael V. Borges and Luís C. Lamb Institute of Informatics Federal University of Rio Grande do Sul orto Alegre RS, 91501-970, Brazil rvborges@inf.ufrgs.br;

More information

Connectionist Artificial Neural Networks

Connectionist Artificial Neural Networks Imperial College London Department of Computing Connectionist Artificial Neural Networks Master s Thesis by Mathieu Guillame-Bert Supervised by Dr. Krysia Broda Submitted in partial fulfillment of the

More information

Neural Nets and Symbolic Reasoning Hopfield Networks

Neural Nets and Symbolic Reasoning Hopfield Networks Neural Nets and Symbolic Reasoning Hopfield Networks Outline The idea of pattern completion The fast dynamics of Hopfield networks Learning with Hopfield networks Emerging properties of Hopfield networks

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

Neural networks. Chapter 20. Chapter 20 1

Neural networks. Chapter 20. Chapter 20 1 Neural networks Chapter 20 Chapter 20 1 Outline Brains Neural networks Perceptrons Multilayer networks Applications of neural networks Chapter 20 2 Brains 10 11 neurons of > 20 types, 10 14 synapses, 1ms

More information

Introduction to Neural Networks

Introduction to Neural Networks Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning

More information

Skeptical Abduction: A Connectionist Network

Skeptical Abduction: A Connectionist Network Skeptical Abduction: A Connectionist Network Technische Universität Dresden Luis Palacios Medinacelli Supervisors: Prof. Steffen Hölldobler Emmanuelle-Anna Dietz 1 To my family. Abstract A key research

More information

Extracting Reduced Logic Programs from Artificial Neural Networks

Extracting Reduced Logic Programs from Artificial Neural Networks Extracting Reduced Logic Programs from Artificial Neural Networks Jens Lehmann 1, Sebastian Bader 2, Pascal Hitzler 3 1 Department of Computer Science, Universität Leipzig, Germany 2 International Center

More information

Unsupervised Neural-Symbolic Integration

Unsupervised Neural-Symbolic Integration Unsupervised Neural-Symbolic Integration Son N. Tran The Australian E-Health Research Center, CSIRO son.tran@csiro.au Abstract Symbolic has been long considered as a language of human intelligence while

More information

Learning and Memory in Neural Networks

Learning and Memory in Neural Networks Learning and Memory in Neural Networks Guy Billings, Neuroinformatics Doctoral Training Centre, The School of Informatics, The University of Edinburgh, UK. Neural networks consist of computational units

More information

Knowledge Extraction from Deep Belief Networks for Images

Knowledge Extraction from Deep Belief Networks for Images Knowledge Extraction from Deep Belief Networks for Images Son N. Tran City University London Northampton Square, ECV 0HB, UK Son.Tran.@city.ac.uk Artur d Avila Garcez City University London Northampton

More information

Neural-Symbolic Integration

Neural-Symbolic Integration Neural-Symbolic Integration Dissertation zur Erlangung des akademischen Grades Doktor rerum naturalium (Dr. rer. nat) vorgelegt an der Technischen Universität Dresden Fakultät Informatik eingereicht von

More information

arxiv: v2 [cs.ai] 22 Jun 2017

arxiv: v2 [cs.ai] 22 Jun 2017 Unsupervised Neural-Symbolic Integration Son N. Tran The Australian E-Health Research Center, CSIRO son.tran@csiro.au arxiv:1706.01991v2 [cs.ai] 22 Jun 2017 Abstract Symbolic has been long considered as

More information

Introduction Biologically Motivated Crude Model Backpropagation

Introduction Biologically Motivated Crude Model Backpropagation Introduction Biologically Motivated Crude Model Backpropagation 1 McCulloch-Pitts Neurons In 1943 Warren S. McCulloch, a neuroscientist, and Walter Pitts, a logician, published A logical calculus of the

More information

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA 1/ 21

Neural Networks. Chapter 18, Section 7. TB Artificial Intelligence. Slides from AIMA   1/ 21 Neural Networks Chapter 8, Section 7 TB Artificial Intelligence Slides from AIMA http://aima.cs.berkeley.edu / 2 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

Introduction to Artificial Neural Networks

Introduction to Artificial Neural Networks Facultés Universitaires Notre-Dame de la Paix 27 March 2007 Outline 1 Introduction 2 Fundamentals Biological neuron Artificial neuron Artificial Neural Network Outline 3 Single-layer ANN Perceptron Adaline

More information

EEE 241: Linear Systems

EEE 241: Linear Systems EEE 4: Linear Systems Summary # 3: Introduction to artificial neural networks DISTRIBUTED REPRESENTATION An ANN consists of simple processing units communicating with each other. The basic elements of

More information

Neural networks. Chapter 20, Section 5 1

Neural networks. Chapter 20, Section 5 1 Neural networks Chapter 20, Section 5 Chapter 20, Section 5 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 20, Section 5 2 Brains 0 neurons of

More information

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks

ARTIFICIAL INTELLIGENCE. Artificial Neural Networks INFOB2KI 2017-2018 Utrecht University The Netherlands ARTIFICIAL INTELLIGENCE Artificial Neural Networks Lecturer: Silja Renooij These slides are part of the INFOB2KI Course Notes available from www.cs.uu.nl/docs/vakken/b2ki/schema.html

More information

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28

Neural Networks. Mark van Rossum. January 15, School of Informatics, University of Edinburgh 1 / 28 1 / 28 Neural Networks Mark van Rossum School of Informatics, University of Edinburgh January 15, 2018 2 / 28 Goals: Understand how (recurrent) networks behave Find a way to teach networks to do a certain

More information

CS:4420 Artificial Intelligence

CS:4420 Artificial Intelligence CS:4420 Artificial Intelligence Spring 2018 Neural Networks Cesare Tinelli The University of Iowa Copyright 2004 18, Cesare Tinelli and Stuart Russell a a These notes were originally developed by Stuart

More information

Distributed Knowledge Representation in Neural-Symbolic Learning Systems: A Case Study

Distributed Knowledge Representation in Neural-Symbolic Learning Systems: A Case Study Distributed Knowledge Representation in Neural-Symbolic Learning Systems: A Case Study Artur S. d Avila Garcez δ, Luis C. Lamb λ,krysia Broda β, and Dov. Gabbay γ δ Dept. of Computing, City University,

More information

The non-logical symbols determine a specific F OL language and consists of the following sets. Σ = {Σ n } n<ω

The non-logical symbols determine a specific F OL language and consists of the following sets. Σ = {Σ n } n<ω 1 Preliminaries In this chapter we first give a summary of the basic notations, terminology and results which will be used in this thesis. The treatment here is reduced to a list of definitions. For the

More information

Logic Programs, Iterated Function Systems, and Recurrent Radial Basis Function Networks

Logic Programs, Iterated Function Systems, and Recurrent Radial Basis Function Networks Wright State University CORE Scholar Computer Science and Engineering Faculty Publications Computer Science and Engineering 9-1-2004 Logic Programs, Iterated Function Systems, and Recurrent Radial Basis

More information

Lecture 7 Artificial neural networks: Supervised learning

Lecture 7 Artificial neural networks: Supervised learning Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann

(Feed-Forward) Neural Networks Dr. Hajira Jabeen, Prof. Jens Lehmann (Feed-Forward) Neural Networks 2016-12-06 Dr. Hajira Jabeen, Prof. Jens Lehmann Outline In the previous lectures we have learned about tensors and factorization methods. RESCAL is a bilinear model for

More information

AI Programming CS F-20 Neural Networks

AI Programming CS F-20 Neural Networks AI Programming CS662-2008F-20 Neural Networks David Galles Department of Computer Science University of San Francisco 20-0: Symbolic AI Most of this class has been focused on Symbolic AI Focus or symbols

More information

Neural networks. Chapter 19, Sections 1 5 1

Neural networks. Chapter 19, Sections 1 5 1 Neural networks Chapter 19, Sections 1 5 Chapter 19, Sections 1 5 1 Outline Brains Neural networks Perceptrons Multilayer perceptrons Applications of neural networks Chapter 19, Sections 1 5 2 Brains 10

More information

Data Mining Part 5. Prediction

Data Mining Part 5. Prediction Data Mining Part 5. Prediction 5.5. Spring 2010 Instructor: Dr. Masoud Yaghini Outline How the Brain Works Artificial Neural Networks Simple Computing Elements Feed-Forward Networks Perceptrons (Single-layer,

More information

Learning Symbolic Inferences with Neural Networks

Learning Symbolic Inferences with Neural Networks Learning Symbolic Inferences with Neural Networks Helmar Gust (hgust@uos.de) Institute of Cognitive Science, University of Osnabrück Katharinenstr. 4, 4969 Osnabrück, Germany Kai-Uwe Kühnberger (kkuehnbe@uos.de)

More information

Lecture 16 Deep Neural Generative Models

Lecture 16 Deep Neural Generative Models Lecture 16 Deep Neural Generative Models CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago May 22, 2017 Approach so far: We have considered simple models and then constructed

More information

Nested Epistemic Logic Programs

Nested Epistemic Logic Programs Nested Epistemic Logic Programs Kewen Wang 1 and Yan Zhang 2 1 Griffith University, Australia k.wang@griffith.edu.au 2 University of Western Sydney yan@cit.uws.edu.au Abstract. Nested logic programs and

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions

Bias-Variance Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions - Trade-Off in Hierarchical Probabilistic Models Using Higher-Order Feature Interactions Simon Luo The University of Sydney Data61, CSIRO simon.luo@data61.csiro.au Mahito Sugiyama National Institute of

More information

arxiv: v2 [cs.ai] 1 Jun 2017

arxiv: v2 [cs.ai] 1 Jun 2017 Propositional Knowledge Representation in Restricted Boltzmann Machines Son N. Tran The Australian E-Health Research Center, CSIRO son.tran@csiro.au arxiv:1705.10899v2 [cs.ai] 1 Jun 2017 Abstract Representing

More information

Neural Networks and the Back-propagation Algorithm

Neural Networks and the Back-propagation Algorithm Neural Networks and the Back-propagation Algorithm Francisco S. Melo In these notes, we provide a brief overview of the main concepts concerning neural networks and the back-propagation algorithm. We closely

More information

A Tableau Calculus for Minimal Modal Model Generation

A Tableau Calculus for Minimal Modal Model Generation M4M 2011 A Tableau Calculus for Minimal Modal Model Generation Fabio Papacchini 1 and Renate A. Schmidt 2 School of Computer Science, University of Manchester Abstract Model generation and minimal model

More information

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau

Last update: October 26, Neural networks. CMSC 421: Section Dana Nau Last update: October 26, 207 Neural networks CMSC 42: Section 8.7 Dana Nau Outline Applications of neural networks Brains Neural network units Perceptrons Multilayer perceptrons 2 Example Applications

More information

Chapter 9: The Perceptron

Chapter 9: The Perceptron Chapter 9: The Perceptron 9.1 INTRODUCTION At this point in the book, we have completed all of the exercises that we are going to do with the James program. These exercises have shown that distributed

More information

A Note on the Relationships between Logic Programs and Neural Networks

A Note on the Relationships between Logic Programs and Neural Networks A Note on the Relationships between Logic Programs and Neural Networks Pascal Hitzler Department of Mathematics, University College, Cork, Ireland Email: phitzler@ucc.ie Web: http://maths.ucc.ie/pascal/

More information

Computational Intelligence Winter Term 2017/18

Computational Intelligence Winter Term 2017/18 Computational Intelligence Winter Term 207/8 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Plan for Today Single-Layer Perceptron Accelerated Learning

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Level mapping characterizations for quantitative and disjunctive logic programs

Level mapping characterizations for quantitative and disjunctive logic programs Level mapping characterizations for quantitative and disjunctive logic programs Matthias Knorr Bachelor Thesis supervised by Prof. Dr. Steffen Hölldobler under guidance of Dr. Pascal Hitzler Knowledge

More information

Lecture 4: Perceptrons and Multilayer Perceptrons

Lecture 4: Perceptrons and Multilayer Perceptrons Lecture 4: Perceptrons and Multilayer Perceptrons Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning Perceptrons, Artificial Neuronal Networks Lecture 4: Perceptrons

More information

Learning and deduction in neural networks and logic

Learning and deduction in neural networks and logic Learning and deduction in neural networks and logic Ekaterina Komendantskaya Department of Mathematics, University College Cork, Ireland Abstract We demonstrate the interplay between learning and deductive

More information

7.1 Basis for Boltzmann machine. 7. Boltzmann machines

7.1 Basis for Boltzmann machine. 7. Boltzmann machines 7. Boltzmann machines this section we will become acquainted with classical Boltzmann machines which can be seen obsolete being rarely applied in neurocomputing. It is interesting, after all, because is

More information

Computational Intelligence

Computational Intelligence Plan for Today Single-Layer Perceptron Computational Intelligence Winter Term 00/ Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS ) Fakultät für Informatik TU Dortmund Accelerated Learning

More information

logic is everywhere Logik ist überall Hikmat har Jaga Hai Mantık her yerde la logica è dappertutto lógica está em toda parte

logic is everywhere Logik ist überall Hikmat har Jaga Hai Mantık her yerde la logica è dappertutto lógica está em toda parte Simple Connectionist Systems Steffen Hölldobler logika je všude International Center for Computational Logic Technische Universität Dresden Germany logic is everywhere Connectionist Networks la lógica

More information

Weak Completion Semantics

Weak Completion Semantics Weak Completion Semantics International Center for Computational Logic Technische Universität Dresden Germany Some Human Reasoning Tasks Logic Programs Non-Monotonicity Łukasiewicz Logic Least Models Weak

More information

Neural Networks (Part 1) Goals for the lecture

Neural Networks (Part 1) Goals for the lecture Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

Neural Network Training

Neural Network Training Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification

More information

Artificial Neural Network and Fuzzy Logic

Artificial Neural Network and Fuzzy Logic Artificial Neural Network and Fuzzy Logic 1 Syllabus 2 Syllabus 3 Books 1. Artificial Neural Networks by B. Yagnanarayan, PHI - (Cover Topologies part of unit 1 and All part of Unit 2) 2. Neural Networks

More information

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2

Neural Nets in PR. Pattern Recognition XII. Michal Haindl. Outline. Neural Nets in PR 2 Neural Nets in PR NM P F Outline Motivation: Pattern Recognition XII human brain study complex cognitive tasks Michal Haindl Faculty of Information Technology, KTI Czech Technical University in Prague

More information

CMSC 421: Neural Computation. Applications of Neural Networks

CMSC 421: Neural Computation. Applications of Neural Networks CMSC 42: Neural Computation definition synonyms neural networks artificial neural networks neural modeling connectionist models parallel distributed processing AI perspective Applications of Neural Networks

More information

Pei Wang( 王培 ) Temple University, Philadelphia, USA

Pei Wang( 王培 ) Temple University, Philadelphia, USA Pei Wang( 王培 ) Temple University, Philadelphia, USA Artificial General Intelligence (AGI): a small research community in AI that believes Intelligence is a general-purpose capability Intelligence should

More information

Artificial Neural Network : Training

Artificial Neural Network : Training Artificial Neural Networ : Training Debasis Samanta IIT Kharagpur debasis.samanta.iitgp@gmail.com 06.04.2018 Debasis Samanta (IIT Kharagpur) Soft Computing Applications 06.04.2018 1 / 49 Learning of neural

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

INTRODUCTION TO ARTIFICIAL INTELLIGENCE

INTRODUCTION TO ARTIFICIAL INTELLIGENCE v=1 v= 1 v= 1 v= 1 v= 1 v=1 optima 2) 3) 5) 6) 7) 8) 9) 12) 11) 13) INTRDUCTIN T ARTIFICIAL INTELLIGENCE DATA15001 EPISDE 8: NEURAL NETWRKS TDAY S MENU 1. NEURAL CMPUTATIN 2. FEEDFRWARD NETWRKS (PERCEPTRN)

More information

Deep Belief Networks are compact universal approximators

Deep Belief Networks are compact universal approximators 1 Deep Belief Networks are compact universal approximators Nicolas Le Roux 1, Yoshua Bengio 2 1 Microsoft Research Cambridge 2 University of Montreal Keywords: Deep Belief Networks, Universal Approximation

More information

Probabilistic Models in Theoretical Neuroscience

Probabilistic Models in Theoretical Neuroscience Probabilistic Models in Theoretical Neuroscience visible unit Boltzmann machine semi-restricted Boltzmann machine restricted Boltzmann machine hidden unit Neural models of probabilistic sampling: introduction

More information

Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011!

Artificial Neural Networks and Nonparametric Methods CMPSCI 383 Nov 17, 2011! Artificial Neural Networks" and Nonparametric Methods" CMPSCI 383 Nov 17, 2011! 1 Todayʼs lecture" How the brain works (!)! Artificial neural networks! Perceptrons! Multilayer feed-forward networks! Error

More information

Artificial Neural Networks

Artificial Neural Networks 0 Artificial Neural Networks Based on Machine Learning, T Mitchell, McGRAW Hill, 1997, ch 4 Acknowledgement: The present slides are an adaptation of slides drawn by T Mitchell PLAN 1 Introduction Connectionist

More information

Equivalence for the G 3-stable models semantics

Equivalence for the G 3-stable models semantics Equivalence for the G -stable models semantics José Luis Carballido 1, Mauricio Osorio 2, and José Ramón Arrazola 1 1 Benemérita Universidad Autóma de Puebla, Mathematics Department, Puebla, México carballido,

More information

Artificial Neural Networks. Edward Gatt

Artificial Neural Networks. Edward Gatt Artificial Neural Networks Edward Gatt What are Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning Very

More information

Computational Logic and Neural Networks: a personal perspective

Computational Logic and Neural Networks: a personal perspective Computational Logic and Neural Networks: a personal perspective Ekaterina Komendantskaya School of Computing, University of Dundee 9 February 2011, Aberdeen Computational Logic and Neural Networks: a personal

More information

In the Name of God. Lecture 9: ANN Architectures

In the Name of God. Lecture 9: ANN Architectures In the Name of God Lecture 9: ANN Architectures Biological Neuron Organization of Levels in Brains Central Nervous sys Interregional circuits Local circuits Neurons Dendrite tree map into cerebral cortex,

More information

Propositional Logic Language

Propositional Logic Language Propositional Logic Language A logic consists of: an alphabet A, a language L, i.e., a set of formulas, and a binary relation = between a set of formulas and a formula. An alphabet A consists of a finite

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

Database Theory VU , SS Complexity of Query Evaluation. Reinhard Pichler

Database Theory VU , SS Complexity of Query Evaluation. Reinhard Pichler Database Theory Database Theory VU 181.140, SS 2018 5. Complexity of Query Evaluation Reinhard Pichler Institut für Informationssysteme Arbeitsbereich DBAI Technische Universität Wien 17 April, 2018 Pichler

More information

SGD and Deep Learning

SGD and Deep Learning SGD and Deep Learning Subgradients Lets make the gradient cheating more formal. Recall that the gradient is the slope of the tangent. f(w 1 )+rf(w 1 ) (w w 1 ) Non differentiable case? w 1 Subgradients

More information

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296

Hopfield Networks and Boltzmann Machines. Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks and Boltzmann Machines Christian Borgelt Artificial Neural Networks and Deep Learning 296 Hopfield Networks A Hopfield network is a neural network with a graph G = (U,C) that satisfies

More information

Applied Logic. Lecture 1 - Propositional logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw

Applied Logic. Lecture 1 - Propositional logic. Marcin Szczuka. Institute of Informatics, The University of Warsaw Applied Logic Lecture 1 - Propositional logic Marcin Szczuka Institute of Informatics, The University of Warsaw Monographic lecture, Spring semester 2017/2018 Marcin Szczuka (MIMUW) Applied Logic 2018

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multi-layer networks Backpropagation Hidden layer representations Examples

More information

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications

Neural Networks. Bishop PRML Ch. 5. Alireza Ghane. Feed-forward Networks Network Training Error Backpropagation Applications Neural Networks Bishop PRML Ch. 5 Alireza Ghane Neural Networks Alireza Ghane / Greg Mori 1 Neural Networks Neural networks arise from attempts to model human/animal brains Many models, many claims of

More information

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels

Need for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Non-linearity provided by kernels Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Non-linearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)

More information

Artificial Neural Networks. MGS Lecture 2

Artificial Neural Networks. MGS Lecture 2 Artificial Neural Networks MGS 2018 - Lecture 2 OVERVIEW Biological Neural Networks Cell Topology: Input, Output, and Hidden Layers Functional description Cost functions Training ANNs Back-Propagation

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

Yet Another Proof of the Strong Equivalence Between Propositional Theories and Logic Programs

Yet Another Proof of the Strong Equivalence Between Propositional Theories and Logic Programs Yet Another Proof of the Strong Equivalence Between Propositional Theories and Logic Programs Joohyung Lee and Ravi Palla School of Computing and Informatics Arizona State University, Tempe, AZ, USA {joolee,

More information

Part 8: Neural Networks

Part 8: Neural Networks METU Informatics Institute Min720 Pattern Classification ith Bio-Medical Applications Part 8: Neural Netors - INTRODUCTION: BIOLOGICAL VS. ARTIFICIAL Biological Neural Netors A Neuron: - A nerve cell as

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Wolfram Burgard, Maren Bennewitz, and Marco Ragni Albert-Ludwigs-Universität Freiburg Contents 1 Agents

More information

6c Lecture 14: May 14, 2014

6c Lecture 14: May 14, 2014 6c Lecture 14: May 14, 2014 11 Compactness We begin with a consequence of the completeness theorem. Suppose T is a theory. Recall that T is satisfiable if there is a model M T of T. Recall that T is consistent

More information

ECE521 Lecture 7/8. Logistic Regression

ECE521 Lecture 7/8. Logistic Regression ECE521 Lecture 7/8 Logistic Regression Outline Logistic regression (Continue) A single neuron Learning neural networks Multi-class classification 2 Logistic regression The output of a logistic regression

More information

A summary of Deep Learning without Poor Local Minima

A summary of Deep Learning without Poor Local Minima A summary of Deep Learning without Poor Local Minima by Kenji Kawaguchi MIT oral presentation at NIPS 2016 Learning Supervised (or Predictive) learning Learn a mapping from inputs x to outputs y, given

More information

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs)

Multilayer Neural Networks. (sometimes called Multilayer Perceptrons or MLPs) Multilayer Neural Networks (sometimes called Multilayer Perceptrons or MLPs) Linear separability Hyperplane In 2D: w x + w 2 x 2 + w 0 = 0 Feature x 2 = w w 2 x w 0 w 2 Feature 2 A perceptron can separate

More information

Foundations of Artificial Intelligence

Foundations of Artificial Intelligence Foundations of Artificial Intelligence 7. Propositional Logic Rational Thinking, Logic, Resolution Joschka Boedecker and Wolfram Burgard and Bernhard Nebel Albert-Ludwigs-Universität Freiburg May 17, 2016

More information

A Connectionist Cognitive Model for Temporal Synchronisation and Learning

A Connectionist Cognitive Model for Temporal Synchronisation and Learning A Connectionist Cognitive Model for Temporal Synchronisation and Learning Luís C. Lamb and Rafael V. Borges Institute of Informatics Federal University of Rio Grande do Sul orto Alegre, RS, 91501-970,

More information