Notes on Propositional and First-Order Logic (CPSC 229 Class Notes, January 23 30 2017) John Lasseter Revised February 14, 2017 The following notes are a record of the class sessions we ve devoted to the study of mathematical logic. Like the lectures that were based on it, these notes are intended to supplement rather than replace the more informal treatment given in Critchlow & Eck. My hope is that you ll come away with a good intuition for the precision in reasoning that these formal systems capture, and that you will in turn find this experience of precision helpful in the more informal mathematical explorations of the rest of this course. For the curious, this presentation only explores the surface of this rich and powerful branch of mathematics. In particular, it elides all mention of the rich metatheory surrounding first-order logic, including the interplay of provability and semantic entailment, satisfiability and validity of formulas, consistency and completeness, the role of equality judgments, compactness, expressivity, undecidability, and alternative systems of inference rules. 1 Propositional Logic 1.1 Syntax Syntax describes the ways in which we can combine the symbols of a formal language into valid sentences or well-formed formulas (WFFs). In the case of propositional logic, this means that WFFs consist of propositional variables and the application of the logical connectives {,,, } to other WFFs. To jump ahead to a notation we ll see later in the semester, we can describe WFFs in propositional logic by the following grammar: 1
φ ::= [constants] P [propositional variables] φ 1 [logical negation] φ 1 φ 2 [conjunction] φ 1 φ 2 [disjunction] φ 1 φ 2 [implication] The constants bottom and top are used to denote the canonical contradiction and tautology. Their interpretations (see the following subsection on Truth Tables are, respectively, false and true 1. For clarity, we will often surround a formula with parenthesis, much as we do with arithmetic formulas. Usually, this is not necessary, as the operators are given here in decreasing order of precedence, which means that operators listed earlier group more tightly than those given later. For example, the formula P Q R S T should be read as (P (( Q) R)) (S T ). 1.2 Truth Tables Truth tables define the meaning of a WFF by providing an exhaustive list of all the possible truth values of the propositional variables in an expression, together with the corresponding truth value for the expression itself. In the following, we ll adopt the convention of writing 0 for false and 1 for true. As we saw in class, this helps to provide a reliable technique for making a complete truth table: think of each combination of the variables truth values as a binary number. P P 0 1 1 0 P Q P Q 0 0 0 0 1 0 1 0 0 1 1 1 P Q P Q 0 0 0 0 1 1 1 0 1 1 1 1 P Q P Q 0 0 1 0 1 1 1 0 0 1 1 1 The constants and, of course, have only one possibility each: 0 1 Strictly speaking, neither nor are necessary, but they allow us to include more intuitive rules for reasoning about contradiction. 1 2
1.3 Algebraic Equivalences Here are some useful identities. In each one, the lowercase Greek letters represent WFFs of propositional logic. All of the identities can be verified by examining their truth tables. Associativity of and Distributivity Commutativity Identity Elements Annihilators Idempotence DeMorgan s Laws Complementation (φ ψ) R φ (ψ R) (φ ψ) R φ (ψ R) φ (ψ R) (φ ψ) (φ R) φ (ψ R) (φ ψ) (φ R) φ ψ ψ φ φ ψ ψ φ φ φ φ φ φ φ φ φ φ φ φ (φ ψ) φ ψ (φ ψ) φ ψ φ φ (excluded middle) φ φ (noncontradiction) Double Negation φ φ Transposition φ ψ ψ φ Material Implication φ ψ φ ψ Exportation (φ ψ) R φ (ψ R) 3
In a related vein, we will occasionally use the macro operator φ ψ ( if and only if ) to denote the formula (φ ψ) (ψ φ). Note that some formulations of logic add this as an additional operator, in which case we add this identity to our algebra as the Law of Material Equivalence. 1.4 Rules of Inference For each of the three binary operators, we will need rules for inferring a formula that contains them (the introduction rules) and for inferring formulas that delete them ( elimination ): Introduction Elimination φ φ ψ ψ [ I] φ ψ [ E1 ] φ φ ψ [ E2 ] ψ φ [ I 1 ] φ ψ ψ [ I2 ] φ ψ φ R ψ R φ ψ [ E] R φ ψ [ I] φ ψ φ ψ φ [ E] ψ Rules for negation are a little different in that φ is really equivalent to the formula φ. On the other hand, proof by contradiction is an important technique, so we add here two inference rules, reductio ad absurdum ( reduction to absurdity ) and ex falso quodlibet ( from contradiction anything [can be derived] ), to shortcut some of the work of this task: φ φ [RAA] [EF Q] φ 4
Finally, we have the following two rules of syllogism (a kind of argument that goes back to Aristotle), Disjunctive Syllogism and Hypothetical Syllogism. Strictly speaking neither rule is necessary, in that both can be derived from other inference rules and identities given above. φ ψ ψ φ [DS] φ ψ φ R ψ R [HS] 1.5 Example Derivations In carrying out a formal derivation, it is important to remember that each judgement must be justified from the premises and judgements that have already been made, using only the equivalence laws and rules of inference that are given. If there is no available rule, it does not matter how obvious that inference may seem to you, the human reader. You have to find another way to get there, using the rules alone (usually, you can). The derivation style given in Critchlow and Eck, as well as in class, is known as a Fitch-style derivation. Here is one of the examples from class, illustrating the use of conditional proof to derive an implication (in this case, two of them). The proof makes use of two applications of the I rule, in which we add a new assumption, show that this implies a desired result, and thereby conclude an implication between the assumption and the conclusion. While this assumption is being used, we say that it is open. For example, the assumption of G in Line 4 is open between Lines 4 11. When we no longer need an open assumption, we say that it is discharged. To save space, all applications of the algebraic rewriting laws are abbreviated with the justification R. 5
1 (F G) (H I) 2 F H 3 (F I) (H G) G I 4 G 5 H G E, 3 6 G H R, 5 7 G R, 4 8 H E, 5, 6 9 F DS, 2, 8 10 F I E, 3 11 I E, 9, 8 12 G I I, 4 11 13 I 14 H I E, 1 15 I H R, 14 16 H E, 15, 13 17 F DS, 2, 16 18 F G E, 1 19 G E, 18, 17 20 I G I, 13 19 21 (G I) ( I G) I, 12, 20 2 Predicate Logic Predicate logic adds to propositional logic the ability to express that a proposition is true for all elements in a domain, some, or none. In class, we have studied only first-order predicate logic (FOPL), meaning that we only consider quantification that ranges over individual elements, excluding quantification on propositions themselves 2. 2.1 Syntax The syntax for predicate logic is similar to propositional logic, with three significant additions: We define a set of terms, which consist of variables and functions whose arguments are themselves terms. In class, we considered only constants instead of functions. These may be thought of a functions 2 Logics of this sort are known as second-order (which allows quantification over firstorder predicates), third-order (with quantification over first and second order predicates), and so on. They are strictly more expressive than FOPL, but at the cost of undecidability of many properties. 6
with arity 0 (i.e., a function that takes no arguments). Examples of constant or higher-arity functions depend on our domain of discourse ( universe ). For example, if our domain consists of the real numbers, the binary functions would include the standard arithmetic operators ({+,,, }), exponentiation, logarithms, roots, and so on. Unary functions would include negation, the trigonometric functions, and others. The constants are just the real numbers themselves. By convention, we will use lower case letters near the end of the alphabet for variables and lower case letters near the beginning of the alphabet for constants (and other functions). We define a set of predicate symbols whose arguments are terms. By convention, these are written as uppercase letters. Predicates can be of any arity, though the most common forms we ve seen are unary ( one place ) and binary ( two place ) predicates. An arity 0 predicate is just an ordinary propositional variable. We add two quantifier symbols, ( for all ) and ( there exists ). This gives us the following definitions for well-formed formulas: φ ::= [constants] P (t 1,..., t n ), where t 1,..., t n are terms [propositional variables] φ 1 [negation] φ 1 φ 2 φ 1 φ 2 φ 1 φ 2 [binary connectives] x.φ 1 [universal quantification] x.φ 1 [existential quantification] 2.1.1 Free and Bound Variables Programmers will find this idea familiar, as the free/bound variable distinction corresponds to the difference between a method s global (free) and local (bound) variables. We didn t spend much time with the technical details of this in class, but like the corresponding notion of scope in programming, it s important to have at least a working intuition of the definitions. 7
The set F V (t) of free variables of a term t is defined to be: F V (x) = {x}, if x is a variable F V (c) =, if c is a constant for a function f of arity n, F V (f(t 1,..., t n ) = F V (t 1 )... F V (t n ) (i.e., collect the free variables of all the argument terms). The free variables F V (φ) of a formula φ are defined by: F V ( ) = F V ( ) = F V (P (t 1,..., t n )) = F V (t 1 )... F V (t n ) F V (φ 1 ) F V (φ 2 ) = F V (φ 1 ) F V (φ 2 ), where is one of the binary connectives,,, or. F V ( φ 1 ) = F V (φ 1 ) F V ( x.φ 1 ) = F V ( x.φ 1 ) = F V (φ 1 ) {x} A variable that is not free in φ is bound. 3 The distinction between free and bound depends on whether we are considering a formula φ along with its quantifier. For example, the variable x is bound in the formula x.( P (x) Q(x, y)), but it is free in P (x) Q(x, y). The variable y is free in both. It is also possible for a variable to be both free and bound in the same formula. In (P (x) R(x)) ( x.[ (Q(x) R(x))] S(x)) for example, x is free in (P (x) R(x)) and S(x), but it is bound in x.[ (Q(x) R(x))]. 4 Among other things, the free/bound variable distinction helps to simplify the presentation of FOPL. In particular, we will informally distort the syntax for a quantified formula by writing x.φ(x) (resp. x.φ(x)) to indicate that the variable x in φ is bound by the associated quantifier. Similarly, for x.φ (resp. x.φ(x)) we will write φ(u) to denote φ, where all free occurrences of x in φ are replaced by u. 3 If inductive definitions like this are unfamiliar to you, it will suffice in most cases to think of free variables as those that are not quantified and bound variables as the ones that are. 4 As with methods in a programming language, this just means we re using the same name for different variables. 8
2.2 Algebraic Laws All of the algebraic laws from propositional logic apply to formulas in predicate logic, as well. In addition, we have rules for expanding/contracting the scope of a quantifier, plus versions of DeMorgan s Laws. DeMorgan s Laws Bound Variable Renaming Order of Variable Bindings Quantifiaction over Conjunction and Disjunction Non-Capture of Free Variables x.φ(x) x. φ(x) x.φ(x) x. φ(x) x.φ(x) y.[y/x]φ(x) = y.φ(y), y / F V (φ(x)) x.φ(x) y.[y/x]φ(x) = y.φ(y), y / F V (φ(x)) x. y.ψ(x, y) y. x.ψ(x, y) x. y.ψ(x, y) y. x.ψ(x, y) x.φ(x) x.ψ(x) x.(φ(x) ψ(x)) x.φ(x) x.ψ(x) x.(φ(x) ψ(x)) ψ x.φ(x) x.(ψ φ(x)), if x / F V (ψ) ψ x.φ(x) x.(ψ φ(x)), if x / F V (ψ) In the laws for bound variable renaming (often called α equivalence), we write [y/x]φ(x) to indicate the replacement of all free occurrences of x in φ(x) with y, provided that y does not already occur free in φ(x). 5 2.3 Inference Rules All of the inference rules for propositional logic are valid in FOPL, as well. In addition, we have rules for the introduction and elimination of the universal ( ) and existential ( ) quantifiers: 5 The details of this obscure what is really a very simple intuition for any programmer: you can always change the name of a parameter in a function definition, so long as you do so consistently throughout the definition. 9
Introduction (Generalization) φ(u) [ I] x.φ(x) φ(u) [ I] x.φ(x) Elimination (Instantiation) x.φ(x) [ E] φ(u) x.φ(x) [ E] φ(u) Note that every application of E is actually a form of conditional proof, since the rule introduces an assumption for x.φ(x) in the form of a witness u, for which φ(u) is assumed to hold. This means that every application of E gives us an open assumption which should be closed when it is no longer needed. We ll adopt the convention here of reiterating the conclusion of an E assumption, immediately after closing the assumption (see Example 2.4.2). 2.4 Examples 2.4.1 Existential and Universal Generalization ( I and I) 1 ( x.a(x)) ( y.b(y)) x.[a(x) y.b(y)] 2 u A(u) 3 x.a(x) I, 2 4 y.b(y) E, 1, 3 5 A(u) y.b(y) I, 2 4 6 x.[a(x) y.b(y)] I, 2 5 10
2.4.2 Existential and Universal Instantiation ( E and E) 1 x.p (x) 2 x. y.[p (x) (Q(y) R(x, y))] 3 x.r(x, x) x.q(x) 4 u P (u) E, 1 5 y.[p (u) (Q(y) R(u, y))] E, 2 6 P (u) (Q(u) R(u, u)) E, 5 7 Q(u) R(u, u) E, 6, 4 8 x. R(x, x) R, 3 9 R(u, u) E, 8 10 Q(u) DS, 7, 9 11 x.q(x) I, 10 12 x.q(x) R, 4 11 2.5 Restrictions and Pitfalls The I and E rules require special care. In the I antecedent, u must be an arbitrary variable: it must not occur free in x.φ(x), and φ(u) must not follow from any open assumption containing u. This prevents invalid proofs that attempt to generalize from a variable that actually has specific assumptions. The restriction on free occurrences of u in an application of I, for example, prevents this proof that if anyone is wealthy then everyone is: 1 x.w (x) W (x) x. y.w (x) W (y) 2 u W (u) W (u) E, 1 3 y.w (u) W (y) I, 2 11
[wrong: the arbitrary u assumed at (2) cannot be free in (3)] 4 x. y.w (x) W (y) I, 3 The requirement that φ(u) cannot follow from an assumption containing u prevents certain kinds of misuse that arise from the interaction with E. For example, this proof that, since there are two things that are not equal, then nothing is equal to itself: 1 x. y. E(x, y) x. E(x, x) 2 u y. E(u, y) E, 1 3 v E(u, v) E, 1 4 y. E(u, y) I, 3 [wrong: E(u, v) (3) is part of the assumption containing v] 5 y. E(u, y) R, 2 4 6 x. y. E(x, y) I, 3 [wrong: y. E(u, y) (5) derives from an assumption containing u] 7 x. y. E(x, y) R, 2 6 8 u y. E(u, y) E, 7 9 E(u, u) E, 8 10 x. E(x, x) I, 9 [OK: An application of E is valid for u or any other variable.] 11 x. E(x, x) R, 7 10 Finally, in the witness that we assume in the E rule, u must be a new variable, which does not occur free in an earlier line of an open assumption. Among other mistakes, this prevents faulty derivations, in which we confuse the witness from two or more applications of the E rule. For example, we might attempt to prove that, because there is at least one dog and at least one cat, there is some animal that is both dog and cat: 12
1 ( x.d(x)) ( y.c(x)) x.(d(x) C(x)) 2 x.d(x) E, 1 3 u D(u) E, 2 4 x.c(x) E, 1 5 u C(u) E, 2 [wrong: u (Line 5) already occurs free in Line 3, which is open.] 6 D(u) C(u) I, 3, 5 7 x.(d(x) C(x)) I, 6 8 x.(d(x) C(x)) R, 5 7 9 x.(d(x) C(x)) R, 3 8 13