Categorial Grammar. Larry Moss NASSLLI. Indiana University

Similar documents
Monotonicity Calculus: What is for, and How it is Formulated

Surface Reasoning Lecture 2: Logic and Grammar

Syllogistic Logic and its Extensions

INTRODUCTION TO LOGIC. Propositional Logic. Examples of syntactic claims

Natural Logic. Larry Moss, Indiana University. October 23, UC Berkeley Logic Seminar

Extensions to the Logic of All x are y: Verbs, Relative Clauses, and Only

Introduction to Semantics. Common Nouns and Adjectives in Predicate Position 1

A proof theoretical account of polarity items and monotonic inference.

Introduction to Semantics. The Formalization of Meaning 1

Bringing machine learning & compositional semantics together: central concepts

Parsing with Context-Free Grammars

A* Search. 1 Dijkstra Shortest Path

Parsing with CFGs L445 / L545 / B659. Dept. of Linguistics, Indiana University Spring Parsing with CFGs. Direction of processing

Parsing with CFGs. Direction of processing. Top-down. Bottom-up. Left-corner parsing. Chart parsing CYK. Earley 1 / 46.

Recent Progress on Monotonicity

Semantics and Generative Grammar. The Semantics of Adjectival Modification 1. (1) Our Current Assumptions Regarding Adjectives and Common Ns

Introduction to Semantic Parsing with CCG

Semantics and Generative Grammar. Expanding Our Formalism, Part 1 1

Natural Language Processing : Probabilistic Context Free Grammars. Updated 5/09

27. THESE SENTENCES CERTAINLY LOOK DIFFERENT

Scope Ambiguities through the Mirror

Natural Logic Welcome to the Course!

CS626: NLP, Speech and the Web. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 14: Parsing Algorithms 30 th August, 2012

27. THESE SENTENCES CERTAINLY LOOK DIFFERENT

Context Free Grammars

The Lambek-Grishin calculus for unary connectives

Computational Models - Lecture 4 1

Parsing. Based on presentations from Chris Manning s course on Statistical Parsing (Stanford)

A Supertag-Context Model for Weakly-Supervised CCG Parser Learning

Relational Reasoning in Natural Language

Sharpening the empirical claims of generative syntax through formalization

Ling 5801: Lecture Notes 7 From Programs to Context-Free Grammars

CS460/626 : Natural Language

The Semantics of Definite DPs 1. b. Argument Position: (i) [ A politician ] arrived from Washington. (ii) Joe likes [ the politician ].

Artificial Intelligence

Grammar and Feature Unification

Computational Models - Lecture 4

Semantics and Generative Grammar. A Little Bit on Adverbs and Events

Lecture 5: Semantic Parsing, CCGs, and Structured Classification

Compositionality and Syntactic Structure Marcus Kracht Department of Linguistics UCLA 3125 Campbell Hall 405 Hilgard Avenue Los Angeles, CA 90095

Ling 130 Notes: Syntax and Semantics of Propositional Logic

Model-Theory of Property Grammars with Features

Computational Models - Lecture 4 1

IBM Model 1 for Machine Translation

Propositional Logic. Testing, Quality Assurance, and Maintenance Winter Prof. Arie Gurfinkel

Proseminar on Semantic Theory Fall 2010 Ling 720. Remko Scha (1981/1984): Distributive, Collective and Cumulative Quantification

What we have done so far

Computational Models - Lecture 3

Let s Give Up. Larry Moss. North American Summer School on Logic, Language, and Information June 25-29, Indiana University 1/31

Contexts for Quantification

A Multi-Modal Combinatory Categorial Grammar. A MMCCG Analysis of Japanese Nonconstituent Clefting

X-bar theory. X-bar :

Parasitic Scope (Barker 2007) Semantics Seminar 11/10/08

A Polynomial Time Algorithm for Parsing with the Bounded Order Lambek Calculus

Lecture 3: Finite Automata

CHAPTER 2 INTRODUCTION TO CLASSICAL PROPOSITIONAL LOGIC

NATURAL LOGIC 报告人 : 徐超哲学系逻辑学逻辑前沿问题讨论班 / 56

The semantics of propositional logic

Languages. Languages. An Example Grammar. Grammars. Suppose we have an alphabet V. Then we can write:

Chapter 1 Review of Equations and Inequalities

Computational Linguistics

1.1 The Language of Mathematics Expressions versus Sentences

Continuations in Type Logical Grammar. Grafting Trees: (with Chris Barker, UCSD) NJPLS, 27 February Harvard University

Quantification in the predicate calculus

Categorial Grammar. Algebra, Proof Theory, Applications. References Categorial Grammar, G. Morrill, OUP, 2011.

Solving Equations by Adding and Subtracting

Linguistics 819: Seminar on TAG and CCG. Introduction to Combinatory Categorial Grammar

Computational Linguistics. Acknowledgements. Phrase-Structure Trees. Dependency-based Parsing

Adding Some. Larry Moss. Nordic Logic School August 7-11, Indiana University 1/37

Lecture 4: Finite Automata

A Review of the Essentials of Extensional Semantics 1

Lecture 3: Finite Automata

CS 188 Introduction to AI Fall 2005 Stuart Russell Final

Ling 98a: The Meaning of Negation (Week 5)

UNIT II REGULAR LANGUAGES

CMPT-825 Natural Language Processing. Why are parsing algorithms important?

Fundamentor Syllabus

A syllogistic system for propositions with intermediate quantifiers

FLAC Context-Free Grammars

NAME: DATE: MATHS: Higher Level Algebra. Maths. Higher Level Algebra

Natural Logic. Larry Moss, Indiana University. ASL North American Annual Meeting, March 19, 2010

Maths Higher Level Algebra

Parsing. Probabilistic CFG (PCFG) Laura Kallmeyer. Winter 2017/18. Heinrich-Heine-Universität Düsseldorf 1 / 22

Focus in complex noun phrases

Theory of Computation

22c:145 Artificial Intelligence

For all For every For each For any There exists at least one There exists There is Some

1 Rules in a Feature Grammar

Handout 3: PTQ Revisited (Muskens 1995, Ch. 4)

Suppose h maps number and variables to ɛ, and opening parenthesis to 0 and closing parenthesis

Part of Speech Tagging: Viterbi, Forward, Backward, Forward- Backward, Baum-Welch. COMP-599 Oct 1, 2015

Sequences and Information

Context-Free Grammars and Languages. Reading: Chapter 5

CA Compiler Construction

1 Alphabets and Languages

Logic: The Big Picture

LING 473: Day 10. START THE RECORDING Coding for Probability Hidden Markov Models Formal Grammars

Logic Background (1A) Young W. Lim 12/14/15

1.2 The Role of Variables

a. Develop a fragment of English that contains quantificational NPs. b. Develop a translation base from that fragment to Politics+λ

Transcription:

1/37 Categorial Grammar Larry Moss Indiana University NASSLLI

2/37 Categorial Grammar (CG) CG is the tradition in grammar that is closest to the work that we ll do in this course. Reason: In CG, syntax and semantics are closely related. This set of slides first covers the basics of syntax in CG. The specific system is applicative categorial grammar, also called the Ajdukiewicz-Bar Hillel form. (But if these don t mean anything to you, don t worry.) Then we ll look at how semantics interfaces with syntax. Later on, we ll extend the syntax to be more real.

3/37 Syntactic categories in CG Basic syntactic categories A categorial grammar always begins with basic categories. You should think of these as simple syntactic categories. In our linguistic applications, we usually will take N, NP and S. But we could just as well take other basic categories, and to make this point, I ll present other choices as well, coming from formal language theory.

Syntactic categories in CG Slash categories If C and D are categories, so are C\D and C/D. It s very important to see the difference between the two slashes! I personally have names for these: \ look left / look right (mnemonic: go from bottom to top) But they have other names: backslash and slash, over and under. The overall idea is that they are directional versions of the usual division notation for fractions: X Y corresponds to both X\Y and X/Y 4/37

5/37 Examples of Categories First, here are some categories when the basic ones are S and NP S/S S\NP (S\NP)/NP ((S\NP)/NP)/S The idea These categories are going to play the role of parts of speech like nouns, noun phrases, adjectives, etc.

6/37 Another set of examples This time, let s let the basic categories be U, S, T, and Y Here are some categories: Y S/T (S\(U/Y))\U There are infinitely many categories.

7/37 Lexicon A lexicon is a set of pairs, consisting of a symbols together with a category. Our first lexicon (Dana, NP) (Kim, NP) (smiled, S\NP) (laughed, S\NP) (cried, S\NP) (praised, (S\NP)/NP) (teased, (S\NP)/NP) (interviewed, (S\NP)/NP) A given word usually is associated to more than one category.

8/37 Another example of a lexicon This time, we use S, T, U, X, and Y as the basic categories. The symbols here are the letters a, b, and c. Lexicon (a, T/X) (a, S/X) (b, X) (b, X/T) (a, U/Y) (a, S/Y) (c, Y) (c, Y/U)

9/37 Our first lexicon: explanation One uses a lexicon to make parse trees like praised: (NP\S)/NP Kim: NP Dana: NP praised Kim: NP\S Dana praised Kim: S If we take a tree t whose root is of category S\NP and put a tree u whose root is of category NP on the left of t, and then add a new root, the whole tree will be of category S. If we take a tree t whose root is of category (S\NP)/NP and put a tree u whose root is of category NP on the right of t, and then add a new root, the whole tree will be of category S\NP.

10/37 Example using our first lexicon praised: (S\NP)/NP Kim: NP Dana: NP praised Kim: S\NP Dana praised Kim: S The leaves must match the categories in the lexicon, and going up we use the construction principles that we just saw. The key point here is that S\NP is verb phrase.

11/37 A second lexicon Lexicon (a, T/X) (a, S/X) (b, X) (b, X/T) (a, U/Y) (a, S/Y) (c, Y) (c, Y/U) Let s parse abab as a string of category S: a : T/X b : X b : X/T ab : T a : S/X bab : X abab : S

12/37 Grammars and languages A categorial grammar is a pair G = (Lex, C) where Lex is a lexicon (over some set of atomic categories), and C is some fixed category. The language of G is the set of sequences of symbols from the lexicon which can be parsed by some tree whose root is labeled C.

13/37 Let s start with Example of a grammar and its language G 1 = (Lex 1, S), where Lex 1 is our first lexicion, repeated below: Lex 1 (Dana, NP) (Kim, NP) (smiled, S\NP) (laughed, S\NP) (cried, S\NP) (praised, (S\NP)/NP) (teased, (S\NP)/NP) (interviewed, (S\NP)/NP) The language of this grammar G 1 is the set containing the following 18 sequences of lexical items: Dana smiled Dana laughed Dana cried Kim smiled Kim laughed Kim cried Dana praised Dana Dana teased Dana Dana interviewed Dana Dana praised Kim Dana teased Kim Dana interviewed Kim Kim praised Dana Kim teased Dana Kim interviewed Dana Kim praised Kim Kim teased Kim Kim interviewed Kim

14/37 Second example of a grammar and its language Let s next consider G 2 = (Lex 2, S), where Lex 2 is our second lexicion, repeated below: Lex 2 (a, T/X) (a, S/X) (b, X) (b, X/T) (a, U/Y) (a, S/Y) (c, Y) (c, Y/U) The language of this grammar G 2 is harder to determine. It turns out to be {ab, abab, ababab, ababab,..., ac, acac, acacac, acacac,...}

15/37 Our third lexicon Lex 3 (Dana, NP) (Kim, NP) (smiled, S\NP) (laughed, S\NP) (cried, S\NP) (praised, (S\NP)/NP) (teased, (S\NP)/NP) (interviewed, (S\NP)/NP) (joyfully, (S\NP)\(S\NP)) (carefully, (S\NP)\(S\NP)) (excitedly, (S\NP)\(S\NP))

16/37 Two examples using adverbs smiled: S\NP joyfully: (S\NP)\(S\NP) Dana: NP smiled joyfully: S\NP Dana smiled joyfully: S Kim: NP criticized: (S\NP)/NP Dana: NP criticized Dana: S\NP carefully: (S\NP)\(S\NP) criticized Dana carefully: S\NP Kim criticized Dana carefully: S

17/37 NP coordination To get we add to the lexicon Farid and Bettina and Cynthia left (and, (NP\NP)/NP)

17/37 NP coordination To get we add to the lexicon Farid and Bettina and Cynthia left (and, (NP\NP)/NP) F : NP and : (NP\NP)/NP C : NP B : NP and C : NP\NP and : (NP\NP)/NP B and C : NP and B and C : NP\NP F and B and C : NP F and B and C left : S left : NP\S

18/37 Semantics begins here We have seen how to build complex categories in CG using the two directional slashes \ and /. We now carry out a parallel development on the semantic side, but with a few differences. We start with a syntax for semantics, generating a set of semantic types.

19/37 Semantic Types We begin with a set T 0 of basic types. Every basic type is a type. If σ and τ are types, so is (σ, τ). The set of all types is written T.

19/37 Semantic Types We begin with a set T 0 of basic types. Every basic type is a type. If σ and τ are types, so is (σ, τ). The set of all types is written T. Our main example is when the basic types are e and t, standing for entity and truth value.

19/37 Semantic Types We begin with a set T 0 of basic types. Every basic type is a type. If σ and τ are types, so is (σ, τ). The set of all types is written T. By the way σ is the Greek letter sigma, in lower-case. τ is similarly the Greek letter tau.

20/37 Semantic Types: examples Note that we drop often the comma, as is customary. e t (et) ((et), t) ((et)((et)t)).

21/37 Semantic Domains Let D be a function which assigns sets to the basic types. In our setting, the basic types are usually e and t, so we would have D(e) and D(t). We ll write these as D e and D t, since this is what everyone does.

21/37 Semantic Domains Let D be a function which assigns sets to the basic types. In our setting, the basic types are usually e and t, so we would have D(e) and D(t). We ll write these as D e and D t, since this is what everyone does. This function D can be extended to all types by the rule D (στ) = (D σ D τ ) This is the set of all functions from D σ to D τ. As a function, the domain of D is the set T of all types.

22/37 The idea behind the equation D (στ) = (D σ D τ ) Think of X\Y and X/Y in terms of functions: input\output input/output A phrase of type v : X\Y will be interpreted by a function [[v]] X\Y : Y interpretations X interpretations A phrase of type v : X/Y will be interpreted by a function [[v]] X/Y : Y interpretations X interpretations Either way, putting one phrase after another corresponds to function application.

23/37 Semantic Domains: Example Let s use D e = {a, b, c, d}, and D t = 2 = {0, 1}. Incidentally We ll always use the set 2 = {0, 1} = {false, true} for D t. D e is our set of entities, and this can be anything.

23/37 Semantic Domains: Example Let s use D e = {a, b, c, d}, and D t = 2 = {0, 1}. type σ D σ e {a, b, c, d} t 2 (et) functions f : {a, b, c, d} 2 ((et), t) functions from the set above to 2 (tt) the four functions from 2 to 2

Connecting the syntactic categories with the semantic types Here is how we connecting the syntactic categories with the semantic types. Let Cat 0 be the set of basic categories in the syntax, and let Cat be the full set of categories. We start with a function k : Cat 0 T. syntactic category X S N NP semantic type k(x) t (et) ((et)t) We then extend this to all syntactic categories by using function spaces for both directional slashes. We ll call the extended function k. 24/37

25/37 A picture of k and k Act I: k has domain Cat 0 S NP N Cat 0 k k k (e, (et, t)) e t (et, t) et (et, (et, t)) t T

26/37 A picture of k and k Act II: k extends k NP/N S NP N S\NP N\N Cat k k k k k k (et, (et, t)) e t (et, t) et ((et, t), t) (et, et) T

27/37 Connecting the syntactic categories with the semantic types syntactic category X name semantic type k(x) S sentence t N noun (et) NP noun phrase ((et)t) N/N adjective ((et)(et)) S\NP verb phrase (((et)t)t) (S\NP)\(S\NP) adverb ((((et)t)t)(((et)t)t)) (S\NP)/NP transitive verb (((et)t)(((et)t)t)) NP/N determiner (et, (et, t))

27/37 Connecting the syntactic categories with the semantic types syntactic category X name semantic type k(x) S sentence t N noun (et) NP noun phrase ((et)t) N/N adjective ((et)(et)) S\NP verb phrase (((et)t)t) (S\NP)\(S\NP) adverb ((((et)t)t)(((et)t)t)) (S\NP)/NP transitive verb (((et)t)(((et)t)t)) NP/N determiner (et, (et, t)) In more detail on how k is defined on the transitive verb category, k((s\np)/np) = (((et)t)(((et)t)t)) we are using the general definition that explains the chart: k(x\y) = (k(y), k(x)) k(y/x) = (k(x), k(y))

27/37 Connecting the syntactic categories with the semantic types syntactic category X name semantic type k(x) S sentence t N noun (et) NP noun phrase ((et)t) N/N adjective ((et)(et)) S\NP verb phrase (((et)t)t) (S\NP)\(S\NP) adverb ((((et)t)t)(((et)t)t)) (S\NP)/NP transitive verb (((et)t)(((et)t)t)) NP/N determiner (et, (et, t)) You can also see why people who work on this need special abbreviations for commonly-found semantic types.

28/37 What do we need to do semantics of a CG? That is, what is a model? We start with a specific CG, used for syntax. We have a set T of semantic types, and sets D σ, for σ T. Next, we must have a function and this induces an extension k : CAT 0 T k : CAT T This connects the syntactic categories with the semantic types.

What do we need to do semantics of a CG? That is, what is a model? We start with a specific CG, used for syntax. We have a set T of semantic types, and sets D σ, for σ T. Next, we must have a function and this induces an extension k : CAT 0 T k : CAT T This connects the syntactic categories with the semantic types. A lexical interpretation function [[ ]] is a function which takes an item in the lexicon, say (w, X), and gives some [[w]] X D k(x). This is what it takes to give a model. 28/37

29/37 The main point of the semantics, again Given: a categorical lexicon, k : CAT 0 T, and a lexical interpretation function Every parse tree in the grammar has a semantic correlate. Every node in the parse tree, say (v, X), has a correlate on the semantic side of the form [[v]] X D k(x). Each use of the CG cancellation rules v 1 : X/Y v 2 : Y v 1 v 2 : X v 1 : Y v 2 : X\Y v 1 v 2 : X corresponds to function application [[v 1 ]] X/Y [[v 2 ]] Y [[v 1 v 2 ]] X = [[v 1 ]] X/Y ([[v 2 ]] Y ) [[v 1 ]] Y [[v 2 ]] X\Y [[v 1 v 2 ]] X = [[v 2 ]] X\Y ([[v 1 ]] Y )

30/37 Algebra as grammar We take a single base type r, and as our lexicon we take plus : r/(r/r) minus : r/(r/r) times : r/(r/r) div2 : r/(r/r) v : r w : r x : r y : r z : r 1 : r 2 : r

31/37 We get terms in Polish notation plus : r/(r/r) v : r minus : r/(r/r) z : r plus v : r/r minus z : r/r plus v w : r minus z plus v w : r w : r This would correspond to the term usually written z (v + w).

32/37 Semantics The semantics will use higher-order (one-place) functions on the real numbers. We take D r = R. Then automatically, And D r r = R R. D r (r r) = R (R R).

Semantics As one particular model, we take [[v]] = 4 [[w]] = 2 [[x]] = 65 [[y]] = 3 [[z]] = 0 [[1]] = 1 [[2]] = 2 [[plus]](x)(y) = x + y [[minus]](x)(y) = x y [[times]](x)(y) = x y [[div2]](x)(y) = 2 x y The part in the middle is standard ; it would not be sensible to use any other choice. 33/37

34/37 We get terms in Polish notation plus : r/(r/r) v : r minus : r/(r/r) z : r plus v : r/r minus z : r/r plus v w : r minus z plus v w : r w : r After some calculation, [[minus z plus v w]] = [[z]] ([[v]] + [[w]]) = 0 (4 + 2) = 6

35/37 We get terms in Polish notation We are interested in a term corresponding to f(v, w, x, y, z) = x y 2 z (v+w). To fit it all on the screen, let s drop the types: minus x plus v minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2(t)(u) is supposed to mean 2 t u.

36/37 Can we determine the polarities of the variables from the tree? Go from the root to the leaves, marking green for red for The rule for propagating colors is: the right branches of completed nodes for div2 and minus flip colors. Otherwise, we keep colors as we go up the tree.

Can we determine the polarities of the variables from the tree? Go from the root to the leaves, marking green for red for The rule for propagating colors is: the right branches of completed nodes for div2 and minus flip colors. Otherwise, we keep colors as we go up the tree. minus x plus v minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w 36/37

Can we determine the polarities of the variables from the tree? Go from the root to the leaves, marking green for red for The rule for propagating colors is: the right branches of completed nodes for div2 and minus flip colors. Otherwise, we keep colors as we go up the tree. minus x plus v minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w 36/37

Can we determine the polarities of the variables from the tree? Go from the root to the leaves, marking green for red for The rule for propagating colors is: the right branches of completed nodes for div2 and minus flip colors. Otherwise, we keep colors as we go up the tree. minus x plus v minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w 36/37

Can we determine the polarities of the variables from the tree? Go from the root to the leaves, marking green for red for The rule for propagating colors is: the right branches of completed nodes for div2 and minus flip colors. Otherwise, we keep colors as we go up the tree. minus x plus v minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w 36/37

36/37 Can we determine the polarities of the variables from the tree? Go from the root to the leaves, marking green for red for minus x plus v minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w div2 minus x y minus z plus v w This agrees with what we saw before: f(v, w, x, y, z ).

This algorithm has a history It was first proposed in CG by van Benthem in the 1990 s to formalize the, notation. His proposal was then worked out by Sanchez-Valencia. (Older versions exists: e.g., Sommers.) Versions of it are even implemented in real-world CL systems: Rowan Nairn, Cleo Condoravdi, and Lauri Karttunen. Computing relative polarity for textual inference. In Proceedings of ICoS-5 (Inference in Computational Semantics), Buxton, UK, 2006. (Karttunen was an IU Linguistics PhD and spoke about this work at a distinguished alum talk here a few years ago.) 37/37